# Autosomal Admixture Levels Are Informative About Sex Bias in Admixed Populations

^{*}Department of Biology, Stanford University, Stanford, California 94305-5020^{†}Centre National de la Recherche Scientifique–Muséum National d'Histoire Naturelle, Université Paris Diderot, Unité Mixte de Recherche 7206, Ecoanthropology and Ethnobiology, Paris, France 75005

- 1Corresponding author: Stanford University, 371 Serra Mall, Gilbert Biology Bldg., Stanford, CA 94305. E-mail: agoldb{at}stanford.edu

## Abstract

Sex-biased admixture has been observed in a wide variety of admixed populations. Genetic variation in sex chromosomes and functions of quantities computed from sex chromosomes and autosomes have often been examined to infer patterns of sex-biased admixture, typically using statistical approaches that do not mechanistically model the complexity of a sex-specific history of admixture. Here, expanding on a model of Verdu and Rosenberg (2011) that did not include sex specificity, we develop a model that mechanistically examines sex-specific admixture histories. Under the model, multiple source populations contribute to an admixed population, potentially with their male and female contributions varying over time. In an admixed population descended from two source groups, we derive the moments of the distribution of the autosomal admixture fraction from a specific source population as a function of sex-specific introgression parameters and time. Considering admixture processes that are constant in time, we demonstrate that surprisingly, although the mean autosomal admixture fraction from a specific source population does not reveal a sex bias in the admixture history, the variance of autosomal admixture is informative about sex bias. Specifically, the long-term variance decreases as the sex bias from a contributing source population increases. This result can be viewed as analogous to the reduction in effective population size for populations with an unequal number of breeding males and females. Our approach suggests that it may be possible to use the effect of sex-biased admixture on autosomal DNA to assist with methods for inference of the history of complex sex-biased admixture processes.

- admixture
- demographic inference
- model
- population history
- sex bias

POPULATIONS often experience sex-biased demographic processes, in which males and females contributing to the gene pool of a population are drawn from source groups in different proportions, owing to patterns of inbreeding avoidance, dispersal, and mating practices (Pusey 1987; Lawson Handley and Perrin 2007). In humans, sex-biased demography has had a particular effect on admixed populations, populations that have often been founded or influenced by periods of colonization and forced migration involving an initial or continuing admixture process (Mesa *et al.* 2000; Seielstad 2000; Wilkins and Marlowe 2006; Tremblay and Vezina 2010; Heyer *et al.* 2012).

Genetic signatures of sex-biased admixture have been empirically investigated in a variety of human populations. In the Americas, these include African American, Latino, and Native American populations (Bolnick *et al.* 2006; Wang *et al.* 2008; Stefflova *et al.* 2009; Tishkoff *et al.* 2009; Bryc *et al.* 2010a,b; Moreno-Estrada *et al.* 2013; Verdu *et al.* 2014). Sex-biased admixture and migration have also been examined in populations throughout Asia (Oota *et al.* 2001; Wen *et al.* 2004; Chaix *et al.* 2007; Ségurel *et al.* 2008; Chaubey *et al.* 2011; Pemberton *et al.* 2012; Pijpe *et al.* 2013), Austronesia (Kayser *et al.* 2003, 2006, 2008; Cox *et al.* 2010; Lansing *et al.* 2011), and Africa (Wood *et al.* 2005; Tishkoff *et al.* 2007; Berniell-Lee *et al.* 2008; Beleza *et al.* 2013; Petersen *et al.* 2013; Verdu *et al.* 2013).

Sex-specific admixture and migration processes have typically been studied using comparisons of the Y chromosome, which is paternally inherited, and the mitochondrial genome, inherited maternally (Seielstad *et al.* 1998; Oota *et al.* 2001; Wood *et al.* 2005; Bolnick *et al.* 2006; Gunnarsdóttir *et al.* 2011; Lacan *et al.* 2011). More recently, as the Y chromosome and mitochondrial genome each represent single nonrecombining loci that provide an incomplete genomic perspective, sex-biased admixture has been examined by comparisons of autosomal DNA to the X chromosome (Lind *et al.* 2007; Wang *et al.* 2008; Bryc *et al.* 2010a,b; Cox *et al.* 2010; Beleza *et al.* 2013; Verdu *et al.* 2013).

The Y-mitochondrial and X-autosomal frameworks are both sensible, as both involve comparisons of two types of loci that follow different modes of inheritance in males and females. What has not been clear, however, is that autosomal data, which have not typically been viewed as the most informative loci for studies of sex-specific processes, can carry information about sex-biased admixture, even in the absence of a comparison with other components of the genome.

We demonstrate this surprising result through an extension of a mechanistic model for the admixture history of a hybrid population. In a diploid autosomal framework, Verdu and Rosenberg (2011) examined contributions of multiple source populations that varied through time, without considering sex specificity. Here, expanding on the model of Verdu and Rosenberg (2011), we develop a model that mechanistically considers sex-specific admixture histories in which multiple source populations contribute to the admixed population, potentially with varying female and male contributions across generations (Figure 1). In an admixed population descended from two source populations, we derive the moments of the distribution of the fraction of autosomal admixture from a specific source population, as a function of sex-specific admixture parameters and time. We analyze the behavior of the model, considering admixture processes that are constant in time, and we show that the moments contain information about the sex bias.

## The Model

Several studies have described mechanistic models of admixture (Chakraborty and Weiss 1988; Long 1991; Ewens and Spielman 1995; Guo *et al.* 2005; Verdu and Rosenberg 2011; Gravel 2012; Jin *et al.* 2014). We follow the notation and style of the model of Verdu and Rosenberg (2011), studying a hybrid population, *H*, which consists of immigrant individuals from *M* isolated source populations and hybrid individuals who have ancestors from two or more source populations. The source populations are labeled *S _{α}*, for

*α*from 1 to

*M*. We focus on the case of

*M*= 2.

We define the parameters *s _{α}*

_{,}

_{g}_{−1}and

*h*

_{g}_{−1}as the contributions from source populations

*S*and

_{α}*H*, respectively, to the gene pool of the hybrid population

*H*at the next generation,

*g*. That is, for a random individual at generation

*g*, the probabilities that a randomly chosen parent of the individual derives from

*S*and

_{α}*H*are

*s*

_{α}_{,}

_{g}_{−1}and

*h*

_{g}_{−1}, respectively. We define the sex-specific parameter , for

*δ*∈ {

*f*,

*m*}, as the probability that the type-

*δ*parent of a randomly chosen individual from the hybrid population at generation

*g*is from source population

*S*. Similarly, is the probability that the type-

_{α}*δ*parent of a randomly chosen individual in

*H*at generation

*g*is from

*H*itself. We consider a two-sex model, using f for female and m for male. Because each individual has one parent of each type, female and male, we have (1) (2)The contributions to the next generation of the three source populations (

*S*

_{1},

*S*

_{2},

*H*) sum to one: (3)Similarly, the female and male contributions to the next generation separately sum to one, (4)At the first generation,

*g*= 1, the hybrid population has not previously existed; therefore, (5) (6)The first generation has two independent parameters, and . Each subsequent generation contributes four independent parameters , and considering

*g*generations, there are 4

*g*− 2 independent parameters. The model is discrete in time with nonoverlapping generations. As in Verdu and Rosenberg (2011), our deterministic treatment amounts to an assumption of infinite population size.

Our model enables us to consider complex sex-biased admixture processes by allowing uneven sex-specific contributions from each source population at each generation. It reduces to the Verdu and Rosenberg (2011) model when the sex-specific contributions are equal within source populations, that is, if for each *g*, and . We perform similar computations to those of Verdu and Rosenberg (2011), finding that in certain cases, our results reduce to those obtained when sex specificity is not considered.

We let *L* be a random variable indicating the source populations of the parents of a random individual from the hybrid population, *H*. *L* takes its values from the set of all possible ordered parental combinations, {*S*_{1}*S*_{1}, *S*_{1}*H*, *S*_{1}*S*_{2}, *HS*_{1}, *HH*, *HS*_{2}, *S*_{2}*S*_{1}, *S*_{2}*H*, *S*_{2}*S*_{2}}, listing the female parent first. We assume random mating in the hybrid population at each generation, so that the probability that an offspring has a particular pair of source populations for his or her parents is simply the product of separate probabilities associated with the female and male parents (Table 1).

We define the fraction of admixture, the random variable *H _{α}*

_{,}

_{g}_{,}

*, as the probability that an autosomal genetic locus in a random individual of sex*

_{δ}*δ*from the hybrid population in generation

*g*ultimately originates from source population

*α*. The sex-specific fractions of admixture are related to the total fraction of admixture

*H*

_{α}_{,}

*from source population*

_{g}*α*in generation

*g*by

*H*

_{α}_{,}

*= (*

_{g}*H*

_{α}_{,}

_{g}_{,}

*+*

_{f}*H*

_{α}_{,}

_{g}_{,}

*)/2.*

_{m}Under the model, we derive expressions for the moments of the fraction of admixture. Autosomal DNA is inherited non-sex-specifically and from both parents; therefore, female and male offspring have identical distributions of admixture, and *H _{α}*

_{,}

_{g}_{,}

*and*

_{f}*H*

_{α}_{,}

_{g}_{,}

*are identically distributed. Each of these quantities depends on both the female and male fractions of admixture in the previous generation, but conditional on the previous generation (that is, on*

_{m}*H*

_{α}_{,}

_{g}_{−1,}

*and*

_{f}*H*

_{α}_{,}

_{g}_{−1,}

*), they are independent. For our two-population model, we consider the non-sex-specific fraction of admixture,*

_{m}*H*

_{1,}

_{g}_{,}

*, treating*

_{δ}*δ*here as representing either

*f*or

*m*, but retaining the same meaning through a calculation. The quantity

*H*

_{1,}

_{g}_{,}

*depends on both sex-specific fractions of admixture from the previous generation,*

_{δ}*H*

_{1,}

_{g}_{−1,}

*and*

_{f}*H*

_{1,}

_{g}_{−1,}

*.*

_{m}## Distribution of the Admixture Fraction from a Specific Source

The definition of the model parameters and the values from Table 1 allow us to write a recursion relation for the fraction of admixture from source population 1 for a random individual of sex *δ* from the hybrid population at generation *g*, or *H*_{1,}_{g}_{,}* _{δ}*. For the first generation,

*g*= 1, we have (7)For all subsequent generations,

*g*≥ 2, we have (8)Using Equations 7 and 8, we can analyze the distribution of the fraction of admixture as a function of the time

*g*and the parameters . Under our model,

*H*

_{1,}

_{g}_{,}

*takes its values in*

_{δ}*Q*= {0, 1/2

_{g}*, …, 1−1/2*

^{g}*, 1}. Therefore, using Equations 7 and 8, and recalling that*

^{g}*H*

_{1,}

_{g}_{,}

*and*

_{f}*H*

_{1,}

_{g}_{,}

*are identically distributed, for a value*

_{m}*q*in the set

*Q*, we can compute the probability ℙ(

_{g}*H*

_{1,}

_{g}_{,}

*=*

_{δ}*q*) that a random individual from the hybrid population at generation

*g*has admixture fraction

*q*. For

*g*= 1, , and (9)For all subsequent generations,

*g*≥ 2, for

*q*in

*Q*, (10)The function

_{g}*I*is defined for all values of

_{g}*q*in

*Q*and is equal to (11)In Equation 10, we calculate the probability distribution of

_{g}*H*

_{1,}

_{g}_{,}

*by taking a sum over all possible parental pairings at the previous generation that would lead to an admixture fraction*

_{δ}*q*at generation

*g*. Only three values of

*q*allow for a history without a single hybrid ancestor—

*q*= 0, , and

*q*= 1—producing the terms in Equation 11. When there is no sex bias and and , Equations 9–11 reduce to the corresponding Equations 3–5 from Verdu and Rosenberg (2011).

Equations 9–11 can be used to analyze the behavior of the distribution of *H*_{1,}_{g}_{−1,}* _{δ}* over time. In Figure 2, we consider constant admixture processes after the founding of the hybrid population ( for each

*α*∈ {1, 2},

*δ*∈ {

*f*,

*m*}, and

*g*≥ 1), plotting ℙ(

*H*

_{1,}

_{g}_{,}

*) for the first six generations, as computed recursively using Equation 10. In Figure 2, A and B, we consider a hybrid population founded with equal contributions from source populations*

_{δ}*S*

_{1}and

*S*

_{2}, but with no further contributions after

*g*= 1. In both of these cases, the distribution of the autosomal admixture fraction contracts around the mean of . However, whereas Figure 2A has equal contributions from each sex in the founding generation, Figure 2B has a large initial sex bias. We see that the width of the distribution is smaller with the sex-biased contributions, despite equality of the total contributions

*s*

_{1,0}and

*s*

_{2,0}.

In Figures 2, C–E, we consider admixture scenarios in which the founding of the hybrid population is followed by constant contributions from the source populations over time, *s*_{1} = 0.1 and *s*_{2} = 0.3. Because the two source populations contribute after the founding, the distribution does not contract around the mean as in Figures 2, A and B. Also, because the total contributions from *S*_{1} and *S*_{2} are unequal, the distribution of *H*_{1,}_{g}_{,}* _{δ}* is no longer symmetrical. Rather, because the contribution from

*S*

_{2}is greater, the distribution is shifted toward zero.

Figures 2, C and D, have the same continuing contributions for *g* ≥ 2, with no sex bias in the founding generation for Figure 2C, and a large initial sex bias for Figure 2D. Despite different founding contributions, Figures 2, C and D, have similar distributions of *H*_{1,}_{g}_{,}* _{δ}* after a few generations. In Figure 2E, the hybrid population is founded without a sex bias and with equal contributions from the two source populations. The total contributions

*s*

_{1}and

*s*

_{2}are the same as in Figures 2, C and D, but unlike in Figures 2, C and D, the continuing contributions are sex biased, with and . Even with

*s*

_{1}and

*s*

_{2}held constant, the distribution of

*H*

_{1,}

_{g}_{,}

*depends on the . Notably, the probability of*

_{δ}*H*

_{1,6,}

*= 0 drops from 0.157 in Figure 2C to 0.000 in Figure 2E. Similarly, ℙ(*

_{δ}*H*

_{1,6,}

*= 1) drops to zero in Figure 2E as well. With these reductions at the extremes, we see a rise in the probability of intermediate values for*

_{δ}*H*

_{1,}

_{g}_{,}

*.*

_{δ}## Expectation of the Fraction of Admixture

Using the law of total expectation, we write the expectation of the fraction of admixture from source population 1 for a random individual of sex *δ* in population *H* at generation *g* as a function of conditional expectations for all possible pairs of parents *L*, (12)We can simplify this recursion relation. For *g* = 1, (13)For all subsequent generations, *g* ≥ 2, we have (14)Using Equations 7 and 8, for the first generation, *g* = 1, we have(15)For all subsequent generations, *g* ≥ 2, we have (16)Recalling Equations 1, 2, and 4, we can simplify the expectation of the fraction of admixture in a random individual of sex *δ* from the hybrid population. For *g* = 1, Equation 15 gives (17)the same expression found by Verdu and Rosenberg (2011, Equation 10). For *g* ≥ 2, by Equation 16, (18)Because *H*_{1,}_{g}_{,}* _{f}* and

*H*

_{1,}

_{g}_{,}

*are identically distributed, recalling Equation 2, we can simplify the expectation using , where*

_{m}*δ*is left as an unspecified sex (f or m). For

*g*≥ 2, the expectation of the fraction of admixture from source population 1 is (19)We see in Equations 17 and 19 that the expectation of the fraction of admixture for a random individual of sex

*δ*from the hybrid population at generation

*g*, , depends on the total contributions of the source populations (

*S*

_{1},

*S*

_{2},

*H*) at each generation,

*s*

_{1,}

_{g}_{−1}and

*h*

_{g}_{−1}, and not on the sex-specific parameters, , , , and . This recursion (Equations 17 and 19) is the same as in the non-sex-specific model of Verdu and Rosenberg (2011, Equations 10 and 11).

## Higher Moments of the Fraction of Admixture

We can write a general recursion for the higher moments of the admixture fraction from population *S*_{1} in a randomly chosen individual of sex *δ* from the hybrid population. For *k* ≥ 1, in generation *g* = 1, (20)For all subsequent generations, *g* ≥ 2, (21)As in the case of *k* = 1, we use the law of total expectation to write a recursion for higher moments of the distribution of the fraction of admixture for all *k* ≥ 1. Using the values for the recursion for the fraction of admixture, Equations 7 and 8, in the first generation, *g* = 1, we have (22)For *g* ≥ 2, we have (23)Recalling Equation 3 and noting that *h*_{0} = 0, we use the binomial theorem to simplify the recursion for the moments of *H*_{1,}_{g}_{,}* _{δ}*. For

*g*= 1, we have (24)For

*g*≥ 2, we have (25)Because

*H*

_{1,}

_{g}_{−1,}

*and*

_{f}*H*

_{1,}

_{g}_{−1,}

*are conditionally independent given*

_{m}*H*

_{1,}

_{g}_{−2,}

*and*

_{f}*H*

_{1,}

_{g}_{−2,}

*, we can simplify the*

_{m}*k*th moment of the distribution of the fraction of admixture from

*S*

_{1}, for

*δ*∈ {

*f*,

*m*}, to give (26)For

*k*= 1, Equations 24 and 26 should produce the expectation that we have already derived for

*k*= 1. For

*k*= 1, using Equations 1, 2, and 4, Equation 24 gives (27)which matches Equation 17. For

*g*≥ 2 and

*k*= 1, Equation 26 gives (28)which simplifies to match Equation 19. Finally, with equal contributions in each population from females and males, so that and , Equations 24 and 26 reduce to Equations 16 and 17 from Verdu and Rosenberg (2011).

## Variance of the Fraction of Admixture

When *k* = 2, Equations 24 and 26 produce a recursion for the second moment of *H*_{1,}_{g}_{,}* _{δ}*. Recalling Equations 1–6, for

*g*= 1, we have (29)For

*g*≥ 2, we have (30)Recalling that

*H*

_{1,}

_{g}_{,}

*and*

_{f}*H*

_{1,}

_{g}_{,}

*are identically distributed, Equation 30 simplifies to give (31)Using the definition of the variance , and Equations 17, 19, 29, and 31, for the first generation, for the variance of the fraction of admixture, we have (32)For all subsequent generations,*

_{m}*g*≥ 2, we have (33)With no sex bias, so that and , Equations 32 and 33 are equivalent to Equations 22 and 23 from Verdu and Rosenberg (2011).

The recursion for the variance of the fraction of admixture of a random individual of sex *δ* from the hybrid population is dependent on the variance from the previous generation, the expectation from the previous generation, and its square. By contrast with the expectation, the variance of the fraction of admixture depends on the sex-specific contributions from the source populations.

Equations 32 and 33 are invariant with respect to an exchange of all variables corresponding to males (superscript m) with those corresponding with females (superscript f). Thus, although the variance is affected by the sex-specific admixture contributions, it does not identify the direction of the bias. Despite the dependence of the variance of the autosomal fraction of admixture on sex-specific contributions, under the model, the symmetry demonstrates that autosomal DNA alone does not identify which sex contributes more to the hybrid population from a given source population. This result is reasonable given the non-sex-specific inheritance pattern of autosomal DNA.

## Special Case: A Single Admixture Event

Using the recursions in Equations 17, 19, 32, and 33, we can study specific cases in which the contributions are specified. We first consider the case in which the source populations *S*_{1} and *S*_{2} do not contribute to the hybrid population after its founding: , and , for all *g* ≥ 1. As before, at the first generation, the hybrid population is not yet formed, and *h*_{0} = 0. Therefore, .

Under this scenario, we can derive the exact expectation and variance of the autosomal fraction of admixture of a random individual from the hybrid population. In the case of a single admixture event, the expectation of the admixture fraction is equal to the expectation at the first generation, because the further contributions are all zero. Using Equation 19, *s*_{1,}_{g}_{−1} = *s*_{2,}_{g}_{−1} = 0 for all *g* ≥ 2. Therefore, from Equation 17, in the case of a single admixture event, for all *g* ≥ 1, (34)The expectation of the autosomal fraction of admixture from *S*_{1} is constant over time, and it depends on the total—not the sex-specific—contribution from the source population *S*_{1}. As in the general case in Equation 19, for a single admixture event, a sex bias does not affect the expectation. Because the source populations provide no further contributions after the founding generation, unlike in the general case, the mean admixture fraction does not change with time.

Using Equations 32 and 33, because for all *g* ≥ 2, the variance of the fraction of admixture follows a geometric sequence with ratio . For all generations *g* ≥ 1, (35)For , by Equations 1 and 2, the variance matches Equation 25 of Verdu and Rosenberg (2011).

With a single admixture event, the variance decreases monotonically, and its limit is zero for all parameter values. Individuals from the hybrid population mate only within the population, decreasing the variance by a factor of 2 each generation. Thus, Equation 35 predicts that the distribution of the admixture fraction for a random individual in the hybrid population contracts around the mean, converging to a constant equal to the mean admixture from the first generation.

In Equation 35, considering all possible pairs , with each entry in [0, 1], the maximal occurs at , a scenario with equal contributions from the two source populations and no sex bias. At the maximum, the variance is . Four minima occur, at , (0, 1), (1, 0), and (1, 1), cases in which all individuals in generation *g* = 1 have the same pair of source populations for their two parents, and in later generations, all individuals continue to have the same value of *H*_{1,}_{g}_{,}* _{δ}*. In these cases, .

Figure 3 plots the variance in Equation 35 as a function of the sex-specific parameters and for three values of *g*. For *g* = 1, a maximum of occurs at , and a minimum, , at or (1, 1). After one generation of mixing within the hybrid population, with no further contributions from the source populations, the maximum and minima occur at the same values of , but the variance is halved (Figure 3B). That is, for a given set of values , . Similarly, for *g* = 8 in Figure 3C, . By *g* = 8, the hybrid population is quite homogeneous in admixture, and the variance of the admixture fraction has decreased to near zero for all sets of founding parameters. Therefore, the admixture fraction distribution is close to constant, with *H*_{1,8,}* _{δ}* ≈

*s*

_{1,0}.

We can analyze the dependence of the variance on the sex-specific parameters by considering constant total contributions *s*_{1,0} and allowing the sex-specific contributions to vary, constrained by Equation 1 so that . Rewriting Equation 35 in terms of *s*_{1,0} and , (36)From this expression, it is possible to observe that given a constant *s*_{1,0} in [0, 1], the maximal variance is produced when . The minimal variance occurs when or (2*s*_{1,0}, 0), for , or (1, 2*s*_{1,0} − 1) or (2*s*_{1,0} − 1, 1), for . This minimum takes the value only when *s*_{1,0} equals 0, , or 1.

For the specific case of , the total contribution for which the maximal variance occurs in Figure 3, we illustrate the variance at several locations in the allowed range for and (Figure 4). Four scenarios are plotted with the same total founding contribution from source population 1, , but with different levels of sex bias. As the female and male contributions become increasingly different, the initial variance decreases. The largest variance for occurs at , with no sex bias. The minimum occurs when males all come from one source population and females all from the other. In this extreme sex-biased case, the variance is zero constantly over time, as each individual has a male parent from one population, a female parent from the other, and an admixture fraction of .

## Special Case: Constant Nonzero Contributions

Next, we consider the case in which an initial admixture event founds the hybrid population and is then followed by constant nonzero contributions from the source populations. After the founding, for each *g* ≥ 1, all admixture parameters are constant in time: for each *α* ∈ {1, 2} and *δ* ∈ {*f*, *m*}, and for each *δ*. Thus, we have parameter values for the founding and constant continuing admixture parameters , , , and . Each parameter takes its value in [0, 1], as do *s*_{1} and *s*_{2}. By contrast, *h* takes its value in (0, 1). The case of *h* = 1 is a single admixture event, analyzed above. The *h* = 0 case is trivial because the hybrid population is refounded at each generation, and the distribution of the admixture fraction thus depends only on the contribution in the previous generation. Therefore, we require *s*_{1} + *s*_{2} ≠ 0 and *s*_{1} + *s*_{2} ≠ 1. Individually, however, *h*^{f} and *h*^{m} can each vary in [0, 1], as long as they are not both zero or one.

The recursion for the expectation of the autosomal fraction of admixture, Equations 17 and 19, is equivalent to that derived by Verdu and Rosenberg (2011). Therefore, the closed form of the expectation is equivalent as well. From Verdu and Rosenberg (2011, Equation 30) we have (37)We can use the same method as Verdu and Rosenberg (2011) to simplify the second moment. Under the special case of constant contributions across generations, for *g* = 1, Equation 29 gives (38)For *g* ≥ 2, Equation 31 gives (39)Because this equation is a nonhomogenous first-order recurrence with the form (40)we can use Theorem 3.1.2 of Cull *et al.* (2005) to solve for a unique solution for , as in Verdu and Rosenberg (2011). For the initial condition, we have (41)We define *λ* = (*h*^{f} + *h*^{m})/4 = *h*/2, and for all *g* ≥ 2, we have (42)Using the expected admixture fraction from Equation 37, we can simplify Equation 42. For all *g* ≥ 2, (43)Therefore, using Theorem 3.1.2 of Cull *et al.* (2005), we have a unique solution for : (44)Equation 44 can be simplified by separating the sum and summing the resulting geometric series, (45)where *α*_{0} is defined in Equation 41, and (46) (47)(48) (49)When and , *A*_{1}, *A*_{2}, *A*_{3}, and *A*_{4} are equal to the corresponding quantities in Verdu and Rosenberg (2011, Equations 39–42). Therefore, without sex bias, the closed form of the second moment of the admixture fraction, Equation 45, is equal to Equation 38 in Verdu and Rosenberg (2011).

Using the relation and Equations 37 and 45, for the variance of the autosomal fraction of admixture, we have (50)For , we have (51)For , Equation 50 gives (52)Equations 50–52 simplify to Equations 43–45 of Verdu and Rosenberg (2011) when and .

### Limiting variance of admixture over time

Figure 5 illustrates the variance of the autosomal fraction of admixture as a function of *g* when the contributions from the source populations are constant over time, computed using Equation 50. The figure shows that if the continuing contributions are held constant, then the long-term limiting variance does not depend on the founding parameters. Unlike in the hybrid isolation case, with constant, nonzero contributions from the source populations over time, *h* ≠ 0 and *h* ≠ 1, a nonzero limit is reached. Applying Equation 50, (53)which does not depend on the founding parameters. The limit matches that of Verdu and Rosenberg (2011, Equation 46) in the absence of sex bias (, *h*^{f} = *h*^{m} = *h*, and ).

### The maxima and minima of the limiting variance

Using Equations 1, 2, 4, and 46, the limit in Equation 53 can be equivalently written in terms of the two female sex-specific contributions, and , and the total contributions from the two source populations, *s*_{1} and *s*_{2}. Considering admixture scenarios with constant *s*_{1}, *s*_{2}, with *s*_{1} + *s*_{2} ∈ (0, 1], but allowing and to range over the closed unit interval, the limiting variance depends on two independent parameters, , subject to the constraint in Equation 1: (54)Treating *s*_{1} and *s*_{2} as constants in [0,1], the critical points of Equation 54 are the same as those of (55)First we consider the maximum. Because is always negative or zero, the maximal variance given *s*_{1} and *s*_{2} occurs when , which occurs on the line . Equivalently, recalling Equation 1, (56)Equation 56 has many solutions for given *s*_{1} and *s*_{2}. One solution is , which by Equation 1 is equivalent to . Therefore, the limiting variance of the admixture fraction is maximized when there is no sex bias. Figure 6 plots two examples of the variance for constant *s*_{1} and *s*_{2}, but increasingly different sex-specific contributions from the source populations. In Figure 6, A and B, the admixture history with no sex bias produces the greatest limit.

For fixed *s*_{1} and *s*_{2}, however, the case without sex bias is not the only maximum of the limiting variance. Figure 7 plots the variance over time for four different admixture histories, each with the same total contributions *s*_{1} and *s*_{2}, but quite different sex-specific contributions . Each of the four scenarios plotted reaches the same limit because each provides a solution to Equation 56. Because , Equation 54 depends only on the total contributions *s*_{1} and *s*_{2}. For constant *s*_{1} and *s*_{2}, any admixture history whose contributions solve has limiting variance (57)This maximal limiting variance depends on the total contributions from the source populations, but not on the sex-specific contributions; it is equivalent to Equation 47 of Verdu and Rosenberg (2011).

Thus far, we have considered the maximal limiting variance as a function of the sex-specific parameters given constant total contributions *s*_{1} and *s*_{2}. We can also identify the values of *s*_{1} and *s*_{2} that maximize the limiting variance, considering all *s*_{1}, *s*_{2} ∈ [0, 1]. For each choice of *s*_{1} and *s*_{2}, the maximal variance over values of and is given by Equation 57. We can therefore find the *s*_{1} and *s*_{2} that maximize Equation 57. As shown by Verdu and Rosenberg (2011), given *s*_{1} + *s*_{2}, the maximal limiting variance occurs when *s*_{1} = *s*_{2}. Over the range of possible choices for *s*_{1} + *s*_{2} ∈ (0, 1), the maximum occurs when . Unlike in Verdu and Rosenberg (2011), however, this maximum requires the sex-specific contributions to solve .

Interestingly, one of the *minima* of the limiting variance occurs when , but with . Specifically, when , but all males come from one source population and all females from the other, or (0, 1, 1, 0), the limiting variance in Equation 54 is zero. In this case, or for every individual in the hybrid population. By Equation 8, the hybrid population is founded anew at each generation, each individual having admixture fraction .

More generally, given and , the minimum occurs when or . Given and , the limiting variance is minimized with respect to and when is smallest (Equation 54). Because *f* is the negative of the square of a difference of products, it is smallest when one term is zero and the other is at its maximum, as at or . These points represent the maximal sex bias for fixed (*s*_{1}, *s*_{2}).

If we allow *s*_{1} and *s*_{2} to vary, because a variance is bounded below by zero, any set of parameters that produces zero variance is a minimum. In Equation 54, if either *s*_{1} = 0 or *s*_{2} = 0, then the limiting variance of the admixture fraction is zero. When only one population contributes after the founding, in the limit, all ancestry in the hybrid population traces to that population.

### Properties of the limiting variance

The limiting variance of the fraction of admixture over time in Equation 53 is a function of the sex-specific contributions from the hybrid population, *h*^{f} and *h*^{m}, and source population 1, and . Recalling Equation 4, the limiting variance is equivalently written as a function of the sex-specific contributions from source population 2, and , and either source population 1 (Equation 54), or the hybrid population. It can be viewed as a function of all six sex-specific parameters , four of which can be selected while assigning the other two by the constraint from Equation 4.

We can therefore analyze the behavior of the limiting variance as a function of two of the sex-specific parameters by specifying two other parameters and allowing the final two parameters, one female and one male, to vary according to Equation 4. Of the four parameters we consider, using the constraint from Equation 4 separately in males and females, two must be male and two must be female. Because the variance is invariant with respect to exchanging the source populations or the sexes, the six-dimensional parameter space has a number of symmetries. Figure 8, Figure 9, Figure 10, Figure 11, and Figure 12 examine the five possible, nonredundant ways of choosing two populations and the corresponding male and female parameters from those populations and holding two corresponding parameters fixed (either from the same sex in the two populations, or for males and females from one population) while allowing the other two to vary. Figure 13 then highlights an informative case that considers the limiting variance in terms of a male and a female parameter from different populations.

Each figure shows multiple contour plots of the limiting variance as a function of two sex-specific parameters, for fixed values of two other parameters. Three cases plot the limiting variance as a function of the female and male parameters from a given population, with the female and male contributions of another population specified. In two other cases, parameters for a single sex from two populations are plotted, specifying the contributions from the other sex for those populations.

By considering these parameter combinations, we can examine the dependence of the variance on sex-specific parameters and parameter interactions, as well as potential bounds on both the parameters and the variance. We highlight a number of symmetries in the limiting variance. The plots also illustrate the maxima and minima found in the previous section.

#### Properties of the limiting variance in terms of and :

In Figure 8, we consider the variance of the fraction of admixture as a function of , the female contribution from *S*_{1}, on the *x*-axis, and , the male contribution from *S*_{1}, on the *y*-axis, computed using Equation 53. We plot the variance for fixed *h*^{f}, the female contribution from *H*, and *h*^{m}, the male contribution from *H*. The domain for and is constrained by Equation 4, with taking values in [0, 1 − *h*^{f}], and taking values in [0, 1 − *h*^{m}].

Figure 8, top left, shows the variance as a function of and , with *h*^{f} = *h*^{m} = *h* = 0. Here, the hybrid population is founded anew by the source populations each generation, and and both take values from the full domain [0, 1]. For *h*^{f} = *h*^{m} = *h* = 0, the maximal limiting variance is , occurring when . At this maximum, given Equation 4 and *h*^{f} = *h*^{m} = 0, we have . As in Equation 54, the maximum occurs when female and male contributions from the source populations are equal and the total contributions from the source populations are equal.

The minima of occur at the four corners of the plot. At the origin, when , the limiting variance is zero because only *S*_{2} contributes to the hybrid population. Individuals in the hybrid population all have parents *L* = *S*_{2}*S*_{2} and admixture fraction zero (Equation 8). By exchanging *S*_{1} for *S*_{2}, the case of is similar. Additional minima occur at or (0, 1), where all males come from one source population and all females from the other, and all individuals at the next generation of the hybrid population have admixture fraction (Equation 8).

For *h*^{f} = *h*^{m} = *h* = 0, the limiting variance is symmetrical over the line , as a result of the symmetry between males and female in the variance (Equation 33). Because the hybrid population provides no contribution and the variance of the fraction of admixture is symmetric with respect to source population, the variance is also symmetric over the lines and .

The columns of Figure 8 consider increasing, fixed values for *h*^{f}, and the rows consider increasing, fixed values for *h*^{m}, both from {0, 0.25, 0.5, 0.75, 0.95}. All the cells maintain the general shape of the limiting variance as a function of and seen for *h*^{f} = *h*^{m} = 0. However, as the domain for and shrinks with increasing *h*^{f} and *h*^{m}, the location of the maximal variance changes across cells. In all cases, the maximum of the limiting variance occurs when and each lie at the midpoints of their respective domains, and The magnitude of the limiting variance at each maximum decreases as its location moves away from .

In all the cells, the minimum occurs when and are either both zero or they lie at the maxima of their respective domains. In these cases, only one source population contributes to the hybrid population, and all individuals in the hybrid population have an admixture fraction from *S*_{1} of either 0, when , or 1, when and . The limiting variance is no longer zero at the two corners of each plot where only one of is at the maximum of its domain; these corners, however, are minima of the variance given the values of *s*_{1} and *s*_{2}. In these cases, males all come from one source population and females from the other, producing a minimum of Equation 54 for fixed *s*_{1} and *s*_{2}.

As in the case of *h*^{f} = *h*^{m} = 0, each cell is symmetrical in reflecting over both the midpoint of the *x*-axis, , and that of the *y*-axis, . The limiting variance is symmetrical with respect to source population (Equation 54), and this pair of reflections corresponds to an exchange of source populations. For *h*^{f} = *h*^{m} = 0, the line generates circular contours, but as the contributions from the hybrid population increase, the contours become elliptical.

In Figure 8, cells on the diagonal have equal contributions from males and females in the hybrid population, *h*^{f} = *h*^{m} = *h*. For *h*^{f} ≠ *h*^{m}, cells above the diagonal are equivalent to those below the diagonal with an exchange of female for male contributions, for both *s*_{1} and *h*. For example, the cell with *h*^{f} = 0.25 and *h*^{m} = 0.5 is equivalent to the cell with *h*^{f} = 0.5 and *h*^{m} = 0.25 if the axes are also switched so that appears along the *x*-axis and is on the *y*-axis.

Figure 9 plots the limiting variance as a function of on the *x*-axis and on the *y*-axis, but we now fix values of by column and by row using Equations 54 and 1. The maxima and minima occur at the same parameter values found in Figure 8, but they appear in different locations on the plots. For example, in Figure 9, the global maximum across cells occurs in the plot with specified, and . By Equation 4, this location has *h*^{f} = *h*^{m} = 0, the cell with the maximal variance in Figure 8. In Figure 9, top left, the variance is zero for all and , because *s*_{2} = 0 (Equation 54).

Whereas all cells in Figure 8 are symmetric in reflecting over the midpoints of both domains, in Figure 9, only the cells with are symmetric over the line However, the symmetry corresponding to transposing males and females is visible in that a cell above the diagonal and its corresponding cell below the diagonal are equivalent if the axes for and are switched.

#### Properties of the limiting variance in terms of *h*^{f} and *h*^{m}*:*

Similarly to Figure 8, Figure 10 considers the limiting variance of the admixture fraction over time as a function of the four variables , , *h*^{f}, and *h*^{m} using Equation 53. In Figure 10, each cell shows *h*^{f} on the *x*-axis and *h*^{m} on the *y*-axis, for and specified, with the domains of *h*^{f} and *h*^{m} constrained by Equation 4. The cells on the diagonal have , and there is a symmetry over this line of cells in that if values of and are switched, then the cells will be equivalent with a transposition of the axes.

In Figure 10, top left, , and the limiting variance is a constant zero. In Figure 10, the maximal variance occurs at the origin (*h*^{f} = *h*^{m} = *h* = 0) of the cell with . As in Figure 8, at the maximum, by Equation 4, . In this case, females and males contribute equally. Both source populations contribute maximally to pull the distribution of the fraction of admixture toward the extremes of zero and one.

Because the limiting variance is symmetrical with respect to source population, and recalling Equation 4, each cell in Figure 10 is equivalent to a corresponding cell in Figure 9 reflected along both the *x*-axis and *y*-axis. For example, the cell in Figure 10 with and is equivalent to the Figure 9 cell with and if reflected on both the *x*- and *y*-axes.

Figure 8, Figure 9, and Figure 10 illustrate that the global maximum of the limiting variance occurs when the two source populations contribute equally, the contributions from the two sexes are equal, and the hybrid population does not contribute to the next generation. As the parameters move from the location of the maximal limiting variance to the minimum, the variance monotonically decreases.

#### Properties of the limiting variance in terms of , *h*^{f}, and :

Next we plot on the *x*- and *y*-axes two parameters of the same sex from different populations. Because the variance is invariant with respect to transposition of females and males, we consider only females without loss of generality. Figure 11 plots the limiting variance as a function of on the *x*-axis and *h*^{f} on the *y*-axis, for fixed values of and *h*^{m}. Figure 12 plots the limiting variance as a function of and , for fixed and . For Figure 11 and Figure 12, the domains of and *h*^{f} are constrained by Equation 4.

For Figure 11, the maximal limiting variance occurs in the cell with and *h*^{m} = 0, at . By Equation 4, this location is the same parameter set for the maximum in Figures 8, Figure 9, and Figure 10. The maximum in each cell occurs when , but the magnitude of the variance decreases with increased distance from the cell with fixed and *h*^{m} = 0. Similarly, within each cell, the limiting variance decreases with distance from .

In the first column of Figure 11, where , the line produces zero variance because the hybrid population is homogenous, with only one source population contributing. Similarly, in cells with , as by Equation 4, the line has minimal variance.

In Figure 12, because of the symmetry in sex in Equation 54, the cells above and below those where are equivalent with a transposition of axes. As in Figure 8, Figure 9, Figure 10, and Figure 11, the maximal variance occurs in the cell with at Also, similar to Figure 11, in the first column, when the line is a minimum; in the first row, when the line is a minimum.

Analogous to the similarity between Figure 9 and Figure 10, by Equation 4, each cell in Figure 12 is a transformation of a cell in Figure 11. For example, for the cell with and *h*^{m} = 0.5 specified in Figure 11, because the male contributions sum to one, this cell also specifies . Therefore, we can compare this cell to the cell with in Figure 12. Both show on the *x*-axis, and using Equation 4, we can rewrite the *y*-axis in Figure 12 as .

#### Properties of the limiting variance in terms of noncorresponding parameters:

Finally, we consider a case in which males from one population in (*S*_{1}, *S*_{2}, *H*) are compared to females from a different population. Although multiple parameter configurations are possible, we plot one that is particularly informative, providing a perspective on Equation 54 beyond the observations visible in Figure 8, Figure 9, Figure 10, Figure 11, and Figure 12. Figure 13 plots the limit of the variance of the admixture fraction as a function of on the *x*-axis and on the *y*-axis, for fixed and . We rewrite Equation 54 as a function of , and using Equation 1: (58)The limit depends on products of sex-specific parameters, including , as can be seen in the shape of the contours in Figure 13, but not in the analogous plots in Figure 9.

## Discussion

Our model demonstrates the potential informativeness of autosomal DNA in the study of sex-biased admixture histories. Under a framework in which admixture occurs over time, potentially with different male and female contributions from the source populations, we have derived recursive expressions for the expectation, variance, and higher moments of the fraction of autosomal admixture. For the special case of constant admixture over time, we have analyzed the behavior of the variance of the admixture fraction. Although the expectation of the autosomal admixture fraction depends only on the total contributions from the source populations, we found that the variance of the autosomal admixture can contain a signature of sex-specific contributions. In particular, for constant admixture over time, the variance of the autosomal admixture fraction decreases as the male and female contributions become increasingly unequal.

That autosomal DNA possesses a signature of sex-biased admixture might at first appear counterintuitive, as unlike the sex chromosomes, autosomes are carried equally in both sexes. The phenomenon can, however, be understood by analogy with the well-known result that increasing sex bias decreases the effective size of populations (Wright 1931; Crow and Dennison 1988; Caballero 1994; Hartl and Clark 2007). In a computation of effective size using the coalescent, for example (Nordborg and Krone 2002; Ramachandran *et al.* 2008), the sex bias causes pairs of genetic lineages to be likely to find common ancestors more recently than in a non-sex-biased population, as the reduced chance of a coalescence in the sex that represents a larger fraction of the breeding population is outweighed by the greater chance of a coalescence in the less populous sex. In a similar manner, if admixture is sex-biased, because lineages are more likely to travel along paths through populations with the larger sex-specific contributions, the variability of genealogical paths—and hence, the variance of the admixture fraction—is reduced compared to the non-sex-biased case.

Autosomal DNA, with its multitude of independent loci, potentially provides more information about the complex histories of hybrid populations, and the autosomal genome might be less susceptible to locus-specific selective pressures than the sex chromosomes. To take advantage of autosomal information, many recent efforts to study sex-biased demography have compared autosomal DNA with the X chromosome (Ramachandran *et al.* 2004, 2008; Wilkins and Marlowe 2006; Hammer *et al.* 2008, 2010; Bustamante and Ramachandran 2009; Keinan *et al.* 2009; Casto *et al.* 2010; Emery *et al.* 2010; Keinan and Reich 2010; Labuda *et al.* 2010; Lambert *et al.* 2010; Gottipati *et al.* 2011; Heyer *et al.* 2012; Arbiza *et al.* 2014). Our study enhances the set of frameworks available for considering effects of admixture and sex bias on autosomal variation. Further, our theoretical results are potentially important to the interpretation of existing methods that utilize admixture fractions. In particular, a decreased variance, often interpreted as older admixture timing, can instead be a consequence of sex bias.

For a single admixture event, the expectation of the autosomal admixture fraction is constant in time and not dependent on sex-specific contributions. Unlike in the case of hybrid isolation, if constant nonzero contributions from the source populations occur over time, then the variance of the fraction of autosomal admixture reaches a nonzero limit, dependent on these continuing sex-specific admixture rates, but not on the founding contributions. In both scenarios, the variance contains information about the magnitude of a sex bias in the admixture history of a hybrid population. For an arbitrary constant total contribution from a source population, the maximal variance occurs when there is no sex bias. The maximal variance across allowable parameter values of the constant admixture model is seen when there is no sex bias and equal contributions from both source populations, that is, . Two types of admixture history minimize the variance of the autosomal admixture fraction. First, the variance is zero when only one source population contributes to the hybrid population. Second, the variance is zero if all males come from one source population and all females from the other. In this scenario, all individuals in the hybrid population have admixture fraction .

Although the variance of the autosomal admixture fraction suggests that autosomal DNA is informative about sex-biased admixture, the relationship between the variance and the sex-specific parameters is complex. We uncovered an interesting case in which quite different sex-specific histories can lead to the same variance over time (Figure 7). The variance is in fact dependent on the product of multiple sex-specific parameters, not on each parameter separately (Figure 13). In particular, when , the variance is maximized (Equations 54–56), and depends only on the total contributions from the source populations, *s*_{1} and *s*_{2} (Equation 57). The symmetry arises from the non-sex-specific inheritance of autosomal DNA.

We have considered two scenarios, isolation of a hybrid population after its founding, and constant contributions from source populations to the hybrid population over time. Although the admixture history of real hybrid populations is likely more complex than these, jointly considering the mean, variance, and potentially higher moments of the admixture fractions, our models can provide a starting point for statistical frameworks to estimate parameters of mechanistic admixture models. We have not numerically analyzed complex time-varying admixture histories, but our recursive expressions flexibly accommodate a range of population histories, especially if simplifying assumptions are employed to reduce the number of parameters.

Our model omits a number of potentially important phenomena. First, assortative mating by ancestry, preferential mating of individuals with those with similar admixture fractions, has been empirically observed in admixed populations (Risch *et al.* 2009), and may have sex-specific patterns. Second, our focus on a randomly chosen locus in a deterministic model amounts to a potentially unrealistic assumption of an infinite chromosome with infinitely many independent segments. Gravel (2012), however, calculated the variance of the admixture fraction including both finite chromosomes and finite population sizes, for a model similar to the one presented here, albeit without sex bias. Gravel (2012) found that the genealogy of individuals in the hybrid population—which our model explicitly examines—is the main factor affecting the variance when admixture is recent, showing that the Verdu and Rosenberg (2011) variance provides a good fit to the finite-population finite-chromosome result in that context. We expect that this suitability to conditions of recent admixture applies similarly for our sex-biased version of the Verdu and Rosenberg (2011) model.

Finally, although sex bias does influence autosomal variation, because autosomal DNA is not inherited sex-specifically, the sex that contributes more from a given source population is nonidentifiable with autosomal DNA alone. Because the X chromosome has a sex-specific mode of inheritance, consideration of the X chromosome alongside autosomal data under the mechanistic model may assist in differentiating between scenarios that produce the same variance with different choices of the sex with a greater contribution.

## Acknowledgments

We thank Ethan Jewett and Michael D. Edge for useful discussions. We acknowledge support from an National Science Foundation (NSF) Graduate Research Fellowship and from NSF grant BCS-1147534.

## Footnotes

Available freely online through the author-supported open access option.

*Communicating editor: B. A. Payseur*

- Received June 23, 2014.
- Accepted August 29, 2014.

- Copyright © 2014 by the Genetics Society of America

Available freely online through the author-supported open access option.