IDT. Quality oligos. Every time.

Originally published as Genetics Published Articles Ahead of Print on November 3, 2008.

Genetics, Vol. 181, 259-276, January 2009, Copyright © 2009
doi:10.1534/genetics.108.097998

The Joint Effects of Selection and Dominance on the QSTFST Contrast

Institute of Zoology, Zoological Society of London, London NW1 4RY, United Kingdom

1 Corresponding author: Institute of Zoology, Zoological Society of London, Regent's Park, London NW1 4RY, United Kingdom.
E-mail: asanture{at}gmail.com

Manuscript received October 24, 2008. Accepted for publication November 1, 2008.

ABSTRACT

QST measures the differentiation of quantitative traits between populations. It is often compared to FST, which measures population differentiation at neutral marker loci due to drift, migration, and mutation. When QST is different from FST, it is usually taken as evidence that selection has either restrained or accelerated the differentiation of the quantitative trait relative to neutral markers. However, a number of other factors such as inbreeding, dominance, and epistasis may also affect the QSTFST contrast. In this study, we examine the effects of dominance, selection, and inbreeding on QSTFST. We compare QST with FST at selected and neutral loci for populations at equilibrium between selection, drift, mutation, and migration using both analytic and simulation approaches. Interestingly, when divergent selection is acting on a locus, inbreeding and dominance generally inflate QST relative to FST when they are both measured at the quantitative locus at equilibrium. As a consequence, dominance is unlikely to hide the signature of divergent selection on the QSTFST contrast. However, although in theory dominance and inbreeding affect the expectation for QSTFST, of most concern is the very large variance in both QST and FST, suggesting that we should be cautious in attributing small differences between QST and FST to selection.


TWO measures of population differentiation are commonly used to assess the genetic structure of populations. The first of these is FST, which measures the differentiation of neutral markers between populations. FST is generally calculated using information on allele frequencies at neutral loci such as microsatellites and is therefore an estimate of the amount of differentiation that has occurred due to drift acting on the (finite) populations. FST is well characterized, with a vast literature on how factors such as mutation, migration, population size, population history and population subdivision affect it (WHITLOCK and MCCAULEY 1999) and how best to calculate and estimate it (WEIR and COCKERHAM 1984; WEIR and HILL 2002). In addition, numerous studies have calculated FST for natural populations to investigate, for instance, effective population size, migration rates, and divergence times as well as examine population differentiation itself.

The second widely used measure of population differentiation is that of quantitative traits, termed QST. Because knowledge of the underlying genetic loci (of which there may be dozens) is generally unknown, quantitative traits are measured on individuals and QST is calculated in terms of variation in those traits between and within populations. Assuming an additive model, WRIGHT (1951) derived expressions for the neutral expectation of variance between and within populations, which LANDE (1992) noted may be translated into an FST equivalent to measure the differentiation of a quantitative trait. The origin of the term QST is attributed to an article the following year (SPITZE 1993).

In addition to the factors affecting neutral markers, quantitative traits may be under selection. In the absence of selection, the differentiation of an additive trait is equal to the differentiation of neutral markers (that is, QST = FST), as both sets of loci are affected equally by drift (LYNCH and SPITZE 1994; LATTA 1998). However, if selection has accelerated the divergence of quantitative traits between the populations (for instance, different phenotypes are selected for in different environments, leading to local adaptation of populations), QST will exceed FST. Alternatively, QST will be less than FST if quantitative traits are restrained by balancing selection (for instance, if natural selection favors the same phenotype in different populations). The comparison of QST and FST, when measured for the same populations (CRNOKRAK and MERILA 2002; WHITLOCK 2008), can therefore be used to infer the type and strength of selection on the quantitative trait. Many dozens of studies have now used the QSTFST contrast to predict the effect of selection (see references in LEINONEN et al. 2008), and the conclusion from two meta-analyses is that QST generally exceeds FST (MERILA and CRNOKRAK 2001; LEINONEN et al. 2008), supporting a general role for divergent selection driving the differentiation of quantitative traits.

There are two problems when comparing QST and FST values to elucidate the role of selection. The first is related to how FST and particularly QST are measured and the intrinsic biases in calculating them using samples of natural, finite populations, despite a number of sophisticated methods available to calculate both (O'HARA and MERILA 2005). For example, WHITLOCK (2008, p. 1894) asserts that QST is "difficult to estimate accurately and precisely," while O'HARA and MERILA (2005, p. 1337) conclude from simulations that "the precision of the QST estimates—irrespective of the estimation method used—is very low." Thus, when comparing QST and FST, it is not particularly clear how to best assess the significance of an observed difference or correct for biases in their estimates.

The second area of concern when comparing QST and FST is in the nature of the quantitative trait itself. In contrast to neutral loci, alleles at quantitative trait loci may exhibit dominance, may interact with other loci via epistasis or extended linkage disequilibrium, or may respond differently than neutral alleles to the effects of mutation. A reasonable research effort has recently been directed toward how these factors affect QST and QSTFST, particularly investigating the changes to QSTFST when FST is calculated using allele frequencies at the quantitative locus itself. Models of epistasis suggest that QST will be depressed by interactions between loci (WHITLOCK 1999; LOPEZ-FANJUL et al. 2003), masking the effect of divergent selection on the trait. Additionally, because multiple loci contribute to a quantitative trait and allele effects sum across loci, mean values of a quantitative trait may not be different between populations even if divergence at individual loci is high. Conversely, low levels of divergence at the allelic level may have significant impacts on the quantitative trait means (LATTA 1998; LE CORRE and KREMER 2003). Thus, the signatures of divergent or uniform selection may be masked if allele frequencies at the quantitative trait are measured (or modeled), rather than the trait values themselves.

Using an island model and a pure drift model respectively, Goudet and colleagues (GOUDET and BUCHI 2006; GOUDET and MARTIN 2007) and Lopez-Fanjul and colleagues (LOPEZ-FANJUL et al. 2003, 2007) have extensively investigated the impact of dominance on QST FST for a neutral trait. For the island model at equilibrium between migration and drift, QST tends to be less than FST calculated at the same locus (GOUDET and BUCHI 2006; GOUDET and MARTIN 2007), while if populations diverge via drift from an infinite population, QST tends to be larger than FST when the recessive allele is at high frequency (LOPEZ-FANJUL et al. 2007). These results led GOUDET and BUCHI (2006) to suggest that dominance may mask the effects of divergent selection when QST is compared to FST calculated from neutral markers. To date, however, there has been no exploration of the combined effect of selection, dominance, and inbreeding on QST and QSTFST. In this article, we examine QST values at equilibrium to address whether divergent selection on quantitative traits may indeed be hidden by the effects of dominance. FST is calculated both at the selected locus and at neutral loci unlinked to the locus under selection. The numeric results from the analytic approach are checked and extended by simulations, in which multiple loci under divergent and balancing selection are also investigated. The following sections are arranged into (i) the derivation of and results from recursion equations used to calculate FST and QST, (ii) a description of and results from an individual-based simulation model, and finally (iii) a general discussion. Parameters and terms for the following FST and QST models are summarized in Table 1.


View this table:
In this window
In a new window

 
TABLE 1

Parameters of the FST and QST models

 


THE ANALYTIC MODELS
We consider the simple case of two populations, X and Y, linked by migration. Diploid individuals within a population outcross or self to form the next generation, and some of these offspring migrate so that a proportion mi of individuals in population i (i = X, Y) are migrants. We assume that migrants are indistinguishable from residents in reproduction, generations are discrete, and population sizes and structure are constant through time.

Let Ni be the number of individuals in population i (i = X, Y) and let the probability of selfing in both populations be f while the probability of parents outcrossing is (1 – f). Selfing and outcrossing are assumed to be independent, and so the number of selfed and outbred offspring per parent is independently binomially distributed with means f and (1 – f), respectively.

Population differentiation at neutral markers (FST):

To derive an explicit expression for FST, we construct recursion equations for coefficients describing the similarity between and within individuals in a subdivided population. These recursion equations may then be used to solve for FST at equilibrium. Full details of the derivation are shown in APPENDIX A. If we make the assumption that migration rates and population sizes are equal in populations X and Y (mX = mY = m and NX = NY = N), the explicit expression for FST is

Formula 1(1)
where the effective population size Ne is defined in APPENDIX A.

If migration rates are small, FST is expected to be

Formula 1
(MARUYAMA and TACHIDA 1992; WANG 1997). Interestingly, using Formula 1 overestimates the value of FST found in (1). This is likely to be the result of the small number of populations we are considering. FST increases with increasing selfing rate and rises sharply with decreasing migration rates and population sizes.

Population differentiation at quantitative traits (QST):

Following previous work (GOUDET and BUCHI 2006), we consider a single locus with two alleles, A and B, at frequencies pi and 1 – pi, respectively, in population i (i = X, Y). We set the genotypic value G of an individual to be –1 (GAA), 1 (GBB), and d (GAB) for genotypes AA, BB, and AB, respectively; we assume d > 0 so that allele B is dominant to allele A. Genotype frequencies r of AA, AB, and BB in population i (i = X, Y) with selfing rate f are shown in Table 2.


View this table:
In this window
In a new window

 
TABLE 2

Genotypic values, population frequencies, and selection coefficients for a single locus with two alleles under divergent linear selection

 
Following HEYWOOD (2005), we assume that selection acts via a linear fitness function, so that the fitness Formula 1 of an individual in population i with genotypic value Gjk is

Formula 2(2)
where si is the population-specific selection coefficient. The mean fitness of population i is

Formula 2
and the relative fitness of an individual becomes

Formula 2
We assume that selection dominates drift Formula 2, and so drift is not explicitly modeled in our equations.

Following selection and reproduction, the allele frequencies in the next generation (t) are

Formula 2
Finally, after migration, the frequency of allele A in the next generation for population X becomes

Formula 3(3)
A similar expression for pY(t) can be found by switching X and Y subscripts in the above equation.

To find the equilibrium frequencies, we solve pX(t) = pX(t–1) and pY(t) = pY(t–1). If the parameters Formula 3 are known, the equilibrium values of pX and pY may be easily found. However, it is not possible to find a general expression for pX and pY at equilibrium. Even solving for a reduced set of equations, when we know some of the parameters of the model, proves to be difficult. For a given set of parameter values, the expressions for pX (3) and pY can therefore be iterated over generations until there is no change in allele frequencies and the populations reach equilibrium. A polymorphic equilibrium with both alleles present in both populations requires that selection coefficients act in opposite directions in the two populations, that there is migration between the populations, and additionally that Formula 3. For example,Formula 3 gives pX = 0.3823 and pY = 0.6181, while Formula 3 and Formula 3 lead to pX = pY = 1 as the high migration rate from population Y to X causes the positive selection in population Y to dominate both populations. Interestingly, if f = 0, d = 0, mX = mY, and sX = –sY, then pX + pY = 1, and both alleles are maintained in populations X and Y [that is, Formula 3].

The expression for QST including selfing rate f is

Formula 4(4)
(BONNIN et al. 1996), where VB is the variance between populations and VA is the additive variance within populations. For a single quantitative locus with two alleles, GOUDET and BUCHI (2006) derived a general expression for the trait variance among n populations. For our case of two populations and genotypic values of –1, d, and 1 (see Table 2), this expression becomes

Formula 5(5)
Similarly, for two populations the expression for the single-locus additive variance VA within the populations (TEMPLETON 1987; LYNCH and WALSH 1998; GOUDET and BUCHI 2006) becomes

Formula 6(6)
As demonstrated by GOUDET and BUCHI (2006), the expression for QST reduces to the definition of FST (WRIGHT 1951) at the locus when there is no dominance,

Formula 7(7)
where for the case of two populations X and Y

Formula 7
and

Formula 7

The value of QST at equilibrium may be found by substituting equilibrium allele frequencies Formula 7 and known population parameters into our expression for QST (Equation 4). For example, solving pX(t) = pX(t–1) and pY(t) = pY(t–1) for Formula 7 gives equilibrium allele frequencies Formula 7 and QST = 0.9570, while Formula 7 gives equilibrium allele frequencies Formula 7 and QST = 0.0555.

Effects of dominance, inbreeding, and selection on QST:

When calculated using equilibrium allele frequencies, QST is a strictly increasing function of inbreeding, dominance, and the strength of divergent selection and decreases with increasing migration between populations. Note that equilibrium allele frequencies are themselves functions of the inbreeding, dominance, and selection coefficients. Under divergent selection, dominance has the greatest impact on QST when inbreeding is low. For example, Figure 1 demonstrates that when Formula 7, equilibrium QST increases dramatically with increasing inbreeding and also increases with increasing dominance at the locus. This nonadditive interaction between dominance and inbreeding was also noted by GOUDET and BUCHI (2006) for a neutral locus.


Figure 1
View larger version (7K):
In this window
In a new window
Download PPT slide
 
FIGURE 1.—

Equilibrium QST analytic values when Formula 7, shown as a function of increasing selfing rate for three different dominance coefficients.

 
Very small changes in the selection coefficients have a much larger impact on QST than do equivalent changes to the dominance coefficient or selfing rates. This suggests that even a low level of divergent selection will overwhelm the effects of dominance or inbreeding on QST. For example, we investigated the balance between increasing the dominance coefficient and decreasing the selection coefficient to maintain the same value of QST. Assuming a migration rate of Formula 7, a population size of N = 1000, and an inbreeding coefficient Formula 7, the initial values for the selection coefficients (sX = –sY) were chosen so that Formula 7 [calculated from (4) and (1), respectively] when d = 0, that is, when selection is approximately the same order of magnitude as drift. For Formula 7, sX = (0.0016, 0.0015, 0.0013) to maintain QST (= FSTN) = 0.1169. So an increase in the dominance coefficient from Formula 7 to 1 is offset by a decrease in the strength of selection of only 0.0002. Therefore, although we have seen that dominance and inbreeding indeed influence equilibrium QST values, divergent selection is a much more significant force in determining the level of quantitative trait differentiation.

QSTFSTQ contrast:

For a single locus with two alleles, the value of QST FST is found by substituting equilibrium allele frequencies into Equations 46 for QST and Equation 7 for FST. The full expression for the difference is shown in APPENDIX B. For a neutral quantitative locus, QST = FSTQ (FST for the quantitative locus) only when there is no dominance (GOUDET and BUCHI 2006). If alleles at the locus do not act additively, the difference between QST and FSTQ may be positive or negative, depending on the allele frequencies. For example, in the one-locus, two-allele model with dominance, if the recessive allele is rare in both populations, then QST will be less than FSTQ, and vice versa if the recessive allele is at high frequency in both populations (GOUDET and BUCHI 2006). In general, the expected value for the difference between QST and FSTQ is dependent on assumptions made about the distribution of neutral allele frequencies for a quantitative trait and the mode of population subdivision (see discussion in GOUDET and BUCHI 2006, GOUDET and MARTIN 2007, and LOPEZ-FANJUL et al. 2007 and references therein).

When a dominant quantitative locus is under divergent selection, however, the range of possible allele frequencies in the two populations will be more restricted than in the neutral case. In particular, we no longer expect the frequency of the recessive allele to be positively correlated between populations, as the allele favored in one population is selected against in the other. Therefore, in contrast to the neutral case, it is unlikely that the same allele will be rare in both populations.

To test the impact of dominance with divergent selection on the difference between QST and FSTQ, we iterated the recursion equations for pX (Equation 3) and pY to find equilibrium allele frequencies for 10,000 random combinations of biologically feasible parameter values [Formula 7 while Formula 7] and calculated QST and FSTQ at equilibrium. The distribution of QSTFSTQ for the 10,000 random combinations is shown in Figure 2. We additionally restricted parameters to biologically realistic values [Formula 7, Formula 7, and Formula 7], which affected the magnitude of QSTFSTQ, but the total frequency of QST FSTQ values less than zero was similar (data not shown). Significantly, these results demonstrate that for a range of feasible migration, selection, inbreeding, and dominance coefficients QST will most likely exceed FSTQ when both are calculated using equilibrium allele frequencies at the locus under selection. Therefore, we conclude that dominance is unlikely to hide the signature of divergent selection on the QSTFST contrast, and in fact the inflation of QST relative to FST due to dominance will make it easier to detect divergent selection on the quantitative trait.


Figure 2
View larger version (12K):
In this window
In a new window
Download PPT slide
 
FIGURE 2.—

Distribution of equilibrium QSTFSTQ analytic values, calculated from recursion equations, for 10,000 replicates, using population parameter values sampled from biologically feasible ranges, where Formula 7 and Formula 7.

 
To assess the impact of individual factors, we calculated the correlation of d, f, sX, sY, mX, and mY with QSTFSTQ and with Formula 7 for the 10,000 random combinations of parameter values described above. Interestingly, there is a significant (P < 0.001) positive correlation between d and both QSTFSTQ and Formula 7 (correlations 0.3449 and 0.5804), a significant negative correlation between QSTFSTQ and f, mX, and mY (correlations –0.3216, –0.2806, and –0.3211), and a significant negative correlation between Formula 7 and f (correlation –0.4750). These correlations indicate that, regardless of the strength of selection and the rates of migration and selfing, as dominance increases the difference between QST and FSTQ also increases. In addition, with other factors constant, increasing the selfing rate or the migration rates decreases the difference between QST and FSTQ. For loci under divergent selection, then, we expect the largest difference between QST and FSTQ when selfing rates are small and the dominance effect is large. This result agrees with the conclusion for neutral loci from GOUDET and BUCHI (2006): the effect of dominance becomes smaller as inbreeding increases.

We also iterated allele frequencies to equilibrium for a range of dominance and inbreeding coefficients given Formula 7. Figure 3A shows that, as expected from the correlation described above, there is a positive relationship between Formula 7 and the dominance coefficient. Interestingly, however, for the given parameter values, Formula 7 increases as the selfing rate approaches 0.3 and then decreases (Figure 3B). Both the numerator and the denominator in the expression for QSTFSTQ involve f4 terms (APPENDIX B), so it is unsurprising to see such a nonlinear effect on the difference. When we plot QSTFSTQ instead of Formula 7, the relationship with d and with f looks very similar to those seen in Figure 3, A and B, respectively. These graphs indicate that in fact it may not be a low but a moderate level of inbreeding, coupled with strong dominance, that is likely to give the largest positive QSTFSTQ difference.


Figure 3
View larger version (18K):
In this window
In a new window
Download PPT slide
 
FIGURE 3.—

Equilibrium Formula 7 analytic values, calculated using recursion equations, as a function of (A) dominance coefficients (d) and (B) selfing rate (f). Other parameters are set at Formula 7, Formula 7, and Formula 7.

 

Nonequilibrium QSTFSTQ:

Populations may not be at equilibrium when they are sampled, for example, when populations are adapting to a new selection pressure. Therefore, we used our recursion equations (3) to investigate the change in allele frequencies and in QST FSTQ as two populations adapt to a new selection pressure and stabilize toward equilibrium values. We found that in general, allele frequencies and QST moved initially very rapidly toward equilibrium values, but that the rate of change slowed considerably when populations approach the equilibrium values.

The initial allele frequencies determine how the value of QST FSTQ changes over time. For example, if Formula 7, Formula 7, and Formula 7, then our expected equilibrium values for pX, pY, and QSTFSTQ are 0.2351, 0.8937, and 0.0249, respectively. The average equilibrium frequency of allele A across populations Formula 7 is 0.5644. Figure 4 demonstrates the change in the value of QST FSTQ over time when the initial frequencies are Formula 7 (bottom line) and Formula 7 (top line). These graphs are typical of the change in QST FSTQ when initial Formula 7 (bottom line) and initial Formula 7 (top line). If initial allele frequencies are approximately equal and are less than the average frequency of allele A at equilibrium, QSTFSTQ will initially decrease and become more negative and then increase toward the equilibrium QSTFSTQ. However, if initial allele frequencies are greater than the average equilibrium frequency, QSTFSTQ will always be positive.


Figure 4
View larger version (7K):
In this window
In a new window
Download PPT slide
 
FIGURE 4.—

Change in the value of QSTFSTQ over time, calculated from recursion Equations 3, when the initial frequencies are Formula 7 (bottom line) and Formula 7 (top line). Other parameters are Formula 7, Formula 7, and Formula 7. Expected equilibrium values are pX = 0.2351, pY = 0.8937, and QSTFSTQ = 0.0249.

 
Interestingly, these results suggest that when divergent selection pressure is first applied, the value of QSTFSTQ can become progressively more negative. In addition, a negative value of QSTFSTQ may persist for a substantial number of generations before a positive value is reached. The speed of the transition from negative to positive QSTFSTQ is predominantly determined by the strength of selection. As expected, however, QST is a strictly increasing function of time, so when comparing QST to FST measured at neutral loci, the value of QSTFSTN will always be positive and will increase as QST approaches its equilibrium value.


THE SIMULATION MODEL
Our numeric results from the analytic approach suggest that dominance is likely to enhance the difference between QST and FST for a quantitative trait under divergent selection, and therefore selection will be more easily detected. To test this prediction, and to incorporate the effect of drift, we simulated the genotypes of diploid individuals in two populations linked by migration. For a number l of unlinked loci, genotypes are initially randomly assigned to individuals in each population by assuming that all a alleles at a locus are equifrequent. A number of loci contribute to the quantitative trait values and are therefore selected, while the remainder act as neutral marker loci. Individuals randomly mate or self to produce offspring of the next generation. Offspring migrate with probability m to the other population. These offspring then join the parent pool for the following generation, and the process of mating and migration iterates through generations. This design closely follows the analytic model, with mating followed by migration, discrete generations, and stable population sizes over time.

We ran our simulation program for a number of different scenarios to investigate the impact both divergent and balancing selection have on the QSTFST contrast. For each of five simulation sets (A–E), we considered six designs combining high and low levels of inbreeding and low, medium, and high dominance coefficients. In addition, we assume equal migration rates and population sizes (mX = mY = m = Formula 7 and NX = NY = N = 1000). For all loci, mutation from one allele to any other occurs at equal, relatively high, probability µ to avoid fixation of alleles due to drift. Under each of the six designs, simulations were run for 2000 generations and replicated 100 times, and QST and FST values were averaged over replicates. In the following sections, we first describe the methods used to calculate QST and FST in our simulations and then present the results from the five simulation sets (A–E).

Calculating FST and QST in simulations:

Calculating FST:

The simulation calculates FST separately at neutral loci (FSTN) and at the loci under selection (FSTQ) to allow comparison of the QSTFST contrast both at selected and at neutral loci. Using allele frequencies in each population following migration, FST at each generation is calculated as

Formula 8(8)
(WRIGHT 1951), where

Formula 9(9)

Formula 10(10)
and

Formula 10
[recall that i indexes the population (i = X, Y), l is the number of loci, and a is the number of alleles at each locus]. pijk is the allele frequency of allele k at locus j in population i and pjk is the average allele frequency of allele k at locus j across populations X and Y. Note the small change of notation from our one-locus, two-allele model (Table 2); we number allele frequencies at locus j in population i as pij1, pij2, pij3, ... , for alleles A1, A2, A3, ... , instead of frequencies p and (1 – p) for alleles A and B.

Calculating QST:

For QST, a small number of diallelic loci are assumed to contribute to the quantitative trait, and genotypic values are assigned by summing over loci. For each of l loci, marginal genotypic values for AA, AB, and BB are –1/l, d/l, and 1/l, where d is the level of dominance. We scale by the number of loci l so that the minimum and maximum genotypic values (–1 and 1) remain constant. On the basis of the genotypic value of an individual, selection is then applied to transform genotypic values into relative fitness. Individuals with higher fitness are more likely to be selected as parents for the next generation. If only one locus is under selection, we may use a linear fitness function as in the analytic approach (2) and model either divergent (si of different sign) or uniform (si of the same sign) selection. For one or more loci, we also model optimal selection acting on the total genotypic value over loci. Following TURELLI (1984) and LE CORRE and KREMER (2003), we define the absolute fitness of an individual as

Formula 11(11)
where GiOPT is the local genotypic optimum in population i and {gamma} > 0 is the strength of selection. As {gamma} increases, the strength of selection decreases. Uniform and divergent selection are modeled by setting similar or different values for the local genotypic optima. Note that for the optimal selection regime, more than two populations and more than two alleles at the loci under selection may easily be included in the simulation design, but we restrict our simulations to the two-population, two-allele case for simplicity and comparison with the analytic approach.

After a large number of generations, the within- and between-population variances needed to calculate QST approach asymptotic values. Variances are calculated either from population allele frequencies or from observed genotypic values of individuals in each population. For example, using trait values, VA may be estimated as twice the covariance between offspring and midparent genotypic values or four times the covariance of half sibs [provided the level of inbreeding is not large and the dominance variance is small (COCKERHAM and WEIR 1984)]. Similarly VB may be calculated from the difference between the trait means of the populations. However, for the two-allele case we can use the exact expressions for VA and VB calculated from allele frequencies [(6) and (5), respectively]. For more than one locus, this expression becomes

Formula 12(12)
where we sum the between- and within-population variances for each locus across the l loci before calculating QST.

We compared the expectations for quantitative trait differentiation when QST is calculated from (i) allele frequency information [using (5) and (6)] or (ii) phenotypic measures and covariances between parents and offspring as described above. We ran 1000 replicates of our simulation model for the simple case of no inbreeding and no dominance, with selection coefficients and migration rates chosen randomly from their biologically feasible ranges [Formula 12 and Formula 12]. The number of individuals in each population was set to 1000, and the simulation was run for 2000 generations. As expected when d = 0, QST and FST values were identical when both are calculated using allele frequencies. The means of QST calculated from allele frequencies (0.4444) and QST calculated from phenotypic values (0.4461) were very similar, as were their variances (0.1376 and 0.1375, respectively). However, calculating QST from phenotypic values introduces additional sampling error to the value. For example, as there is no dominance, the correlation between FST and QST calculated from allele frequencies is 1, while the correlation of FST and QST calculated from phenotypes was 0.9985. Similarly, if we run the simulation under the same conditions but additionally select dominance and inbreeding coefficients randomly from their biologically feasible ranges Formula 12 and 3000 replicates], the correlation between FST and QST calculated from allele frequencies is 0.9929, while the correlation between FST and QST calculated from phenotypes is 0.9501. Therefore, using phenotypic values to calculate QST would make the contrast between FST and QST more variable and any difference between them less likely to detect. Results of the following simulations are therefore presented with values of QST calculated from allele frequencies.

Simulation set A—differentiation via divergent linear selection:

To verify the predictions of our analytic model, we first ran the simulation with one locus under divergent selection using the linear fitness function described in Table 2. Selection strength was set at Formula 12. Table 3 compares the expected analytic values (exp), for QST, FSTQ (FST at the selected quantitative locus), and FSTN (FST for nine neutral loci) with the average results from our simulations (sim) for each design. Exp values are calculated using Equations 1 (FSTN exp), 4 (QST exp), and 7 (FSTQ exp) after iterating recursion equations to equilibrium. Sim values are based on final allele frequencies from the simulations and calculated using Equations 8 (FSTN sim, FSTQ sim) and 12 (QST sim).


View this table:
In this window
In a new window

 
TABLE 3

Simulation set A: comparison of QST and FST values under divergent linear selection

 
In general, the simulated values agree well with those from our analytic models. There are some surprisingly large differences between the expected and the simulated values for FSTN for high rates of selfing. This appears to be due to a large proportion of replicates fixing different alleles in the two populations across many of the neutral loci (and hence FSTN {approx} 1). Therefore, despite explicitly including drift in our expression for FST (Equation 1), the combination of high rates of selfing and low migration has reduced Ne further than we predicted. Setting a higher mutation rate for neutral loci for these particular designs would certainly have prevented such loss of alleles and this elevation in FSTN simulated values.

QST – FSTQ:

The average simulation results for QSTFSTQ are very similar to the expected values, and, as expected, all averages are very close to zero at low dominance (d = Formula 12). Figure 5A demonstrates the close correlation between average expected (solid lines) and simulated values (dashed lines) at the three levels of dominance when the selfing rate is low. However, the variation in the value of QST FSTQ between replicates in many cases exceeds its expectation (data not shown). In addition, we can see that in many cases a high proportion of individual replicates may have negative QSTFSTQ values. For example, for the medium dominance, low-inbreeding design we expect QSTFSTQ = 0.0202, and although the average QSTFSTQ from simulations is close to expectation, nearly a quarter of replicates have QST FSTQ < 0.


Figure 5
View larger version (12K):
In this window
In a new window
Download PPT slide
 
FIGURE 5.—

Comparison of QST and FSTQ expected (exp) and simulated (sim) values across (A) three levels of dominance (d) and (B) two levels of selfing (f) under divergent linear selection, when f = Formula 12. See Simulation set A—differentiation via divergent linear selection for details of the simulation design.

 
To investigate the impact of the large variation of QST and FST further, we ran a set of 1000 simulations. Dominance, inbreeding and selection coefficients, and migration rates were chosen randomly from their biologically feasible ranges and population sizes fixed (NX = NY = 1000). Figure 6 demonstrates that the distribution of QSTFSTQ differs significantly from the distribution generated from our analytic model (Figure 2). Thus, although we generally expect QST > FSTQ on the basis of our results from the recursion equations, the simulations indicate that in real populations the effect of sampling (i.e., drift) may hide any elevation of QST due to dominance. This result agrees with the discussion in GOUDET and BUCHI (2006) and GOUDET and MARTIN (2007) regarding the large variance of QST and FSTQ for a neutral quantitative locus.


Figure 6
View larger version (12K):
In this window
In a new window
Download PPT slide
 
FIGURE 6.—

Distribution of QSTFSTQ simulated values for 1000 replicates, calculated by sampling population parameters from biologically feasible ranges, where Formula 12, Formula 12, NX = NY = 1000, and µ = Formula 12.

 
As predicted from Figure 2 and the analytic expectations, Table 3 and Figure 5A indicate that there is a positive relationship between QSTFSTQ and the dominance coefficient: within each of the two levels of selfing there is a general trend for increased difference between QST and FSTQ as dominance coefficients increase. This supports our earlier conclusion that dominance increases the likelihood that divergent selection will be detected.

Higher inbreeding levels tend to reduce the QSTFSTQ contrast in both the simulated and the expected analytic values (Figure 5B; plotted for the medium dominance level d = Formula 12), although it is difficult to compare these results to Figure 3B because only two selfing rates are modeled. There is no significant difference between the QST FSTQ contrasts for high, medium, and low mutation rates (data not shown).

QST – FSTN:

Table 3 also shows the value of QSTFSTN for each design, averaged over the 100 replicates. These contrasts are large due to the strong selection coefficients used in the simulations and hence the high QST values compared to neutral marker FST. In a meta-analysis of 55 empirical studies comparing QST and FSTN, the average difference between QST and FSTN was 0.12 (LEINONEN et al. 2008), suggesting that on average selection is weak in reality. However, 6 of 62 studies (of which 4 of 55 were included in the meta-analysis) reported QST > 0.8, and of these 3 were also associated with relatively low FSTN values (FSTN < 0.25); therefore such high levels of divergence are not impossible.

Despite the very large average values of QSTFSTN, the large variance in values for QST and FSTN causes the QST FSTN contrast for simulated values to be negative in some replicates. In particular, designs with a high selfing rate have a reasonable proportion (up to 4%) of negative QSTFSTN values. Interestingly, in a separate set of simulations with a much higher migration rate (m = Formula 12), the proportion of negative QSTFSTN values was much larger. In cases where high migration was paired with high selfing rates, even the average QSTFSTN over 100 replicates was negative (data not shown). Although this appeared to be a consequence of a large proportion of replicates fixing alleles at all neutral loci (FSTN {approx} 1; see above), these results indicate that high migration and drift may in fact cause QST < FSTN even when the quantitative trait is under divergent selection.

We now assess the effect a different selection regime has on the QSTFST contrast for the quantitative and neutral loci in simulation sets B–E. As in simulation set A above, we use a combination of low and high inbreeding coefficients and low, medium, and high dominance as the basis for our six simulation designs.

Simulation set B—differentiation via divergent optimal selection:

Simulation set B assessed whether divergent optimal selection at one locus gives similar results to the simulations using divergent linear selection above (set A). Recalling our optimal selection fitness function (Equation 11), we set divergent local optima for populations X and Y (GXOPT = –GYOPT = Formula 12) with relatively strong selection strength ({gamma} = 5). Again each of the six designs was replicated 100 times and FSTQ, FSTN, and QST were calculated from final allele frequencies using Equations 8 and 12. Average values for QST were ~0.06 larger than corresponding simulation values under divergent linear selection (Table 3), suggesting that the strength of selection was slightly higher. All average QST FSTQ and QSTFSTN values are positive (data not shown). In some cases a reasonable proportion of QST FSTQ replicates are negative but this fails to affect the average QSTFSTQ values for each design. These results indicate that divergent optimal selection elevates QST still further above FSTQ than does divergent linear selection, making it even more likely that selection will be detected. These simulations further support the conclusion that a dominant locus under divergent selection is likely to increase the value of QST relative to FSTQ.

Simulation sets C and D—differentiation via balancing optimal selection:

Simulation sets C and D represent balancing selection acting on one and two loci, respectively. For both sets, the local optima were set at zero and were the same for populations X and Y (GiOPT = 0), with strong selection toward this optimum ({gamma} = 5).

One selected locus:

Table 4 shows the results of 100 replicates for each design for set C (one locus). Values are based on final allele frequencies from the simulations and calculated using Equations 8 (FSTN, FSTQ) and 12 (QST). Under balancing selection, we expect QST FSTN < 0 because the differentiation of quantitative traits will be restrained relative to neutral markers. Table 4 shows that indeed in all cases average QSTFSTN values are less than zero.


View this table:
In this window
In a new window

 
TABLE 4

Simulation set C: comparison of QST and FST values under balancing optimal selection

 
When allele frequencies at the quantitative trait locus are similar in populations X and Y, we expect QSTFSTQ to be approximately equal to zero. Provided allele frequencies are not identical, a low frequency of the recessive allele in both populations Formula 12 gives QSTFSTQ values slightly less than zero, while a higher frequency Formula 12 gives QST FSTQ values slightly greater than zero. This difference between QST and FSTQ across different frequencies of the recessive allele in populations X and Y is demonstrated in Figure 7, for Formula 12 and Formula 12 (also see similar contour plots in Figure 2, A and B, in GOUDET and BUCHI 2006). The negative or close to zero values of QSTFSTQ under balancing selection (Table 4) therefore suggest a trend for low frequencies of the recessive allele in both populations. Negative QSTFSTQ values are more likely at medium and high dominance coefficients than for low dominance coefficients. This leads to the exciting conclusion that, as with divergent selection, balancing selection is more likely to be detected when alleles at the locus are not additive. QST values are depressed relative to FST at the quantitative locus, and when compared to FST for neutral loci, the signal of balancing selection (QST FSTN < 0) becomes stronger.


Figure 7
View larger version (43K):
In this window
In a new window
Download PPT slide
 
FIGURE 7.—

Contour plot of analytic QSTFSTQ values, plotted across the frequency of the recessive allele in populations X and Y, for Formula 12 and Formula 12. The line of equality pX = pY and the line intersecting it midplot correspond to QSTFSTQ = 0, contours are spaced in Formula 12 intervals, and values to the right top (contours with lighter shading) become more positive.

 

Two selected loci:

We also simulated 100 replicates of the six designs where two loci contribute to the quantitative trait and eight loci are neutral (set D, data not shown). The FSTN values agree well with those from previous sets, as expected for these neutral loci. However, QST and FSTQ values are remarkably high (QST {approx} 0.4) compared to values for one selected locus (Table 4; QST {approx} 0.1). This high level of differentiation with low migration is explained by a reasonable proportion of replicates fixing different alleles in populations X and Y (see also GOLDSTEIN and HOLSINGER 1992 and LATTA 1998). For example, for two loci with two alleles A and B at frequencies p and q, the same mean (µ) for populations X and Y can be achieved when allele frequencies are similar in both populations (such as Formula 12 and Formula 12, giving Formula 12 and Formula 12) or different alleles are fixed in both populations (such as Formula 12 while Formula 12, giving Formula 12 and Formula 12) (where pij is the frequency of allele A at locus j in population i).

The elevation of QST levels has a large impact on the average QSTFSTN values across these designs (data not shown). Under balancing selection we expect QSTFSTN < 0, but in designs with low levels of inbreeding the average contrast is much greater than zero. This indicates that QST calculated using allele frequencies at the quantitative loci may be very large despite the mean values of the trait of interest being similar between populations (GOLDSTEIN and HOLSINGER 1992; LATTA 1998). Clearly because QST is usually measured on trait values rather than calculated from allele frequencies across loci (as the quantitative loci of interest are unlikely to be known), this is a rather academic conclusion. This result does suggest, however, that knowledge of allele frequencies at loci underlying a quantitative trait will not necessarily help determine whether the trait has been under divergent or balancing selection.

Simulation set E—neutral differentiation:

Finally, we replicated the results of GOUDET and BUCHI (2006) by simulating purely neutral differentiation for both the quantitative and the neutral loci, for two loci coding the quantitative trait and eight neutral loci (set E, data not shown). We set the optimum for both populations X and Y equal to zero, but additionally reduced the strength of selection ({gamma} = 5 x 109) so that the fitness of every genotype was effectively equal. Because the quantitative trait is neutral, we expect for each design that QST {approx} FSTQ {approx} FSTN. Our simulations suggest that average QSTFSTQ values are slightly less than zero, although the proportions of QSTFSTQ greater than and less than zero within replicates of each design are approximately equal. This result agrees with that of LOPEZ-FANJUL et al. (2003, 2007), GOUDET and BUCHI (2006), and GOUDET and MARTIN (2007)—when integrating over the surface of all feasible allele frequencies in populations X and Y, the average QSTFSTQ value tends to be negative when d != 0, despite the positive and negative areas of parameter space being approximately equal (Figure 7). Thus, when a trait is evolving neutrally, dominance may deflate the value of QST relative to FST and therefore may lead us to conclude that balancing selection is restricting the differentiation of the quantitative trait.

Finally of note, the values of FSTN are reassuringly similar across sets A–E, reflecting pure drift in all cases. Note that the neutral loci are unlinked to the quantitative loci and their differentiation is dependent only upon the effective size of the population. In particular, FSTN values are consistently high when rates of selfing are high, reflecting local fixation of alleles due to the small effective population size of these designs.


DISCUSSION
There has been growing interest in the evolution of quantitative traits in natural populations, particularly in those traits that have important fitness consequences. Empirical evidence from natural populations suggests that quantitative trait divergence frequently exceeds the divergence seen at neutral markers (MERILA and CRNOKRAK 2001; LEINONEN et al. 2008). In addition to the large variances expected for QST and FST, and the difficulty in testing any difference between them (O'HARA and MERILA 2005; WHITLOCK 2008), recent modeling has further suggested that the observation of QST > FST may be caused by factors intrinsic to the quantitative loci themselves, such as dominance and epistatic interactions between loci (LATTA 1998; WHITLOCK 1999; LE CORRE and KREMER 2003; LOPEZ-FANJUL et al. 2003; GOUDET and BUCHI 2006).

In this article, we constructed both an analytic and a simulation model to describe the differentiation of dominant quantitative traits and neutral markers between two populations linked by migration. We included dominance, selfing, and a divergent linear selection function into recursive equations to find equilibrium allele frequencies for the two populations and used these allele frequencies to calculate QST and FST at the quantitative locus. In addition, we derived expressions and recursive equations to calculate FST at equilibrium for neutral loci. Finally, we constructed an individual-based simulation model to investigate the effects of divergent and balancing selection and drift on the QSTFST contrast.

These models have given important insight into the behavior of QST and FST in the presence of dominance and inbreeding. By using our recursion equations to find expected equilibrium values for QST, we have demonstrated that when a quantitative trait is under divergent selection, the expectation for QST FSTQ (FST calculated at the quantitative locus using equilibrium allele frequencies) is generally greater than zero when the quantitative locus is dominant. QST is likely to be elevated relative to the divergence calculated from allele frequencies at the quantitative trait (i.e., FSTQ), and when comparing QST to FST calculated from neutral loci (FSTN), this elevation in QST means that it is more likely that divergent selection will be detected. Results from simulations also indicate that for balancing selection, where we expect QST < FST, the presence of dominance may deflate QST relative to FSTQ and also lead to a larger negative value for QSTFSTN. Despite the impact that dominance, and to a certain extent inbreeding, has on enhancing the QSTFSTQ contrast, this should be seen in context: selection has a very much stronger impact on QST values than dominance or inbreeding.

Effect of dominance:

The reason that dominance generally enhances the difference between QST and FST for the selected locus can be seen clearly when we plot the increase of expected QST and FSTQ values against increasing dominance coefficients for a "typical" set of parameters (Figure 8). Although QST and FST have the same value when there is no dominance (d = 0), when the populations are under divergent selection, QST increases more rapidly than FST with increasing dominance coefficients. In contrast to divergent selection, simulation results for balancing selection suggest that the growth in FST values across dominance coefficients may be greater than that for QST (Table 4), explaining why the presence of dominance enhances the difference between QST and FST for both divergent and balancing selection. We conclude that dominance is very unlikely to mask the effect of selection on the signatures of divergent and balancing selection (QST > FST and QST < FST, respectively).


Figure 8
View larger version (6K):
In this window
In a new window
Download PPT slide
 
FIGURE 8.—

Plot of analytic QSTFSTQ values calculated from equilibrium allele frequencies against increasing dominance coefficients (d). Other population parameters are Formula 12, Formula 12 (i = X, Y).

 
There are in addition a number of interesting findings from the sets of simulations. For our simulation results for divergent selection using a linear fitness function (Table 3), the observed difference between QST and FSTQ for d > 0 was frequently negative due to the very large variance in both QST and FSTQ values, despite the relatively large population sizes modeled (NX = NY = 1000). In addition, even with strong selection acting on the quantitative trait, the value of QST was less than FSTN for neutral markers in some cases. Nevertheless, at high levels of dominance it was more probable that QSTFSTQ and QSTFSTN values were positive, suggesting that divergent selection is more likely to be detected in dominant than in additive traits. Interestingly, in simulations with a different divergent selection regime (where genotypic fitnesses are determined by an optimal fitness function), dominance had an even stronger effect on enhancing the difference between QST and FSTQ. Under balancing selection, QST seemed to be less than FSTN in the presence of dominance, suggesting that, as with divergent selection, balancing selection is more likely to be detected in dominant than in additive traits.

The results for one locus under balancing selection are consistent with expectation when comparing quantitative and neutral loci (QSTFSTN < 0). However, when two loci were selected, the observed QST values at low levels of migration and selfing were extremely high compared to the FSTN values. This was a consequence of calculating QST using allele frequencies at these loci rather than the overall trait values; although the mean trait values were similar between populations, allele frequencies could be highly differentiated. Therefore, when interpreting sequence differentiation at loci thought to contribute to quantitative traits, it is important to bear in mind that large sequence differences between two populations may not be reflected in any significant trait differentiation if they are offset by differences in other coding loci throughout the genome.

Finally, as explored by LOPEZ-FANJUL et al. (2003, 2007), GOUDET and BUCHI (2006), and GOUDET and MARTIN (2007), neutral traits show generally very little difference in QST and FSTQ values, with similar proportions of QSTFSTQ greater than and less than zero across designs.

Extensions to more general cases:

Multiple alleles:

In all of our simulations described above, we assumed that the quantitative trait was coded by one or more diallelic loci. To assess whether our conclusions are restricted to the case of two alleles, we ran an additional set of simulations. Inbreeding coefficients, migration rates, and mutation rates were fixed at 1/N with population size NX = NY = N = 1000. To model divergent selection, selection coefficients were chosen from a uniform distribution between 0 and 0.1 (population X) and between –0.1 and 0 (population Y). The simulations were run for 2000 generations with 200 replicates. One set of simulations was run for a single-locus, two-allele model, where the dominance coefficient d was chosen at random from (0, 1), with genotypic values defined as in Table 2. The second set of simulations was run for a single-locus, four-allele model (alleles Ai, i = 1–4). Genotypic values of homozygotes were defined as

Formula 12
where {alpha}i is the additive effect of allele Ai and is drawn from a normal distribution of mean 0 and standard deviation of Formula 12. Genotypic values of heterozygotes were defined as

Formula 12
where dij is the dominance term. Each of the six unique dominance terms was drawn from a normal distribution with mean 0 and standard deviation Formula 12 defined by

Formula 12
such that ~95% of heterozygotes fall within the range between the two homozygote genotypic values. FST at the quantitative locus (FSTQ) was calculated using (8). We calculated QST using Equation 4, where VA is defined as twice the covariance between parent and offspring genotypic values (averaged over populations), and VB is the variance in mean genotypic values between populations.

The results from the two sets of simulations strongly support our earlier conclusion that under divergent selection, dominance causes a positive difference between QST and FSTQ. The mean difference between QST and FSTQ was significantly greater than zero whether two or four alleles contribute to the quantitative trait (P < 0.001), and in both cases only a small proportion of replicates had QSTFSTQ values less than zero. Dominance is therefore likely to enhance the difference between QST and FSTQ, whether the quantitative trait locus has two or more alleles, when the quantitative trait is under selection.

Multiple populations:

An additional restriction of our simulations was that we investigated only two populations linked by migration. The variance in QST and FST is certainly affected by the number of populations studied (see WHITLOCK 2008 and references therein) and will affect the likelihood that we can detect a difference between them. We therefore performed simulations for two and four populations under divergent selection, to assess the change in variance in QST, FSTQ, and QSTFSTQ. QST and FST are calculated by extending Equations 8 and 12 to the case of four populations.

We performed two sets of simulations, the first with low migration and inbreeding, where many parameters were fixed (Formula 12, d = Formula 12). The second set of simulations had a larger proportion of parameters that are allowed to vary, with Formula 12 and f, d, and m chosen at random from their biologically feasible ranges. For both sets, half of the selection coefficients were chosen at random from (–1, 0), and half chosen at random from (0, 1). For two and four populations, each of the simulation sets was run for 2000 generations with 500 individuals in each population, and 1000 replicates were performed.

For both sets of simulations, the variances in QST, FSTQ, and QSTFSTQ decreased substantially for four populations compared to two populations. For the first set of simulations, the variances for QST, FSTQ, and QSTFSTQ were 0.0097, 0.0099, and 0.0017, respectively, for two populations and 0.0022, 0.0021, and 0.0007 for four populations. Results from the second set of simulations showed a similar trend. These findings suggest that increasing the number of populations will not only decrease the variance in QST and FSTQ, but also decrease the variance in the difference between the two. When comparing QST to FST from neutral marker loci, the increased precision in QST will make it more likely that we will detect divergent selection on the quantitative trait. The variances in QST and FST, and hence the likelihood that we will observe a difference between them, are influenced by many other factors, for example, sample size and how QST and FST are estimated (WHITLOCK 2008). These factors need to be taken into account when we determine the significance of any difference (or indeed, lack of difference) between QST and FST for empirical data.

We have demonstrated that under divergent selection, dominance is likely to increase the value of QST, relative to the value expected for a purely additive trait. Our simulations also suggest that balancing selection will decrease the value of QST. For neutral loci, the value of QST under dominance may be increased or decreased relative to QST expected for an additive trait. Under divergent, balancing, and no selection, we expect QST greater than, less than, and approximately equal to FST, respectively. Thus, the presence of dominance will enhance the difference between QST and FST when the quantitative locus is selected and will make it more likely that we will be able to detect these "signatures of selection" in natural populations. For neutral loci, dominance does lead to a difference between QST and FST. The difference is, however, small, making a false detection of selection unlikely.

A perhaps unsurprising result from both analytic and simulation models was the very large impact of migration rates on QST and FST values. Even with strong directional selection, high levels of migration can severely restrict the differentiation between populations, such that the difference between QST and FST measured at neutral markers becomes so low as to be indistinguishable from a neutral model (data not shown). Thus, while we cannot have confidence in the QSTFST contrast giving us any reasonable indication of selection when selection is weak and population sizes are small, we similarly are unlikely to detect consistent signatures of selection when migration is high. One major conclusion from these simulations has been previously well stated by GOUDET and MARTIN (2007, p. 1373): compared to dominance (or indeed other factors), "the large variance of QST is certainly more worrisome for the prospect of identifying traits under selection."


APPENDIX A

Population differentiation at neutral markers (FST):

Recall that Ni is the number of individuals in population i (i = X, Y) and that we let the probability of selfing in both populations be f while the probability of parents outcrossing is (1 – f). Selfing and outcrossing are assumed to be independent, and so the numbers of selfed and outbred offspring per parent are binomially distributed with means f and (1 – f), respectively.

We calculate a number of coefficients describing the similarity between and within individuals in a subdivided population, which we use to calculate FST. The inbreeding coefficient F describes the probability that two alleles at a given locus within an individual are identical by descent (IBD), while {theta}, the coancestry coefficient, is the probability that two alleles chosen at random from two individuals within a population are IBD. Both F and {theta} can be calculated separately for the two populations, and overall values found by multiplying each population parameter by its proportional population size. Finally, {alpha} is the probability that two alleles chosen at random from different populations are IBD.

Recursions for probabilities of identity by descent:

Following the approach of WANG (1997), we derive the full set of recursion equations for F, {theta}, and {alpha}. Individuals within a population outcross or self to give offspring in the next generation, and then a number of these offspring migrate so that population i has a proportion mi of migrants and (1 – mi) of nonmigrants. We first derive the expressions for the probability of IBD following reproduction and then following migration. Definitions of terms used in the following derivations may be found in Table 1.

F, {theta}, and {alpha} after reproduction:

F:

Offspring arise from selfing and outcrossing, with expected proportions f and (1 – f). The inbreeding coefficients of offspring from selfing and outbreeding are Formula 12 and {theta}i, respectively. The average F in the next generation Formula 12 is therefore

Formula 12
for population i.

{theta}:

When calculating {theta} between two individuals I1 and I2, we need to consider a number of situations:
  1. Both I1 and I2 are the result of selfing, with probability f2. Assuming all parents have an equal chance of producing selfed offspring, the probability that I1 and I2 have the same or different parents is 1/N and (N – 1)/N, respectively. The probability of IBD between I1 and I2 with the same parent is Formula 12 and with different parents is {theta}.
  2. I1 is the result of selfing and I2 is not (or vice versa), with probability 2f(1 f). I1 and I2 share a parent with probability 2/N, and the probability of IBD between a selfed and an outbred sib is Formula 12. I1 and I2 share no parents with probability Formula 12, and the probability of IBD between unrelated individuals is {theta}.
  3. Both I1 and I2 are the result of outcrossing, with probability (1 f)2. The probability that I1 and I2 have the same parent pair is 2/N(N – 1), and the probability of IBD between these full sibs is Formula 12. I1 and I2 share one parent with probability 1/N and do not share the next with probability Formula 12, and can do so four ways, giving an overall probability of Formula 12. The probability of IBD between half sibs is Formula 12. Finally, all parents between and within I1 and I2 are different with probability Formula 12, and the probability of IBD between unrelated individuals is {theta}.

Multiplying probabilities, summing each of the subcases, and then summing over the three mating types, the overall probability of IBD between two individuals in population i after reproduction is

Formula 12

{alpha}:

Reproduction does not change the probability of IBD between populations, so

Formula 12

F, {theta}, and {alpha} after migration:

F:

After migration, the probability that alleles within an individual are IBD for population X is a weighted average of Formula 12 and Formula 12:

Formula 12
A similar expression for population Y is obtained by swapping the X and Y subscripts.

{theta}:

After migration, the probability that two alleles from different individuals within X (or Y) are IBD is a weighted average of Formula 12 and Formula 12. For two individuals taken at random from population X, the probability that they are both nonmigrants is Formula 12, that they are both migrants is Formula 12, and that one is migrant and one nonmigrant is Formula 12. Then the average probability of IBD is

Formula 12
A similar expression for population Y is obtained by swapping the X and Y subscripts.

{alpha}:

After migration, the probability that two alleles, one from X and one from Y, are IBD is again a weighted average of Formula 12 and Formula 12: Formula 12.

By substituting values of Formula 12 and Formula 12 into the equations for FX(t), FY(t), {theta}X(t), {theta}Y(t), and {alpha}(t) above, we derive the full set of recursive equations for the probabilities of identity by descent. These recursion equations may be written in matrix notation as

Formula 12
where

Formula A1(A1)
and Mi = miNi is the number of migrant individuals in population i (i = X, Y). Interestingly, these equations differ from those of WANG (1997) and VITALIS and COUVET (2001), who assume exchange of genes rather than individuals during migration, and hence migration has a different interpretation in the F, {theta}, and {alpha} terms.

If migration and population sizes are the same in the two populations (mX = mY = m and NX = NY = N), then FX = FY = F, {theta}X = {theta}Y = {theta}, and the recursions reduce to three equations, where

Formula A2(A2)
and M = mN.

While the values of F, {theta}, and {alpha} increase monotonically over time, their instantaneous rates of increase [Formula A2, Formula A2, and Formula A2] converge to the same value if there is some migration between populations (WANG 1997). Their asymptote is equal to half the inverse of the asymptotic effective population size; that is,

Formula A3(A3)
The expression for Ne may be found from the transition matrix T. Following WANG (1997),

Formula A4(A4)
where {lambda} is the leading eigenvalue of T. Although finding {lambda} is straightforward when the parameters of the model (f, mX, mX, NX, and NY) are known, there are five eigenvalues for the 5 x 5 matrix T and it becomes difficult to write a general expression for the leading eigenvalue. For example, Mathematica 4.0 (WOLFRAM RESEARCH 1999) finds the five eigenvalues in terms of root objects that may be evaluated only numerically. For given values of NX, NX, mX, mY, and f, we may find a numeric value for Ne either by using the leading eigenvalue of T or by iterating the recursion S(t) = TS(t–1) + C until the changes in {Delta}F, {Delta}{theta}, and {Delta}{alpha} become small.

When we make the simplification that migration rates and population sizes are equal for the two populations, and T is reduced to a 3 x 3 matrix [see (A2) above], we can solve for the leading eigenvalue {lambda} of T. By substituting {lambda} into (A4), we can derive an explicit expression for Ne,

Formula A4
where

Formula A4
and

Formula A4
and | | is the modulus of the expression.

FST at equilibrium:

Using our recursive equations, we may also determine values for Wright's F-statistics, including FST. The F-statistics are defined as

Formula A4
(WRIGHT 1969), where F, {theta}, and {alpha} are the probabilities of IBD described above. For given parameter values of f, mi, and Ni, therefore, we can use recurrent Equation A1 to obtain equilibrium values for FST, FIS, and FIT. In the simple case of mX = mY = m and NX = NY = N, we obtain

Formula A4

Formula A4
and

Formula A4
Solving these equations for FST using (A3) yields

Formula A5(A5)
where Ne is defined above.


APPENDIX B: EXPRESSION FOR QSTFST FOR A SINGLE QUANTITATIVE LOCUS WITH TWO ALLELES
Note that pi is the frequency of allele A in population i (i = X, Y).

Formula A5
where Formula A5 and Formula A5.


ACKNOWLEDGEMENTS
The authors thank the anonymous reviewers for their extremely helpful suggestions to improve the manuscript.


LITERATURE CITED

BONNIN, I., J. M. PROSPERI and I. OLIVIERI, 1996 Genetic markers and quantitative genetic variation in Medicago truncatula (Leguminosae): a comparative analysis of population structure. Genetics 143: 1795–1805.[Abstract]

COCKERHAM, C. C., and B. S. WEIR, 1984 Covariances of relatives stemming from a population undergoing mixed self and random mating. Biometrics 40: 157–164.

CRNOKRAK, P., and J. MERILA, 2002 Genetic population divergence: markers and traits. Trends Ecol. Evol. 17: 501.

GOLDSTEIN, D. B., and K. E. HOLSINGER, 1992 Maintenance of polygenic variation in spatially structured populations: roles for local mating and genetic redundancy. Evolution 46: 412–429.[CrossRef]

GOUDET, J., and L. BUCHI, 2006 The effects of dominance, regular inbreeding and sampling design on QST, an estimator of population differentiation for quantitative traits. Genetics 172: 1337–1347.[Abstract/Free Full Text]

GOUDET, J., and G. MARTIN, 2007 Under neutrality, QST ≤ FST when there is dominance in an island model. Genetics 176: 1371–1374.[Free Full Text]

HEYWOOD, J. S., 2005 An exact form of the breeder's equation for the evolution of a quantitative trait under natural selection. Evolution 59: 2287–2298.[Medline]

LANDE, R., 1992 Neutral theory of quantitative genetic variance in an island model with local extinction and colonization. Evolution 46: 381–389.[CrossRef]

LATTA, R. G., 1998 Differentiation of allelic frequencies at quantitative trait loci affecting locally adaptive traits. Am. Nat. 151: 283–292.[CrossRef][Medline]

LE CORRE, V., and A. KREMER, 2003 Genetic variability at neutral markers, quantitative trait loci and traits in a subdivided population under selection. Genetics 164: 1205–1219.[Abstract/Free Full Text]

LEINONEN, T., R. B. O'HARA, J. M. CANO and J. MERILA, 2008 Comparative studies of quantitative trait and neutral marker divergence: a meta-analysis. J. Evol. Biol. 21: 1–17.[Medline]

LOPEZ-FANJUL, C., A. FERNANDEZ and M. A. TORO, 2003 The effect of neutral nonadditive gene action on the quantitative index of population divergence. Genetics 164: 1627–1633.[Abstract/Free Full Text]

LOPEZ-FANJUL, C., A. FERNANDEZ and M. A. TORO, 2007 The effect of dominance on the use of the QSTFST contrast to detect natural selection on quantitative traits. Genetics 176: 501–511.[Abstract/Free Full Text]

LYNCH, M., and K. SPITZE, 1994 Evolutionary genetics of Daphnia, pp. 109–128 in Ecological Genetics, edited by L. A. REAL. Princeton University Press, Princeton, NJ.

LYNCH, M., and B. WALSH, 1998 Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA.

MARUYAMA, K., and H. TACHIDA, 1992 Genetic variability and geographical structure in partially selfing populations. Jpn. J. Genet. 67: 39–51.[CrossRef]

MERILA, J., and P. CRNOKRAK, 2001 Comparison of genetic differentiation at marker loci and quantitative traits. J. Evol. Biol. 14: 892–903.[CrossRef]

O'HARA, R. B., and J. MERILA, 2005 Bias and precision in Q(ST) estimates: problems and some solutions. Genetics 171: 1331–1339.[Abstract/Free Full Text]

SPITZE, K., 1993 Population-structure in Daphnia obtusa: quantitative genetic and allozymic variation. Genetics 135: 367–374.[Abstract]

TEMPLETON, A., 1987 The general relationship between average effect and average excess. Genet. Res. 49: 69–70.[Medline]

TURELLI, M., 1984 Heritable genetic variation via mutation-selection balance: Lerch's zeta meets the abdominal bristle. Theor. Popul. Biol. 25: 138–193.[CrossRef][Medline]

VITALIS, R., and D. COUVET, 2001 Estimation of effective population size and migration rate from one- and two-locus identity measures. Genetics 157: 911–925.[Abstract/Free Full Text]

WANG, J., 1997 Effective size and F-statistics of subdivided populations. 1. Monoecious species with partial selfing. Genetics 146: 1453–1463.[Abstract]

WEIR, B. S., and C. C. COCKERHAM, 1984 Estimating F-statistics for the analysis of population-structure. Evolution 38: 1358–1370.[CrossRef]

WEIR, B. S., and W. G. HILL, 2002 Estimating F-statistics. Annu. Rev. Genet. 36: 721–750.[CrossRef][Medline]

WHITLOCK, M. C., 1999 Neutral additive genetic variance in a metapopulation. Genet. Res. 74: 215–221.[CrossRef][Medline]

WHITLOCK, M. C., 2008 Evolutionary inference from QST. Mol. Ecol. 17: 1885–1896.[Medline]

WHITLOCK, M. C., and D. E. MCCAULEY, 1999 Indirect measures of gene flow and migration: F-ST not equal 1/(4Nm+1). Heredity 82: 117–125.[CrossRef][Medline]

WOLFRAM RESEARCH, 1999 Mathematica Version 4.0. Wolfram Research, Champaign, IL.

WRIGHT, S., 1951 The genetic structure of populations. Ann. Eugen. 15: 323–354.

WRIGHT, S., 1969 Evolution and the Genetics of Populations. University of Chicago Press, Chicago.

Communicating editor: J. WAKELEY