## Abstract

The effective population size (*N*_{e}) is frequently estimated using temporal changes in allele frequencies at neutral markers. Such temporal changes in allele frequencies are usually estimated from the standardized variance in allele frequencies (*F*_{c}). We simulate Wright-Fisher populations to generate expected distributions of *F*_{c} and of *F̅*_{c} (*F*_{c} averaged over several loci). We explore the adjustment of these simulated *F̅*_{c} distributions to a chi-square distribution and evaluate the resulting precision on the estimation of *N*_{e} for various scenarios. Next, we outline a procedure to test for the homogeneity of the individual *F*_{c} across loci and identify markers exhibiting extreme *F*_{c}-values compared to the rest of the genome. Such loci are likely to be in genomic areas undergoing selection, driving *F*_{c} to values greater (or smaller) than expected under drift alone. Our procedure assigns a *P*-value to each locus under the null hypothesis (drift is homogeneous throughout the genome) and simultaneously controls the rate of false positive among loci declared as departing significantly from the null. The procedure is illustrated using two published data sets: (i) an experimental wheat population subject to natural selection and (ii) a maize population undergoing recurrent selection.

THE effective population size (*N*_{e}), defined as the size of an ideal Wright-Fisher population undergoing the same rate of genetic change as the population under study, is an essential parameter to predict the evolution of a population due to genetic drift in terms of rates of loss of genetic variation, fixation of deleterious alleles, or inbreeding (Wright 1969). However, obtaining direct estimates of *N*_{e} from demographic data has often proved difficult. An alternative is to use indirect methods, for instance, those based on the measurement of temporal changes in allele frequencies at neutral markers (Krimbas and Tsakas 1971; Waples 1989a). The foundation of these methods is that the variance of allele frequency due to drift from parents to offspring, *V*(*P*_{1}), depends on *N*_{e} as follows: *V*(*P*_{1}) = *P*_{0}(1 − *P*_{0})/2*N*_{e}, where *P*_{0} is the frequency in the parental population. After *t* generations of drift, the expected frequency of the allele is *E*(*P _{t}*) =

*P*

_{0}and the variance of the allele frequency,

*V*(

*P*) =

_{t}*E*(

*P*−

_{t}*P*

_{0})

^{2}, can be written as a function of

*N*

_{e}:

*V*(

*P*) =

_{t}*P*

_{0}(1 −

*P*

_{0})[1 − (1 − 1/2

*N*

_{e})

*] (Crow and Kimura 1970). If*

^{t}*t*is not too large (

*t*≪

*N*

_{e}),

*N*

_{e}can be approximated by

*N*

_{e}≅ (

*P*

_{0}(1 −

*P*

_{0})

*t*)/(2

*V*(

*P*)) and therefore an estimator for

_{t}*N*

_{e}based on the standardized variance in allele frequency is (

*V*(

*P*))/(

_{t}*P*

_{0}(1 −

*P*

_{0})). Nei and Tajima (1981) proposed estimating the standardized variance in allele frequency between generation

*t*and

_{x}*t*for each locus

_{y}*l*with

*K*alleles as 1where

_{l}*p*

_{x}_{(}

_{i}_{,}

_{l}_{)}[respectively

*p*

_{y}_{(}

_{i}_{,}

_{l}_{)}] represents the frequency of allele

*i*at locus

*l*in the sample of

*S*individuals drawn at generation

_{x}*t*(respectively

_{x}*S*individuals at

_{y}*t*). A weighted mean of

_{y}*F̂*

_{c}_{,}

*-values across several loci, 2is then typically used to estimate*

_{l}*N*

_{e}via 3

(Waples 1989a). Note that Equation 1 assumes that alleles frequencies are estimated from samples taken prior to reproduction (so-called plan II sampling scheme). We use that sampling in the remainder of this article; allowing for an alternative sampling scheme is straightforward.

Recently, renewed interest in estimating *N*_{e} has led to the development of numerous methods using allele frequencies observed in a series of temporally spaced samples of a population. Williamson and Slatkin (1999) and Anderson *et al.* (2000) introduced a maximum-likelihood approach to estimate *N*_{e}. Berthier *et al.* (2002) proposed a coalescent-based likelihood approach to estimate *N*_{e} and Wang (2001) devised a faster approximate version using a pseudo-maximum-likelihood method. These methods, tested by the authors for some values of population parameters (*N*_{e}, *l*, *K _{l}*, and

*p*

_{x}_{(}

_{i}_{,}

_{l}_{)}), have proved to be slightly more accurate than the

*F*

_{c}method, but are much more computationally intensive. Hence

*F*

_{c}-based estimators of

*N*

_{e}remain frequently used in practice (Fujiio

*et al.*1999; Turner

*et al.*1999; Goldringer

*et al.*2001; Shikano

*et al.*2001). Properties of

*F*

_{c}-based estimators of

*N*

_{e}and the quality of the confidence intervals around such estimates depend critically on the distribution of

*F̅*

_{c}-values. Confidence intervals around

*N*

_{e}have been based on the fact that , with

*n*= ∑

*(*

_{l}*K*− 1), is distributed approximately as a chi-square with

_{l}*n*d.f. (Lewontin and Krakauer 1973). Hence, assessing the adjustment of the actual

*F̅*

_{c}distribution to a chi-square distribution is important. The chi-square approximation has been studied for some special cases, but the effects of initial allele frequencies, of the number of alleles, of the number of loci, and of the number of generations as well as the “true” effective population size on the distribution of

*F̅*

_{c}-values and on

*N̂*

_{e}are still poorly known (Waples 1989a).

Before averaging estimates of *F*_{c} obtained at individual loci to obtain *F̅*_{c} and an estimate of *N*_{e}, it is desirable to test whether all loci used for that study have experienced the same effective population size. This implicit assumption, which underlies all methods of estimation mentioned above, is rarely tested. Several factors can modify the local effective size at a given locus. The recurrent elimination of deleterious variants linked to a marker locus, known as “background selection,” will reduce the effective size locally; this effect depends on the local recombination rates and on genome-wide parameters describing spontaneous mutation and their effect on fitness (Charlesworth *et al.* 1993). Hitchhiking will also drive higher than expected the temporal variance in allele frequency of markers linked to a positively selected variant (Wiehe and Stephan 1993).

The remainder of this note is organized as follows. In the first part, we study the actual *F̅*_{c} distribution and the quality of the chi-square approximation. The actual distribution of *F̅*_{c}, its divergence from a chi-square distribution, and the quality of the *N*_{e} estimation based on *F*_{c} are studied under various scenarios by varying the initial allele frequencies, the number of loci, the number of generations, the sample size at both generations, and the true effective population size. In the second part, we outline a procedure to identify loci with “extreme” individual *F̂ _{c}*

_{,}

*-values. We illustrate our approach by reanalyzing two experimental data sets: temporal variation in allele frequencies at 29 markers in an experimental wheat population under natural selection and frequencies at 82 markers in a maize population under recurrent selection.*

_{l}## The actual distribution of *F̅*_{c} and its consequences for estimating *N*_{e}:

To investigate the actual distribution of *F̅*_{c}, we simulated Wright-Fisher populations using an exact multinomial sampling scheme. We generated expected distributions of temporal variations in allele frequencies conditional on initial allele frequencies. Distributions were based on 3000 independent replicates. In each replication, several loci with the same initial allele frequencies were simulated and *F̅*_{c} was computed. All simulations were carried out using *Mathematica* (Wolfram 1996). Our simulations show that substantial departure of the actual distribution of *F̅*_{c} from a chi-square distribution, as measured through Kullback's (1968) symmetric measure of divergence between both distributions (see Table 1), can be observed under a variety of conditions depending on the parameter values chosen (Table 1). The actual *F̅*_{c} distribution is closest to a chi-square and thus *N*_{e} is best estimated when biallelic marker loci with equal frequencies are used. Conversely, the discrepancy between the actual *F̅*_{c} distribution and the chi-square approximation is large when allele frequencies are strongly unbalanced (*P*_{0} < 0.1 for at least one allele), when the number of alleles per locus is large (*K* ≥ 5) such as for microsatellite markers, or when the number of generations increases (δ*T* = *t _{y}* −

*t*> 15 when

_{x}*N*

_{e}= 100 is assumed). Increasing the sample sizes up to 200 or 500 individuals does not diminish the discrepancy, especially for a high number of alleles (data not shown). In most cases, the distribution of simulated values is shrunk compared to the chi-square approximation. In addition, the actual distribution is a bit skewed toward higher values of

*F*

_{c}. As a consequence, the distribution of

*N̂*

_{e}in the simulations is often more narrow than the one based on the chi-square approximation. Confidence intervals at the 95% level based on either the chi-square approximation or the actual

*F̅*

_{c}distribution are given in Table 1. These are exactly the intervals that would be computed in experimental studies. Chi-square confidence intervals are often wider than the confidence intervals based on the simulated

*F̅*

_{c}(Table 1). The width of this interval, which is connected to the precision of the estimation, depends mainly on the number of independent alleles used [

*L*(

*K*− 1)]. Note that the product

*L*(

*K*− 1) is also the number of degrees of freedom of the chi-square used in previous approximations. Confidence intervals derived from simulated data are reduced by 10–25% relative to chi-square-based confidence intervals for scenarios involving unbalanced initial allele frequencies with 5 ≤

*L*(

*K*− 1) ≤ 10, or very large number of alleles (

*K*= 10), sample sizes <100, δ

*T*> 10, or

*N*

_{e}< 75. Otherwise, they are quite close to the chi-square-based intervals (reduction is <10%). Similar results are found with larger sample sizes (200 and 500 individuals).

Except for δ*T* = 5 generations, *N̂*^{*}_{e}, computed from *F̅*^{*}_{c} (an average of *F̅*_{c} over 3000 independent replicates), is always higher than the population size *N*_{e} used in simulations. This indicates that *F̅*_{c}-based estimation tends to return overestimated values of *N*_{e}. Richards and Leberg (1996) and Luikart *et al.* (1999) argued that the overestimation of *N*_{e} using *F̅*_{c} [or Pollak's (1983) estimator, *F̂ _{k}*] is due mainly to the loss of alleles in early generations, suggesting that the bias would be greater with increasing drift and when there are rare alleles. Luikart

*et al.*(1999) focused on the estimation of

*N*

_{e}after very strong bottlenecks (

*N*

_{e}= 4–40), which were not considered here. Whereas our results confirm the existence of a greater bias for rare alleles, for the range of

*N*

_{e}we considered, population size does not appear as critical; however, increasing the number of generations between samples leads to overestimation of

*N*

_{e}. Hence, to improve precision on the

*F*

_{c}-based estimation of

*N*

_{e}, we recommend generating the actual

*F̅*

_{c}distribution using simulations based on the estimated value

*N̂*

_{e}and obtaining a confidence interval directly on

*N̂*

_{e}. With such a gain in accuracy, the performance of the

*F*

_{c}-based estimator of

*N*

_{e}becomes close to those of likelihood-based estimators.

## Distribution of *F*_{c} at individual loci and the detection of loci departing from pure drift:

Once *N*_{e} has been estimated on the basis of *F̅*_{c} by averaging over several marker loci, it is desirable to test whether all loci considered have undergone the same rate of change in allele frequency. Heterogeneity in individual *F*_{c}-values might be used as evidence for selection since “*while natural selection will operate differently for each locus and each allele at a locus, the effect of breeding structure* (migration, genetic drift, inbreeding) *is uniform over all loci*” (Lewontin and Krakauer 1973, pp. 176–177). Loci with significantly high *F̂ _{c}*

_{,}

*-values should be discarded before (re)computing*

_{l}*F̅*

_{c}to yield a more reliable estimate of

*N*

_{e}.

We propose a way to identify, in a series of experimental *F̂ _{c}*

_{,}

*measurements, markers exhibiting*

_{l}*F̂*

_{c}_{,}

*-values significantly higher than expected under pure drift based on*

_{l}*N̂*

_{e}. To do so, each

*F̂*

_{c}_{,}

*-value should be compared to an expected distribution based on a genome-wide effective size estimated from the remaining loci and the trajectory of allele frequencies at this locus. We exemplify below our method with two published data sets. First we consider individual*

_{l}*F̂*

_{c}_{,}

*-values estimated from temporal variations of allele frequencies in an experimental composite wheat population undergoing natural selection. A total of 250 and 213 individuals were sampled at generations 1 and 10, respectively, and genotyped at 29 RFLP loci (Goldringer*

_{l}*et al.*2001). For each locus

*l*, we pooled the remaining 28 loci to obtain a global

*F̅*

_{c}estimate and an

*F̅*

_{c}-based estimate of

*N*

_{e}(described hereafter as the genome-wide average estimate). The expected distribution of

*F*

_{c}at locus

*l*was then obtained (using typically 3000–5000 independent simulations) conditional on

*N̂*

_{e}(excluding locus

*l*) and the observed initial allele frequencies (at locus

*l*). We then tested whether the observed temporal variance of allele frequencies at locus

*l*,

*F̂*

_{c}_{,}

*, was significantly larger than the genome-wide average variations by computing*

_{l}*p*, the probability for

*F̂*

_{c}_{,}

*to be greater than or equal to the observed value at this locus on the basis of the simulated distribution described above. Note that one could also test for the presence of loci exhibiting smaller than expected variations in allele frequency. Some loci exhibited some “excess drift” relative to the rest of the loci and accordingly fairly small*

_{l}*P*-values:

*Fba242-C*(

*P*= 0.021),

*Fba280-C*(

*P*= 0.042),

*Fba65-D*(

*P*= 0.085), and

*Fba204-A*(

*P*= 0.09). However, the distribution of

*P*-values was fairly uniform (data not shown) and to take into account the fact that multiple loci were examined we computed the expected false discovery rates, also known as

*q*-values, using the distribution of

*P*-values (see Storey and Tibshirani 2003 for details). The

*q*-values were calculated using the package QVALUE (http://faculty.washington.edu/jstorey/qvalue/index.html). This analysis suggests that declaring only

*Fba242-C*and

*Fba280-C*as “significant” for excess of drift would still yield an estimated rate of false positives of ∼40% among these two loci.

Next we consider the study published by Labate* et al*. (1999), where temporal variations in allele frequencies were surveyed at 82 RFLP loci after 12 generations in maize populations undergoing recurrent selection. A *P*-value was calculated at each locus (Figure 1), using the method described above. The distribution of *P*-values was then used to calculate corresponding *q*-values (Figure 2). In contrast with the previous case, the distribution of *P*-values is clearly L-shaped (Figure 1) and choosing a *q*-value cutoff of 0.05 yields 10–11 loci exhibiting significant departures from the genome-wide level of drift. This method proved to be somewhat more conservative than the one used by the authors, who declared 14 loci as outliers (Labate *et al*. 1999). Discarding those outlier loci, one can compute a new genome-wide effective size and check that no more loci exhibit *F*_{c}-values departing significantly from the null hypothesis of homogenous drift (data not shown).

Waples (1989b) proposed a method based on the chi-square test of homogeneity to test the hypothesis that observed changes in allele frequencies can be satisfactorily explained by drift alone. This allows one to examine the variation of a particular allele according to the range of possible *N*_{e}-values for the population under study. Yet, this test has not been widely used in experimental studies. Indeed, it is rather complicated to implement, particularly in cases of multiple alleles since it is necessary to consider covariances of frequencies for different alleles sampled at different times. Lewontin and Krakauer (1973) provided the theoretical grounds for homogeneity tests of variation in allele frequencies, but they emphasized more spatial variation, and the test proposed for temporal variation, which again relies on the assumption of a chi-square distribution of individual *F*_{c,}* _{l}*-values, is much too restrictive [see Beaumont and Nichols (1996) and Vitalis

*et al.*(2001) for the case of spatial variation in allele frequencies]. The use of simulation-based distributions provides a robust method to test for homogeneity of

*F̂*

_{c}_{,}

*-values across loci before pooling estimates. Our procedure yields a more reliable genome-wide estimate of the realized*

_{l}*N*

_{e}and can be used to detect markers exhibiting

*F*

_{c}-values significantly higher than expected on the basis of mean

*N*

_{e}, thereby providing a (formal) way of assessing if selection is operating on any given genomic segment (see also Luikart

*et al.*2003 for a review of available methods for population structure). One potential caveat of our method is that the distribution of

*F*

_{c}under the null hypothesis is generated using information from the data (to estimate the genome-wide

*N*

_{e}). We verified through simulations (see online supplementary material at http://www.genetics.org/supplemental/) that our procedure is actually fairly robust to uncertainty in the estimation of the genome-wide

*N*

_{e}. A program generating the expected individual or mean

*F*

_{c}distributions used in this note is available upon request as a

*Mathematica*notebook from the authors.

## Acknowledgments

We thank F. Hospital, I. Bonnin, and C. Dillmann for helpful discussions and A. Tsitrone, R. Waples, and an anonymous reviewer for their comments on earlier versions of this article. We thank O. Martin for correcting and improving the English of this article.

## Footnotes

Communicating editor: O. Savolainen

- Received December 17, 2003.
- Accepted June 7, 2004.

- Genetics Society of America