Comparative Evaluation of a New Effective Population Size Estimator Based on Approximate Bayesian Computation
David A. Tallmon, Gordon Luikart, Mark A. Beaumont


We describe and evaluate a new estimator of the effective population size (Ne), a critical parameter in evolutionary and conservation biology. This new “SummStat” Ne estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer Ne. Simulations of a Wright-Fisher population with known Ne show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and Ne values. We also address the paucity of information about the relative performance of Ne estimators by comparing the SummStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated using initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (n = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and Ne ≤ 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true Ne. The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any potentially informative summary statistic from population genetic data.

THE effective population size (Ne) plays a central role in how a population evolves because Ne affects the degree to which a population can respond to selection, as well as its sensitivity to inbreeding effects (Crow and Kimura 1970; Lande 1995; Lynch et al. 1995). As a result of the critical importance of Ne to evolution, a great deal of effort has focused upon estimating Ne accurately and precisely, and there is always a demand for efficient and useful Ne estimators. Currently, many different methods are available to infer Ne, including ones based on demographic or genetic data. These methods vary in the types of information they use, their accuracy, and the kinds of Ne estimates they provide (Crow and Denniston 1988; Harris and Allendorf 1989; Waples 1991; Schwartz et al. 1998; Storz et al. 2002).

Ne can be estimated from genetic data in one or more samples (Waples 1991). Most one-sample estimators use associations among alleles at different loci to infer Ne (Hill 1981; Vitalis and Couvet 2001). Multiple-sample methods infer Ne from temporal changes in allele frequencies or the rate of coalescence of alleles between sample periods (Nei and Tajima 1981; Wang 2001; Berthier et al. 2002). The most widely used estimator is a method-of-moments estimator (moment estimator), which infers Ne from the standardized variance in allele frequencies sampled one or more generations apart. The change in allele frequencies (F) between sample periods is an inverse function of Ne. Therefore, Ne can be derived from the amount of change in allele frequencies (Nei and Tajima 1981; Waples 1991). However, this estimator uses only the first two moments of the allele frequency distribution to obtain Ne and a number of approximations are made in its derivation. Several studies have noted that it is often biased high (Luikart et al. 1999; Wang 2001; Berthier et al. 2002).

Likelihood-based methods have been proposed to improve Ne estimation from temporally spaced samples and have become more feasible because fast computers required to calculate likelihoods are more generally available. In theory, these estimators should be more accurate and precise than the moment estimator because they use more of the information provided by the data. Williamson and Slatkin (1999) and Anderson et al. (2000) developed maximum-likelihood-based estimators that outperform the moment estimator. However, these methods are computationally extremely intensive, so they have been evaluated only with extremely small Ne and three or fewer alleles per locus. Recently, Wang (2001) used time-saving analytical approximations based on the method of Williamson and Slatkin (1999) to provide a more efficient likelihood Ne estimator. Unlike the latter study, which could be used only for biallelic markers, he assumed k alleles at the same locus could be treated as if from k independent, biallelic loci to generate the total likelihood, with an adjustment to take into account that there are only k − 1 independent allele frequencies. This pseudo-likelihood estimator appeared to perform well relative to the full-likelihood estimators, but was compared only with three alleles per locus and small Ne.

More recently, Berthier et al. (2002) developed a novel likelihood-based approach to obtain e from two samples using a genealogical representation from coalescent theory. Likelihoods were estimated by importance sampling. Markov chain Monte Carlo (MCMC) was then used to give a Bayesian posterior distribution for Ne. Using a prior distribution to set an upper limit for Ne, which ensured convergence of the MCMC estimates, Berthier et al. (2002) showed their estimator to be superior to the moment estimator when genetic drift was strong. Being based on the coalescent, it is only an approximation to the genealogy of the Wright-Fisher model, which assumes discrete generations. The approximation is most accurate when the sample size is small relative to Ne and Ne is large. Because their estimator relies upon Monte Carlo methods to provide the posterior distribution of Ne, it is computationally demanding and has not been extensively evaluated with simulations, nor has it been compared to the pseudo-likelihood estimator of Wang (2001).

We develop a new Ne estimator by combining simple summary statistics from multiple genetic samples with approximate Bayesian computation and compare this estimator to existing ones. Bayesian approaches are attractive because they allow for background information to be incorporated into the model, provide posterior probability distributions for parameters of interest, and integrate out cumbersome nuisance parameters that are common in population genetics data (Shoemaker et al. 1999). Potentially exact Bayesian computation (in the sense that the posterior distribution can be approximated to any desired level of accuracy) using MCMC is often very time consuming and requires a substantial amount of programming effort (Beaumont et al. 2002). These constraints make potentially exact Bayesian analyses impractical for many applications and difficult to evaluate in many cases, especially for large data sets. Thus, alternative, less time-consuming methods are desirable.

The use of summary statistics has been proposed as a means to avoid the problems presented by complex population genetics analyses (Tavaré et al. 1997). This requires the comparison of summary statistics from a real sample with an unknown parameter value to the same summary statistics generated under simulated conditions with known parameter values. Most applications have used a rejection sampling method (Pritchard et al. 1999), in which all summary statistic values that fall outside a given tolerance range are rejected, and only those summary statistics that fall within the tolerance range are used to estimate the target parameters (e.g., Tishkoff et al. 2001). The approach we use here differs by using local linear regression and smooth weighting of summary statistics and associated Ne values falling within the tolerance range. Beaumont et al. (2002) showed that local linear regression and smooth weighting can be used to improve the accuracy and precision of parameter estimation from summary statistics over that provided by rejection sample methods.

We develop a novel Ne estimator using four simple summary statistics and local weighted regression in a Bayesian framework. The four summary statistics are the divergence between samples using Weir and Cockerham's θ (Weir 1990), the change in the number of alleles from the first to second sample (Δa), the change in within-sample gene diversity from the first to second sample (ΔHs; Nei 1987), and the total expected heterozygosity between samples (Ht; Weir 1990). We evaluate the performance of this summary statistics (SummStat) estimator and compare its performance relative to three existing estimators. These estimators include the standard moment estimator, which has well-known properties over a wide range of parameter values, and promising likelihood estimators developed recently by Berthier et al. (2002) and Wang (2001) that are not as well known and have not been compared to each other. For this comparison we use simulated populations of known Ne and levels of genetic variation at marker loci typical of microsatellites, the preferred marker for Ne estimation (Luikart et al. 1999). We compare the accuracy and precision of these four Ne estimators, using a range of Ne's (20–100), numbers of loci (5 or 15), and sample sizes (20 or 60) separated by a range of generations (1–10) typical in studies of natural populations. Finally, we illustrate the use of the methods with a real microsatellite data set from an experimental population of mosquito fish (Spencer et al. 2000).


SummStat estimator:

We developed an Ne estimator using the summary statistics approach developed by several authors (Fu and Li 1997; Tavaré et al. 1997; Weiss and von Haeseler 1998) and modified by Beaumont et al. (2002) to incorporate weighted local regression in a Bayesian context. The general method is described in detail by Beaumont et al. (2002) and briefly here. This approach is especially useful when inferences about some parameter of interest, Φ, are difficult to make using full likelihoods. In this method, J values of Φi are simulated from a prior distribution, ΦiP(Φ). For each Φi a data set, Di, is simulated using a Wright-Fisher model described below. Summary statistics, Si, are then calculated from the data and scaled to have unit variance. Thus, the Si and Φi are drawn from the joint distribution P(S, Φ). The posterior distribution P|S = S*) is the conditional distribution of Φ given the target summary statistics S*, calculated from the sample data. To approximate this, the simulated candidate value Φi and associated Si are accepted when the Euclidean distance ||SiS*|| < g, where g defines a distance such that a proportion dg of points closest to S* are accepted (Fu and Li 1997; Tavaré et al. 1997; Pritchard et al. 1999). To improve the accuracy of the “rejection” scheme, in the method of Beaumont et al. (2002) each accepted Φi is given a weight that declines quadratically as a function of ||SiS*|| from 1 at distance 0 to 0 at distance g, and then weighted linear regression is used to adjust the values of Φi. The method fits a regression line such that each Φi = a + bSi + ei, and then, assuming constant variance within the interval given by ||SiS*|| < g, makes the adjustment Math. These Φi are then assumed to be random samples from the posterior distribution P(S, Φ), which, depending on how close to sufficient are the summary statistics, is itself assumed to be close to P(D|Φ).

To examine our SummStat Ne estimator, we created an individual-based, Wright-Fisher simulation model of diploid organisms using the programming language C. This model differs slightly from a Wright-Fisher model in that there are two allogamous sexes and equal numbers of each sex. In the present case, the model was initialized using genotypes drawn from a uniform Dirichlet distribution with eight alleles per locus, but it can also be initialized using a coalescent-based microsatellite distribution with any specified θ value or range of values. Following the collection of the first sample of n diploid individuals at time t1 = 0, a breeding population of size Ne was created and randomly mated for t generations, when the second sample of n diploid individuals was collected from progeny of adults in generation t2. Following the collection of the second sample, summary statistics were estimated over the L loci sampled. This sampling schedule follows plan 2 of Waples (1989). The summary statistics consisted of a common measure of divergence between samples, the coancestry coefficient, Mathwhere Qi is the estimated probability of identity of alleles in a sample at locus i and qi is the estimated probability of identity of alleles in the two samples at locus i (Weir 1990); the mean change in the number of alleles from the first to second sample, Mathwhere a1,i and a2,i are the estimated number of alleles at locus i in samples 1 and 2, respectively; the change in mean within-sample gene diversity from the first to second sample, Mathwhere p21,i and p22,i are the squares of the estimated frequency of each allele present at locus i in the first and second samples, respectively (Nei 1987); and total expected heterozygosity between samples, Mathwhere p2i is the estimated frequency of each allele in the combined samples (Weir 1990). All estimated values for these summary statistics also included appropriate sample size corrections.

It is important to note that we could have included any summary statistics that can be calculated from standard population genetic data, but limited ourselves to four that are straightforward to calculate, commonly used in population genetics studies, and thought to be related to our parameter of interest, Ne, on the basis of previous research and some preliminary simulations of our own.

The simulation model sampled from a uniform flat prior distribution of Ne between 4 and 400 to generate J = 50,000 values of the summary statistics. This prior is reasonable because Ne can fall in this range, even for some populations with thousands of individuals, and because most applications of Ne estimators are cases where Ne is small (Waples 1991). To test the performance of the SummStat estimator, we simulated independent populations of known Ne and calculated summary statistics for each target data set sampled from each population (see details of sampling conditions below). We used natural log of Ne in all regressions to adjust the values of Φi to ensure that the results were robust to changes in g. Values of Ne accepted within dg = 0.02, as described above, were then regarded as samples from the posterior distribution of Ne. The mode and credible intervals of the posterior distribution of backtransformed Ne values were calculated using the density estimation method of Loader (1996), implemented in the statistical package R (R Development Core Team 2003). The log transformation did not prevent some regression-adjusted points from projecting beyond the upper bound of Ne.

Comparison to other estimators:

We compared the performance of the SummStat estimator to three existing two-sample Ne estimators, including a coalescent (TMVP; Beaumont 2003, which is based on the program TM3 in Berthier et al. 2002), a moment (Nei and Tajima 1981), and a pseudo-likelihood (MLNE; Wang 2001) estimator. The comparisons were conducted over a range of parameter values, including the effective population sizes (Ne = 20, 50, 100), generations between samples (t = 1, 3, 5, 10), sample sizes (n = 20, 60 diploid individuals), and numbers of loci (L = 5, 15). Note that in some situations the sample size exceeded the effective size, which is common when sampling natural populations in which the number of juvenile or adult individuals can greatly exceed the number of breeders (Frankham 1995) and corresponds to sampling plan II of Waples (1989). Only one set of simulations with large Ne = 100 is included (t = 1, 3, 5, 10; n = 60; L = 15), because of the long time required to run all of the models on large Ne populations and because most applications of Ne estimators are to natural populations with small Ne. We ran 600 independent iterations of each combination of parameter values for the model comparisons. The models were compared in terms of bias and precision of e, using a number of metrics: relative mean square error (RMSE) of the mode, median bias of the mode, 95th percentile of 95% confidence/credible intervals, and proportion of confidence/credible intervals that excluded true Ne. We also show the bootstrapped estimates of the standard errors of the RMSE estimates.

All sampled genotypes generated in our simulations were written to files to provide input for TMVP and MLNE. TMVP is an updated version of the TM3 program used in Berthier et al. (2002) and provides a posterior distribution of Ne using a MCMC approach with importance sampling (Beaumont 2003). For the simulations used here, the size of the importance sample was 20, 20,000 MCMC updates were used with 10 updates between estimate outputs, and an Ne ceiling was set at 400. The TMVP Ne evaluated in this article is the mode from the posterior MCMC distribution of values (except for an initial 10% of values discarded as burn-in).

MLNE provides a pseudo-likelihood Ne as described in Wang (2001). We used an updated version of the program, described in Wang and Whitlock (2003), which provides a ceiling for Ne. We used an Ne ceiling of 400 in these simulations. MLNE also provides Ne from the moment estimator of Nei and Tajima (1981), but no confidence intervals. Therefore, the 95% confidence intervals of the moment estimator are not reported and described here, but they have been explored previously and their precision and bias are well known (Luikart et al. 1999; Wang 2001; Berthier et al. 2002).

Robustness to prior probability distribution of Ne:

We investigated the robustness of the SummStat estimator to changes in the prior probability distribution (using the same metrics described above) when the lower limit of the prior remained at 4 and upper limit of the prior probability distribution for Ne was 200, 400, or 1000. True Ne was set to 50. Diploid individuals were sampled 1 or 3 generations apart and genotyped at 15 loci. We did not directly compare the SummStat estimator to the others in these simulations. Results are based upon 600 iterations of each set of conditions.

Robustness to initial allele frequencies:

Allele frequencies are drawn from a uniform Dirichlet distribution in most simulation-based studies of Ne estimators, including ours (Anderson et al. 2000; Wang 2001). This corresponds to specific assumptions about the mutation process—namely, that it is a k-allele model where each mutation occurs at rate Θ, and the probability of mutation to any specific allele is 1/k. We investigated the robustness of our results to assumptions about the underlying allele frequency distribution by initializing a set of simulations and priors with allele frequency data drawn from the coalescent. For these simulations the Wright-Fisher population described above was used. Initial levels of polymorphism were determined by a randomly selected value of Θ = 5–15, and the population was first sampled t1 = 1–10 generations following a change in effective size to Ne = 20. We used uniform flat priors for Θ and the t1. The second sample was collected t2 = 1, 3, 5, or 10 generations after t1. In each sample, n = 60 individuals were genotyped at L = 15 loci. The estimators were evaluated as described above on the basis of 600 iterations of each set of conditions. With this slight modification we are now able to obtain the posterior distribution of Ne marginal to both Θ and t1. This highlights the advantage of approximate Bayesian computation based on summary statistics, and it is straightforward to make changes in the model with minimal programming effort.

Real data set:

To further illustrate the SummStat estimator, we estimated Ne for an experimentally bottlenecked population reported in Spencer et al. (2000) and evaluated by Berthier et al. (2002), using their coalescent-based Ne estimator and the moment estimator. The target data set was collected from a large source population of mosquito fish (Gambusia affinis) that was sampled and then experimentally reduced to eight pairs of founders, allowed to grow for two generations, and then resampled (Spencer et al. 2000). Forty individuals were genotyped at eight microsatellite loci in each sample. Although true Ne is unknown for this population, Ne = 16 is the hypothesized value. For this target data set we generated values for the same summary statistics used in our simulations: θ̂ = 0.283, Δâ = −5.50, ΔĤs = −0.096, and Ĥt = 0.708. We used the same Wright-Fisher model described above to simulate 50,000 populations, with an Ne randomly drawn from a uniform flat prior between 4 and 500 and initial levels of polymorphism determined by the coalescent with a uniform flat prior for θ between 5 and 15. In each simulation, 40 diploid individuals were sampled and genotyped at eight loci in generations 0 and 2. These genotypes were used to calculate the same summary statistics calculated for the target data set, with the tolerance set at 0.02. We report the mode and 95% credible intervals of the posterior distribution from SummStat and compare them to the other estimators. The estimates from TMVP and the moment method are taken directly from Berthier et al. (2002)(Table 3). MLNE estimates were obtained using a ceiling of 500.


Estimator performance:

The SummStat estimator provides accurate and precise e across a range of generations between samples and numbers of individuals and loci sampled. In general, this estimator shows a small, positive bias when only a single generation passes between samples (Figure 1) . Only when Ne = 50 and sampling conditions are at their worst (only 20 individuals, 5 loci, and 1 generation) did SummStat show a substantial positive bias. The bias decreases rapidly with increases in the number of individuals sampled, loci sampled, and generations between samples. The relative mean square error (RMSE) of the SummStat estimator is small when true Ne = 20, but is noticeably larger when Ne = 50 and only 1 generation separates the samples (Figure 2) . There is a consistent, striking decrease in the RMSE when the number of generations between samples increases from 1 to 3, regardless of the other sampling conditions. However, there is relatively little increase in accuracy and precision as the number of generations increases from 3 to 10.

Figure 1.—

Bias of four effective population size (Ne) estimators, including the new SummStat estimator, for known Ne values (20, 50, or 100) and various sampling conditions. Bias is reported as the median difference between the mode and true Ne from 600 simulations. The bottom right shows bias when initial allele frequencies are drawn from the coalescent (all others are from Dirichlet). The y-axis changes throughout the figure to emphasize differences among methods.

Figure 2.—

Relative mean square error (RMSE) of four effective population size (Ne) estimators, including the new SummStat estimator, for true Ne (20, 50, or 100), and various sampling conditions. RMSE values reported are means from 600 simulations with SE bars from 1000 bootstraps. The bottom right shows estimator RMSE when initial allele frequencies are drawn from the coalescent (all others are from Dirichlet). The y-axis changes throughout the figure to emphasize differences among methods.

The SummStat 95% credible intervals contain true Ne consistently and do not depart greatly from the expectation of 2.5% of the true Ne values falling above or below the credible intervals (Table 1) . There is a slight tendency to exclude true Ne from the lower credible interval with sparse data and a single generation between samples, which probably is due to the slight positive bias of the SummStat estimator under these conditions noted above. In addition, the 95th percentiles of the 95% credible intervals tend to be conservative when data are sparse or drift is weak (Table 2) . The SummStat 95% credible intervals narrow rapidly with increasing numbers of generations between samples. For the parameter combinations evaluated, a threefold increase in the number of loci sampled provides a markedly greater increase in precision than a threefold increase in the number of individuals sampled.

View this table:

Percentage of lower and upper credible/confidence limits that exclude true Ne

View this table:

Ninety-fifth percentiles of the lower and upper confidence/credible limits for the Ne estimators

The SummStat estimator appears robust to changes in the upper limit for the prior probability distribution of Ne. The bias, RMSE, and lower credible interval of the estimator do not change much with a fivefold change in the upper limit of the Ne prior from 200 to 1000 and true Ne = 50 (Table 3) . However, while the lower credible interval is relatively stable, the upper credible interval does appear to be sensitive to the choice of a prior when only a single generation separates samples. For example, the upper credible interval more than doubled with a fivefold increase in the upper limit of the Ne prior. With three generations between samples the upper credible limit changed by <25% with a fivefold increase in the upper limit of the Ne prior. This limit changes less with more generations between samples (results not shown).

View this table:

Robustness of SummStat estimator to changes in the Ne prior

Comparison to other estimators:

The SummStat estimator has the lowest bias and generally performs well relative to the other estimators. When only 1 generation passes between samples (t = 1), TMVP and MLNE tend to underestimate Ne, whereas the SummStat estimator slightly overestimates it (Figure 1). In general, TMVP shows the largest bias of the four estimators when 1–5 generations pass between sampling events. In contrast, with 10 generations passing between sampling events, the bias of TMVP is greatly reduced, whereas the moment estimator has the greatest bias. MLNE generally shows the second lowest bias after the SummStat estimator, and this bias decreases with increasing generations between samples. Wang (2001) reported a slight positive bias for MLNE that is based on the mean of the Ne estimates, rather than on the median as reported here. In fact, the distribution of Ne estimates is skewed (as illustrated in Wang and Whitlock 2003, Figure 3), and the mean is generally larger than the median, as reported here.

All of the estimators generally show reduced bias and RMSE with increasing sample sizes, loci, and generations between samples. The exception is the moment estimator, which shows increasing bias with increasing generations between samples. In most cases simulated, SummStat RMSE is intermediate to that of the other estimators, and the relative performance of the estimators changes with sampling conditions. However, the RMSE of the SummStat estimator is generally larger than that of MLNE, which often provides the smallest RMSE, and is consistently smaller than the RMSE of the moment estimator. TMVP shows very small RMSE and little bias when 10 generations pass between samples. In contrast, the moment estimator consistently has the largest RMSE when there are 10 generations between samples.

TMVP and MLNE tend to exclude true Ne from the upper 95% credible/confidence interval much more than expected by chance (Table 1). This tendency is most notable with TMVP, which consistently provides the lowest upper and lower confidence intervals. The observed bias of the confidence intervals generally worsens with the number of loci or individuals sampled, if the number of generations between samples is held constant. In contrast, the observed downward biases of the 95% confidence/credible intervals of TMVP and MLNE improve rapidly with increasing generations between samples. Although the 95% credible intervals of the SummStat estimator are less likely to exclude true Ne across the entire range of sampling conditions, the 95% confidence/credible intervals of the other estimators are often narrower when there are few generations between samples (Table 2). Consequently, there is a fairly consistent trade-off in the bias and precision of these estimators, especially when data are sparse, Ne large, and few generations pass between samples. TMVP and MLNE are more precise, but there is a risk of underestimating true Ne. The SummStat estimator provides relatively conservative credible intervals and is less biased.

The general patterns seen in the data generated using a Dirichlet distribution to draw initial allele frequencies also appear to be robust to changes in the underlying allele frequency distribution and lend credibility to our analyses. The patterns of bias and RMSE of the estimators are similar whether initial allele frequencies are drawn by sampling from a Dirichlet or from the coalescent. However, the tendencies of each estimator noted above appear to be accentuated. The SummStat estimator continued to show the lowest bias (Figure 1) as well as a relatively large RMSE when few generations passed between samples (Figure 2). TMVP and MLNE were precise, but showed considerably strong bias. A noticeable change in the relative performance of these estimators is the improved performance of the moment estimator in the coalescent simulations, especially with 5–10 generations between samples. The positive bias and low accuracy of the moment estimator when drift is strong have been noted elsewhere to result from the fixation of low-frequency alleles present in the first sample (Richards and Leberg 1996; Luikart et al. 1999; Williamson and Slatkin 1999; Wang 2001; Berthier et al. 2002). The improved performance here is probably due to there being fewer rare alleles because the simulated populations were first sampled as many as 10 generations following the initial reduction in Ne. This long period of strong drift before drawing the first sample could decrease the number of rare alleles, thus reducing this source of bias for the moment estimator.

The patterns seen in the 95% confidence/credible intervals when initial allele frequencies are drawn from the coalescent are also similar to the patterns in simulations that draw from the Dirichlet. Namely, the SummStat credible intervals are conservative and are slightly more likely to fall above than below true Ne (Table 4) . In contrast, the confidence/credible intervals of TMVP and MLNE tend to fall below true Ne and provide underestimates of true Ne quite frequently.

View this table:

Coverage of the Ne estimators with initial allele frequencies drawn from the coalescent

TMVP provides point estimates and upper credible intervals that are biased low when there are few generations between samples, and this bias has been noted previously (Berthier et al. 2002). However, TMVP is very accurate and precise when 10 generations pass between samples, despite the fact that it suffers from a disadvantage in our comparisons in that it assumes a coalescent model to simulate drift and a Wright-Fisher model was assumed for our simulations. Indeed, TMVP was very accurate in a subset of simulations using the coalescent rather than a Wright-Fisher model to simulate drift (data not shown).

Empirical data:

The application of SummStat to data collected from an experimental mosquito fish population studied by Spencer et al. (2000) demonstrates its utility in an empirical setting. The SummStat point estimate of Ne = 8.93 falls below the hypothesized true Ne = 16 for this population. In contrast, the coalescent, moment, and MLNE estimates are 21.6, 35.4, and 32.51, respectively. More importantly, the SummStat Ne 95% credible intervals contain the hypothesized true Ne value and are reasonably precise, whereas all of the others excluded Ne from their credible/confidence intervals (Figure 3) . In this case, only eight loci and 40 individuals were sampled, which is a modest sampling effort to expect for most natural populations. Although this is only a single application of the SummStat estimator, the results are encouraging and suggest the estimator may work well when applied to microsatellite data sampled from real populations.

Figure 3.—

Comparison of four effective population size (Ne) estimators using data from Richards and Leberg (1996) for an experimental population of mosquito fish. Lines indicate point estimates and shaded boxes indicate 95% confidence/credible intervals.

Other considerations:

It has been suggested by Wang and Whitlock (2003) that measurement of bias and precision should be based on 1/(2Ne) rather than on Ne. The relative advantages or disadvantages of different point estimators and transformations can be assessed only in a decision-theoretic framework (O'Hagan 1994), where a utility function can be specified for Ne, and in the absence of this we prefer to measure bias and precision in terms of Ne. Examination of a number of data sets indicates that the general patterns in the data reported here are robust whether one examines estimates of Ne or 1/(2Ne).

An important choice in Bayesian inference is the method used to generate the prior probability distribution. We have found our estimator to be robust whether initial allele frequencies are drawn from the Dirichlet or the coalescent. However, as with any Bayesian approach, it is important to recognize that large differences between the biological conditions that gave rise to the data and the method used to generate the prior can affect the validity of the results. Common effects of small population size, such as small departures from Hardy-Weinberg proportions or gametic phase equilibrium, should not create problems in our simulations because the assumed model incorporates these effects adequately. However, if the target data had been collected from a population with high rates of undetected immigration, for example, and the model used to generate the priors did not incorporate this, then biological inferences could be misleading. Consequently, researchers should temper their interpretations with careful consideration of the assumptions (e.g., immigration, no substructuring, neutral markers) used in creating the prior probability distribution and how differences within the biological context that created the data set of interest might affect their inferences.

It is also important to note that the quality of the information provided by each summary statistic will vary with sampling conditions, amounts of genetic drift, and type of genetic markers. For example, if molecular markers or populations with few alleles are studied, then the mean loss of alleles per locus may be a less informative summary statistic than, for example, divergence between samples. Thus, careful attention should be paid to choosing statistics that are likely to be informative, as one seeks to maximize the amount of information that can be extracted from a data set while avoiding the curse of dimensionality created by using many different summary statistics (Beaumont et al. 2002). Ideally, both the literature and simulations should be used to help choose the best summary statistics for a given set of biological conditions and sampling constraints.


The SummStat estimator performs well, relative to the others, using only four summary statistics. It is the least biased method over the full range of parameter values investigated, has an RMSE intermediate to the others in most of the scenarios investigated, and performed well when applied to a real data set. However, MLNE generally has the smallest RMSE of the estimators compared, despite a negative bias and tendency to exclude true Ne from the upper confidence intervals when few generations pass between samples. The complementary properties of MLNE (precision) and SummStat (accuracy) suggest that it would be wise to use both to estimate Ne. TMVP may be more appropriate when reproduction occurs continuously rather than in discrete generations.

The SummStat estimator is best viewed as a flexible approach to Ne estimation, in terms of both modeling and choice of summary statistics, as shown by the ease with which it is modified to consider a very different prior from a Dirichlet. An advantage of the SummStat estimator is that one can combine any potentially informative summary statistics calculated from sample data into a single approach to Ne estimation. For example, it could conceivably be extended beyond the allele frequency-based information used in the present example to include genotypic information, thus extracting more information provided by the data (Hill 1981; Waples 1991). The time-saving attributes of approximate Bayesian approaches relative to exact Bayesian methods make them an appealing alternative for population genetics applications. New molecular technologies provide ever-increasing numbers of molecular markers, which, in turn, make full-likelihood calculations very time consuming. The performance of the SummStat Ne estimator suggests further exploration and expansion using approximate Bayesian computation may be a fruitful means to improve efficient Ne estimation.


We thank L. Excoffier, J. Wang, and R. Waples for helpful reviews and suggestions that improved this manuscript. We also thank J. Wang for graciously recompiling slightly altered versions of his program for us. This work was supported by a National Science Foundation postdoctoral grant (INT 0202707) to D.A.T. and by the European Commission (Econogene contract QLK5-CT-2001-02461) to G.H.L. M.A.B. is supported by a Natural Environment Research Council Advanced Fellowship.


  • Communicating editor: L. Excoffier

  • Received December 26, 2003.
  • Accepted March 1, 2004.


View Abstract