## Abstract

We describe and evaluate a new estimator of the effective population size (*N*_{e}), a critical parameter in evolutionary and conservation biology. This new “SummStat” *N*_{e} estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer *N*_{e}. Simulations of a Wright-Fisher population with known *N*_{e} show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and *N*_{e} values. We also address the paucity of information about the relative performance of *N*_{e} estimators by comparing the SummStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated using initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (*n* = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and *N*_{e} ≤ 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true *N*_{e}. The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any potentially informative summary statistic from population genetic data.

THE effective population size (*N*_{e}) plays a central role in how a population evolves because *N*_{e} affects the degree to which a population can respond to selection, as well as its sensitivity to inbreeding effects (Crow and Kimura 1970; Lande 1995; Lynch *et al.* 1995). As a result of the critical importance of *N*_{e} to evolution, a great deal of effort has focused upon estimating *N*_{e} accurately and precisely, and there is always a demand for efficient and useful *N*_{e} estimators. Currently, many different methods are available to infer *N*_{e}, including ones based on demographic or genetic data. These methods vary in the types of information they use, their accuracy, and the kinds of *N*_{e} estimates they provide (Crow and Denniston 1988; Harris and Allendorf 1989; Waples 1991; Schwartz* et al.* 1998; Storz *et al.* 2002).

*N*_{e} can be estimated from genetic data in one or more samples (Waples 1991). Most one-sample estimators use associations among alleles at different loci to infer *N*_{e} (Hill 1981; Vitalis and Couvet 2001). Multiple-sample methods infer *N*_{e} from temporal changes in allele frequencies or the rate of coalescence of alleles between sample periods (Nei and Tajima 1981; Wang 2001; Berthier *et al.* 2002). The most widely used estimator is a method-of-moments estimator (moment estimator), which infers *N*_{e} from the standardized variance in allele frequencies sampled one or more generations apart. The change in allele frequencies (*F*) between sample periods is an inverse function of *N*_{e}. Therefore, *N*_{e} can be derived from the amount of change in allele frequencies (Nei and Tajima 1981; Waples 1991). However, this estimator uses only the first two moments of the allele frequency distribution to obtain *N*_{e} and a number of approximations are made in its derivation. Several studies have noted that it is often biased high (Luikart *et al.* 1999; Wang 2001; Berthier *et al.* 2002).

Likelihood-based methods have been proposed to improve *N*_{e} estimation from temporally spaced samples and have become more feasible because fast computers required to calculate likelihoods are more generally available. In theory, these estimators should be more accurate and precise than the moment estimator because they use more of the information provided by the data. Williamson and Slatkin (1999) and Anderson *et al.* (2000) developed maximum-likelihood-based estimators that outperform the moment estimator. However, these methods are computationally extremely intensive, so they have been evaluated only with extremely small *N*_{e} and three or fewer alleles per locus. Recently, Wang (2001) used time-saving analytical approximations based on the method of Williamson and Slatkin (1999) to provide a more efficient likelihood *N*_{e} estimator. Unlike the latter study, which could be used only for biallelic markers, he assumed *k* alleles at the same locus could be treated as if from *k* independent, biallelic loci to generate the total likelihood, with an adjustment to take into account that there are only *k* − 1 independent allele frequencies. This pseudo-likelihood estimator appeared to perform well relative to the full-likelihood estimators, but was compared only with three alleles per locus and small *N*_{e}.

More recently, Berthier *et al.* (2002) developed a novel likelihood-based approach to obtain *N̂*_{e} from two samples using a genealogical representation from coalescent theory. Likelihoods were estimated by importance sampling. Markov chain Monte Carlo (MCMC) was then used to give a Bayesian posterior distribution for *N*_{e}. Using a prior distribution to set an upper limit for *N*_{e}, which ensured convergence of the MCMC estimates, Berthier *et al.* (2002) showed their estimator to be superior to the moment estimator when genetic drift was strong. Being based on the coalescent, it is only an approximation to the genealogy of the Wright-Fisher model, which assumes discrete generations. The approximation is most accurate when the sample size is small relative to *N*_{e} and *N*_{e} is large. Because their estimator relies upon Monte Carlo methods to provide the posterior distribution of *N*_{e}, it is computationally demanding and has not been extensively evaluated with simulations, nor has it been compared to the pseudo-likelihood estimator of Wang (2001).

We develop a new *N*_{e} estimator by combining simple summary statistics from multiple genetic samples with approximate Bayesian computation and compare this estimator to existing ones. Bayesian approaches are attractive because they allow for background information to be incorporated into the model, provide posterior probability distributions for parameters of interest, and integrate out cumbersome nuisance parameters that are common in population genetics data (Shoemaker *et al.* 1999). Potentially exact Bayesian computation (in the sense that the posterior distribution can be approximated to any desired level of accuracy) using MCMC is often very time consuming and requires a substantial amount of programming effort (Beaumont *et al.* 2002). These constraints make potentially exact Bayesian analyses impractical for many applications and difficult to evaluate in many cases, especially for large data sets. Thus, alternative, less time-consuming methods are desirable.

The use of summary statistics has been proposed as a means to avoid the problems presented by complex population genetics analyses (Tavaré *et al.* 1997). This requires the comparison of summary statistics from a real sample with an unknown parameter value to the same summary statistics generated under simulated conditions with known parameter values. Most applications have used a rejection sampling method (Pritchard *et al.* 1999), in which all summary statistic values that fall outside a given tolerance range are rejected, and only those summary statistics that fall within the tolerance range are used to estimate the target parameters (*e.g.*, Tishkoff *et al.* 2001). The approach we use here differs by using local linear regression and smooth weighting of summary statistics and associated *N*_{e} values falling within the tolerance range. Beaumont *et al.* (2002) showed that local linear regression and smooth weighting can be used to improve the accuracy and precision of parameter estimation from summary statistics over that provided by rejection sample methods.

We develop a novel *N*_{e} estimator using four simple summary statistics and local weighted regression in a Bayesian framework. The four summary statistics are the divergence between samples using Weir and Cockerham's θ (Weir 1990), the change in the number of alleles from the first to second sample (Δ*a*), the change in within-sample gene diversity from the first to second sample (Δ*H*_{s}; Nei 1987), and the total expected heterozygosity between samples (*H*_{t}; Weir 1990). We evaluate the performance of this summary statistics (SummStat) estimator and compare its performance relative to three existing estimators. These estimators include the standard moment estimator, which has well-known properties over a wide range of parameter values, and promising likelihood estimators developed recently by Berthier *et al.* (2002) and Wang (2001) that are not as well known and have not been compared to each other. For this comparison we use simulated populations of known *N*_{e} and levels of genetic variation at marker loci typical of microsatellites, the preferred marker for *N*_{e} estimation (Luikart *et al.* 1999). We compare the accuracy and precision of these four *N*_{e} estimators, using a range of *N*_{e}'s (20–100), numbers of loci (5 or 15), and sample sizes (20 or 60) separated by a range of generations (1–10) typical in studies of natural populations. Finally, we illustrate the use of the methods with a real microsatellite data set from an experimental population of mosquito fish (Spencer* et al.* 2000).

## METHODS

### SummStat estimator:

We developed an *N*_{e} estimator using the summary statistics approach developed by several authors (Fu and Li 1997; Tavaré *et al.* 1997; Weiss and von Haeseler 1998) and modified by Beaumont *et al.* (2002) to incorporate weighted local regression in a Bayesian context. The general method is described in detail by Beaumont *et al.* (2002) and briefly here. This approach is especially useful when inferences about some parameter of interest, Φ, are difficult to make using full likelihoods. In this method, *J* values of Φ* _{i}* are simulated from a prior distribution, Φ

*∼*

_{i}*P*(Φ). For each Φ

*a data set,*

_{i}*D*, is simulated using a Wright-Fisher model described below. Summary statistics,

_{i}*S*, are then calculated from the data and scaled to have unit variance. Thus, the

_{i}*S*and Φ

_{i}*are drawn from the joint distribution*

_{i}*P*(

*S*, Φ). The posterior distribution

*P*(Φ

*|S = S**) is the conditional distribution of Φ given the target summary statistics

*S**, calculated from the sample data. To approximate this, the simulated candidate value Φ

*and associated*

_{i}*S*are accepted when the Euclidean distance ||

_{i}*S*−

_{i}*S**|| <

*g*, where

*g*defines a distance such that a proportion

*d*of points closest to

_{g}*S** are accepted (Fu and Li 1997; Tavaré

*et al.*1997; Pritchard

*et al.*1999). To improve the accuracy of the “rejection” scheme, in the method of Beaumont

*et al.*(2002) each accepted Φ

*is given a weight that declines quadratically as a function of ||*

_{i}*S*−

_{i}*S**|| from 1 at distance 0 to 0 at distance

*g*, and then weighted linear regression is used to adjust the values of Φ

*. The method fits a regression line such that each Φ*

_{i}*=*

_{i}*a*+

*bS*+

_{i}*e*, and then, assuming constant variance within the interval given by ||

_{i}*S*−

_{i}*S**|| <

*g*, makes the adjustment . These Φ

^{′}

_{i}are then assumed to be random samples from the posterior distribution

*P*(

*S*, Φ), which, depending on how close to sufficient are the summary statistics, is itself assumed to be close to

*P*(

*D|*Φ).

To examine our SummStat *N*_{e} estimator, we created an individual-based, Wright-Fisher simulation model of diploid organisms using the programming language C. This model differs slightly from a Wright-Fisher model in that there are two allogamous sexes and equal numbers of each sex. In the present case, the model was initialized using genotypes drawn from a uniform Dirichlet distribution with eight alleles per locus, but it can also be initialized using a coalescent-based microsatellite distribution with any specified θ value or range of values. Following the collection of the first sample of *n* diploid individuals at time *t*_{1} = 0, a breeding population of size *N*_{e} was created and randomly mated for *t* generations, when the second sample of *n* diploid individuals was collected from progeny of adults in generation *t*_{2}. Following the collection of the second sample, summary statistics were estimated over the *L* loci sampled. This sampling schedule follows plan 2 of Waples (1989). The summary statistics consisted of a common measure of divergence between samples, the coancestry coefficient,
where *Q _{i}* is the estimated probability of identity of alleles in a sample at locus

*i*and

*q*is the estimated probability of identity of alleles in the two samples at locus

_{i}*i*(Weir 1990); the mean change in the number of alleles from the first to second sample, where

*a*

_{1,}

*and*

_{i}*a*

_{2,}

*are the estimated number of alleles at locus*

_{i}*i*in samples 1 and 2, respectively; the change in mean within-sample gene diversity from the first to second sample, where

*p*

^{2}

_{1,i}and

*p*

^{2}

_{2,i}are the squares of the estimated frequency of each allele present at locus

*i*in the first and second samples, respectively (Nei 1987); and total expected heterozygosity between samples, where

*p*

^{2}

_{i}is the estimated frequency of each allele in the combined samples (Weir 1990). All estimated values for these summary statistics also included appropriate sample size corrections.

It is important to note that we could have included any summary statistics that can be calculated from standard population genetic data, but limited ourselves to four that are straightforward to calculate, commonly used in population genetics studies, and thought to be related to our parameter of interest, *N*_{e}, on the basis of previous research and some preliminary simulations of our own.

The simulation model sampled from a uniform flat prior distribution of *N*_{e} between 4 and 400 to generate *J =* 50,000 values of the summary statistics. This prior is reasonable because *N*_{e} can fall in this range, even for some populations with thousands of individuals, and because most applications of *N*_{e} estimators are cases where *N*_{e} is small (Waples 1991). To test the performance of the SummStat estimator, we simulated independent populations of known *N*_{e} and calculated summary statistics for each target data set sampled from each population (see details of sampling conditions below). We used natural log of *N*_{e} in all regressions to adjust the values of Φ* _{i}* to ensure that the results were robust to changes in

*g*. Values of

*N*

_{e}accepted within

*d*= 0.02, as described above, were then regarded as samples from the posterior distribution of

_{g}*N*

_{e}. The mode and credible intervals of the posterior distribution of backtransformed

*N*

_{e}values were calculated using the density estimation method of Loader (1996), implemented in the statistical package R (R Development Core Team 2003). The log transformation did not prevent some regression-adjusted points from projecting beyond the upper bound of

*N*

_{e}.

### Comparison to other estimators:

We compared the performance of the SummStat estimator to three existing two-sample *N*_{e} estimators, including a coalescent (TMVP; Beaumont 2003, which is based on the program TM3 in Berthier *et al.* 2002), a moment (Nei and Tajima 1981), and a pseudo-likelihood (MLNE; Wang 2001) estimator. The comparisons were conducted over a range of parameter values, including the effective population sizes (*N*_{e} = 20, 50, 100), generations between samples (*t* = 1, 3, 5, 10), sample sizes (*n* = 20, 60 diploid individuals), and numbers of loci (*L* = 5, 15). Note that in some situations the sample size exceeded the effective size, which is common when sampling natural populations in which the number of juvenile or adult individuals can greatly exceed the number of breeders (Frankham 1995) and corresponds to sampling plan II of Waples (1989). Only one set of simulations with large *N*_{e} = 100 is included (*t* = 1, 3, 5, 10; *n* = 60; *L* = 15), because of the long time required to run all of the models on large *N*_{e} populations and because most applications of *N*_{e} estimators are to natural populations with small *N*_{e}. We ran 600 independent iterations of each combination of parameter values for the model comparisons. The models were compared in terms of bias and precision of *N̂*_{e}, using a number of metrics: relative mean square error (RMSE) of the mode, median bias of the mode, 95th percentile of 95% confidence/credible intervals, and proportion of confidence/credible intervals that excluded true *N*_{e}. We also show the bootstrapped estimates of the standard errors of the RMSE estimates.

All sampled genotypes generated in our simulations were written to files to provide input for TMVP and MLNE. TMVP is an updated version of the TM3 program used in Berthier *et al.* (2002) and provides a posterior distribution of *N*_{e} using a MCMC approach with importance sampling (Beaumont 2003). For the simulations used here, the size of the importance sample was 20, 20,000 MCMC updates were used with 10 updates between estimate outputs, and an *N*_{e} ceiling was set at 400. The TMVP *N*_{e} evaluated in this article is the mode from the posterior MCMC distribution of values (except for an initial 10% of values discarded as burn-in).

MLNE provides a pseudo-likelihood *N*_{e} as described in Wang (2001). We used an updated version of the program, described in Wang and Whitlock (2003), which provides a ceiling for *N*_{e}. We used an *N*_{e} ceiling of 400 in these simulations. MLNE also provides *N*_{e} from the moment estimator of Nei and Tajima (1981), but no confidence intervals. Therefore, the 95% confidence intervals of the moment estimator are not reported and described here, but they have been explored previously and their precision and bias are well known (Luikart *et al.* 1999; Wang 2001; Berthier *et al.* 2002).

### Robustness to prior probability distribution of *N*_{e}:

We investigated the robustness of the SummStat estimator to changes in the prior probability distribution (using the same metrics described above) when the lower limit of the prior remained at 4 and upper limit of the prior probability distribution for *N*_{e} was 200, 400, or 1000. True *N*_{e} was set to 50. Diploid individuals were sampled 1 or 3 generations apart and genotyped at 15 loci. We did not directly compare the SummStat estimator to the others in these simulations. Results are based upon 600 iterations of each set of conditions.

### Robustness to initial allele frequencies:

Allele frequencies are drawn from a uniform Dirichlet distribution in most simulation-based studies of *N*_{e} estimators, including ours (Anderson* et al.* 2000; Wang 2001). This corresponds to specific assumptions about the mutation process—namely, that it is a *k*-allele model where each mutation occurs at rate Θ, and the probability of mutation to any specific allele is 1/*k*. We investigated the robustness of our results to assumptions about the underlying allele frequency distribution by initializing a set of simulations and priors with allele frequency data drawn from the coalescent. For these simulations the Wright-Fisher population described above was used. Initial levels of polymorphism were determined by a randomly selected value of Θ = 5–15, and the population was first sampled *t*_{1} = 1–10 generations following a change in effective size to *N*_{e} = 20. We used uniform flat priors for Θ and the *t*_{1}. The second sample was collected *t*_{2} = 1, 3, 5, or 10 generations after *t*_{1}. In each sample, *n* = 60 individuals were genotyped at *L* = 15 loci. The estimators were evaluated as described above on the basis of 600 iterations of each set of conditions. With this slight modification we are now able to obtain the posterior distribution of *N*_{e} marginal to both Θ and *t*_{1}. This highlights the advantage of approximate Bayesian computation based on summary statistics, and it is straightforward to make changes in the model with minimal programming effort.

### Real data set:

To further illustrate the SummStat estimator, we estimated *N*_{e} for an experimentally bottlenecked population reported in Spencer *et al.* (2000) and evaluated by Berthier *et al.* (2002), using their coalescent-based *N*_{e} estimator and the moment estimator. The target data set was collected from a large source population of mosquito fish (*Gambusia affinis*) that was sampled and then experimentally reduced to eight pairs of founders, allowed to grow for two generations, and then resampled (Spencer *et al.* 2000). Forty individuals were genotyped at eight microsatellite loci in each sample. Although true *N*_{e} is unknown for this population, *N*_{e} = 16 is the hypothesized value. For this target data set we generated values for the same summary statistics used in our simulations: θ̂ = 0.283, Δ*â* = −5.50, Δ*Ĥ*_{s} = −0.096, and *Ĥ*_{t} = 0.708. We used the same Wright-Fisher model described above to simulate 50,000 populations, with an *N*_{e} randomly drawn from a uniform flat prior between 4 and 500 and initial levels of polymorphism determined by the coalescent with a uniform flat prior for θ between 5 and 15. In each simulation, 40 diploid individuals were sampled and genotyped at eight loci in generations 0 and 2. These genotypes were used to calculate the same summary statistics calculated for the target data set, with the tolerance set at 0.02. We report the mode and 95% credible intervals of the posterior distribution from SummStat and compare them to the other estimators. The estimates from TMVP and the moment method are taken directly from Berthier *et al.* (2002)(Table 3). MLNE estimates were obtained using a ceiling of 500.

## RESULTS AND DISCUSSION

### Estimator performance:

The SummStat estimator provides accurate and precise *N̂*_{e} across a range of generations between samples and numbers of individuals and loci sampled. In general, this estimator shows a small, positive bias when only a single generation passes between samples (Figure 1)
. Only when *N*_{e} = 50 and sampling conditions are at their worst (only 20 individuals, 5 loci, and 1 generation) did SummStat show a substantial positive bias. The bias decreases rapidly with increases in the number of individuals sampled, loci sampled, and generations between samples. The relative mean square error (RMSE) of the SummStat estimator is small when true *N*_{e} = 20, but is noticeably larger when *N*_{e} = 50 and only 1 generation separates the samples (Figure 2)
. There is a consistent, striking decrease in the RMSE when the number of generations between samples increases from 1 to 3, regardless of the other sampling conditions. However, there is relatively little increase in accuracy and precision as the number of generations increases from 3 to 10.

The SummStat 95% credible intervals contain true *N*_{e} consistently and do not depart greatly from the expectation of 2.5% of the true *N*_{e} values falling above or below the credible intervals (Table 1)
. There is a slight tendency to exclude true *N*_{e} from the lower credible interval with sparse data and a single generation between samples, which probably is due to the slight positive bias of the SummStat estimator under these conditions noted above. In addition, the 95th percentiles of the 95% credible intervals tend to be conservative when data are sparse or drift is weak (Table 2)
. The SummStat 95% credible intervals narrow rapidly with increasing numbers of generations between samples. For the parameter combinations evaluated, a threefold increase in the number of loci sampled provides a markedly greater increase in precision than a threefold increase in the number of individuals sampled.

The SummStat estimator appears robust to changes in the upper limit for the prior probability distribution of *N*_{e}. The bias, RMSE, and lower credible interval of the estimator do not change much with a fivefold change in the upper limit of the *N*_{e} prior from 200 to 1000 and true *N*_{e} = 50 (Table 3)
. However, while the lower credible interval is relatively stable, the upper credible interval does appear to be sensitive to the choice of a prior when only a single generation separates samples. For example, the upper credible interval more than doubled with a fivefold increase in the upper limit of the *N*_{e} prior. With three generations between samples the upper credible limit changed by <25% with a fivefold increase in the upper limit of the *N*_{e} prior. This limit changes less with more generations between samples (results not shown).

### Comparison to other estimators:

The SummStat estimator has the lowest bias and generally performs well relative to the other estimators. When only 1 generation passes between samples (*t* = 1), TMVP and MLNE tend to underestimate *N*_{e}, whereas the SummStat estimator slightly overestimates it (Figure 1). In general, TMVP shows the largest bias of the four estimators when 1–5 generations pass between sampling events. In contrast, with 10 generations passing between sampling events, the bias of TMVP is greatly reduced, whereas the moment estimator has the greatest bias. MLNE generally shows the second lowest bias after the SummStat estimator, and this bias decreases with increasing generations between samples. Wang (2001) reported a slight positive bias for MLNE that is based on the mean of the *N*_{e} estimates, rather than on the median as reported here. In fact, the distribution of *N*_{e} estimates is skewed (as illustrated in Wang and Whitlock 2003, Figure 3), and the mean is generally larger than the median, as reported here.

All of the estimators generally show reduced bias and RMSE with increasing sample sizes, loci, and generations between samples. The exception is the moment estimator, which shows increasing bias with increasing generations between samples. In most cases simulated, SummStat RMSE is intermediate to that of the other estimators, and the relative performance of the estimators changes with sampling conditions. However, the RMSE of the SummStat estimator is generally larger than that of MLNE, which often provides the smallest RMSE, and is consistently smaller than the RMSE of the moment estimator. TMVP shows very small RMSE and little bias when 10 generations pass between samples. In contrast, the moment estimator consistently has the largest RMSE when there are 10 generations between samples.

TMVP and MLNE tend to exclude true *N*_{e} from the upper 95% credible/confidence interval much more than expected by chance (Table 1). This tendency is most notable with TMVP, which consistently provides the lowest upper and lower confidence intervals. The observed bias of the confidence intervals generally worsens with the number of loci or individuals sampled, if the number of generations between samples is held constant. In contrast, the observed downward biases of the 95% confidence/credible intervals of TMVP and MLNE improve rapidly with increasing generations between samples. Although the 95% credible intervals of the SummStat estimator are less likely to exclude true *N*_{e} across the entire range of sampling conditions, the 95% confidence/credible intervals of the other estimators are often narrower when there are few generations between samples (Table 2). Consequently, there is a fairly consistent trade-off in the bias and precision of these estimators, especially when data are sparse, *N*_{e} large, and few generations pass between samples. TMVP and MLNE are more precise, but there is a risk of underestimating true *N*_{e}. The SummStat estimator provides relatively conservative credible intervals and is less biased.

The general patterns seen in the data generated using a Dirichlet distribution to draw initial allele frequencies also appear to be robust to changes in the underlying allele frequency distribution and lend credibility to our analyses. The patterns of bias and RMSE of the estimators are similar whether initial allele frequencies are drawn by sampling from a Dirichlet or from the coalescent. However, the tendencies of each estimator noted above appear to be accentuated. The SummStat estimator continued to show the lowest bias (Figure 1) as well as a relatively large RMSE when few generations passed between samples (Figure 2). TMVP and MLNE were precise, but showed considerably strong bias. A noticeable change in the relative performance of these estimators is the improved performance of the moment estimator in the coalescent simulations, especially with 5–10 generations between samples. The positive bias and low accuracy of the moment estimator when drift is strong have been noted elsewhere to result from the fixation of low-frequency alleles present in the first sample (Richards and Leberg 1996; Luikart *et al.* 1999; Williamson and Slatkin 1999; Wang 2001; Berthier *et al.* 2002). The improved performance here is probably due to there being fewer rare alleles because the simulated populations were first sampled as many as 10 generations following the initial reduction in *N*_{e}. This long period of strong drift before drawing the first sample could decrease the number of rare alleles, thus reducing this source of bias for the moment estimator.

The patterns seen in the 95% confidence/credible intervals when initial allele frequencies are drawn from the coalescent are also similar to the patterns in simulations that draw from the Dirichlet. Namely, the SummStat credible intervals are conservative and are slightly more likely to fall above than below true *N*_{e} (Table 4)
. In contrast, the confidence/credible intervals of TMVP and MLNE tend to fall below true *N*_{e} and provide underestimates of true *N*_{e} quite frequently.

TMVP provides point estimates and upper credible intervals that are biased low when there are few generations between samples, and this bias has been noted previously (Berthier *et al.* 2002). However, TMVP is very accurate and precise when 10 generations pass between samples, despite the fact that it suffers from a disadvantage in our comparisons in that it assumes a coalescent model to simulate drift and a Wright-Fisher model was assumed for our simulations. Indeed, TMVP was very accurate in a subset of simulations using the coalescent rather than a Wright-Fisher model to simulate drift (data not shown).

### Empirical data:

The application of SummStat to data collected from an experimental mosquito fish population studied by Spencer *et al.* (2000) demonstrates its utility in an empirical setting. The SummStat point estimate of *N*_{e} = 8.93 falls below the hypothesized true *N*_{e} = 16 for this population. In contrast, the coalescent, moment, and MLNE estimates are 21.6, 35.4, and 32.51, respectively. More importantly, the SummStat *N*_{e} 95% credible intervals contain the hypothesized true *N*_{e} value and are reasonably precise, whereas all of the others excluded *N*_{e} from their credible/confidence intervals (Figure 3)
. In this case, only eight loci and 40 individuals were sampled, which is a modest sampling effort to expect for most natural populations. Although this is only a single application of the SummStat estimator, the results are encouraging and suggest the estimator may work well when applied to microsatellite data sampled from real populations.

### Other considerations:

It has been suggested by Wang and Whitlock (2003) that measurement of bias and precision should be based on 1/(2*N*_{e}) rather than on *N*_{e}. The relative advantages or disadvantages of different point estimators and transformations can be assessed only in a decision-theoretic framework (O'Hagan 1994), where a utility function can be specified for *N*_{e}, and in the absence of this we prefer to measure bias and precision in terms of *N*_{e}. Examination of a number of data sets indicates that the general patterns in the data reported here are robust whether one examines estimates of *N*_{e} or 1/(2*N*_{e}).

An important choice in Bayesian inference is the method used to generate the prior probability distribution. We have found our estimator to be robust whether initial allele frequencies are drawn from the Dirichlet or the coalescent. However, as with any Bayesian approach, it is important to recognize that large differences between the biological conditions that gave rise to the data and the method used to generate the prior can affect the validity of the results. Common effects of small population size, such as small departures from Hardy-Weinberg proportions or gametic phase equilibrium, should not create problems in our simulations because the assumed model incorporates these effects adequately. However, if the target data had been collected from a population with high rates of undetected immigration, for example, and the model used to generate the priors did not incorporate this, then biological inferences could be misleading. Consequently, researchers should temper their interpretations with careful consideration of the assumptions (*e.g.*, immigration, no substructuring, neutral markers) used in creating the prior probability distribution and how differences within the biological context that created the data set of interest might affect their inferences.

It is also important to note that the quality of the information provided by each summary statistic will vary with sampling conditions, amounts of genetic drift, and type of genetic markers. For example, if molecular markers or populations with few alleles are studied, then the mean loss of alleles per locus may be a less informative summary statistic than, for example, divergence between samples. Thus, careful attention should be paid to choosing statistics that are likely to be informative, as one seeks to maximize the amount of information that can be extracted from a data set while avoiding the curse of dimensionality created by using many different summary statistics (Beaumont *et al.* 2002). Ideally, both the literature and simulations should be used to help choose the best summary statistics for a given set of biological conditions and sampling constraints.

### Conclusions:

The SummStat estimator performs well, relative to the others, using only four summary statistics. It is the least biased method over the full range of parameter values investigated, has an RMSE intermediate to the others in most of the scenarios investigated, and performed well when applied to a real data set. However, MLNE generally has the smallest RMSE of the estimators compared, despite a negative bias and tendency to exclude true *N*_{e} from the upper confidence intervals when few generations pass between samples. The complementary properties of MLNE (precision) and SummStat (accuracy) suggest that it would be wise to use both to estimate *N*_{e}. TMVP may be more appropriate when reproduction occurs continuously rather than in discrete generations.

The SummStat estimator is best viewed as a flexible approach to *N*_{e} estimation, in terms of both modeling and choice of summary statistics, as shown by the ease with which it is modified to consider a very different prior from a Dirichlet. An advantage of the SummStat estimator is that one can combine any potentially informative summary statistics calculated from sample data into a single approach to *N*_{e} estimation. For example, it could conceivably be extended beyond the allele frequency-based information used in the present example to include genotypic information, thus extracting more information provided by the data (Hill 1981; Waples 1991). The time-saving attributes of approximate Bayesian approaches relative to exact Bayesian methods make them an appealing alternative for population genetics applications. New molecular technologies provide ever-increasing numbers of molecular markers, which, in turn, make full-likelihood calculations very time consuming. The performance of the SummStat *N*_{e} estimator suggests further exploration and expansion using approximate Bayesian computation may be a fruitful means to improve efficient *N*_{e} estimation.

## Acknowledgments

We thank L. Excoffier, J. Wang, and R. Waples for helpful reviews and suggestions that improved this manuscript. We also thank J. Wang for graciously recompiling slightly altered versions of his program for us. This work was supported by a National Science Foundation postdoctoral grant (INT 0202707) to D.A.T. and by the European Commission (Econogene contract QLK5-CT-2001-02461) to G.H.L. M.A.B. is supported by a Natural Environment Research Council Advanced Fellowship.

## Footnotes

Communicating editor: L. Excoffier

- Received December 26, 2003.
- Accepted March 1, 2004.

- Genetics Society of America