Temporal Estimates of Effective Population Size in Species With Overlapping Generations
Robin S. Waples, Masashi Yokota


The standard temporal method for estimating effective population size (Ne) assumes that generations are discrete, but it is routinely applied to species with overlapping generations. We evaluated bias in the estimates Math caused by violation of this assumption, using simulated data for three model species: humans (type I survival), sparrow (type II), and barnacle (type III). We verify a previous proposal by Felsenstein that weighting individuals by reproductive value is the correct way to calculate parametric population allele frequencies, in which case the rate of change in age-structured populations conforms to that predicted by discrete-generation models. When the standard temporal method is applied to age-structured species, typical sampling regimes (sampling only newborns or adults; randomly sampling the entire population) do not yield properly weighted allele frequencies and result in biased Math. The direction and magnitude of the bias are shown to depend on the sampling method and the species' life history. Results for populations that grow (or decline) at a constant rate paralleled those for populations of constant size. If sufficient demographic data are available and certain sampling restrictions are met, the Jorde–Ryman modification of the temporal method can be applied to any species with overlapping generations. Alternatively, spacing the temporal samples many generations apart maximizes the drift signal compared to sampling biases associated with age structure.

BECAUSE effective population size (Ne) is an important parameter in evolutionary biology but is notoriously difficult to measure in natural populations, considerable interest has focused on genetic methods for estimating Ne (reviewed by Beaumont 2003; Leberg 2005; Wang 2005). By far the most widely used genetic approach for estimating contemporary Ne is the temporal method (Krimbas and Tsakas 1971; Nei and Tajima 1981), so called because it depends on estimates of allele frequency taken from a population at two or more points in time. In addition to other simplifying assumptions, the standard temporal method assumes that generations are discrete, whereas many species are age structured and hence have generations that overlap. Two variations of the temporal method can account for effects of age structure, at least in some circumstances. Waples (1990) developed a modified temporal method for species with life histories like Pacific salmon (semelparous with variable age at maturity), but this model is not intended for use with iteroparous species. A more general model for age-structured, iteroparous species was developed by Jorde and Ryman (1995), who showed that the magnitude of allele-frequency change is determined not only by effective size and the sampling interval, but also by age-specific survival and birth rates. They derived an adjustment to the standard model to account for age-structure effects, and subsequent evaluations (e.g., Jorde and Ryman 1996; Palm et al. 2003) documented the biases in estimates of effective size that can result from application of the standard temporal method without accounting for overlapping generations. However, the Jorde–Ryman method requires detailed demographic information and the ability to age individuals or group them into single-cohort samples and perhaps for these reasons has not been widely applied (but see Turner et al. 2002 and Palm et al. 2003 for examples). In spite of its obvious limitations, the discrete-generation temporal method has been and continues to be widely applied to species with overlapping generations (e.g., Scribner et al. 1997; Johnson et al. 2004; Hoffman et al. 2004; Kaeuffer et al. 2004; Poulsen et al. 2006). A priori, we expect bias in the resulting estimate of Ne when an otherwise unbiased method is applied to situations that violate assumptions of the model. With respect to application of the standard temporal method to species with overlapping generations, it is possible that the biases could be small if the elapsed time between samples is long enough that the drift signal strongly dominates sampling considerations (as suggested by Jorde and Ryman 1995 and assumed, for example, by Miller and Kapuscinski 1997 and Hauser et al. 2002). However, under what specific circumstances this might be true cannot be determined without a quantitative analysis, nor can the magnitude and direction of biases associated with samples more closely spaced in time.

In this article we evaluate performance of the standard temporal method when it is applied to iteroparous species with overlapping generations. Central to our evaluations is establishing a point of reference for evaluating the true Ne of such a population. For this we draw on two important contributions of previous authors. First, Hill (1972) showed that discrete-generation models for Ne can be modified to apply to organisms with overlapping generations, provided that the population is of constant size (N) and demographically stable. This means, for example, that over t generations of genetic drift, a population with overlapping generations should experience the same amount of allele-frequency change as a population with discrete generations and the same effective size per generation. But to test this, one must be able to measure (or estimate) the population allele frequency at a given point in time. This is straightforward if generations are discrete, but how can one measure the parametric allele frequency of a population when generations overlap? Felsenstein (1971) proposed that the correct way to calculate a population allele frequency in this case is to weight each individual by its reproductive value. We first show that this method correctly predicts the rate of allele-frequency change in simulated, iteroparous populations of constant size. Next, we evaluate bias of estimates of Ne using the standard temporal method and how they vary as a function of the species' life history and various commonly used sampling strategies.

Populations that change in size are of particular interest to evolutionary biologists and conservation biologists. Changing population size does not present a problem for the standard, discrete-generation temporal model (if population size varies the method estimates the harmonic mean Ne over the time between samples), but most models for Ne in species with overlapping generations depend on the assumption of constant population size. However, Felsenstein (1971) derived an expression for the variance-effective size in species with overlapping generations that grow (or decline) at a constant rate, in which case age structure remains constant. To complete our evaluations, we used Felsenstein's model to establish the benchmark “true” effective size in populations that deterministically change in size and evaluated performance of the standard temporal method under these conditions.


Definition of Ne when generations overlap:

Constant N:

The models developed by Felsenstein (1971) and Hill (1972, 1979) both have features that are useful for our analyses. Notation is consistent with that used by Felsenstein (1971) and is summarized in Table 1. The species is a monoecious diploid with the possibility of selfing. Demographic parameters are fixed; exactly N1 individuals are born in each time period, so the population will eventually reach a stable age distribution. Time units, indexed by t, are in years except as noted. The fraction of newborns that survive to age x is lx = Nx/N1, where Nx is the number in age class x. During a single time interval, individuals of age x produce an average of bx offspring that survive to the beginning of age class 1, so the probability that a newborn has a parent of age x is lxbx. The cohort size of newborns is Math. In this model, Math, generation length is given by Math, and reproductive value (Fisher 1958; Felsenstein 1971) can be calculated as Math.

View this table:

Notation used

Felsenstein (1971) showed that Ne in species with overlapping generations can be calculated directly from the age-specific survival and fecundity parameters contained in a standard Leslie matrix. In his model, each year within each age class there is random (binomial) variation among individuals in reproductive success, random survival of individuals between age classes, and no correlation between mortality and fecundity. Under these conditions, and assuming constant population size, inbreeding and variance-effective sizes are the same and are given byMath(1)(Felsenstein 1971, Equation 10), where dx is the probability of death at the end of age x and sx = 1 − dx.

Hill (1972, 1979) considered variance Ne in a model similar to Felsenstein's except that Hill made no particular assumption about variation in reproductive success among individuals. For monoecious diploids with random selfing, Hill showed that effective size is given byMath(2)where N1 and T are as defined above and Vk is the lifetime variance in reproductive success (production of newborns) by the N1 individuals making up a cohort of newborns. With random variation in (lifetime) reproductive success (Vk ≈ 2), Ne = N1T, which is the number of individuals entering the population over a period of one generation. In general, however, age structure leads to Vk > 2 and Ne < N1T.

Equation 2 is more general than Equation 1 but the latter is more convenient to use with data from a typical life table. However, if one assumes that the distribution of reproductive success is Poisson within each age class, Equations 1 and 2 are comparable (Johnson 1977). Under this assumption, a reformulation of Equation 2 (see appendix a) provides a way of implementing Hill's model on the basis of population vital rates,Math(3)where Math is the average lifetime reproductive success of individuals that die between years x and x + 1. Effective size calculated from Equation 3 is referred to as Ne(H).

Changing N:

Felsenstein (1971) also developed an expression for variance Ne in populations that are changing size at a constant rate λ per time period. λ is the dominant eigenvalue of the Leslie matrix and is the unique real solution to the discrete-time version of the Euler–Lotka equation Math. When population size changes deterministically, analogs to the life-history parameters described above are Math and Math, and effective size is given byMath(4)(Felsenstein 1971, Equation 24), where N1′ is the number of newborns in the next time interval. If λ = 1, N1′ = N1 and Equation 4 is identical to Equation 1 except for the last term in the denominator, which is zero if the number of offspring per parent in one time interval is Poisson (Felsenstein 1971). Effective size calculated from Equation 4 is referred to as Ne(F).

Model species:

We chose three model species (Figure 1; Table 2), each representative of one of the three basic survival schedules. Humans are a classic type I survivorship species, with high survival well into adulthood followed by a period of rapidly increasing mortality. The white-crowned sparrow (Zonotrichia leucophrys nuttalli; Baker et al. 1981) has a modified type II survivorship curve, with a constant survival rate after an episode of high early mortality. The barnacle (Balanus glandula; Connell 1970) exhibits a classical type III survivorship curve with very high early mortality, even after we reduced fecundity and age-1 mortality by an order of magnitude from published values to make the simulations more tractable. We used the human demographic data analyzed by Felsenstein (1971), which are arranged into 5-year age classes. For the other two species, life-history parameters were modified slightly from published values to provide for a constant population size and integer generation lengths in units of years (T = 3 years for the sparrow and 4 years for the barnacle). The published data for all three species are for females only, so for the purposes of this exercise we assumed that the same values apply to the entire population.

Figure 1.—

Survivorship curves for the three model species. See Table 2 for data sources.

View this table:

Life-history parameters for the model species, scaled to constant population size

Computer simulations:

We used computer simulations to model drift variance in allele frequency in populations with demographic parameters characteristic of the three life-history types. We considered both stable and growing populations.

Constant population size:

For each species we considered a “small” and a “large” population size, indexed by N1, the number of newborns produced each year [“newborns” were enumerated as the number of live births (humans), clutch size (sparrow), and plankton just after hatching (barnacle)]. The small and large N1 values were: human, 102, 103; sparrow, 103, 104; and barnacle, 105, 106 (Table 3). These population sizes translated into Ne values of order 102 for the small population sizes and of order 103 for the large population sizes (Table 3), on the basis of application of Equations 1 and 3 to the life-history data in Table 2. Given a fixed value of N1, the stable age distribution (and hence the number in each age class) is given by the right eigenvector of the Leslie matrix, which we calculated using the power method described by Caswell (2001). For the simulations, we rounded the number of individuals in each age class to the nearest integer, yielding (for each population size in each species) a vector of Nx values representing the (fixed) number of individuals in age class x.

View this table:

Fixed numbers in each age class and effective population size for constant populations of the three model species

For each fixed demographic trajectory, we modeled the stochastic process of genetic drift by randomly drawing genes to represent birth of newborns and survival from one age class to the next. At each time t, we calculated the frequency of a “gamete pool” of infinite size as the weighted mean of the allele frequencies in each age class of reproductive adults: Math. In generating this gamete pool, all individuals within an age class contribute equally, but the total contribution differs among age classes on the basis of age-specific survival and birth rates. Therefore, the process for reproduction simulated an array of Wright–Fisher subpopulations with the possibility of selfing, stratified by age. [Whether a species is monoecious or dioecious, and whether or not selfing is allowed, has a negligible effect on Ne (Crow and Denniston 1988; Caballero 1994); therefore, this aspect of the model should not have appreciably influenced the results.] Next, we calculated the allele frequency in the N1 newborns in time unit t + 1 (P1(t+1)) by drawing 2N1 genes binomially from this gamete pool. Finally, for each age class x > 1, allele frequency at time t + 1 was calculated by sampling hypergeometrically (without replacement) from the 2Nx−1 genes representing the frequency in age class x − 1 at time t. This entire random process of sampling genes due to births and deaths was then repeated to generate an age-structured vector of allele frequencies for the next time unit. Simulated populations were initialized using the stable age distribution and with the same allele frequency in all age classes at time 0 [Px(0) = 0.5, x = 1, 2, …]. Each replicate simulation was run for 100 time units. Each run of 100 time units was taken to represent the trajectory of one gene locus, and the entire procedure was repeated for each additional locus (5000 loci total).

At each time unit, population allele frequencies were computed parametrically (by exhaustive population census) and estimated by sampling subsets of individuals. Parametric frequencies were computed two ways: (1) the standard method, which involves counting alleles in all individuals in the population and leads to an unweighted frequency Math, and (2) the weighted method, in which each individual is weighted by its reproductive value (Math; Felsenstein 1971).

Sampling of individuals was simulated by random draw of the required number of genes from specified age classes, and sample allele frequencies were computed by counting alleles (hence the sample frequencies were unweighted). We considered three different sample sizes (S1 = S2 = 25, 50, and 100) and three different sampling regimes: (a) all individuals are at equal risk of being sampled regardless of their age, (b) only reproductive adults (those in age classes with bx > 0) are subject to sampling, and (c) only newborns are subject to sampling (so sampling is from a single cohort). In all cases, genes were counted without replacement, so no individual could be sampled more than once in any time period. However, after the sampling (or enumeration) was completed, all genes were returned to the population, so sampling did not affect the future demographic or genetic trajectory of the population.

After initializing all age classes at the same frequency (P0 = 0.5), a period of time is needed before the population reaches a dynamic equilibrium with respect to the amount of random allele-frequency variation among years and among age classes within years. In agreement with results reported by previous authors (Jorde and Ryman 1995; Waples 2002), we found that this dynamic equilibrium was reached quickly (within a few generations) under the conditions we modeled (data not shown). Therefore, we allowed each replicate to “warm up” for 20 time periods (periods 0–19) before collecting data.

Beginning in period 20, parametric or estimated population allele frequencies were compared at several different time intervals (L): 1 time unit and 1, 5, or 10 generations (T, 5T, and 10T time units, respectively). For each time interval, we used the method of Nei and Tajima (1981) to estimate F, the standardized variance of allele frequencies at two points in time. For diallelic loci such as those considered here, this measure is calculated asMath(5)where a is the number of loci and Pi1 and Pi2 are gene frequencies at the ith locus in the first and second samples, respectively. We also considered a closely related measure (Math) proposed by Pollak (1983). As previous authors have shown that the two measures generally lead to comparable results, we focused on Math because it has a smaller variance for diallelic loci (Tajima and Nei 1984; Waples 1989). However, we found that in some circumstances Math differed substantially depending on which estimator of F was used, and these cases are noted below.

For each time interval, we computed an overall Math as the mean across the 5000 replicate loci. After advancing the initial time period by one generation (e.g., first sample occurs in time period 20 + T), this process was repeated for each interval of L years, generating a vector of mean Math values for a number of intervals of the same duration equally spaced one generation apart across the last 80 years of each replicate. For some analyses, we also averaged across the elements of these vectors to arrive at an overall mean Math characteristic of a particular sampling interval of L years.

To evaluate accuracy of the demographic formula for Ne, we compared observed rates of allele frequency change in simulated populations with the magnitude of change expected assuming that true Ne was as given in Table 3. The expected variance of parametric population allele frequency at time period t among replicate populations or gene loci is given byMath(6)where P0 = initial allele frequency (0.5) and g = t/T = elapsed time in generations. For the simulated populations, we calculated an observed variance in allele frequency at time period t as the variance in Math among 5000 replicate gene loci.

Estimating Ne with the temporal method:

In the temporal method, effective size is estimated by relating the observed amount of allele frequency change to that expected under pure drift. The standard temporal method (Krimbas and Tsakas 1971; Nei and Tajima 1981; Pollak 1983; Waples 1989) is commonly referred to as the moment method because it relies on an estimate of F. Under a pure drift model, the expected value of Math depends on Ne, the number of generations between samples (g), and the sampling regime. Here we consider nonlethal sampling of the population (with replacement, as in collecting a biopsy for DNA analysis). This corresponds to sampling plan I described by Nei and Tajima (1981) and Waples (1989). Under these conditions,MathandMathwhere S1 and S2 are the sample sizes at times 1 and 2, N is the number of individuals at risk of being sampled at the time of the first sample, and g = L/T = elapsed time in generations. Because sample sizes S1 and S2 were the same in our analyses of constant population size, the above equation simplifies toMath(7)

The term 1/N accounts for the covariance of allele frequencies at times 1 and 2 that arises in plan I sampling because some individuals in the initial sample can also contribute genes to future generations. The appropriate value of N depends on the sampling scheme. If sampling is random with respect to the entire population, the estimator of Ne becomesMath(8)(sampling from entire population). For an exhaustive population census, S = Math, leading toMath(9)(complete population census). If only reproductive adults are sampled, then the number subject to sampling is the total in age classes equal to or older than the age at first maturity (j) and younger than the age of senescence, q (at which age bx = 0 again):Math(10)(sampling mature adults). Finally, if each sample is taken from a single cohort of N1 newborns, the appropriate estimator isMath(11)(sampling newborns). Assuming that survival among age classes is random, Equation 11 also applies to any random sample of a single cohort; it is necessary to substitute for N1 only the total number remaining in the cohort at the time of sampling.

Jorde and Ryman (1995) modified the standard temporal method to account for overlapping generations. Their modified estimator is (in current notation and assuming plan I sampling)Math(12)where C is a constant that depends on life-history features of the population. Jorde and Ryman's model assumes that population size and demographic parameters are constant and is based on a comparison of consecutive cohorts, which can be achieved by sampling single cohorts or sampling mixed cohorts and sorting individuals by age to reconstruct single-cohort samples. We sampled randomly from the N1 newborns in successive years and used Equation 12 to estimate Ne using the Jorde–Ryman method. The recurrent formulas (10–13) and (23) in Jorde and Ryman (1995) were used to calculate C. This iterative process rapidly converged on a constant C value for each life-history scenario we considered.

Changing population size:

We also modeled “human” and “sparrow” populations that grew deterministically, increasing each time period by the multiplicative factor λ. We used the λ values estimated from the original published data (1.037 for humans and 1.063 for the sparrow). In addition, we considered a human population that declined at the rate λ = 0.994 per time period. These populations were initiated and allowed to run for 20 time periods (periods 0–19) at constant size (determined by low N1) and demographic parameters given in Table 2. At time period 20, population growth was generated by scaling the birth rates to provide the desired λ, while the lx values remained unchanged. At time period 20 the age structure was also adjusted to conform to the stable age distribution for a growing (or declining) population. In each successive time period, the vector of Nx values was obtained by applying the birth and death rates (Table 6) to the current population: the newborn cohort was generated according to Math, and random mortality as described above determined which genes survived from one age class to the next. We generated 100 time periods of population sizes using real numbers for the Nx but rounded these to integers in each time period before conducting the simulations. Under the modeled growth rates, the human population (excluding senescent individuals) increased from 826 to >31,000, and the sparrow population size climbed from 1326 to >5 × 105 (Table 6). In the human decline scenario, population size dropped from 5118 to 2807.

With changing population size, effective size also changes every time interval. At a given time t, Equation 4 gives a per-generation effective size (Ne(Ft)) that characterizes the expected amount of gene frequency drift that occurs between times t and t + T (Felsenstein 1971). This definition creates an analytical difficulty here. Because there is a different Ne(Ft) for every time period, genetic change in the population is described by an array of Ne(Ft) values that apply to partially overlapping periods of one generation each, and it is not immediately clear how to determine the appropriate Ne value for any given point in time. In the present context, we need to evaluate genetic change over one or more individual time units that are fractions of a generation. Therefore, we need a way to calculate E(Ne(t)), which is the effective size that describes genetic change between times t and t + 1. As shown in appendix b, E(Ne(t)) is just the instantaneous Ne for the point midway between times t and t + 1,Math(13)where Ne*(t) is an instantaneous effective size that describes the actual rate of change in the population at time t. E(Ne(t)) from Equation 13 can be compared with Math estimated from samples taken in time periods t and t + 1. For longer time intervals, let E(Ne(t,t+x)) represent the effective size that determines the rate of change in the population between time periods t and t + x. Then E(Ne(t,t+x)) can be calculated as the harmonic mean of the E(Ne(t)) values for time periods t through t + x − 1.

When effective size changes over time, the expected variance of parametric population allele frequency at time period t among replicate populations or gene loci isMathwhere Ne(i) is the effective size for the time period i to i + 1. We used Equation 13 to calculate E(Ne(i)) for each time period.

To estimate Ne, we used the procedure described above to calculate Math for various time intervals and sampling regimes. With changing population size, however, it is necessary to adjust the terms for N to correspond to the correct time period. Therefore, in Equations 8, 10, and 11 we used the values for Math, Math, and N1, respectively, that applied to the time period of the first sample. The sample sizes of 25, 50, and 100 were fixed, but when complete population enumeration was done to calculate parametric population allele frequencies in growing populations, the sample size at time 2 was larger than that at time 1: S2 > S1 = Math. Therefore, Equation 9 was modified as follows:Math(14)(complete population census; growing population).


Constant population size:

For the three model species, we compared the demographic calculations of Ne (based on Felsenstein 1971) with values calculated by two other methods (Equation 3, modified from Hill 1972, and Jorde and Ryman 1995). The three methods produced essentially identical values of Ne under both low and high N1 (Table 3). For simplicity, in what follows we use only Equation 4 to calculate Ne from demographic data.

Comparison of expected and observed rates of change in allele frequency:

As shown in Figure 2, the observed variance in weighted population allele frequencies Math very closely tracked the expected variance in all three model species, indicating that the putative Ne values in Table 3 accurately describe the variance effective size of these populations. For the remainder of this section, therefore, we refer to the values in Table 3 as the “true” Ne.

Figure 2.—

Comparison of observed and expected Var(Pt) for the three model species under constant population size and low N1. Comparable results were found for high N1 (data not shown).

Genetic estimates of Ne:

Figure 3 compares true Ne values with estimates of Ne based on the standard temporal method, using complete population census to calculate weighted allele frequencies (Math). For both small and large cohort sizes in all three species, agreement of expected and true Ne was very good. Under every scenario, Math for a given time interval fluctuated randomly around the true value. Figure 3 shows results for low N1 for samples taken 1 year apart; similar results were found for high N1 and longer time between samples (L = 1, 5, or 10T; Table 4). At the extreme, deviations of Math from true Ne were <3% (sampling interval of 10T in humans and barnacle).

Figure 3.—

Comparison of estimated Ne (Formula) and expected Ne [E(Ne)] for the low N1 simulations of the three model species under constant population size. Expected Ne uses Equation 4 and the life-table data in Table 2. Estimated Ne was calculated using the standard temporal method (Equation 9) on the basis of samples taken 1 year apart; Formula was computed by complete population enumeration, using either unweighted allele frequencies or frequencies weighted by reproductive value.

View this table:

Effective population size estimated for the three model species using the standard temporal method

We also estimated Ne using the temporal method based on a complete population census using unweighted allele frequencies (Math); this produced estimates of Ne that were substantially biased in all cases: ∼50% too high for humans and ∼50% too low for the other two species (Figure 3; Table 4). That is, complete enumeration of all individuals alive at a given time does not provide a reliable way of calculating parametric population allele frequencies in species with overlapping generations.

The array of different sampling regimes produced a range of outcomes in the three species (Table 4). Two general patterns can be identified. First, small samples (S = 25) produced estimates of Ne that had large biases under many scenarios, even when Math was averaged across 5000 replicate gene loci. Sample sizes of 50 or 100 produced Ne estimates that were less variable and more accurate.

Second, under most sampling regimes in most species, Math was severely biased when F was estimated over short time intervals but approached true Ne as the sampling interval increased. For example, Figure 4 plots Math as a function of time between samples for random samples from either newborns or the entire population. In all six scenarios, the longest sampling interval (10T) produced the most accurate estimate of Ne. In five of the six cases the estimates for shorter time intervals were biased more strongly downward; however, sampling from the entire population for humans led to overestimates of Ne, and the bias was high for L = 1 generation (Figure 4). Figure 4 shows results for conditions under which the temporal method has the most power: large sample size (S = 100) and low N1 (hence relatively small Ne). The same basic pattern (more accurate estimates of Ne over longer time periods) is seen, albeit sometimes less clearly, in the scenarios with smaller sample sizes and larger N1 (Table 4). Sampling only from adult age classes produced results qualitatively similar to those for sampling from the entire population under all scenarios (Table 4).

Figure 4.—

The ratio Formula for the three model species under constant population size and low N1. E(Ne) was computed using Equation 4 and the life-table data in Table 2. Formula was calculated using Formula in the standard temporal method, using random samples either from the population as a whole (Equation 8; solid symbols) or only from the cohort of newborns (Equation 11; open symbols). Sample sizes at times 1 and 2 were fixed at S1 = S2 = 100 but the interval between samples varied from L = 1 to 10 generations.

Finally, we evaluated estimates of Ne from samples (rather than complete population census) of individuals weighted by reproductive value. These estimates were consistently biased downward (data not shown), a result that we determined arises because weighting the samples increases the sampling variance of allele frequency beyond that expected for the nominal sample size S. Although it is possible, for a given set of individual weights, to account for this increase in variance to arrive at an effective sample size, in natural populations the information necessary to do so will rarely be available. Consequently, attempting to weight samples by reproductive value is unlikely to be a practical solution and we did not pursue this idea further.

The Jorde–Ryman method was designed for application to data for two successive cohorts (equivalent to sampling newborns in two consecutive years in our model), and under that scenario Math for the Jorde–Ryman method asymptotically approached the true Ne as sample size increased (Table 5). In their published example (which was for a species with type I survivorship), Jorde and Ryman found that with S = 30, Math based on Math was ∼10% too high. With the human life history, we found comparable results for S = 25 and low N1 (8% upward bias in Math; Table 5), but this bias was considerably less for larger samples. For the sparrow and barnacle life histories, the Jorde–Ryman method also produced accurate estimates for large sample size but the upward bias was more extreme (up to 100%) for S = 25. A more pronounced upward bias was also found for high N1 simulations, especially for small sample sizes (Table 5). When we used the Jorde–Ryman method with Pollak's measure Math, we found the same general pattern (Math asymptotically accurate for large samples) except that small samples led to underestimates (rather than overestimates) of Ne (data not shown).

View this table:

Effective population size estimated for constant populations of the three model species using the method of Jorde and Ryman (1995)

Ne/N ratio:

As pointed out by previous authors (e.g., Nunney 1993; Frankham 1995), the Ne/N ratio is sensitive to the choice of which age classes to include in calculating N. Results in Table 3 illustrate this point. For the barnacle, Ne/N is 1.2 × 10−3 if population size is taken to be all individuals alive at a given time (Math) but is 0.79 (almost three orders of magnitude higher) if only mature adults are counted (N = Nadult). [Note that to make our simulations more tractable we reduced fecundity and increased age-1 survival of the barnacle by an order of magnitude; the difference between Ne/N ratios would be even more extreme using the published data]. The effects are not as dramatic in the other species, but even for humans the ratio Ne/N is over twice as high if N is taken to be the number of reproductive adults rather than the total population size (0.76 vs. 0.34; Table 3).

Changing population size:

Comparison of expected and observed rates of change in allele frequency:

For all three scenarios (human growth and decline and sparrow growth; see Table 6), the observed variance in weighted population allele frequencies Math closely tracked the expected variance (data not shown, but results are comparable to those shown in Figure 2), indicating that Equation 13 accurately predicts the effective size as it changes over time.

View this table:

Life-history parameters for the model species in populations that change in size

Genetic estimates of Ne:

Agreement between Math and E(Ne) for populations that changed in size was excellent for all three scenarios considered. Median Math ratios were almost exactly 1.0 for all time periods when weighted population allele frequencies were used to calculate Math (Table 7). Other sampling regimes produced results that roughly paralleled those for populations of constant size. In humans, a population census with unweighted frequencies led to overestimates of Ne for short time periods, with lower bias over longer time periods. Conversely, sampling newborns led to gross underestimates of Ne for short time periods (Table 7). In the sparrow, a complete population census using unweighted allele frequencies and random sampling from newborns both produced severe underestimates of Ne for short time intervals, but Math approached E(Ne) for longer time intervals. In contrast, sampling from the entire population or only adults led to substantial overestimates for short time periods. These patterns are all consistent with those found for constant population size (Tables 4 and 7).

View this table:

Ratio of estimated (Formula) to expected [E(Ne)] effective size for the three model species in populations that change in size


Felsenstein (1971) proposed, but did not prove, that weighting each individual by its reproductive value is the correct way to calculate parametric population allele frequencies in species with overlapping generations. Our results demonstrate that this is indeed the case, regardless of the species' life history and regardless of whether the population is fixed or changing in size at a constant rate. In contrast, measuring the magnitude of population change using unweighted population allele frequencies led to large departures from expected values in all scenarios considered, even when the entire population was sampled.

These results indicate that caution is needed in interpreting standard-model temporal estimates of effective population size in species with overlapping generations. The resulting estimates will be biased unless the entire population is sampled and individual genotypes are weighted by their reproductive value. Unfortunately, this is not a practical solution to the problem of estimating Ne. Rarely is it feasible to sample an entire population, and, even if this were possible (e.g., as might be the case in small captive populations), the age-specific life history data needed to compute individual reproductive values typically are not available. As discussed above, taking samples and weighting allele frequencies by reproductive value also are generally not feasible. One solution to this problem is to use the Jorde–Ryman method (Jorde and Ryman 1995), which we found generally performed well when it was applied to situations for which it was designed: analysis of samples from consecutive cohorts. This method can be used with any age-structured species, but in practice it has not been widely applied, perhaps because it requires both detailed demographic information to compute the correction factor C and samples from single cohorts in successive years. Most applications of the temporal method for species with overlapping generations continue to use the standard discrete-generation model, so it is important to consider the nature and magnitude of the bias that is likely to result from common sampling schemes. These results are complex, but some patterns are apparent.

First, bias in Math is largest for short time intervals and in many cases largely disappears after 5–10 generations between samples (Figure 4; Table 4). This pattern, which was also seen in analyses of growing populations (Table 7), can be understood by noting that the magnitude of Math (and hence Math) is determined by two major components: a component related to drift (the signal) and one related to sampling a finite number of individuals (the noise). The bias arises because the adjustment for sampling derived for the discrete-generation model does not capture all of the complexities associated with sampling from age-structured populations (Jorde and Ryman 1995). The magnitude of this sampling bias is fixed by the sample size and the nature of the sampling in relation to the species' life history; it does not change with the amount of time between samples. In contrast, a longer sampling interval allows more episodes of genetic drift to influence Math, thus increasing the signal-to-noise ratio in the data.

The second noteworthy pattern is that the direction of bias in Math based on Math differs markedly between the human and the barnacle: random sampling from the population (or a complete population census with unweighted allele frequencies) leads to an overestimate of Ne in humans but an underestimate in the barnacle. In the barnacle, the vast majority of individuals alive at any time are newborns, so a random sample will primarily reflect frequencies in the newborns, which are derived only from the fraction of the population that reproduced the previous year. The result is a sampling variance greater than would be expected from a random sample from the generation as a whole, with the consequence that Math is higher than expected and Ne is underestimated. Humans are much more evenly distributed across age classes, including a substantial contingent of older individuals with low reproductive value that represent a sort of genetic inertia in the population. If their frequencies are weighted equally with younger individuals that have higher reproductive value, the result will be an underestimate of the rate of genetic change and an overestimate of Ne. Sampling only human newborns, however, leads to an underestimate of Ne, as it does in the barnacle, because the progeny are drawn from a relatively small fraction of the population that reproduces in any given time period. Use of Math exacerbates the downward bias in Math for the barnacle (data not shown), because with diallelic loci Math is always larger than Math (Waples 1989), resulting in a lower Math. Use of Math with humans led to quite variable results, with Math being biased either upward or downward depending on the sampling regime, sample size, and elapsed time between samples (data not shown).

Results for the sparrow are somewhat intermediate but closer to the barnacle than to the human and consistent between the constant size and growth scenarios (Figures 3 and 4; Tables 4 and 7). Presumably this reflects high mortality in the sparrow between ages 1 and 2, which causes a large fraction of the population to have low reproductive value, as in the barnacle. Thus, although the sparrow displays a classical type II survival curve after age 1, the episode of high juvenile mortality has important consequences for estimation of effective population size.

Finally, an interesting result is that, at least for small N1, the magnitude of bias in Math was larger for the human than for the other two species (Table 4, Figure 4). This might mean that species with type I survivorship are particularly prone to bias with use of the standard temporal method. However, the generality of this result requires further evaluation, as other factors such as longer generation time could also be involved.


Our analyses have focused on bias, but precision is also an important consideration for any genetic method of estimating Ne. Although it is beyond the scope of this article to evaluate precision in any comprehensive way, some general observations can be made on the basis of previously published results (Nei and Tajima 1981; Waples 1989; Jorde and Ryman 1995). First, the same proportional increase in sample size, elapsed time between samples, and number of independent alleles used in the estimate of Math all have approximately the same effect on precision. Our results show that in some cases it is possible to get largely unbiased estimates of Ne from samples taken only a fraction of a generation apart. In general, however, such estimates would have very low precision unless Ne were very small and samples of individuals and loci very large. As the number of alleles used to compute Math increases, the mean contributions from sampling and drift both stabilize around their expected values and precision increases. With existing technology it is quite feasible to collect data for 10–20 highly polymorphic microsatellite loci in many natural populations, which can produce precise estimates of Ne under certain conditions.

Second, the temporal method can be used most effectively to study populations with small Ne, because the signal from drift is large relative to sampling error. With any genetic method it can be difficult to distinguish a large population from a very large one. For example, under many scenarios with high N1 the point estimates of Ne were infinity, even when Math was averaged across 5000 gene loci (Tables 4 and 5). A similar effect was seen in growing populations, which reached a large size before the end of the simulations (Table 7).

Finally, results obtained by Jorde and Ryman (1995) suggest that the coefficient of variation of Math when generations overlap is comparable to that for the discrete-generation model, so parametric confidence intervals for Math for the standard temporal method (Waples 1989) should also be applicable to iteroparous species.

Likelihood-based methods for estimating Ne:

In recent years a number of likelihood-based methods (Williamson and Slatkin 1999; Anderson et al. 2000; Wang 2001; Berthier et al. 2002) have been proposed for estimating Ne. In general, results for these methods are roughly comparable to the standard temporal method for moderate allele frequencies but are less biased and more precise when many low-frequency alleles are considered. These methods are all computationally demanding and were not evaluated here. However, all depend on estimates of allele-frequency change in samples taken at two or more points in time. Therefore, they should all be affected in the same general way by the various age-structure biases considered in this article. This conjecture should be tested empirically. The empirical Bayesian method proposed by Tallmon et al. (2004) should be able to deal with overlapping generations by modifying life-history parameters of the modeled species.

Fluctuating population size:

Our results confirmed the basic elements of Felsenstein's (1971) model for populations that change in size at a constant rate, regardless of whether they are increasing or decreasing. However, many populations fluctuate around a “mean” population size. Waples (2005) evaluated performance of the temporal method for semelparous, age-structured species following the Pacific salmon model, but comparable analyses have not been done for iteroparous species. The model recently developed by Engen et al. (2005) for defining Ne in populations with overlapping generations that fluctuate in size could form the basis for such an evaluation in the future.


Results discussed above lead to the following conclusions regarding application of the temporal method to iteroparous, age-structured species:

  • The Jorde–Ryman method (Jorde and Ryman 1995) is perhaps the best option if appropriate samples and demographic data are available. Ideally, samples from a number of consecutive cohorts can be analyzed, so that an overall mean Math (and associated Math) can be calculated that accounts for small variations over time in population size and demographic parameters (Jorde and Ryman 1996; Palm et al. 2003). If single-cohort samples cannot be collected directly, it might be possible to reconstruct them by ageing individuals. Jorde and Ryman (1995) suggested that if it is not possible to reconstruct consecutive cohorts, the correction factor C might be modified to reflect a different timing of the samples. Some uncertainty in demographic parameters might not seriously affect the results, depending on the species' life history (Jorde and Ryman 1995). Biases associated with small samples can be minimized by using at least 50 individuals in each cohort.

  • If the available data do not allow use of the Jorde–Ryman model, sampling biases can be minimized (and precision enhanced) by taking samples spaced far apart in time (at least three to five generations, preferably more). This in fact has been the assumption adopted by some authors who have used the temporal method with age-structured species, and our results show that it can be a reasonable one in at least some cases. Recent improvements in the ability to extract DNA from historical material can provide opportunities for retrospective analyses that span large numbers of generations (Hauser et al. 2002; Johnson et al. 2004; Poulsen et al. 2006).

  • In some cases (e.g., for species with type I survivorship), estimates of Ne using the standard temporal method for closely spaced, random samples might not be severely biased, but precision of such estimates is unlikely to be satisfactory for most applications. Therefore, when generations overlap and samples are closely spaced in time, standard temporal estimates of Ne should be interpreted with extreme caution. This is particularly true because in these cases the choice of which estimator of F to use can have a profound effect on Math. The minor differences between Math and Math have less impact on Math when a longer time elapses between samples and the drift signal is stronger. As a precaution, those employing the temporal method should compute Math using both Math and Math and use considerable caution in interpreting scenarios under which the estimate of effective size depends heavily on the estimator of F.

  • Large samples of individuals are important, not only for precision but also to help minimize bias arising from various violations of implicit sampling assumptions.

  • Understanding as much as possible about the species' life history is important to design a sampling strategy to minimize potential biases.


Consider a population with constant demographic parameters that produces N1 newborns per time period. The cohort of newborns can be partitioned into x demographic classes of individuals on the basis of age at death. The number of individuals living to age x, reproducing, and then dying before the next time interval is N1lxdx. The average lifetime reproductive success of individuals that die at the end of year x is Math. Population size is constant, so the mean reproductive success of the entire cohort of N1 individuals is Math = 2, which is equal to the weighted mean for each age.

Now let σx2 be the variance in lifetime reproductive success of individuals that die at the end of age x, and let αx = lxdx be the fraction of the initial cohort that reaches age x and then dies. It can be shown (see online Appendix A in Waples 2006 for details, and see Hill 1972 for a related treatment based on continuous time intervals) that the overall variance in reproductive success for the entire cohort isMath(A1)

If the distribution of reproductive success is Poisson within each age class, so that σx2 = Math, this simplifies to Math and (from Equation 2)Math(A2)


The objective is to calculate E(Ne(t)), which is the effective size that describes genetic change between times t and t + 1. This can be accomplished by treating Ne as a continuous function of time. Effective size is directly proportional to total reproductive value of the population, which increases at the rate λ per time unit (Felsenstein 1971). Therefore, Ne can be described by the exponential growth modelMath(B1)where Ne(F0) is the effective size in the generation immediately preceding population growth (506.2 in our human model after adjusting to stable age structure in a growing population). Ne(Ft) describes the “average” rate of change over the generation that starts at time t and ends at time t + T; however, since the population is growing geometrically, the actual effective size must be smaller early in the generation and larger later in the generation. Let us define an instantaneous effective size, denoted by Ne*(t), that describes the actual rate of change in the population at a point t in time. Ne* increases exponentially over the interval t to t + T, and it can be shown that Ne* = Ne(Ft) at the midpoint of the interval. That is, Ne*(t+T/2) = Ne(Ft), which is also equal to the geometric mean of the Ne* values over the period of the generation. It follows that the instantaneous Ne for any time period t can be calculated asMath(B2)That is, instantaneous Ne at time t is just Ne(F) from one-half generation earlier.

We want to find E(Ne(t)), which is analogous to Ne(Ft) except that it describes genetic change over the next time period rather than the next generation. The above logic indicates that E(Ne(t)) is just the instantaneous Ne for the point midway between times t and t + 1:Math(B3)


We thank Per Erik Jorde, Nils Ryman, and two anonymous reviewers for insightful comments on an earlier draft of the manuscript, and we are grateful to Shuichi Kitada and Toshihide Kitakado for stimulating discussions.


  • Communicating editor: J. Wakeley

  • Received August 25, 2006.
  • Accepted October 18, 2006.


View Abstract