## Abstract

The standard temporal method for estimating effective population size (*N*_{e}) assumes that generations are discrete, but it is routinely applied to species with overlapping generations. We evaluated bias in the estimates caused by violation of this assumption, using simulated data for three model species: humans (type I survival), sparrow (type II), and barnacle (type III). We verify a previous proposal by Felsenstein that weighting individuals by reproductive value is the correct way to calculate parametric population allele frequencies, in which case the rate of change in age-structured populations conforms to that predicted by discrete-generation models. When the standard temporal method is applied to age-structured species, typical sampling regimes (sampling only newborns or adults; randomly sampling the entire population) do not yield properly weighted allele frequencies and result in biased . The direction and magnitude of the bias are shown to depend on the sampling method and the species' life history. Results for populations that grow (or decline) at a constant rate paralleled those for populations of constant size. If sufficient demographic data are available and certain sampling restrictions are met, the Jorde–Ryman modification of the temporal method can be applied to any species with overlapping generations. Alternatively, spacing the temporal samples many generations apart maximizes the drift signal compared to sampling biases associated with age structure.

BECAUSE effective population size (*N*_{e}) is an important parameter in evolutionary biology but is notoriously difficult to measure in natural populations, considerable interest has focused on genetic methods for estimating *N*_{e} (reviewed by Beaumont 2003; Leberg 2005; Wang 2005). By far the most widely used genetic approach for estimating contemporary *N*_{e} is the temporal method (Krimbas and Tsakas 1971; Nei and Tajima 1981), so called because it depends on estimates of allele frequency taken from a population at two or more points in time. In addition to other simplifying assumptions, the standard temporal method assumes that generations are discrete, whereas many species are age structured and hence have generations that overlap. Two variations of the temporal method can account for effects of age structure, at least in some circumstances. Waples (1990) developed a modified temporal method for species with life histories like Pacific salmon (semelparous with variable age at maturity), but this model is not intended for use with iteroparous species. A more general model for age-structured, iteroparous species was developed by Jorde and Ryman (1995), who showed that the magnitude of allele-frequency change is determined not only by effective size and the sampling interval, but also by age-specific survival and birth rates. They derived an adjustment to the standard model to account for age-structure effects, and subsequent evaluations (*e.g*., Jorde and Ryman 1996; Palm *et al*. 2003) documented the biases in estimates of effective size that can result from application of the standard temporal method without accounting for overlapping generations. However, the Jorde–Ryman method requires detailed demographic information and the ability to age individuals or group them into single-cohort samples and perhaps for these reasons has not been widely applied (but see Turner *et al*. 2002 and Palm *et al*. 2003 for examples). In spite of its obvious limitations, the discrete-generation temporal method has been and continues to be widely applied to species with overlapping generations (*e.g*., Scribner *et al*. 1997; Johnson *et al*. 2004; Hoffman *et al*. 2004; Kaeuffer *et al*. 2004; Poulsen *et al*. 2006). *A priori*, we expect bias in the resulting estimate of *N*_{e} when an otherwise unbiased method is applied to situations that violate assumptions of the model. With respect to application of the standard temporal method to species with overlapping generations, it is possible that the biases could be small if the elapsed time between samples is long enough that the drift signal strongly dominates sampling considerations (as suggested by Jorde and Ryman 1995 and assumed, for example, by Miller and Kapuscinski 1997 and Hauser *et al*. 2002). However, under what specific circumstances this might be true cannot be determined without a quantitative analysis, nor can the magnitude and direction of biases associated with samples more closely spaced in time.

In this article we evaluate performance of the standard temporal method when it is applied to iteroparous species with overlapping generations. Central to our evaluations is establishing a point of reference for evaluating the true *N*_{e} of such a population. For this we draw on two important contributions of previous authors. First, Hill (1972) showed that discrete-generation models for *N*_{e} can be modified to apply to organisms with overlapping generations, provided that the population is of constant size (*N*) and demographically stable. This means, for example, that over *t* generations of genetic drift, a population with overlapping generations should experience the same amount of allele-frequency change as a population with discrete generations and the same effective size per generation. But to test this, one must be able to measure (or estimate) the population allele frequency at a given point in time. This is straightforward if generations are discrete, but how can one measure the parametric allele frequency of a population when generations overlap? Felsenstein (1971) proposed that the correct way to calculate a population allele frequency in this case is to weight each individual by its reproductive value. We first show that this method correctly predicts the rate of allele-frequency change in simulated, iteroparous populations of constant size. Next, we evaluate bias of estimates of *N*_{e} using the standard temporal method and how they vary as a function of the species' life history and various commonly used sampling strategies.

Populations that change in size are of particular interest to evolutionary biologists and conservation biologists. Changing population size does not present a problem for the standard, discrete-generation temporal model (if population size varies the method estimates the harmonic mean *N*_{e} over the time between samples), but most models for *N*_{e} in species with overlapping generations depend on the assumption of constant population size. However, Felsenstein (1971) derived an expression for the variance-effective size in species with overlapping generations that grow (or decline) at a constant rate, in which case age structure remains constant. To complete our evaluations, we used Felsenstein's model to establish the benchmark “true” effective size in populations that deterministically change in size and evaluated performance of the standard temporal method under these conditions.

## METHODS

#### Definition of *N*_{e} when generations overlap:

##### Constant N:

The models developed by Felsenstein (1971) and Hill (1972, 1979) both have features that are useful for our analyses. Notation is consistent with that used by Felsenstein (1971) and is summarized in Table 1. The species is a monoecious diploid with the possibility of selfing. Demographic parameters are fixed; exactly *N*_{1} individuals are born in each time period, so the population will eventually reach a stable age distribution. Time units, indexed by *t*, are in years except as noted. The fraction of newborns that survive to age *x* is *l _{x}* =

*N*/

_{x}*N*

_{1}, where

*N*is the number in age class

_{x}*x*. During a single time interval, individuals of age

*x*produce an average of

*b*offspring that survive to the beginning of age class 1, so the probability that a newborn has a parent of age

_{x}*x*is

*l*. The cohort size of newborns is . In this model, , generation length is given by , and reproductive value (Fisher 1958; Felsenstein 1971) can be calculated as .

_{x}b_{x}Felsenstein (1971) showed that *N*_{e} in species with overlapping generations can be calculated directly from the age-specific survival and fecundity parameters contained in a standard Leslie matrix. In his model, each year within each age class there is random (binomial) variation among individuals in reproductive success, random survival of individuals between age classes, and no correlation between mortality and fecundity. Under these conditions, and assuming constant population size, inbreeding and variance-effective sizes are the same and are given by(1)(Felsenstein 1971, Equation 10), where *d _{x}* is the probability of death at the end of age

*x*and

*s*= 1 −

_{x}*d*.

_{x}Hill (1972, 1979) considered variance *N*_{e} in a model similar to Felsenstein's except that Hill made no particular assumption about variation in reproductive success among individuals. For monoecious diploids with random selfing, Hill showed that effective size is given by(2)where *N*_{1} and *T* are as defined above and *V _{k}* is the lifetime variance in reproductive success (production of newborns) by the

*N*

_{1}individuals making up a cohort of newborns. With random variation in (lifetime) reproductive success (

*V*≈ 2),

_{k}*N*

_{e}=

*N*

_{1}

*T*, which is the number of individuals entering the population over a period of one generation. In general, however, age structure leads to

*V*> 2 and

_{k}*N*

_{e}<

*N*

_{1}

*T*.

Equation 2 is more general than Equation 1 but the latter is more convenient to use with data from a typical life table. However, if one assumes that the distribution of reproductive success is Poisson within each age class, Equations 1 and 2 are comparable (Johnson 1977). Under this assumption, a reformulation of Equation 2 (see appendix a) provides a way of implementing Hill's model on the basis of population vital rates,(3)where is the average lifetime reproductive success of individuals that die between years *x* and *x* + 1. Effective size calculated from Equation 3 is referred to as *N*_{e}(*H*).

##### Changing N:

Felsenstein (1971) also developed an expression for variance *N*_{e} in populations that are changing size at a constant rate λ per time period. λ is the dominant eigenvalue of the Leslie matrix and is the unique real solution to the discrete-time version of the Euler–Lotka equation . When population size changes deterministically, analogs to the life-history parameters described above are and , and effective size is given by(4)(Felsenstein 1971, Equation 24), where *N*_{1}′ is the number of newborns in the next time interval. If λ = 1, *N*_{1}′ = *N*_{1} and Equation 4 is identical to Equation 1 except for the last term in the denominator, which is zero if the number of offspring per parent in one time interval is Poisson (Felsenstein 1971). Effective size calculated from Equation 4 is referred to as *N*_{e}(*F*).

#### Model species:

We chose three model species (Figure 1; Table 2), each representative of one of the three basic survival schedules. Humans are a classic type I survivorship species, with high survival well into adulthood followed by a period of rapidly increasing mortality. The white-crowned sparrow (*Zonotrichia leucophrys nuttalli*; Baker *et al*. 1981) has a modified type II survivorship curve, with a constant survival rate after an episode of high early mortality. The barnacle (*Balanus glandula*; Connell 1970) exhibits a classical type III survivorship curve with very high early mortality, even after we reduced fecundity and age-1 mortality by an order of magnitude from published values to make the simulations more tractable. We used the human demographic data analyzed by Felsenstein (1971), which are arranged into 5-year age classes. For the other two species, life-history parameters were modified slightly from published values to provide for a constant population size and integer generation lengths in units of years (*T* = 3 years for the sparrow and 4 years for the barnacle). The published data for all three species are for females only, so for the purposes of this exercise we assumed that the same values apply to the entire population.

#### Computer simulations:

We used computer simulations to model drift variance in allele frequency in populations with demographic parameters characteristic of the three life-history types. We considered both stable and growing populations.

##### Constant population size:

For each species we considered a “small” and a “large” population size, indexed by *N*_{1}, the number of newborns produced each year [“newborns” were enumerated as the number of live births (humans), clutch size (sparrow), and plankton just after hatching (barnacle)]. The small and large *N*_{1} values were: human, 10^{2}, 10^{3}; sparrow, 10^{3}, 10^{4}; and barnacle, 10^{5}, 10^{6} (Table 3). These population sizes translated into *N*_{e} values of order 10^{2} for the small population sizes and of order 10^{3} for the large population sizes (Table 3), on the basis of application of Equations 1 and 3 to the life-history data in Table 2. Given a fixed value of *N*_{1}, the stable age distribution (and hence the number in each age class) is given by the right eigenvector of the Leslie matrix, which we calculated using the power method described by Caswell (2001). For the simulations, we rounded the number of individuals in each age class to the nearest integer, yielding (for each population size in each species) a vector of *N _{x}* values representing the (fixed) number of individuals in age class

*x*.

For each fixed demographic trajectory, we modeled the stochastic process of genetic drift by randomly drawing genes to represent birth of newborns and survival from one age class to the next. At each time *t*, we calculated the frequency of a “gamete pool” of infinite size as the weighted mean of the allele frequencies in each age class of reproductive adults: . In generating this gamete pool, all individuals within an age class contribute equally, but the total contribution differs among age classes on the basis of age-specific survival and birth rates. Therefore, the process for reproduction simulated an array of Wright–Fisher subpopulations with the possibility of selfing, stratified by age. [Whether a species is monoecious or dioecious, and whether or not selfing is allowed, has a negligible effect on *N*_{e} (Crow and Denniston 1988; Caballero 1994); therefore, this aspect of the model should not have appreciably influenced the results.] Next, we calculated the allele frequency in the *N*_{1} newborns in time unit *t* + 1 (*P*_{1(t+1)}) by drawing 2*N*_{1} genes binomially from this gamete pool. Finally, for each age class *x* > 1, allele frequency at time *t* + 1 was calculated by sampling hypergeometrically (without replacement) from the 2*N _{x−}*

_{1}genes representing the frequency in age class

*x*− 1 at time

*t*. This entire random process of sampling genes due to births and deaths was then repeated to generate an age-structured vector of allele frequencies for the next time unit. Simulated populations were initialized using the stable age distribution and with the same allele frequency in all age classes at time 0 [

*P*

_{x}_{(0)}= 0.5,

*x*= 1, 2, …]. Each replicate simulation was run for 100 time units. Each run of 100 time units was taken to represent the trajectory of one gene locus, and the entire procedure was repeated for each additional locus (5000 loci total).

At each time unit, population allele frequencies were computed parametrically (by exhaustive population census) and estimated by sampling subsets of individuals. Parametric frequencies were computed two ways: (1) the standard method, which involves counting alleles in all individuals in the population and leads to an unweighted frequency , and (2) the weighted method, in which each individual is weighted by its reproductive value (; Felsenstein 1971).

Sampling of individuals was simulated by random draw of the required number of genes from specified age classes, and sample allele frequencies were computed by counting alleles (hence the sample frequencies were unweighted). We considered three different sample sizes (*S*_{1} = *S*_{2} = 25, 50, and 100) and three different sampling regimes: (a) all individuals are at equal risk of being sampled regardless of their age, (b) only reproductive adults (those in age classes with *b _{x}* > 0) are subject to sampling, and (c) only newborns are subject to sampling (so sampling is from a single cohort). In all cases, genes were counted without replacement, so no individual could be sampled more than once in any time period. However, after the sampling (or enumeration) was completed, all genes were returned to the population, so sampling did not affect the future demographic or genetic trajectory of the population.

After initializing all age classes at the same frequency (*P*_{0} = 0.5), a period of time is needed before the population reaches a dynamic equilibrium with respect to the amount of random allele-frequency variation among years and among age classes within years. In agreement with results reported by previous authors (Jorde and Ryman 1995; Waples 2002), we found that this dynamic equilibrium was reached quickly (within a few generations) under the conditions we modeled (data not shown). Therefore, we allowed each replicate to “warm up” for 20 time periods (periods 0–19) before collecting data.

Beginning in period 20, parametric or estimated population allele frequencies were compared at several different time intervals (*L*): 1 time unit and 1, 5, or 10 generations (*T*, 5*T*, and 10*T* time units, respectively). For each time interval, we used the method of Nei and Tajima (1981) to estimate *F*, the standardized variance of allele frequencies at two points in time. For diallelic loci such as those considered here, this measure is calculated as(5)where *a* is the number of loci and *P _{i}*

_{1}and

*P*

_{i}_{2}are gene frequencies at the

*i*th locus in the first and second samples, respectively. We also considered a closely related measure () proposed by Pollak (1983). As previous authors have shown that the two measures generally lead to comparable results, we focused on because it has a smaller variance for diallelic loci (Tajima and Nei 1984; Waples 1989). However, we found that in some circumstances differed substantially depending on which estimator of

*F*was used, and these cases are noted below.

For each time interval, we computed an overall as the mean across the 5000 replicate loci. After advancing the initial time period by one generation (*e.g*., first sample occurs in time period 20 + *T*), this process was repeated for each interval of *L* years, generating a vector of mean values for a number of intervals of the same duration equally spaced one generation apart across the last 80 years of each replicate. For some analyses, we also averaged across the elements of these vectors to arrive at an overall mean characteristic of a particular sampling interval of *L* years.

To evaluate accuracy of the demographic formula for *N*_{e}, we compared observed rates of allele frequency change in simulated populations with the magnitude of change expected assuming that true *N*_{e} was as given in Table 3. The expected variance of parametric population allele frequency at time period *t* among replicate populations or gene loci is given by(6)where *P*_{0} = initial allele frequency (0.5) and *g* = *t*/*T* = elapsed time in generations. For the simulated populations, we calculated an observed variance in allele frequency at time period *t* as the variance in among 5000 replicate gene loci.

#### Estimating *N*_{e} with the temporal method:

In the temporal method, effective size is estimated by relating the observed amount of allele frequency change to that expected under pure drift. The standard temporal method (Krimbas and Tsakas 1971; Nei and Tajima 1981; Pollak 1983; Waples 1989) is commonly referred to as the moment method because it relies on an estimate of *F*. Under a pure drift model, the expected value of depends on *N*_{e}, the number of generations between samples (*g*), and the sampling regime. Here we consider nonlethal sampling of the population (with replacement, as in collecting a biopsy for DNA analysis). This corresponds to sampling plan I described by Nei and Tajima (1981) and Waples (1989). Under these conditions,andwhere *S*_{1} and *S*_{2} are the sample sizes at times 1 and 2, *N* is the number of individuals at risk of being sampled at the time of the first sample, and *g* = *L*/*T* = elapsed time in generations. Because sample sizes *S*_{1} and *S*_{2} were the same in our analyses of constant population size, the above equation simplifies to(7)

The term 1/*N* accounts for the covariance of allele frequencies at times 1 and 2 that arises in plan I sampling because some individuals in the initial sample can also contribute genes to future generations. The appropriate value of *N* depends on the sampling scheme. If sampling is random with respect to the entire population, the estimator of *N*_{e} becomes(8)(sampling from entire population). For an exhaustive population census, *S* = , leading to(9)(complete population census). If only reproductive adults are sampled, then the number subject to sampling is the total in age classes equal to or older than the age at first maturity (*j*) and younger than the age of senescence, *q* (at which age *b _{x}* = 0 again):(10)(sampling mature adults). Finally, if each sample is taken from a single cohort of

*N*

_{1}newborns, the appropriate estimator is(11)(sampling newborns). Assuming that survival among age classes is random, Equation 11 also applies to any random sample of a single cohort; it is necessary to substitute for

*N*

_{1}only the total number remaining in the cohort at the time of sampling.

Jorde and Ryman (1995) modified the standard temporal method to account for overlapping generations. Their modified estimator is (in current notation and assuming plan I sampling)(12)where *C* is a constant that depends on life-history features of the population. Jorde and Ryman's model assumes that population size and demographic parameters are constant and is based on a comparison of consecutive cohorts, which can be achieved by sampling single cohorts or sampling mixed cohorts and sorting individuals by age to reconstruct single-cohort samples. We sampled randomly from the *N*_{1} newborns in successive years and used Equation 12 to estimate *N*_{e} using the Jorde–Ryman method. The recurrent formulas (10–13) and (23) in Jorde and Ryman (1995) were used to calculate *C*. This iterative process rapidly converged on a constant *C* value for each life-history scenario we considered.

##### Changing population size:

We also modeled “human” and “sparrow” populations that grew deterministically, increasing each time period by the multiplicative factor λ. We used the λ values estimated from the original published data (1.037 for humans and 1.063 for the sparrow). In addition, we considered a human population that declined at the rate λ = 0.994 per time period. These populations were initiated and allowed to run for 20 time periods (periods 0–19) at constant size (determined by low *N*_{1}) and demographic parameters given in Table 2. At time period 20, population growth was generated by scaling the birth rates to provide the desired λ, while the *l _{x}* values remained unchanged. At time period 20 the age structure was also adjusted to conform to the stable age distribution for a growing (or declining) population. In each successive time period, the vector of

*N*values was obtained by applying the birth and death rates (Table 6) to the current population: the newborn cohort was generated according to , and random mortality as described above determined which genes survived from one age class to the next. We generated 100 time periods of population sizes using real numbers for the

_{x}*N*but rounded these to integers in each time period before conducting the simulations. Under the modeled growth rates, the human population (excluding senescent individuals) increased from 826 to >31,000, and the sparrow population size climbed from 1326 to >5 × 10

_{x}^{5}(Table 6). In the human decline scenario, population size dropped from 5118 to 2807.

With changing population size, effective size also changes every time interval. At a given time *t*, Equation 4 gives a per-generation effective size (*N*_{e}(*F _{t}*)) that characterizes the expected amount of gene frequency drift that occurs between times

*t*and

*t*+

*T*(Felsenstein 1971). This definition creates an analytical difficulty here. Because there is a different

*N*

_{e}(

*F*) for every time period, genetic change in the population is described by an array of

_{t}*N*

_{e}(

*F*) values that apply to partially overlapping periods of one generation each, and it is not immediately clear how to determine the appropriate

_{t}*N*

_{e}value for any given point in time. In the present context, we need to evaluate genetic change over one or more individual time units that are fractions of a generation. Therefore, we need a way to calculate

*E*(

*N*

_{e(t)}), which is the effective size that describes genetic change between times

*t*and

*t*+ 1. As shown in appendix b,

*E*(

*N*

_{e(t)}) is just the instantaneous

*N*

_{e}for the point midway between times

*t*and

*t*+ 1,(13)where

*N*

_{e}

***

_{(t)}is an instantaneous effective size that describes the actual rate of change in the population at time

*t. E*(

*N*

_{e(t)}) from Equation 13 can be compared with estimated from samples taken in time periods

*t*and

*t*+ 1. For longer time intervals, let

*E*(

*N*

_{e(t,t+x)}) represent the effective size that determines the rate of change in the population between time periods

*t*and

*t + x*. Then

*E*(

*N*

_{e(t,t+x)}) can be calculated as the harmonic mean of the

*E*(

*N*

_{e(t)}) values for time periods

*t*through

*t + x*− 1.

When effective size changes over time, the expected variance of parametric population allele frequency at time period *t* among replicate populations or gene loci iswhere *N*_{e(i)} is the effective size for the time period *i* to *i* + 1. We used Equation 13 to calculate *E*(*N*_{e(i)}) for each time period.

To estimate *N*_{e}, we used the procedure described above to calculate for various time intervals and sampling regimes. With changing population size, however, it is necessary to adjust the terms for *N* to correspond to the correct time period. Therefore, in Equations 8, 10, and 11 we used the values for , , and *N*_{1}, respectively, that applied to the time period of the first sample. The sample sizes of 25, 50, and 100 were fixed, but when complete population enumeration was done to calculate parametric population allele frequencies in growing populations, the sample size at time 2 was larger than that at time 1: *S*_{2} > *S*_{1} = . Therefore, Equation 9 was modified as follows:(14)(complete population census; growing population).

## RESULTS

#### Constant population size:

For the three model species, we compared the demographic calculations of *N*_{e} (based on Felsenstein 1971) with values calculated by two other methods (Equation 3, modified from Hill 1972, and Jorde and Ryman 1995). The three methods produced essentially identical values of *N*_{e} under both low and high *N*_{1} (Table 3). For simplicity, in what follows we use only Equation 4 to calculate *N*_{e} from demographic data.

##### Comparison of expected and observed rates of change in allele frequency:

As shown in Figure 2, the observed variance in weighted population allele frequencies very closely tracked the expected variance in all three model species, indicating that the putative *N*_{e} values in Table 3 accurately describe the variance effective size of these populations. For the remainder of this section, therefore, we refer to the values in Table 3 as the “true” *N*_{e}.

##### Genetic estimates of N_{e}:

Figure 3 compares true *N*_{e} values with estimates of *N*_{e} based on the standard temporal method, using complete population census to calculate weighted allele frequencies (). For both small and large cohort sizes in all three species, agreement of expected and true *N*_{e} was very good. Under every scenario, for a given time interval fluctuated randomly around the true value. Figure 3 shows results for low *N*_{1} for samples taken 1 year apart; similar results were found for high *N*_{1} and longer time between samples (*L* = 1, 5, or 10*T*; Table 4). At the extreme, deviations of from true *N*_{e} were <3% (sampling interval of 10*T* in humans and barnacle).

We also estimated *N*_{e} using the temporal method based on a complete population census using unweighted allele frequencies (); this produced estimates of *N*_{e} that were substantially biased in all cases: ∼50% too high for humans and ∼50% too low for the other two species (Figure 3; Table 4). That is, complete enumeration of all individuals alive at a given time does not provide a reliable way of calculating parametric population allele frequencies in species with overlapping generations.

The array of different sampling regimes produced a range of outcomes in the three species (Table 4). Two general patterns can be identified. First, small samples (*S* = 25) produced estimates of *N*_{e} that had large biases under many scenarios, even when was averaged across 5000 replicate gene loci. Sample sizes of 50 or 100 produced *N*_{e} estimates that were less variable and more accurate.

Second, under most sampling regimes in most species, was severely biased when *F* was estimated over short time intervals but approached true *N*_{e} as the sampling interval increased. For example, Figure 4 plots as a function of time between samples for random samples from either newborns or the entire population. In all six scenarios, the longest sampling interval (10*T*) produced the most accurate estimate of *N*_{e}. In five of the six cases the estimates for shorter time intervals were biased more strongly downward; however, sampling from the entire population for humans led to overestimates of *N*_{e}, and the bias was high for *L* = 1 generation (Figure 4). Figure 4 shows results for conditions under which the temporal method has the most power: large sample size (*S* = 100) and low *N*_{1} (hence relatively small *N*_{e}). The same basic pattern (more accurate estimates of *N*_{e} over longer time periods) is seen, albeit sometimes less clearly, in the scenarios with smaller sample sizes and larger *N*_{1} (Table 4). Sampling only from adult age classes produced results qualitatively similar to those for sampling from the entire population under all scenarios (Table 4).

Finally, we evaluated estimates of *N*_{e} from samples (rather than complete population census) of individuals weighted by reproductive value. These estimates were consistently biased downward (data not shown), a result that we determined arises because weighting the samples increases the sampling variance of allele frequency beyond that expected for the nominal sample size *S*. Although it is possible, for a given set of individual weights, to account for this increase in variance to arrive at an effective sample size, in natural populations the information necessary to do so will rarely be available. Consequently, attempting to weight samples by reproductive value is unlikely to be a practical solution and we did not pursue this idea further.

The Jorde–Ryman method was designed for application to data for two successive cohorts (equivalent to sampling newborns in two consecutive years in our model), and under that scenario for the Jorde–Ryman method asymptotically approached the true *N*_{e} as sample size increased (Table 5). In their published example (which was for a species with type I survivorship), Jorde and Ryman found that with *S* = 30, based on was ∼10% too high. With the human life history, we found comparable results for *S* = 25 and low *N*_{1} (8% upward bias in ; Table 5), but this bias was considerably less for larger samples. For the sparrow and barnacle life histories, the Jorde–Ryman method also produced accurate estimates for large sample size but the upward bias was more extreme (up to 100%) for *S* = 25. A more pronounced upward bias was also found for high *N*_{1} simulations, especially for small sample sizes (Table 5). When we used the Jorde–Ryman method with Pollak's measure , we found the same general pattern ( asymptotically accurate for large samples) except that small samples led to underestimates (rather than overestimates) of *N*_{e} (data not shown).

##### N_{e}/N ratio:

As pointed out by previous authors (*e.g*., Nunney 1993; Frankham 1995), the *N*_{e}/*N* ratio is sensitive to the choice of which age classes to include in calculating *N*. Results in Table 3 illustrate this point. For the barnacle, *N*_{e}/*N* is 1.2 × 10^{−3} if population size is taken to be all individuals alive at a given time () but is 0.79 (almost three orders of magnitude higher) if only mature adults are counted (*N* = *N*_{adult}). [Note that to make our simulations more tractable we reduced fecundity and increased age-1 survival of the barnacle by an order of magnitude; the difference between *N*_{e}/*N* ratios would be even more extreme using the published data]. The effects are not as dramatic in the other species, but even for humans the ratio *N*_{e}/*N* is over twice as high if *N* is taken to be the number of reproductive adults rather than the total population size (0.76 *vs*. 0.34; Table 3).

#### Changing population size:

##### Comparison of expected and observed rates of change in allele frequency:

For all three scenarios (human growth and decline and sparrow growth; see Table 6), the observed variance in weighted population allele frequencies closely tracked the expected variance (data not shown, but results are comparable to those shown in Figure 2), indicating that Equation 13 accurately predicts the effective size as it changes over time.

##### Genetic estimates of N_{e}:

Agreement between and *E*(*N*_{e}) for populations that changed in size was excellent for all three scenarios considered. Median ratios were almost exactly 1.0 for all time periods when weighted population allele frequencies were used to calculate (Table 7). Other sampling regimes produced results that roughly paralleled those for populations of constant size. In humans, a population census with unweighted frequencies led to overestimates of *N*_{e} for short time periods, with lower bias over longer time periods. Conversely, sampling newborns led to gross underestimates of *N*_{e} for short time periods (Table 7). In the sparrow, a complete population census using unweighted allele frequencies and random sampling from newborns both produced severe underestimates of *N*_{e} for short time intervals, but approached *E*(*N*_{e}) for longer time intervals. In contrast, sampling from the entire population or only adults led to substantial overestimates for short time periods. These patterns are all consistent with those found for constant population size (Tables 4 and 7).

## DISCUSSION

Felsenstein (1971) proposed, but did not prove, that weighting each individual by its reproductive value is the correct way to calculate parametric population allele frequencies in species with overlapping generations. Our results demonstrate that this is indeed the case, regardless of the species' life history and regardless of whether the population is fixed or changing in size at a constant rate. In contrast, measuring the magnitude of population change using unweighted population allele frequencies led to large departures from expected values in all scenarios considered, even when the entire population was sampled.

These results indicate that caution is needed in interpreting standard-model temporal estimates of effective population size in species with overlapping generations. The resulting estimates will be biased unless the entire population is sampled and individual genotypes are weighted by their reproductive value. Unfortunately, this is not a practical solution to the problem of estimating *N*_{e}. Rarely is it feasible to sample an entire population, and, even if this were possible (*e.g*., as might be the case in small captive populations), the age-specific life history data needed to compute individual reproductive values typically are not available. As discussed above, taking samples and weighting allele frequencies by reproductive value also are generally not feasible. One solution to this problem is to use the Jorde–Ryman method (Jorde and Ryman 1995), which we found generally performed well when it was applied to situations for which it was designed: analysis of samples from consecutive cohorts. This method can be used with any age-structured species, but in practice it has not been widely applied, perhaps because it requires both detailed demographic information to compute the correction factor *C* and samples from single cohorts in successive years. Most applications of the temporal method for species with overlapping generations continue to use the standard discrete-generation model, so it is important to consider the nature and magnitude of the bias that is likely to result from common sampling schemes. These results are complex, but some patterns are apparent.

First, bias in is largest for short time intervals and in many cases largely disappears after 5–10 generations between samples (Figure 4; Table 4). This pattern, which was also seen in analyses of growing populations (Table 7), can be understood by noting that the magnitude of (and hence ) is determined by two major components: a component related to drift (the signal) and one related to sampling a finite number of individuals (the noise). The bias arises because the adjustment for sampling derived for the discrete-generation model does not capture all of the complexities associated with sampling from age-structured populations (Jorde and Ryman 1995). The magnitude of this sampling bias is fixed by the sample size and the nature of the sampling in relation to the species' life history; it does not change with the amount of time between samples. In contrast, a longer sampling interval allows more episodes of genetic drift to influence , thus increasing the signal-to-noise ratio in the data.

The second noteworthy pattern is that the direction of bias in based on differs markedly between the human and the barnacle: random sampling from the population (or a complete population census with unweighted allele frequencies) leads to an overestimate of *N*_{e} in humans but an underestimate in the barnacle. In the barnacle, the vast majority of individuals alive at any time are newborns, so a random sample will primarily reflect frequencies in the newborns, which are derived only from the fraction of the population that reproduced the previous year. The result is a sampling variance greater than would be expected from a random sample from the generation as a whole, with the consequence that is higher than expected and *N*_{e} is underestimated. Humans are much more evenly distributed across age classes, including a substantial contingent of older individuals with low reproductive value that represent a sort of genetic inertia in the population. If their frequencies are weighted equally with younger individuals that have higher reproductive value, the result will be an underestimate of the rate of genetic change and an overestimate of *N*_{e}. Sampling only human newborns, however, leads to an underestimate of *N*_{e}, as it does in the barnacle, because the progeny are drawn from a relatively small fraction of the population that reproduces in any given time period. Use of exacerbates the downward bias in for the barnacle (data not shown), because with diallelic loci is always larger than (Waples 1989), resulting in a lower . Use of with humans led to quite variable results, with being biased either upward or downward depending on the sampling regime, sample size, and elapsed time between samples (data not shown).

Results for the sparrow are somewhat intermediate but closer to the barnacle than to the human and consistent between the constant size and growth scenarios (Figures 3 and 4; Tables 4 and 7). Presumably this reflects high mortality in the sparrow between ages 1 and 2, which causes a large fraction of the population to have low reproductive value, as in the barnacle. Thus, although the sparrow displays a classical type II survival curve after age 1, the episode of high juvenile mortality has important consequences for estimation of effective population size.

Finally, an interesting result is that, at least for small *N*_{1}, the magnitude of bias in was larger for the human than for the other two species (Table 4, Figure 4). This might mean that species with type I survivorship are particularly prone to bias with use of the standard temporal method. However, the generality of this result requires further evaluation, as other factors such as longer generation time could also be involved.

#### Precision:

Our analyses have focused on bias, but precision is also an important consideration for any genetic method of estimating *N*_{e}. Although it is beyond the scope of this article to evaluate precision in any comprehensive way, some general observations can be made on the basis of previously published results (Nei and Tajima 1981; Waples 1989; Jorde and Ryman 1995). First, the same proportional increase in sample size, elapsed time between samples, and number of independent alleles used in the estimate of all have approximately the same effect on precision. Our results show that in some cases it is possible to get largely unbiased estimates of *N*_{e} from samples taken only a fraction of a generation apart. In general, however, such estimates would have very low precision unless *N*_{e} were very small and samples of individuals and loci very large. As the number of alleles used to compute increases, the mean contributions from sampling and drift both stabilize around their expected values and precision increases. With existing technology it is quite feasible to collect data for 10–20 highly polymorphic microsatellite loci in many natural populations, which can produce precise estimates of *N*_{e} under certain conditions.

Second, the temporal method can be used most effectively to study populations with small *N*_{e}, because the signal from drift is large relative to sampling error. With any genetic method it can be difficult to distinguish a large population from a very large one. For example, under many scenarios with high *N*_{1} the point estimates of *N*_{e} were infinity, even when was averaged across 5000 gene loci (Tables 4 and 5). A similar effect was seen in growing populations, which reached a large size before the end of the simulations (Table 7).

Finally, results obtained by Jorde and Ryman (1995) suggest that the coefficient of variation of when generations overlap is comparable to that for the discrete-generation model, so parametric confidence intervals for for the standard temporal method (Waples 1989) should also be applicable to iteroparous species.

#### Likelihood-based methods for estimating *N*_{e}:

In recent years a number of likelihood-based methods (Williamson and Slatkin 1999; Anderson *et al*. 2000; Wang 2001; Berthier *et al*. 2002) have been proposed for estimating *N*_{e}. In general, results for these methods are roughly comparable to the standard temporal method for moderate allele frequencies but are less biased and more precise when many low-frequency alleles are considered. These methods are all computationally demanding and were not evaluated here. However, all depend on estimates of allele-frequency change in samples taken at two or more points in time. Therefore, they should all be affected in the same general way by the various age-structure biases considered in this article. This conjecture should be tested empirically. The empirical Bayesian method proposed by Tallmon *et al*. (2004) should be able to deal with overlapping generations by modifying life-history parameters of the modeled species.

#### Fluctuating population size:

Our results confirmed the basic elements of Felsenstein's (1971) model for populations that change in size at a constant rate, regardless of whether they are increasing or decreasing. However, many populations fluctuate around a “mean” population size. Waples (2005) evaluated performance of the temporal method for semelparous, age-structured species following the Pacific salmon model, but comparable analyses have not been done for iteroparous species. The model recently developed by Engen *et al*. (2005) for defining *N*_{e} in populations with overlapping generations that fluctuate in size could form the basis for such an evaluation in the future.

#### Conclusions:

Results discussed above lead to the following conclusions regarding application of the temporal method to iteroparous, age-structured species:

The Jorde–Ryman method (Jorde and Ryman 1995) is perhaps the best option if appropriate samples and demographic data are available. Ideally, samples from a number of consecutive cohorts can be analyzed, so that an overall mean (and associated ) can be calculated that accounts for small variations over time in population size and demographic parameters (Jorde and Ryman 1996; Palm

*et al*. 2003). If single-cohort samples cannot be collected directly, it might be possible to reconstruct them by ageing individuals. Jorde and Ryman (1995) suggested that if it is not possible to reconstruct*consecutive*cohorts, the correction factor*C*might be modified to reflect a different timing of the samples. Some uncertainty in demographic parameters might not seriously affect the results, depending on the species' life history (Jorde and Ryman 1995). Biases associated with small samples can be minimized by using at least 50 individuals in each cohort.If the available data do not allow use of the Jorde–Ryman model, sampling biases can be minimized (and precision enhanced) by taking samples spaced far apart in time (at least three to five generations, preferably more). This in fact has been the assumption adopted by some authors who have used the temporal method with age-structured species, and our results show that it can be a reasonable one in at least some cases. Recent improvements in the ability to extract DNA from historical material can provide opportunities for retrospective analyses that span large numbers of generations (Hauser

*et al*. 2002; Johnson*et al*. 2004; Poulsen*et al*. 2006).In some cases (

*e.g*., for species with type I survivorship), estimates of*N*_{e}using the standard temporal method for closely spaced, random samples might not be severely biased, but precision of such estimates is unlikely to be satisfactory for most applications. Therefore, when generations overlap and samples are closely spaced in time, standard temporal estimates of*N*_{e}should be interpreted with extreme caution. This is particularly true because in these cases the choice of which estimator of*F*to use can have a profound effect on . The minor differences between and have less impact on when a longer time elapses between samples and the drift signal is stronger. As a precaution, those employing the temporal method should compute using both and and use considerable caution in interpreting scenarios under which the estimate of effective size depends heavily on the estimator of*F*.Large samples of individuals are important, not only for precision but also to help minimize bias arising from various violations of implicit sampling assumptions.

Understanding as much as possible about the species' life history is important to design a sampling strategy to minimize potential biases.

## APPENDIX A

Consider a population with constant demographic parameters that produces *N*_{1} newborns per time period. The cohort of newborns can be partitioned into *x* demographic classes of individuals on the basis of age at death. The number of individuals living to age *x*, reproducing, and then dying before the next time interval is *N*_{1}*l _{x}d_{x}*. The average lifetime reproductive success of individuals that die at the end of year

*x*is . Population size is constant, so the mean reproductive success of the entire cohort of

*N*

_{1}individuals is = 2, which is equal to the weighted mean for each age.

Now let σ_{x}^{2} be the variance in lifetime reproductive success of individuals that die at the end of age *x*, and let α* _{x}* =

*l*be the fraction of the initial cohort that reaches age

_{x}d_{x}*x*and then dies. It can be shown (see online Appendix A in Waples 2006 for details, and see Hill 1972 for a related treatment based on continuous time intervals) that the overall variance in reproductive success for the entire cohort is(A1)

If the distribution of reproductive success is Poisson within each age class, so that σ_{x}^{2} = , this simplifies to and (from Equation 2)(A2)

## APPENDIX B

The objective is to calculate *E*(*N*_{e(t)}), which is the effective size that describes genetic change between times *t* and *t* + 1. This can be accomplished by treating *N*_{e} as a continuous function of time. Effective size is directly proportional to total reproductive value of the population, which increases at the rate λ per time unit (Felsenstein 1971). Therefore, *N*_{e} can be described by the exponential growth model(B1)where *N*_{e}(*F*_{0}) is the effective size in the generation immediately preceding population growth (506.2 in our human model after adjusting to stable age structure in a growing population). *N*_{e}(*F _{t}*) describes the “average” rate of change over the generation that starts at time

*t*and ends at time

*t*+

*T*; however, since the population is growing geometrically, the actual effective size must be smaller early in the generation and larger later in the generation. Let us define an instantaneous effective size, denoted by

*N*

_{e}*

_{(t)}, that describes the actual rate of change in the population at a point

*t*in time.

*N*

_{e}* increases exponentially over the interval

*t*to

*t*+

*T*, and it can be shown that

*N*

_{e}* =

*N*

_{e}(

*F*) at the midpoint of the interval. That is,

_{t}*N*

_{e}*

_{(t+T/2)}=

*N*

_{e}(

*F*), which is also equal to the geometric mean of the

_{t}*N*

_{e}* values over the period of the generation. It follows that the instantaneous

*N*

_{e}for any time period

*t*can be calculated as(B2)That is, instantaneous

*N*

_{e}at time

*t*is just

*N*

_{e}(

*F*) from one-half generation earlier.

We want to find *E*(*N*_{e(t)}), which is analogous to *N*_{e}(*F _{t}*) except that it describes genetic change over the next time period rather than the next generation. The above logic indicates that

*E*(

*N*

_{e(t)}) is just the instantaneous

*N*

_{e}for the point midway between times

*t*and

*t*+ 1:(B3)

## Acknowledgments

We thank Per Erik Jorde, Nils Ryman, and two anonymous reviewers for insightful comments on an earlier draft of the manuscript, and we are grateful to Shuichi Kitada and Toshihide Kitakado for stimulating discussions.

## Footnotes

Communicating editor: J. Wakeley

- Received August 25, 2006.
- Accepted October 18, 2006.

- Copyright © 2007 by the Genetics Society of America