Abstract
In the past, moment and likelihood methods have been developed to estimate the effective population size (N_{e}) on the basis of the observed changes of marker allele frequencies over time, and these have been applied to a large variety of species and populations. Such methods invariably make the critical assumption of a single isolated population receiving no immigrants over the study interval. For most populations in the real world, however, migration is not negligible and can substantially bias estimates of N_{e} if it is not accounted for. Here we extend previous moment and maximumlikelihood methods to allow the joint estimation of N_{e} and migration rate (m) using genetic samples over space and time. It is shown that, compared to genetic drift acting alone, migration results in changes in allele frequency that are greater in the short term and smaller in the long term, leading to under and overestimation of N_{e}, respectively, if it is ignored. Extensive simulations are run to evaluate the newly developed moment and likelihood methods, which yield generally satisfactory estimates of both N_{e} and m for populations with widely different effective sizes and migration rates and patterns, given a reasonably large sample size and number of markers.
ONE of the major applications of population genetics theory has been in the estimation of genetic and demographic properties of populations. Quantities such as the mutation rate, the migration rate, and the recombination rate are all in principle estimable from population genetic data, at least as products with the effective population size.
The variance effective population size (N_{e}) is an important quantity in evolutionary biology, as a description of the rate at which genetic variance changes due to genetic drift. A variety of methods have been developed to estimate N_{e} (Schwartzet al. 1999), including predictive equations based on life history data (reviewed in Caballero 1994; Wang and Caballero 1999), the lethal allelism approach (Dobzhansky and Wright 1941), and a linkage disequilibrium approach (Hill 1981). Several authors have developed methods to estimate N_{e} for a population from temporal genetic data, that is, data on the frequencies of marker alleles taken at two or more time points from the same populations. In principle, if other population genetic processes such as mutation, migration, and selection are negligible in their effects on allele frequency change over the time frame of the study, then the observed changes in allele frequency can be assumed to be due to the effects of genetic drift. Using what is known about the effects of genetic drift, then, we could estimate N_{e} from such temporal data. This socalled “temporal approach” was pioneered by Krimbas and Tsakas (1971), with subsequent substantial statistical refinements by others (e.g., Pamilo and VarvioAho 1980; Nei and Tajima 1981; Pollak 1983; Tajima and Nei 1984; Waples 1989; Williamson and Slatkin 1999; Andersonet al. 2000; Wang 2001a). These methods have been applied to a large variety of species and populations (e.g., Funket al. 1999; Labateet al. 1999; Fiumera et al. 1999, 2000; Kantanenet al. 1999; Turneret al. 1999; see also many earlier references cited in Williamson and Slatkin 1999).
A key assumption made by all these approaches, however, is that the only factor involved in allele frequency change is genetic drift. All systematic forces (selection, mutation, and migration) are assumed to be absent. In our view, it is often legitimate to ignore the effects of mutation on the timescale involved in most empirical studies. It also may be reasonable to neglect the effects of selection because direct selection on most markers may be unlikely to be strong enough to cause substantial changes in their frequencies. (The effects of indirect selection acting on other linked loci on the allele frequency change of the marker locus may arguably be included as a “drift” effect and legitimately included in a N_{e} term.) However, for most populations the effects of migration are not negligible. A major weakness of these temporal methods is that they ignore the effects of migration, which can substantially bias estimates of N_{e}, either upward or downward (see below). The purpose of this article is to include the effects of migration in the estimation of demographic parameters from temporal genetic data.
Migration can affect the estimation of N_{e} in two ways. In the short term, migration can change allele frequencies quite quickly. Given that the temporal methods estimate N_{e} by measuring the pace at which allele frequencies change, migration causes a population to behave as if strong drift was changing allele frequencies very rapidly, resulting in an estimate of N_{e} that is too low. In the longer term, the problem is reversed. Constant migration and drift would cause the populations to approach an equilibrium level of genetic differentiation and the rate of change of allele frequencies in a deme to slow down to approach that predicted by the effective size of the whole metapopulation. Thus the effective population size of the deme would be substantially overestimated in the long term. These effects are discussed later in this article in more mathematical detail.
This problem can be avoided by simultaneously estimating N_{e} and the migration rate, which is the approach taken in this article. It turns out that while both migration and drift affect allele frequency changes in a population, they do so differently in terms of the direction and magnitude of these changes. Under migration, the allele frequency of a focal population would tend to become increasingly similar to that of the source population. Under drift, however, the allele frequency of a focal population changes purely at random, irrespective of that of the source population. Furthermore, the magnitude of allele frequency change of a focal population depends on the allele frequency difference between the focal and source populations under migration. Migration has no effect on alleles that happen to be at the same frequencies both in the deme in question and in the populations providing those migrants, but migration is expected to change allele frequencies substantially in cases when the source and recipient demes have very different allele frequencies. Genetic drift, on the other hand, is not affected by differences in allele frequencies among demes, but depends only on the local allele frequency. These differences in the patterns of changes of allele frequencies due to migration and drift allow us to estimate them separately, so long as enough data are from multiple loci. Fortunately, these kinds of data are often available with modern biochemical methods.
The estimation of migration rate and migrant number has a long history. The oldest and most widely used genetic method assumes that populations are at equilibrium in a system approximated by Wright’s (1931) island model and then uses estimates of Wright’s F_{ST} to calculate 1/(4N_{e} m + 1), where m is the local immigration rate (see Slatkin 1987). This method has been roundly criticized (Whitlock and McCauley 1999), in particular because of the unrealistic assumptions it requires about the demographic structure of the species. Other approaches include genetic stock identification (Milneret al. 1985), assignment methods (Rannala and Mountain 1997), twolocus descent measures (Vitalis and Couvet 2001), coalescent methods (Nielsen and Wakeley 2001), isolationbydistance methods (see review by Rousset 2001), and migration matrixlikelihood models (Beerli and Felsenstein 2001). The first two approaches require detailed knowledge about the possible sources of migrants and do not allow for genetic drift. The latter four approaches assume that the populations are at an equilibrium between the effects of genetic drift and migration. There is a need for a method that does not make restrictive assumptions about genetic and demographic equilibria, yet allows simultaneous estimation of the effects of drift and migration.
Here we extend previous temporal methods for estimating the N_{e} of a single isolated population so that they can be used to estimate N_{e} and m jointly for a metapopulation. We first develop an expectation based on a moment estimation approach, which allows an approximate description of the behavior of the estimator and the system. We follow this by developing a likelihood method based on that of Williamson and Slatkin (1999), Anderson et al. (2000), and Wang (2001a). Extensive simulations are run to check the performance of the moment and likelihood estimators in terms of precision and accuracy for the estimation of N_{e} and m. Two metapopulation models (a focal small population receiving immigrants from an infinitely large source population and two small populations connected by migration) are considered, and the applications of the methods to more complicated metapopulation models (e.g., a small number of demes in island or steppingstone migration models) were checked by simulations. As an example for the application of our methods, a spatial and temporal genetic data set on Triturus newts (Jehleet al. 2001) was reanalyzed by our methods for the simultaneous estimation of migration and drift.
DEFINITIONS AND ASSUMPTIONS
We use temporal data to estimate the effective size (N_{e}) and immigration rate (m) of a partially isolated population. For brevity, we refer to the population for which parameters are being estimated as the focal population and that from which the migrants come as the source population. Data and parameters for the source population are given the subscript A, and those for the focal population, B. In the first section, we assume that the source population is infinitely large and therefore not changing genetically through the time frame of the study. In the section after that, we assume that both populations are finite and jointly estimate the properties of both. In the latter case, source and focal populations are obviously reciprocal. Throughout we assume diploidy. For data from haploid species or loci, the N_{e} estimated by these models would be double the actual effective size.
For the infinitesource population model, a minimum of three samples is required to estimate N_{e} and m of the focal population: two temporally spaced ones from the focal population and one taken at any time from the source. For the two small population models, a minimum of four samples is necessary, two temporally spaced ones taken concurrently from each population. Let us denote the size of these samples S, with subscripts A or B to describe whether the sample is from the source or focal population, respectively, and a subscript t to describe the generation in which the sample is taken. Thus S_{B}_{,0} is the number of individuals sampled from the focal population in the initial generation. We use x to refer to the relative frequency of a particular allele in a given sample. Therefore x is an estimator of the true frequency (p) of that allele at that time in that population. For convenience, we define q = 1  p. These quantities, x, p, and q, are subscripted in the same way as S.
Several summary statistics of these quantities turn out to be useful. We define Δp_{AB} ≡ p_{A}_{,0}  p_{B}_{,0} to be the initial difference between the allele frequencies of the two populations and Δp_{B,t} ≡ p_{B,t}  p_{B}_{,0} to be the change in allele frequency by generation t in deme B. In the derivations of the momentbased estimators, we also use δ_{B}_{,}_{i} to refer to the change in generation i due to genetic drift of the allele frequency in population B. Each of the quantities can refer to population A if the subscript B is replaced by A.
We make several assumptions throughout. These are that the alleles under study are not under direct selection, that mutation is a negligible source of allele frequency change, that potential sources of migrants to the focal populations can be identified a priori, that all samples are randomly taken from their populations, and that the sampling event does not change the available pool of reproductive individuals. The latter can be realized by sampling after reproduction, sampling with replacement, or sampling from a population much larger than the sample. This sampling scheme therefore corresponds to Waples’s (1989) plan 1.
Table 1 summarizes most of the terms used in this article.
IMMIGRATION FROM AN INFINITESOURCE POPULATION
Theoretical background: Imagine that the focal population is receiving migrants at rate m per generation from an infinite, unchanging source population. For effectively neutral alleles, changes in allele frequency over a short period of time are a function of the local effective population size and migration rate. Mutations are negligible compared to drift and migration because of their rare occurrence and the short time span. Under these conditions, and assuming that migration precedes population regulation, the allele frequency at generation i is given by
Moment approximations: N_{e} and m affect the first and second moments of the change in allele frequency over time. Under certain circumstances, we can use these moments to estimate m and N_{e}. We do this in a twostep process, first estimating m from the mean change in allele frequencies at all loci and then using this value in the estimation of N_{e} from the second moment of allele frequency change.
Estimating the migration rate: The change in allele frequency is given by Equation 4 above. Taking the expectation of Δp_{B}_{,}_{t} we find
To predict the optimal sampling strategy and to approximate standard errors, we estimate the error variance of an estimate of m. The expected error in estimating a ratio is an approximate function of the error variance of estimating the numerator and the denominator and the error covariance between the two (see Lynch and Walsh 1998, Appendix 1). The error in estimating (1  (1  m)^{t}) can be approximated from the error in estimating Δp_{B}_{,}_{t}/Δp_{AB}. The error variances (V_{err}[]) and error covariance (COV_{err}) are given by
Using these last equations and Equation A1.19b from Lynch and Walsh (1998), we can find the expected error variance for the estimate of (1  (1  m))^{t}:
Estimating N_{e}: Under the assumption that the total change in allele frequency per generation is not large (i.e., N_{e} is not very small and m is not very large), each of the terms in the equation describing allele frequency change is roughly independent of the others. The variance in allele frequency change at generation i due to genetic drift within the focal population is V[δ_{B}_{,}_{i}] = p_{B}_{,}_{i}q_{B}_{,}_{i}/2N_{e,}_{B}. If the allele frequency is not changing rapidly, then we can assume that the value of pq is approximately the same each generation. A reasonable estimator of the mean value of pq is
In general we do not have exact information about the change in population allele frequencies, but only an estimate based on sample data. Because of sampling variance,
Choosing the right value of t: With a large separation in time between the first and second samples, estimates of m become problematic. The estimate of (1  (1  m)^{t}) becomes more and more similar for different values of m as t gets larger (lim_{t}_{→∞}[δ(1  (1  m)^{t})/δm] = 0), yet the error variance in estimating this quantity increases linearly with t. Furthermore, as t increases, the allele frequency change approaches a distribution determined largely by the relative sizes of the effective size and m, rather than by the absolute values of either. When t is large enough that the populations reach an equilibrium between genetic drift and migration, then mˆ → 0, irrespective of the actual true migration rate. This is clear by inspecting the singlelocus estimator of m,
If the time interval is too short and m too small, then → there is less power to measure m because too few migration events may have occurred. Larger samples with more informative markers are therefore necessary to obtain a good estimate of m when (1  m)^{t} is close to one or zero.
Similarly, the estimation of N_{e} is also biased if the sampling interval is too long. As t → ∞, we have mˆ 0 shown above and Equation 10 reduces to
As a numerical example, Figure 1A shows the average estimates of 1/(2N_{e,}_{B}) and m over 10,000 replicates plotted against sampling interval t, and Figure 1B shows the coefficients of variation (CV) of the estimates when t = 112 where the estimation is unbiased. (Methods for these simulations are given in the appendix.) The true values being estimated are 1/(2N_{e,}_{B}) = 0.005 (N_{e,}_{B} = 100) and m = 0.2 for the focal population, which is assumed to be in a metapopulation containing 10 other populations of the same size (N_{e} = 100) with an island migration model. The effective size of the source population is therefore ∼1000, sufficiently large although not infinite. As can be seen, estimates of both 1/(2N_{e,}_{B}) and m are close to the true values when t is small, but approach the true values for the whole species, 0.00045 (N_{e} = 1112, calculated using the known parameters) and 0, with increasing t. The maximum number of generations allowed for valid joint estimates of 1/(2N_{e,}_{B}) and m and the number of generations required to reach the equilibrium so that the species N_{e} is estimated depend on m, the effective sizes of the focal and source populations, and the initial genetic differentiation when the first sample was taken (see below).
For this specific numerical example, the estimation is unbiased when ∼t < 12. The precision of the estimators changes considerably over this short sampling period (Figure 1B). As expected, the best estimates are obtained at an intermediate value of t (t = 24 for the example). The optimal sampling interval depends, again, on m, the effective size of the focal population and the initial genetic differentiation when the first sample was taken. With decreasing m and/or increasing initial differentiation, for example, the optimal t is expected to increase.
In experimental design, it is difficult to determine beforehand the optimal interval between sampling for maximum power of the temporal method, because the power depends partially on the parameter being estimated (see e.g., Equation 7). Without any prior knowledge of the demographic history of the populations in question, it is reasonable to consider a short rather than a long sampling interval. While a too long sampling interval results in biased estimates of N_{e} and m, a too short sampling interval only decreases the precision of the estimates. With powerful statistical methods (e.g., maximum likelihood shown below) and large sample size and marker information, the estimation precision can be improved greatly even if t is as short as one generation.
The consequences of assuming that m = 0: Other methods using temporal data for estimating the effective size have assumed that the immigration rate is zero. We can use the equations above to investigate the effects of making this assumption when it is in fact false.
When m → 0, the expected change in allele frequency is zero, M = t, and Equation 12 reduces to
When m > 0, the variance in allele frequency change over the interval between samples is a function of both genetic drift and migration. This can be either greater or smaller than that expected by drift alone, depending on, among other parameters, t. In the short term, the change in allele frequency due to the joint effects of migration and drift will be larger than that expected by drift alone. If t = 1, for example, we have
Figure 2 shows the biases of N_{e} estimation when a false assumption of m = 0 is made. The true parameter values are N_{e} = 100 and m = 0.1, and the populations are assumed to be either completely differentiated (no previous migration, case A) or at driftmigration equilibrium (case B, where F_{st} ≈ 0.02) when the initial sample is taken. Ignoring migration results in an overestimation of 1/(2N_{e}) when sampling interval (t) is small and underestimation when t is large. Compared with case A, the initial overestimation of 1/(2N_{e}) in case B from ignoring migration is much smaller, because the populations are more similar genetically (Δp_{AB} small) and therefore migrants have less effect on allele frequency changes. With a larger m and smaller N_{e}, the initial overestimation of 1/(2N_{e}) can be dramatic even when the populations are initially at equilibrium (data not shown). For both cases, the estimation is reasonably good for a large range of sampling interval when immigration is accounted for. However, the sampling interval cannot be too long. Otherwise, 1/(2N_{e}) is increasingly underestimated with t, approaching the true value for the whole species.
Likelihood model: Let us first consider a single locus with two alleles, and we focus on a particular allele. For brevity in this subsection, subscript B is dropped for the focal population. To estimate m and N_{e} of the focal population jointly, at least two temporally spaced samples from the focal and a single sample from the source population (A) are necessary. Because the three sampling events are independent, the probability of getting a set of samples with allele frequencies denoted by x is
The likelihood (L) of a particular set of values of N_{e} and m is the probability of observing the data given those values. We have, therefore,
The first term in Equation 16, Pr[p_{t} p_{0}, p_{A}, N_{e}, m], can be calculated by using the transition matrix T (Ewens 1979). The probability of a transition from the jth configuration of the parental population at time t  1 to the ith configuration of the offspring population at time t is
For loci with more than two alleles, the calculation of the likelihood function is problematic. The number of possible configurations, C = (2N_{e} + k  1)!/(2N_{e}! (k  1)!) for a diploid locus with k alleles (Feller 1950; Williamson and Slatkin 1999), increases rapidly with k, prohibiting the evaluation of the exact likelihood due to the limitations of computer memory and processing speed. For a moderate value of N_{e}, even k = 3 poses a serious computational problem for the likelihood method. Anderson et al. (2000) proposed a Monte Carlo method to evaluate the likelihood function with data on multiallelic markers. Their method does not need much storage but is highly demanding computationally and may not converge when k is large (say, k = 10) or some alleles are not observed in all samples. To simplify the likelihood computation for multiallelic loci, we follow our previous approach (Wang 2001a), which transforms a kallele locus into k biallelic “loci,” each having one of the k alleles with all the other alleles pooled. The overall loglikelihood is approximated by the sum of the loglikelihoods across loci multiplied by the factor of (k  1)/k to account for the dependence among such converted loci.
The likelihood method can also use more than two samples from the focal population simultaneously to obtain a mean N_{e} and an average m during the whole sampling period. If samples are taken from the focal population at generations t0, t1, t2,..., tg, then the joint likelihood of N_{e} and m is
TWO FINITE DEMES
Up to this point, we have considered the demographic properties of a single deme that receives immigrants from an effectively infinite source, that is, a source so large that over the time course of the study it is not changing appreciably in allele frequency. While this is likely to be a reasonable approximation in many cases, there are examples in nature where only a few relatively small demes serve as the only source of migrants to each other. This case is more difficult, both analytically and computationally, but here we consider the case of two small demes as a beginning to such an analysis.
The allele frequency at a neutral locus will change over a single generation in the two populations according to
Moment estimator: We can find a moment estimator for the two migration rates for this twofinitedeme case, if we consider the change in the difference in allele frequency between the demes over time. We consider the simple case in which the samples are separated by only a single generation, and so we drop the subscript notation for time. Taking the expectations of Equation 19 above and remembering that the expected values of δ_{A} and δ_{B} are zero, we find
Finding an estimator of the effective size of each deme is also possible. The variance of the change in allele frequency over time due to drift for population A is
For the case of t > 1, moment estimators for both m and N_{e} become much more complicated (especially the latter) and are not shown here.
Likelihood estimator: The likelihood of the effective sizes (N_{e,}_{A}, N_{e,}_{B}) and immigration rates (m_{A}, m_{B}) of demes A and B given the temporal data (x_{A}_{,0}, x_{A}_{,}_{t}, x_{B}_{,0}, x_{B}_{,}_{t}) is
In principle, the exact likelihood could be worked out for any set of the four parameter values, but in practice this would be computationally intensive for anything but small population sizes and small sampling interval (t). The total number of configuration combinations that need to be considered in Equation 22 is ((2N_{e,}_{A} + 1) (2N_{e,}_{B} + 1))^{t} for a biallelic locus, which increases exponentially with sampling interval t. Therefore, we use the transition matrix method for the exact calculation of the likelihood for samples separated by only a single generation (t = 1) and use a Monte Carlo approach to approximate the likelihood for samples with t > 1.
Equation 22 could easily be extended to more than two samples from each deme with little increase in computational demands. It could also be extended to more than two demes, but the computational intensity increases prohibitively with the number of demes, because of the exponential increases in configuration combinations and the number of parameters for joint estimation.
SIMULATION RESULTS
Infinite source: Methods for these simulations are summarized in the appendix. For focal populations with wide ranges of effective size and immigration rates, the maximumlikelihood and moment estimates of 1/(2N_{e}) and m from simulations are summarized in Table 2. The sample sizes used in the simulations are large, but not unrealistic. Indeed, a large sample size is important for the temporal methods; otherwise, the apparent changes in allele frequency would be dominated by sampling error rather than by genetic drift. For most natural populations, the actual census population sizes could be orders of magnitude larger than the effective sizes (Frankham 1995), making both large samples and significant genetic drift possible.
The maximumlikelihood estimator gives unbiased estimates of 1/(2N_{e}) in all cases considered. The coefficient of variation of the estimates is roughly onequarter, with 95% of the estimates being in the range of ∼50 and 150% of the true values. The moment estimator underestimates 1/(2N_{e}) slightly when m is large and sample size is small relative to N_{e}. Increasing sample size can decrease the bias dramatically. For the case of N_{e} = 100 and m = 0.5 in Table 2, for example, the mean ± SD of the moment estimates of 1/(2N_{e}) improves to 0.9 ± 0.3 (relative to the true value) when the sample size is 500 individuals. The estimates from the moment estimator are generally more dispersed than those from the maximumlikelihood estimator.
The maximumlikelihood method overestimates m slightly only when m is small in very small populations (N_{e} = 10). This is expected because maximumlikelihood estimates have a lower bound at zero. Increasing sample size or marker information (number of loci and alleles) in a sample decreases the bias and increases precision. In contrast to 1/(2N_{e}), both the moment and the maximumlikelihood methods give similar estimates of m. However, some moment estimates of m are negative when m is small.
In general, both maximumlikelihood and moment methods give reasonable estimates of N_{e} and m. The accuracy and precision of the two estimators are similar for m, but the likelihood method is better for N_{e}. This is partially because moment methods must use the estimate of m in estimating N_{e}. Such a twostep estimation procedure is expected to be inferior to the joint estimation of m and N_{e} from the likelihood approach, because the estimation error of m is not accounted for in estimating N_{e}. Also, when m is large and therefore the changes of allele frequencies are rapid, the approximation to the mean value of pq in deriving the moment estimator of N_{e} may be no longer good enough.
The distribution of the estimates of N_{e} [or 1/(2N_{e})] generally has a long tail, with a significant proportion of the estimates being much larger (smaller) than the true value. This is similar to the observation made on the case of a single isolated population (Nei and Tajima 1981; Waples 1989; Wang 2001a). For the particular parameter set of N_{e} = 100, m = 0.1, the distributions of N_{e} and m estimates over 10,000 replicates are shown in Figure 3. Again, there is very little difference in the distribution of estimates of m between the two techniques, the mean ± SD being 0.114 ± 0.048 for maximum likelihood and 0.115 ± 0.049 for the moment estimator. For N_{e} estimation, however, maximum likelihood is much more accurate and precise than the moment approach. The mean ± SD of estimated 1/(2N_{e}) is 0.0049 ± 0.0020 for maximum likelihood and 0.0041 ± 0.0025 for the moment estimator. The latter gives a significant proportion of extremely large N_{e} estimates.
Typically, the likelihood surface is smooth and has a single maximum. This means that a single arbitrary starting point is sufficient to obtain maximumlikelihood estimates using Powell’s quadratic convergence method (Presset al. 1996). Of course the starting point affects the time spent in the search process, and the values provided by the moment estimates usually act as a good starting point close to the point with the maximum likelihood. For a particular replicate with N_{e} = 100 and m = 0.1 in Table 2, the likelihood curve is shown in Figure 4. The maximumlikelihood estimates of N_{e} and m are 102 and 0.107, respectively, with 95% confidence intervals being 65172 and 0.0380.191, respectively.
Multiple samples over time can increase the estimation precision for the average 1/(2N_{e}) and m during the entire sampling period. Table 3 summarizes the estimates of 1/(2N_{e}) and m, relative to the true parameter values of 1/(2N_{e}) = 0.01 and m = 0.2, from maximumlikelihood and moment estimators using two to six temporal samples from the focal population. As expected, increasing the number of samples has no influence on the accuracy, but increases the precision dramatically, as indicated by the decreasing SD and the width between the 2.5 and 97.5 percentiles. The likelihood method gives unbiased estimation of 1/(2N_{e}) and m, while the moment estimator results in underestimation of 1/(2N_{e}) and overestimation of m. The poor accuracy of the moment estimator is partially due to the highly polymorphic loci (10 alleles per locus) and small sample sizes, which lead to a high proportion of rare alleles in the samples (Wang 2001a).
Quasiinfinite source, the island model and the steppingstone model: In reality, the source population is not infinite in size, although it could be very large. Moreover, there can be different numbers of source populations for a focal population with migration in different patterns. As long as we can identify the source populations a priori, however, the parameters for the focal population can be estimated approximately using the methods developed for the infinitesource model by treating the source populations collectively as the single “infinite” source. Here we explore numerical examples to show that the infinitesource model is very robust and works well for small source populations with various models of migration.
Consider a metapopulation consisting of 21 populations of equal effective size N_{e}. Each population receives, at each generation, a proportion of m migrants taken equally from each of the remaining 20 populations in the island model or from each of the two neighboring populations in the circular steppingstone model. For the estimation of m and N_{e} of a focal population, we take two samples separated by two generations from the focal and each of four source populations (the island model) or each of the two neighboring populations (the steppingstone model). Each sample had 50 individuals and two alleles per locus for N_{e} = 10 and 100 individuals and eight alleles per locus for N_{e} = 100. The estimates of 1/(2N_{e}) and m relative to their true values are summarized in Table 4 for maximumlikelihood and moment estimators. In all cases considered, maximum likelihood provided reasonable estimates of both 1/(2N_{e}) and m, while the moment estimator gives satisfactory estimation of 1/(2N_{e}) and m only when drift is not very weak relative to migration. Otherwise, 1/(2N_{e}) is underestimated, similar to the situation of an infinite large source population shown in Table 2.
We also tested the infinitesource model with the extreme case of two small populations of equal N_{e} and m, shown in Table 5. Although the source population (A) is small, the estimation of both 1/(2N_{e}) and m of the focal population (B) is still not bad, at least within a factor of two. The moment estimator gives similar results (not shown). Like the results shown above, the worst case occurs with an extremely high migration rate, where populations can quickly reach the driftmigration equilibrium. In this case, increasing sample size improves the estimation considerably. For the situation of N_{e} = 100 and m = 0.5 in Table 5, for example, the mean (relative) estimate ± SD is 0.969 ± 0.204 for 1/(2N_{e}) and 0.809 ± 0.204 for m when the sample size is increased to 400 individuals.
Twodeme models: As a numerical example for the case of two demes with sampling interval t = 1, Figure 5 shows the frequency distributions of the estimates of N_{e,}_{A}, m_{A}, N_{e,}_{B}, and m_{B}. As can be seen, both estimators give reasonably good estimates of effective size and migration rate, the precision of the likelihood estimator being always higher than that of the moment estimator.
For the case of sampling interval t > 1, the moment estimator for N_{e} becomes quite complicated and the likelihood estimator becomes computationally intensive. In principle, however, both methods are not limited by the sampling interval, except for the case of very large t where populations are approaching an equilibrium between drift and migration (Figures 1 and 2). At intermediate values of t, both methods should actually be more powerful than at t = 1 (Figure 1B). Because of the computational demands of the Monte Carlo approach, we have run only a few replicates for t > 1 for the likelihood method, yielding reasonable estimates of both effective size and migration rate, as expected (data not shown).
APPLICATION TO A REAL DATA SET
Jehle et al. (2001) used microsatellite markers to investigate the spatial and temporal genetic differentiation of syntopic newts (Triturus cristatus, T. marmoratus). Individuals from three ponds in western France, situated 410 kilometers apart and inhabited by both species, were sampled and genotyped for eight microsatellite loci. Here we focus on T. marmoratus and consider the subpopulations in ponds 1 and 2 only for which temporal samples are available. The subpopulation in pond 3 is ignored because it is much farther away from the other two and has only a single sample. The first sample was collected as sacrificed larvae in 1989 (pond 1) or 1986 (pond 2), and the second was taken nondestructively from both larvae and adults in both ponds in 1998. The allele counts for five polymorphic loci were converted from the original data on allele frequencies and sample sizes listed in the Appendix of Jehle et al. (2001), and the data for adults and larvae in 1998 were combined for the likelihood analysis. The generation interval was estimated to be 6 years, and the sampling interval is therefore ∼1.5 and 2 generations for ponds 1 and 2, respectively (Jehleet al. 2001). Because of the unsynchronized sampling, it is not possible to apply our method for the twosmallpopulation model. However, we can use our method for the infinitesource model as an approximation to estimate the effective size and immigration rate for one pond, with the other pond acting as the source population and samples from it pooled. Since only integer numbers for sampling intervals are allowed in likelihood estimation, the analysis for pond 1 was carried out by using a sampling interval of 2, and the estimates of N_{e} and
The relative profile loglikelihood curves for the effective size and migration rate of subpopulations in ponds 1 and 2 are shown in Figure 6. The maximumlikelihood estimates are N_{e} = 16.3 and m = 0.39 for pond 1 and N_{e} = 18.0 and m = 0.19 for pond 2. If isolation between ponds is assumed, the likelihood N_{e} estimates are 17.1 and 21.9 for ponds 1 and 2, respectively, both being somewhat larger than those allowing migration.
The estimated migration rates between ponds are high. Although newts are found to migrate as juveniles and adults during breeding seasons (Haywardet al. 2000), there is no direct estimate of migration rate between populations in the literature to support or reject the estimates obtained herein. However, we argue that the migration rate could be high for this population. The average observed number of distinctive alleles per microsatellite locus is seven, which seems to be too large to be maintained in isolated populations with N_{e} at ∼1020. Interspecies introgression may also be partially responsible for the high migration rate estimates. On average ∼4% of adult newts from these ponds are F_{1}hybrids between T. cristatus and T. marmoratus (Arntzen and Wallis 1991). Although hybrids were identified morphologically for adults and genetically with diagnostic microsatellites for larvae (Jehleet al. 2000) and excluded from the analysis (Jehleet al. 2001), it is inevitable that some hybrids may be undetected and included in the samples. Coincidentally, pond 2 contains few T. cristatus (Jehleet al. 2001), and the immigration rate for T. marmoratus in this pond is estimated to be much lower than that in pond 1, which contains ∼75 T. cristatus individuals.
DISCUSSION
The estimation of demographic properties of a species from its genetic characteristics is now an old topic. Many methods have been generated by statistical geneticists to estimate population sizes, migration rates, population growth rates, and many other interesting and useful ecological parameters. Some of these methods work quite well most of the time, and some are less successful. All share one property: they make simplifying assumptions about the possible range of demographies that can be considered, and the methods proposed here are no exception. It should be our goal, though, to make only the assumptions that are biologically tenable and avoid those we know to be both important and untrue. One such untenable assumption in previous temporal methods is that there is no migration into the local populations under study. The intent of this article is twofold: to assess and correct for the population genetic effects of immigration on estimates of the effective population size and to estimate the rate of migration itself.
Techniques that use data over a time series are relatively rare in statistical genetics, and this is unfortunate, as it ignores information available from the evolutionary process itself. One exception to this rule is that the effective population size has most often been estimated by a variety of techniques that use temporal data. Two or more samples separated by one or more generations allow us to measure how much allele frequencies change over time. With a model for how evolution might be occurring, we can then predict something about the evolutionary properties of the species. Previous estimators of N_{e} from temporal data have made a significant assumption in this translation from measuring the pattern to inferring the process: they have assumed that all evolutionary processes other than genetic drift are not occurring in the populations under study. This may be reasonable for some genetic markers with respect to selection and mutation, but in most cases the effect of migration into a population is not negligible. What we have seen here is that this assumption can give estimates of N_{e} with substantial bias (Figure 2). Yet it is possible to correct for this error and in the process estimate another of the elusive demographic properties of populations, the immigration rate.
In this article, we have presented techniques based on both moment estimation and maximum likelihood. As a general rule, the maximumlikelihood method was more successful, having higher accuracy and precision, although in most cases the moment approach was not much worse. The likelihood approach, as usual, is much more computationally intensive, although with modern computers this limitation would not be prohibitive. In most cases, the methods require a large amount of marker information for reasonably good estimates of migration rate and effective size. Like the case of a single isolated population (no migration), the amount of information depends on the total number of independent alleles across loci (total number of alleles  number of loci) in the samples, the sample size, and the number of samples (Wang 2001a). Unlike the single isolated population case, however, the amount of information and thus the power of the methods are no longer simply proportional to sampling interval t, but depend on t in a complicated way. Other things being equal, the power of the methods first increases with t, reaches a maximum at an intermediate t value, and thereafter decreases with t (Figure 1). If t is too large, the methods tend to estimate the N_{e} for the whole species rather than for the local population (Figures 1 and 2). Because of these and the fact that the optimal t depends partly on the parameters being estimated, we recommend a short (say, two to four generations) sampling interval in practice. If we have some prior information indicating a low migration rate, however, t should be larger to increase the estimation precision. In general, the numbers of individuals and markers required are not outside the range of the better studies of population structure. Most of the results given here, which allow at least precision to the right order of magnitude, use a few hundreds of individuals genotyped for 1020 loci for a total number of 20140 alleles across loci. It should be noted that most simulations in the present study were run assuming the equilibrium level of initial genetic differentiation among populations. The power of the methods can be substantially higher than that indicated by these simulations if, in practice, the populations are more differentiated when initial samples are taken.
Many methods have been developed for estimating the effective size, migration rate, or their product N_{e}m. Although it seems desirable to compare these different methods for their precision and accuracy, such a comparison is not possible because they use different types of information and estimate different genetic properties of the populations (see, e.g., Schwartzet al. 1999). For the estimation of effective size, methods employing a single sample generally use the (equilibrium) distribution of some genetic property, such as linkage disequilibrium (Hill 1981), identity disequilibrium (Vitalis and Couvet 2001), or coalescence time (e.g., Beaumont 1999; Beerli and Felsenstein 2001) to estimate the longterm effective population size. In contrast, methods employing two or more temporal samples (e.g., Nei and Tajima 1981; Pollak 1983; Waples 1989; Williamson and Slatkin 1999; Andersonet al. 2000; Wang 2001a; this article) use the allele frequency changes over time under a given genetic model to estimate the shortterm average effective size over the sampling interval. An exception is the heterozygosity excess method (Pudovkinet al. 1996; Luikart and Cornuet 1999), which uses a single sample but estimates the effective size of the population just one generation before the sampling. This method, however, has a low precision and applies only to very small populations under strictly random mating.
For the estimation of migration, most previous approaches estimate the composite quantity of effective number of migrants, 4N_{e}m, assuming an equilibrium among mutation, migration, and drift (e.g., Slatkin 1987; Beerli and Felsenstein 2001). This quantity largely determines the genetic differentiation among populations at equilibrium, but tells us little about the specific effect of either genetic drift (effective size) or migration. These techniques also suffer from a long list of untenable assumptions (Whitlock and McCauley 1999). So far the only approach available to estimate effective size and migration rate jointly is based on the estimates of one and twolocus identity probabilities (Vitalis and Couvet 2001). It requires only a single sample, but assumes populations at equilibrium and with known parameters for the mating system and linkage relationship among markers. For large populations, tightly linked markers with known recombination rate are necessary for substantial identity disequilibrium and therefore reliable N_{e} estimates from this method. In practice, parameters describing mating system and recombination are unknown and, in the most favorable case, estimated from data and the estimates can suffer from large sampling errors. In contrast, our temporal method must use at least two temporal samples from the focal population, but it does not make any assumptions about mating system and genetic status (at equilibrium or not) of the populations when sampling, and it applies to small as well as large populations. We do assume independence among marker loci, but violation of the assumption affects only the precision, not the accuracy of the estimation. With tightly linked markers, the information of the data and thus precision might be overestimated, but no bias should result from dependence among loci.
Another advantage of our temporal approach is that it is applicable to a wide range of population structures and migration patterns, e.g., arbitrary numbers of demes of various sizes in an island or steppingstone model, as long as the migration sources are known a priori from another source of information. Simulation results indicate that the methods developed assuming an infinitely large source are robust to violations of the assumption. For the case of a single small source population, or for sampling only a few from many small source populations, the methods under the infinitesource model still yield satisfactory estimates for both immigration rate and effective size for the focal population.
A computer program for the infinitesource model, MLNE (maximumlikelihood estimate of N_{e}), which implements the maximumlikelihood and moment estimation of both N_{e} and m, is available for free download from http://www.zoo.cam.ac.uk/ioz/software.htm.
APPENDIX: SIMULATION AND LIKELIHOOD SEARCH METHODS
Simulations: Simulations were run to check the performance of the moment and likelihood methods over a wide range of the parameter values. The changes in allele frequency in a diploid population of effective size N_{e} and immigration rate m were simulated by Monte Carlo. For each replicate, the initial allele frequencies for each locus were drawn from a uniform distribution. Except where specified, some generations (required to reach ∼95% equilibrium F_{ST} predicted by using the true parameter values) were run to allow populations to reach quasiequilibrium between drift and migration before samples were taken binomially (sampled with replacement) for estimating the parameters. For equilibrium populations, obviously the number of alleles at a locus observed (maintained) in a focal population covaries with N_{e} and m. To compare the performance of the estimation methods across wide ranges of parameter values of N_{e} and m, we used a small number of alleles per locus in most simulations. For populations with very small m and N_{e}, mutations (at rate 0.001) were incurred to maintain polymorphism. The methods are, however, applicable to nonequilibrium populations and should be slightly more powerful if the genetic differentiation among populations is larger than the equilibrium value, and vice versa.
Since both the likelihood and the moment methods estimate 1/(2N_{e}) rather than N_{e} directly, and it is 1/(2N_{e}) that is important in most applications, we reportthe estimates of 1/(2N_{e}) and m from the simulations. Any unbiased estimate of 1/(2N_{e}) with sampling error would be biased for estimating N_{e}, and vice versa. To reduce running time, a replicate was terminated once the likelihood estimate of N_{e} was determined to be larger than the threshold value of 10N_{e}, and the estimate of 1/(2N_{e}) is regarded as zero (N_{e} infinite) from this replicate. For the moment estimator, a negative estimate of N_{e} is taken as N_{e} → ∞ and the estimate of 1/(2N_{e}) is zero. For a given set of parameters, the means and standard deviations of 1/(2N_{e}) and m estimates from each estimator were obtained over a large number of replicates (depending on N_{e} and thus the computational demand). As a measure of the dispersion of the estimates, the 2.5 and 97.5 percentiles of the distribution of estimates of each parameter from all replicates were also listed for each estimator.
For more than two samples from the focal population, a single estimate of the (harmonic) mean effective size and (arithmetic) mean immigration rate can be obtained from the likelihood estimator. From the moment estimator, N_{e} and m for each period between two adjacent samples can be calculated, and the harmonic and arithmetic means of these over different periods are used as the estimates. More appropriately, averaging moment estimates should take sampling variances into account, which can be obtained by resampling.
Likelihood searches: Depending on the data, the likelihood function defined above may have multiple maxima. To search for the global maximum, we could use techniques employing a Monte Carlo method, such as the simulated annealing method (Presset al. 1996; Wang 2001b). However, such methods are computationally intensive and therefore not appropriate for simulation studies where large numbers of replicates are to be run. We choose to use Powell’s quadratically convergent method (Presset al. 1996), which can find quickly a local maximum from a given starting point. Widely varying starting values of N_{e} and m, including the estimates from the moment estimator, are used and the largest likelihood is picked as the global maximum. For large N_{e}, we start the maximumlikelihood searching from the initial point of moment estimates if they are valid or a random nearby point if not. The performance of the maximumlikelihood estimator indicated by the simulations for large N_{e} can therefore be slightly conservative.
Acknowledgments
We thank Mark Beaumont, Cort Griswold, Sally Otto, Dolph Schluter, and two anonymous reviewers for their careful reading of and helpful comments on the manuscript. Funding was provided by a grant from the Natural Science and Engineering Research Council (Canada).
Footnotes

Communicating editor: J. B. Walsh
 Received July 24, 2002.
 Accepted October 25, 2002.
 Copyright © 2003 by the Genetics Society of America