Abstract

Using the island model of population demography, I report that the demographic parameters migration rate and effective population size can be jointly estimated with equilibrium probabilities of identity in state calculated using a sample of genotypes collected at a single point in time from a single generation. The method, which uses moment-type estimators, applies to dioecious populations in which females and males have identical demography and monoecious populations with no selfing and requires that offspring genotypes are sampled following reproduction and prior to migration. I illustrate the estimation procedure using the infinite-island model with no mutation and the finite-island model with three kinds of mutation models. In the infinite-island model with no mutation, the estimators can be expressed as simple functions of estimates of the F-statistic parameters FIT and FST. In the finite-island model with mutation among k alleles, mutation rate, migration rate, and effective population size can be simultaneously estimated. The estimates of migration rate and effective population size are somewhat robust to violations in assumptions that may arise in empirical applications such as different kinds of mutation models and deviations from temporal equilibrium.

POPULATION geneticists recognize that the demographic characteristics of populations, such as migration rates and population sizes, affect population genetic structure (Wright 1951). Accordingly, many population genetic studies have investigated how demographic properties might be inferred from genetic measurements in populations (e.g., Slatkin 1985; Waples 1989; Pudovkin  et al. 1996; Beerli and Felsenstein 2001; Vitalis and Couvet 2001; Wang and Whitlock 2003; Robledo-Arnuncio  et al. 2006). In parallel, the cultivation of genomic resources in species that are amenable to field study has facilitated the application of genetic methodologies to estimate demographic rates in natural populations.

The island model of Wright (1951) is an important model in population genetics. Under a simple version of this model, an infinite number of demes, each having population size N, exchange migrants at rate m under the assumption that migrants into a deme come from any of the other demes with equal probability. In the absence of mutation and other evolutionary forces, genetic polymorphism is maintained within demes via a balance between genetic drift and migration. An important feature of the infinite-island model is that, at temporal equilibrium, the magnitude of genetic differentiation among demes, FST, is approximated by
\[F_{\mathrm{ST}}{\approx}\frac{1}{1{+}4mN},\]
where the approximation is intended to apply for small values of the migration rate, m (Wright 1951). An important statistical consequence of the above result is that the product parameter mN, the number of individuals migrating and reproducing per generation, may be estimated using data on FST (Slatkin 1985), but the parameters m and N cannot be estimated individually in this way. Although the indiscriminate application of the infinite-island model to interpret genetic data in terms of demographic rates has been discouraged (Whitlock and Mccauley 1999), the island model continues to support a variety of theoretical and empirical investigations (e.g., Vitalis and Couvet 2001; Balloux  et al. 2003; Hänfling and Weetman 2006).

There is continuing interest in statistical approaches that estimate both migration rate, m, and effective population size, N, from genetic data, including methods that are applicable to a sample taken from a single generation at a single point in time (Beerli and Felsenstein 2001; Vitalis and Couvet 2001; Wang and Whitlock 2003). Here, I report results that show how the island model of dioecious or monoecious populations can be used to simultaneously estimate migration rate, m, and effective population size, N, using a sample of selectively neutral markers taken from a single generation at a single point in time. In particular, at temporal equilibrium under the infinite-island model with no mutation, the demographic parameters m and N can be estimated using data on FIT and FST (Wright 1951). At temporal equilibrium under the finite-island model with a k-allele mutation scheme, the demographic parameters m and N, as well as the mutation rate, u, can be jointly estimated using data on probabilities of identity in state.

THE INFINITE-ISLAND MODEL WITH NO MUTATION

I first describe key results for the infinite-island model of a selectively neutral locus with no mutation; theoretical details are in the  appendix. Under the infinite-island model with no mutation, an infinite number of demes, each having effective population size N, exchange migrants at rate m. The results that follow apply to monoecious populations with no selfing (N adults) and dioecious populations when males and females have identical demography (N adults composed of N/2 females and N/2 males). This model is appropriate for highly fecund organisms with localized mating, including species of invertebrates, amphibians, fishes, and plants.

Population genetic structure can be characterized using probabilities of gene identity (e.g., Maruyama 1970; Maynard  Smith 1970; Nei and Feldman 1972; Crow and Aoki 1984; Epperson 1999; Rousset 2001; Vitalis 2002). Accordingly, let Q1(t), Q2(t), and Q3(t) be the probabilities (summed over k alleles at one locus) that genes within individuals, between individuals within a deme, and between individuals between demes are the same allele at time t, respectively. Hence, Q1(t), Q2(t), and Q3(t) are probabilities of identity in allelic state. A key idea in the following theory is that the interpretation of the probabilities of identity can depend on the timing of the sampling of genotypes within the sequence of demographic events that defines the life cycle (Vitalis 2002). I first assume that the sampling of genotypes follows a premigration census in the sense that genotypes are sampled from offspring immediately following reproduction and prior to migration. This kind of sampling is appropriate for highly fecund organisms with localized mating in which many offspring may be available following reproduction for genotyping. Under the premigration census in the infinite-island model with no mutation, the probabilities of identity at temporal equilibrium satisfy
\begin{eqnarray*}&&Q_{1(t{+}1)}{=}Q_{1(t)}{=}(1{-}m)^{2}Q_{2(t)}{+}{[}1{-}(1{-}m)^{2}{]}Q_{3(t)}\\&&Q_{2(t{+}1)}{=}Q_{2(t)}{=}\frac{1}{2N}(1{+}Q_{1(t)}){+}(1{-}m)^{2}\left(1{-}\frac{1}{N}\right)Q_{2(t)}\\&&{+}{[}1{-}(1{-}m)^{2}{]}\left(1{-}\frac{1}{N}\right)Q_{3(t)}\\&&Q_{3(t{+}1)}{=}Q_{3(t)}{=}Q_{3(t)}.\end{eqnarray*}
(1)
Equation 1 is the same as Equation A1.4 of Vitalis (2002) when assuming an infinite number of demes with no mutation and no sex-specific dispersal in the latter. Although recursions similar to Equation 1 have been presented and analyzed (Maynard  Smith 1970; Vitalis and Couvet 2001; Vitalis 2002; Balloux  et al. 2003; see the  appendix for details), previous work seems to have overlooked the idea that Equation 1 can be used to jointly estimate m and N. Indeed, at temporal equilibrium, the parameters FIT and FST are distinct and are given by
\begin{eqnarray*}&&F_{\mathrm{IT}}{=}\frac{Q_{1}{-}Q_{3}}{1{-}Q_{3}}{=}\frac{(1{-}m)^{2}}{{[}1{-}(1{-}m)^{2}{]}2N{+}(1{-}m)^{2}}\\&&{\approx}\frac{(1{-}2m)}{(1{-}2m){+}4mN}\\&&F_{\mathrm{ST}}{=}\frac{Q_{2}{-}Q_{3}}{1{-}Q_{3}}{=}\frac{1}{{[}1{-}(1{-}m)^{2}{]}2N{+}(1{-}m)^{2}}\\&&{\approx}\frac{1}{(1{-}2m){+}4mN},\end{eqnarray*}
where the approximation omits terms proportional to m2 [the approximation is given here solely to connect these findings to Wright's (1951) classic result that
\(F_{\mathrm{ST}}{\approx}1/(1{+}4mN)\)
]. Equation 14 in Vitalis (2002) assuming no mutation, an infinite number of demes, and no sex-specific dispersal is the same as the equation for FST given above, but Vitalis (2002) does not report an expression for FIT. Thus, migration rate, m, and effective population size, N, can be expressed in terms of FIT and FST, without approximation, via
\begin{eqnarray*}&&m{=}1{-}\sqrt{\frac{F_{\mathrm{IT}}}{F_{\mathrm{ST}}}}{=}1{-}\sqrt{\frac{Q_{1}{-}Q_{3}}{Q_{2}{-}Q_{3}}}\\&&N{=}\frac{1{-}F_{\mathrm{IT}}}{2(F_{\mathrm{ST}}{-}F_{\mathrm{IT}})}{=}\frac{1{-}Q_{1}}{2(Q_{2}{-}Q_{1})}.\end{eqnarray*}
Hence, the above expressions can be used to estimate m and N via the moment-based estimators
\begin{eqnarray*}&&{\hat{m}}{=}1{-}\sqrt{\frac{{\hat{F}}_{\mathrm{IT}}}{{\hat{F}}_{\mathrm{ST}}}}{=}1{-}\sqrt{\frac{{\hat{Q}}_{1}{-}{\hat{Q}}_{3}}{{\hat{Q}}_{2}{-}{\hat{Q}}_{3}}}\\&&{\hat{N}}{=}\frac{1{-}{\hat{F}}_{\mathrm{IT}}}{2({\hat{F}}_{\mathrm{ST}}{-}{\hat{F}}_{\mathrm{IT}})}{=}\frac{1{-}{\hat{Q}}_{1}}{2({\hat{Q}}_{2}{-}{\hat{Q}}_{1})},\end{eqnarray*}
(2)
where
\({\hat{F}}_{\mathrm{IT}},{\hat{F}}_{\mathrm{ST}},{\hat{Q}}_{1(t)},{\hat{Q}}_{2(t)},\)
and
\({\hat{Q}}_{3(t)}\)
denote estimates of FIT, FST, Q1(t), Q2(t), and Q3(t), respectively. Methods for estimating FIT, FST, Q1(t), Q2(t), and Q3(t) are discussed by Rousset (2001). Equation 17 in Vitalis (2002) gives an estimator for sex-specific dispersal rates similar to the estimator of m in Equation 2, but, importantly, the former requires estimates of FST from a sequence of samples taken pre- and postmigration, rather than estimates of FIT and FST from a single sample as in Equation 2. Fontanillas  et al. (2004) also give estimators for sex-specific dispersal based on the idea of Vitalis (2002) that require estimates of FST from samples taken pre- and postmigration. Vitalis (2002) and Fontanillas  et al. (2004) do not report estimators of effective population size.

Estimates of m and N can be calculated from multiple loci by calculating

\({\hat{Q}}_{1(t)},{\hat{Q}}_{2(t)},\)
and
\({\hat{Q}}_{3(t)}\)
over loci (or, equivalently,
\({\hat{F}}_{\mathrm{IT}}\)
and
\({\hat{F}}_{\mathrm{ST}}\)
over loci). Interestingly, if offspring genotypes are sampled following migration using a postmigration census scheme (Vitalis 2002), then the recursions for the probabilities of identity are different from those for the premigration census with the consequence that FIT = FST. Hence, the parameters m and N cannot be jointly estimated in this way using a postmigration census.

To verify the recursions in Equation 1, and thus that the estimators in Equation 2 work as intended, I simulated genotype data under the infinite-island model with no mutation for dioecious and monoecious (with no selfing) populations at temporal equilibrium over a range of migration rates (0.02, 0.05, 0.10, and 0.20) and effective population sizes (10, 20, and 50). In the simulations, individual genotypes were tracked forward in time using a Monte Carlo implementation of the probability model defined by the life cycle using a premigration census. For each replicate simulation, diploid genotypes were initialized using random pairs of alleles, the life cycle was iterated until the system reached temporal equilibrium, and offspring genotypes were sampled prior to migration. I numerically solved the analytical recursions in Equation 1 to identify, in advance of the stochastic simulations, a sufficient number of generations required for the system of probabilities of identity to reach equilibrium to a precision of 10−4 (equilibrium to four decimal places; 1000 generations is sufficient for all parameter combinations under the infinite-island model). Means of the probabilities of identity calculated over replicate simulations are in close agreement with those calculated from the analytical recursions. The simulations were carried out using 20 independent eight-allele loci (with equally frequent alleles) at which 50 offspring were genotyped from each of 20 demes. Simulated data were combined over loci to calculate

\({\hat{Q}}_{1(t)},{\hat{Q}}_{2(t)},\)
and
\({\hat{Q}}_{3(t)}.\)
Negative estimates of N (equivalent to infinite-valued estimates of N) and m were set equal to 1000 and zero, respectively.

The simulations of the infinite-island model show, given sufficient data collected using a premigration census, that estimates of migration rate and effective population size using Equation 2 are close to their true values for both dioecious (Figure 1A; Table 1) and monoecious populations (Figure 1B; Table 1). The precision of the estimates of N decreases with increasing N, and the precision of the estimates of m decreases with increasing m and N. Additional simulation results are in supplemental Table S1 at http://www.genetics.org/supplemental/.

Figure 1.—

Infinite-island model: estimates of migration rate and effective population size. (A) Estimates of migration rate, m, and effective population size, N, under the infinite-island model with no mutation for different values of the migration rate and effective population size parameters for dioecious populations. (B) Estimates of migration rate, m, and effective population size, N, under the infinite-island model with no mutation for different values of the migration rate and effective population size parameters for monoecious populations. The medians (open circles) and interquartile ranges (error bars) of estimates of migration rate and effective population size from 400 replicate simulations are plotted for each pair of model values of migration rate and effective population size. Solid circles denote parameter values used to simulate the data.

TABLE 1

Medians (5th, 95th percentiles) of the estimates of migration rate and effective population size using genotype data simulated under the infinite-island model at different values of migration rate, m, and effective population size, N, based on 400 replicate simulations for each pair of model migration rate and effective population size parameters for dioecious and monoecious model populations




Parameter estimates
Parameter values
Dioecious populations
Monoecious populations
N
m
\({\hat{m}}\)

\({\hat{N}}\)

\({\hat{m}}\)

\({\hat{N}}\)

100.020.019 (0.013, 0.027)9.91 (7.80, 13.22)0.019 (0.014, 0.027)9.73 (8.16, 12.46)
0.200.200 (0.158, 0.248)9.96 (8.76, 11.88)0.199 (0.162, 0.244)10.05 (8.90, 11.29)
200.020.019 (0.012, 0.028)19.62 (15.04, 30.07)0.019 (0.012, 0.027)19.78 (15.33, 29.79)
0.200.202 (0.152, 0.262)19.77 (16.47, 25.14)0.199 (0.147, 0.258)20.04 (16.36, 25.14)
500.020.019 (0.005, 0.032)51.39 (30.95, 172.5)0.019 (0.005, 0.033)50.90 (30.23, 165.7)

0.20
0.194 (0.090, 0.322)
50.18 (33.46, 101.9)
0.200 (0.095, 0.322)
49.66 (34.20, 95.17)



Parameter estimates
Parameter values
Dioecious populations
Monoecious populations
N
m
\({\hat{m}}\)

\({\hat{N}}\)

\({\hat{m}}\)

\({\hat{N}}\)

100.020.019 (0.013, 0.027)9.91 (7.80, 13.22)0.019 (0.014, 0.027)9.73 (8.16, 12.46)
0.200.200 (0.158, 0.248)9.96 (8.76, 11.88)0.199 (0.162, 0.244)10.05 (8.90, 11.29)
200.020.019 (0.012, 0.028)19.62 (15.04, 30.07)0.019 (0.012, 0.027)19.78 (15.33, 29.79)
0.200.202 (0.152, 0.262)19.77 (16.47, 25.14)0.199 (0.147, 0.258)20.04 (16.36, 25.14)
500.020.019 (0.005, 0.032)51.39 (30.95, 172.5)0.019 (0.005, 0.033)50.90 (30.23, 165.7)

0.20
0.194 (0.090, 0.322)
50.18 (33.46, 101.9)
0.200 (0.095, 0.322)
49.66 (34.20, 95.17)
TABLE 1

Medians (5th, 95th percentiles) of the estimates of migration rate and effective population size using genotype data simulated under the infinite-island model at different values of migration rate, m, and effective population size, N, based on 400 replicate simulations for each pair of model migration rate and effective population size parameters for dioecious and monoecious model populations




Parameter estimates
Parameter values
Dioecious populations
Monoecious populations
N
m
\({\hat{m}}\)

\({\hat{N}}\)

\({\hat{m}}\)

\({\hat{N}}\)

100.020.019 (0.013, 0.027)9.91 (7.80, 13.22)0.019 (0.014, 0.027)9.73 (8.16, 12.46)
0.200.200 (0.158, 0.248)9.96 (8.76, 11.88)0.199 (0.162, 0.244)10.05 (8.90, 11.29)
200.020.019 (0.012, 0.028)19.62 (15.04, 30.07)0.019 (0.012, 0.027)19.78 (15.33, 29.79)
0.200.202 (0.152, 0.262)19.77 (16.47, 25.14)0.199 (0.147, 0.258)20.04 (16.36, 25.14)
500.020.019 (0.005, 0.032)51.39 (30.95, 172.5)0.019 (0.005, 0.033)50.90 (30.23, 165.7)

0.20
0.194 (0.090, 0.322)
50.18 (33.46, 101.9)
0.200 (0.095, 0.322)
49.66 (34.20, 95.17)



Parameter estimates
Parameter values
Dioecious populations
Monoecious populations
N
m
\({\hat{m}}\)

\({\hat{N}}\)

\({\hat{m}}\)

\({\hat{N}}\)

100.020.019 (0.013, 0.027)9.91 (7.80, 13.22)0.019 (0.014, 0.027)9.73 (8.16, 12.46)
0.200.200 (0.158, 0.248)9.96 (8.76, 11.88)0.199 (0.162, 0.244)10.05 (8.90, 11.29)
200.020.019 (0.012, 0.028)19.62 (15.04, 30.07)0.019 (0.012, 0.027)19.78 (15.33, 29.79)
0.200.202 (0.152, 0.262)19.77 (16.47, 25.14)0.199 (0.147, 0.258)20.04 (16.36, 25.14)
500.020.019 (0.005, 0.032)51.39 (30.95, 172.5)0.019 (0.005, 0.033)50.90 (30.23, 165.7)

0.20
0.194 (0.090, 0.322)
50.18 (33.46, 101.9)
0.200 (0.095, 0.322)
49.66 (34.20, 95.17)

THE FINITE-ISLAND MODEL WITH k-ALLELE MUTATION

I now describe key results for the finite-island model with mutation following the k-allele mutation model; theoretical details are in the  appendix. Under the finite-island model, s demes, each having effective population size N, exchange migrants at rate m, and genes can mutate into other alleles after gamete production according to the k-allele mutation model. The results that follow again apply to monoecious populations with no selfing and dioecious populations when males and females exhibit identical demography (including identical rates of mutation).

Under the premigration census in the finite-island model with k-allele mutation, the probabilities of identity at temporal equilibrium satisfy
\begin{eqnarray*}&&Q_{1(t{+}1)}{=}Q_{1(t)}{=}U_{2}{+}(U_{1}{-}U_{2}){[}M_{1}Q_{2(t)}{+}(1{-}M_{1})Q_{3(t)}{]}\\&&Q_{2(t{+}1)}{=}Q_{2(t)}{=}\frac{U_{1}}{2N}{+}\left(1{-}\frac{1}{2N}\right)U_{2}{+}\frac{1}{2N}(U_{1}{-}U_{2})Q_{1(t)}\\&&{+}M_{1}\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2})Q_{2(t)}\\&&{+}(1{-}M_{1})\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2})Q_{3(t)}\\&&Q_{3(t{+}1)}{=}Q_{3(t)}{=}U_{2}{+}(U_{1}{-}U_{2}){[}M_{2}Q_{2(t)}{+}(1{-}M_{2})Q_{3(t)}{]},\end{eqnarray*}
(3)
where
\begin{eqnarray*}&&M_{1}{=}(1{-}m)^{2}{+}\frac{m^{2}}{s{-}1}\\&&M_{2}{=}\frac{1{-}M_{1}}{s{-}1}\\&&U_{1}{=}(1{-}u)^{2}{+}\frac{u^{2}}{k{-}1}\\&&U_{2}{=}\frac{1{-}U_{1}}{k{-}1}.\end{eqnarray*}
Equation 3 is the same as Equation A1.4 of Vitalis (2002) when the latter is modified to have k-allele mutation (rather than infinite-allele mutation) and no sex-specific dispersal. Although recursions similar to Equation 3 have been presented and analyzed (Maynard  Smith 1970; Vitalis and Couvet 2001; Vitalis 2002; Balloux  et al. 2003; see  appendix for details), the subsequent estimation of u, m, and N based on Equation 3 seems not to have been recognized in previous work. Indeed, Equation 3 can be used to jointly estimate the mutation rate, u, migration rate, m, and effective population size, N, by solving the system of equations
\begin{eqnarray*}&&{\hat{Q}}_{1(t)}{=}U_{2}{+}(U_{1}{-}U_{2}){[}M_{1}{\hat{Q}}_{2(t)}{+}(1{-}M_{1}){\hat{Q}}_{3(t)}{]}\\&&{\hat{Q}}_{2(t)}{=}\frac{U_{1}}{2N}{+}\left(1{-}\frac{1}{2N}\right)U_{2}{+}\frac{1}{2N}(U_{1}{-}U_{2}){\hat{Q}}_{1(t)}\\&&{+}M_{1}\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2}){\hat{Q}}_{2(t)}\\&&{+}(1{-}M_{1})\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2}){\hat{Q}}_{3(t)}\\&&{\hat{Q}}_{3(t)}{=}U_{2}{+}(U_{1}{-}U_{2}){[}M_{2}{\hat{Q}}_{2(t)}{+}(1{-}M_{2}){\hat{Q}}_{3(t)}{]}\end{eqnarray*}
(4)
for u, m, and N. The values of u, m, and N that satisfy Equation 4, denoted by
\({\hat{u}},{\hat{m}},\)
and
\({\hat{N}},\)
respectively, are the respective moment-based estimators of u, m, and N. The estimates
\({\hat{u}},{\hat{m}},\)
and
\({\hat{N}}\)
are calculated assuming that the number of demes, s, and the number of possible alleles, k, are known. Estimates can be calculated from multiple loci by summing the left- and right-hand sides of Equation 4 over loci (k may be locus specific in the U1 and U2 terms in the right-hand side of Equation 4) and calculating the parameter values that solve the moment equations obtained by setting the left-hand sum over loci equal to the right-hand sum over loci.

To verify the recursions in Equation 3, and thus that the estimators

\({\hat{u}},{\hat{m}},\)
and
\({\hat{N}}\)
based on Equation 4 work as intended, I simulated genotype data under the finite-island model with k-allele mutation for dioecious and monoecious (with no selfing) populations at temporal equilibrium over a range of migration rates (0.02, 0.05, 0.10, and 0.20) and effective population sizes (10, 20, and 50) at three different mutation rates (0.001, 0.0005, and 0.0001). These values for the mutation rate are consistent with values used in similar simulation studies (e.g., Vitalis and Couvet 2001; Wang and Whitlock 2003; Excoffier  et al. 2005) and estimates from empirical data (Estoup  et al. 2001; Lai and Sun 2003; Excoffier  et al. 2005). Simulations were executed as described above for the infinite-island model: individual genotypes were tracked forward in time using a Monte Carlo implementation of the probability model defined by the life cycle using a premigration census for s demes. I numerically solved the analytical recursions in Equation 3 to identify a sufficient number of generations for the system to reach equilibrium to a precision of 10−4 (5000 generations for u = 0.001, 6000 generations for u = 0.0005, and 15,000 generations for u = 0.0001). Models with smaller values of u and m and larger values of N require more generations to reach equilibrium. Means of the probabilities of identity calculated over replicate simulations are in close agreement with those calculated from the analytical recursions. The simulations were carried out using 30 independent eight-allele loci (with equally frequent alleles; hence, k = 8) at which 100 offspring were genotyped from each of 20 demes. Simulated data were combined over loci to calculate
\({\hat{Q}}_{1(t)},{\hat{Q}}_{2(t)},\)
and
\({\hat{Q}}_{3(t)},\)
and the estimators
\({\hat{u}},{\hat{m}},\)
and
\({\hat{N}}\)
defined by Equation 4 were calculated numerically using a nonlinear least-squares procedure in the software application MATLAB (The Mathworks). Estimates of u, m, and N were constrained according to 0 ≤
\({\hat{u}}\)
≤ 1, 0 ≤
\({\hat{m}}\)
≤ 1, and 2 ≤
\({\hat{N}}\)
≤ 1000, respectively, to obtain realistic estimates and to account for the possibility of infinite-valued estimates of N (cf. Waples 1989; Williamson and Slatkin 1999; Wang and Whitlock 2003).

The simulations of the finite-island model show, given sufficient data collected using a premigration census, that estimates of migration rate and effective population size using Equation 4 are close to their true values for both dioecious (Figure 2, A–C; Table 2) and monoecious populations (Figure 2D; Table 2). The precision of the estimates of N decreases with increasing values of N, and the precision of the estimates of m decreases with increasing values of m and N. The precision of the estimates of m and N decreases with lower mutation rates. Estimates of mutation rate, given sufficient data, are similarly close to their true values (Table 2). The precision of the estimates of u decreases with decreasing values of u and increasing values of N. Additional simulation results are in supplemental Table S2 at http://www.genetics.org/supplemental/.

Figure 2.—

Finite-island model: estimates of migration rate and effective population size. (A) Estimates of migration rate, m, and effective population size, N, under the finite-island model for dioecious populations with u = 0.001. (B) Estimates of migration rate, m, and effective population size, N, under the finite-island model for dioecious populations with u = 0.0005. (C) Estimates of migration rate, m, and effective population size, N, under the finite-island model for dioecious populations with u = 0.0001. (D) Estimates of migration rate, m, and effective population size, N, under the finite-island model for monoecious populations with u = 0.0005. The medians (open circles) and interquartile ranges (error bars) of estimates of migration rate and effective population size from 400 replicate simulations are plotted for each pair of model values of migration rate and effective population size. Solid circles denote parameter values used to simulate the data.

TABLE 2

Medians (5th, 95th percentiles) of the estimates of mutation rate, migration rate, and effective population size using genotype data simulated under the finite-island model with the k-allele mutation model at different values of mutation rate, u, migration rate, m, and effective population size, N, based on 400 replicate simulations for each pair of model migration rate and effective population size parameters for dioecious (u = 0.001, u = 0.0005, and u = 0.0001) and monoecious (u = 0.0005) model populations


Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 103
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.001
100.020.993 (0.693, 1.343)0.019 (0.013, 0.027)9.97 (8.36, 12.53)
0.201.003 (0.716, 1.425)0.201 (0.155, 0.260)10.02 (8.60, 11.80)
200.021.005 (0.675, 1.348)0.020 (0.014, 0.026)20.09 (15.83, 26.87)
0.201.005 (0.732, 1.371)0.202 (0.152, 0.257)19.92 (16.72, 24.95)
500.020.989 (0.523, 1.507)0.019 (0.011, 0.029)50.83 (35.17, 88.58)
0.200.971 (0.569, 1.524)0.196 (0.112, 0.305)50.82 (35.10, 83.54)
Parameter values
Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.0005
100.024.937 (3.299, 7.462)0.019 (0.014, 0.028)10.09 (8.12, 12.76)
0.205.100 (3.287, 7.125)0.200 (0.144, 0.270)9.89 (8.50, 12.52)
200.025.038 (3.355, 7.123)0.020 (0.014, 0.026)19.97 (15.30, 27.83)
0.204.914 (3.413, 7.067)0.198 (0.145, 0.265)19.86 (16.50, 25.72)
500.025.018 (2.529, 8.093)0.020 (0.010, 0.030)49.36 (33.45, 98.02)
0.205.034 (2.521, 7.748)0.199 (0.098, 0.314)49.98 (34.53, 92.92)
Dioecious populations: k-allele mutation model, u = 0.0001
100.020.970 (0.423, 1.677)0.019 (0.010, 0.028)9.94 (7.59, 16.63)
0.200.962 (0.354, 1.734)0.200 (0.105, 0.327)9.97 (7.47, 15.78)
200.020.977 (0.444, 1.610)0.019 (0.010, 0.030)19.68 (13.78, 35.61)
0.201.025 (0.446, 1.824)0.198 (0.106, 0.339)19.92 (13.19, 35.27)
500.020.919 (0.096, 1.809)0.018 (0.002, 0.034)53.17 (29.94, 447.1)
0.200.933 (0.250, 1.736)0.191 (0.056, 0.374)51.81 (31.04, 172.7)
Monoecious populations: k-allele mutation model, u = 0.0005
100.024.965 (3.299, 6.878)0.019 (0.014, 0.026)10.07 (8.69, 12.34)
0.205.046 (3.356, 7.129)0.198 (0.151, 0.260)10.08 (8.77, 11.84)
200.024.985 (3.266, 7.065)0.020 (0.013, 0.027)19.83 (15.65, 27.23)
0.204.836 (3.499, 7.004)0.196 (0.143, 0.266)20.17 (16.13, 26.19)
500.025.108 (2.216, 7.818)0.020 (0.009, 0.030)49.68 (33.53, 110.1)

0.20
4.863 (2.877, 7.566)
0.201 (0.113, 0.309)
50.55 (35.33, 84.27)

Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 103
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.001
100.020.993 (0.693, 1.343)0.019 (0.013, 0.027)9.97 (8.36, 12.53)
0.201.003 (0.716, 1.425)0.201 (0.155, 0.260)10.02 (8.60, 11.80)
200.021.005 (0.675, 1.348)0.020 (0.014, 0.026)20.09 (15.83, 26.87)
0.201.005 (0.732, 1.371)0.202 (0.152, 0.257)19.92 (16.72, 24.95)
500.020.989 (0.523, 1.507)0.019 (0.011, 0.029)50.83 (35.17, 88.58)
0.200.971 (0.569, 1.524)0.196 (0.112, 0.305)50.82 (35.10, 83.54)
Parameter values
Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.0005
100.024.937 (3.299, 7.462)0.019 (0.014, 0.028)10.09 (8.12, 12.76)
0.205.100 (3.287, 7.125)0.200 (0.144, 0.270)9.89 (8.50, 12.52)
200.025.038 (3.355, 7.123)0.020 (0.014, 0.026)19.97 (15.30, 27.83)
0.204.914 (3.413, 7.067)0.198 (0.145, 0.265)19.86 (16.50, 25.72)
500.025.018 (2.529, 8.093)0.020 (0.010, 0.030)49.36 (33.45, 98.02)
0.205.034 (2.521, 7.748)0.199 (0.098, 0.314)49.98 (34.53, 92.92)
Dioecious populations: k-allele mutation model, u = 0.0001
100.020.970 (0.423, 1.677)0.019 (0.010, 0.028)9.94 (7.59, 16.63)
0.200.962 (0.354, 1.734)0.200 (0.105, 0.327)9.97 (7.47, 15.78)
200.020.977 (0.444, 1.610)0.019 (0.010, 0.030)19.68 (13.78, 35.61)
0.201.025 (0.446, 1.824)0.198 (0.106, 0.339)19.92 (13.19, 35.27)
500.020.919 (0.096, 1.809)0.018 (0.002, 0.034)53.17 (29.94, 447.1)
0.200.933 (0.250, 1.736)0.191 (0.056, 0.374)51.81 (31.04, 172.7)
Monoecious populations: k-allele mutation model, u = 0.0005
100.024.965 (3.299, 6.878)0.019 (0.014, 0.026)10.07 (8.69, 12.34)
0.205.046 (3.356, 7.129)0.198 (0.151, 0.260)10.08 (8.77, 11.84)
200.024.985 (3.266, 7.065)0.020 (0.013, 0.027)19.83 (15.65, 27.23)
0.204.836 (3.499, 7.004)0.196 (0.143, 0.266)20.17 (16.13, 26.19)
500.025.108 (2.216, 7.818)0.020 (0.009, 0.030)49.68 (33.53, 110.1)

0.20
4.863 (2.877, 7.566)
0.201 (0.113, 0.309)
50.55 (35.33, 84.27)
TABLE 2

Medians (5th, 95th percentiles) of the estimates of mutation rate, migration rate, and effective population size using genotype data simulated under the finite-island model with the k-allele mutation model at different values of mutation rate, u, migration rate, m, and effective population size, N, based on 400 replicate simulations for each pair of model migration rate and effective population size parameters for dioecious (u = 0.001, u = 0.0005, and u = 0.0001) and monoecious (u = 0.0005) model populations


Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 103
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.001
100.020.993 (0.693, 1.343)0.019 (0.013, 0.027)9.97 (8.36, 12.53)
0.201.003 (0.716, 1.425)0.201 (0.155, 0.260)10.02 (8.60, 11.80)
200.021.005 (0.675, 1.348)0.020 (0.014, 0.026)20.09 (15.83, 26.87)
0.201.005 (0.732, 1.371)0.202 (0.152, 0.257)19.92 (16.72, 24.95)
500.020.989 (0.523, 1.507)0.019 (0.011, 0.029)50.83 (35.17, 88.58)
0.200.971 (0.569, 1.524)0.196 (0.112, 0.305)50.82 (35.10, 83.54)
Parameter values
Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.0005
100.024.937 (3.299, 7.462)0.019 (0.014, 0.028)10.09 (8.12, 12.76)
0.205.100 (3.287, 7.125)0.200 (0.144, 0.270)9.89 (8.50, 12.52)
200.025.038 (3.355, 7.123)0.020 (0.014, 0.026)19.97 (15.30, 27.83)
0.204.914 (3.413, 7.067)0.198 (0.145, 0.265)19.86 (16.50, 25.72)
500.025.018 (2.529, 8.093)0.020 (0.010, 0.030)49.36 (33.45, 98.02)
0.205.034 (2.521, 7.748)0.199 (0.098, 0.314)49.98 (34.53, 92.92)
Dioecious populations: k-allele mutation model, u = 0.0001
100.020.970 (0.423, 1.677)0.019 (0.010, 0.028)9.94 (7.59, 16.63)
0.200.962 (0.354, 1.734)0.200 (0.105, 0.327)9.97 (7.47, 15.78)
200.020.977 (0.444, 1.610)0.019 (0.010, 0.030)19.68 (13.78, 35.61)
0.201.025 (0.446, 1.824)0.198 (0.106, 0.339)19.92 (13.19, 35.27)
500.020.919 (0.096, 1.809)0.018 (0.002, 0.034)53.17 (29.94, 447.1)
0.200.933 (0.250, 1.736)0.191 (0.056, 0.374)51.81 (31.04, 172.7)
Monoecious populations: k-allele mutation model, u = 0.0005
100.024.965 (3.299, 6.878)0.019 (0.014, 0.026)10.07 (8.69, 12.34)
0.205.046 (3.356, 7.129)0.198 (0.151, 0.260)10.08 (8.77, 11.84)
200.024.985 (3.266, 7.065)0.020 (0.013, 0.027)19.83 (15.65, 27.23)
0.204.836 (3.499, 7.004)0.196 (0.143, 0.266)20.17 (16.13, 26.19)
500.025.108 (2.216, 7.818)0.020 (0.009, 0.030)49.68 (33.53, 110.1)

0.20
4.863 (2.877, 7.566)
0.201 (0.113, 0.309)
50.55 (35.33, 84.27)

Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 103
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.001
100.020.993 (0.693, 1.343)0.019 (0.013, 0.027)9.97 (8.36, 12.53)
0.201.003 (0.716, 1.425)0.201 (0.155, 0.260)10.02 (8.60, 11.80)
200.021.005 (0.675, 1.348)0.020 (0.014, 0.026)20.09 (15.83, 26.87)
0.201.005 (0.732, 1.371)0.202 (0.152, 0.257)19.92 (16.72, 24.95)
500.020.989 (0.523, 1.507)0.019 (0.011, 0.029)50.83 (35.17, 88.58)
0.200.971 (0.569, 1.524)0.196 (0.112, 0.305)50.82 (35.10, 83.54)
Parameter values
Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: k-allele mutation model, u = 0.0005
100.024.937 (3.299, 7.462)0.019 (0.014, 0.028)10.09 (8.12, 12.76)
0.205.100 (3.287, 7.125)0.200 (0.144, 0.270)9.89 (8.50, 12.52)
200.025.038 (3.355, 7.123)0.020 (0.014, 0.026)19.97 (15.30, 27.83)
0.204.914 (3.413, 7.067)0.198 (0.145, 0.265)19.86 (16.50, 25.72)
500.025.018 (2.529, 8.093)0.020 (0.010, 0.030)49.36 (33.45, 98.02)
0.205.034 (2.521, 7.748)0.199 (0.098, 0.314)49.98 (34.53, 92.92)
Dioecious populations: k-allele mutation model, u = 0.0001
100.020.970 (0.423, 1.677)0.019 (0.010, 0.028)9.94 (7.59, 16.63)
0.200.962 (0.354, 1.734)0.200 (0.105, 0.327)9.97 (7.47, 15.78)
200.020.977 (0.444, 1.610)0.019 (0.010, 0.030)19.68 (13.78, 35.61)
0.201.025 (0.446, 1.824)0.198 (0.106, 0.339)19.92 (13.19, 35.27)
500.020.919 (0.096, 1.809)0.018 (0.002, 0.034)53.17 (29.94, 447.1)
0.200.933 (0.250, 1.736)0.191 (0.056, 0.374)51.81 (31.04, 172.7)
Monoecious populations: k-allele mutation model, u = 0.0005
100.024.965 (3.299, 6.878)0.019 (0.014, 0.026)10.07 (8.69, 12.34)
0.205.046 (3.356, 7.129)0.198 (0.151, 0.260)10.08 (8.77, 11.84)
200.024.985 (3.266, 7.065)0.020 (0.013, 0.027)19.83 (15.65, 27.23)
0.204.836 (3.499, 7.004)0.196 (0.143, 0.266)20.17 (16.13, 26.19)
500.025.108 (2.216, 7.818)0.020 (0.009, 0.030)49.68 (33.53, 110.1)

0.20
4.863 (2.877, 7.566)
0.201 (0.113, 0.309)
50.55 (35.33, 84.27)

Examples of the sampling distributions of the parameter estimates are shown in Figure 3 for dioecious populations with u = 0.0005, m = 0.05, and N = 20 for samples of 100 genotypes from 20 demes genotyped at 30 eight-allele loci. The distributions of

\({\hat{u}}\)
and
\({\hat{m}}\)
are approximately symmetrical (Figure 3, A and B), and the distribution of
\({\hat{N}}\)
is positively skewed (Figure 3C). The estimates of m and N are strongly negatively correlated and fall along the line defined by
\({\hat{m}}{\hat{N}}\)
= mN (Figure 3D). The estimates of u and m are positively correlated, and the estimates of u and N are negatively correlated (the correlation between
\({\hat{u}}\)
and
\({\hat{N}}\)
is strong, but not as strong as that exhibited by
\({\hat{m}}\)
and
\({\hat{N}}\)
). These results suggest that the product parameter mN is well estimated, but that the individual estimates of u, m, and N are more difficult to identify precisely from data.

Figure 3.—

Sampling distributions of parameter estimates. (A) Sampling distribution of estimates of mutation rate, u. (B) Sampling distribution of estimates of migration rate, m. (C) Sampling distribution of estimates of effective population size, N. (D) Estimates of effective population size, N, graphed against estimates of migration rate, m. Estimates are from data simulated under the finite-island model for dioecious populations with k-allele mutation using parameters u = 0.0005, m = 0.05, and N = 20. Triangles denote values of the parameters.

In empirical situations, k, the number of possible alleles at a locus, might be expected to vary across loci. In this case, estimates of u, m, and N may be constructed from the moment equations defined by summing the left- and right-hand sides of Equation 4 over loci and setting the left-hand sum equal to the right-hand sum. Simulations of dioecious populations in the finite-island model (100 individuals genotyped from each of 20 demes) with k-allele mutation (u = 0.0005) for 30 loci having a random mixture of 4, 8, or 12 possible alleles show, given sufficient data collected using a premigration census, that estimates of mutation rate, migration rate, and effective population size are close to their true values and have similar properties to those calculated when all loci have the same number of possible allelic states (simulation results are in supplemental Table S3 at http://www.genetics.org/supplemental/).

ROBUSTNESS OF THE ESTIMATION PROCEDURES

The extent to which a statistical procedure yields useful parameter estimates typically depends on the appropriateness of the assumptions to a given data set. Accordingly, I assessed the properties of the parameter estimates when some of the assumptions of the models are violated.

I used the k-allele mutation model to describe the mutation process in the estimation procedure described above. The infinite-allele model (Kimura and Crow 1964) can be employed using Equation 4 by setting k to a large (and effectively infinite) value. An alternative mutation model that may be appropriate for some markers such as microsatellite sequences is the stepwise mutation model (Ohta and Kimura 1973). The stepwise mutation model assigns a length-based ordering to alleles and posits that mutation occurs between allelic states that are adjacent in the ordering. Accordingly, I applied the estimation procedure to data generated using the simulation approach described above for the finite-island model except that I implemented mutation according to two simple kinds of stepwise mutation models: a stepwise mutation model with no bounds on allele length (unbounded stepwise mutation model; Ohta and Kimura 1973) and a stepwise mutation model with lower and upper bounds on allele length (bounded stepwise mutation model; constrained to eight allelic states). I simulated genotype data under the finite-island model with dioecious populations over a range of migration rates (0.02, 0.05, 0.10, and 0.20) and effective population sizes (10, 20, and 50) at a mutation rate of u = 0.0005. Hence, each generation every gene mutates to an adjacent allele with probability u, mutating to either neighboring allele with equal probability. In the bounded stepwise mutation model, genes at the lower bound mutate toward the upper bound and genes at the upper bound mutate toward the lower bound. The simulations were carried out using 30 independent eight-allele loci (with equally frequent alleles) at which 100 offspring were genotyped from each of 20 demes after 6000 generations. Simulated data were combined over loci to calculate

\({\hat{Q}}_{1(t)},{\hat{Q}}_{2(t)},\)
and
\({\hat{Q}}_{3(t)}.\)
I set k = 10,000 to mimic the infinite-allele mutation model for parameter estimation using data generated from the unbounded stepwise mutation model, and I set k = 8 for parameter estimation using data generated from the bounded stepwise mutation model.

The accuracy (when comparing the medians of replicate estimates to their parametric values) of estimates of migration rate and effective population size for data generated under the stepwise mutation models is similar to that observed under the k-allele mutation model (Figure 4; Table 3), despite the fact that the k-allele mutation model is the assumed mutation process in Equation 4. In particular, the medians of the estimates of m and N are close to their respective parametric values. The estimates of m and N for data generated under the stepwise mutation models (Figure 4; Table 3) exhibit levels of precision similar to, but slightly lower than, the estimates of those parameters for data generated under the equivalent k-allele mutation model (Figure 2B; Table 2). Hence, the estimates of m and N based on Equation 4 are somewhat robust to violations in the assumptions of the mutation model. In contrast, the estimates of mutation rate, when summarized using their medians, are negatively biased (Table 3), suggesting that the mutation rate estimates are sensitive to violations in the assumptions of the mutation model in the estimation procedure. Additional simulation results are in supplemental Table S4 at http://www.genetics.org/supplemental/.

Figure 4.—

Finite-island model: stepwise mutation models. (A) Estimates of migration rate, m, and effective population size, N, under the finite-island model with unbounded stepwise mutation for different values of the migration rate and effective population size parameters for dioecious populations. (B) Estimates of migration rate, m, and effective population size, N, under the finite-island model with bounded stepwise mutation for different values of the migration rate and effective population size parameters for dioecious populations. The medians (open circles) and interquartile ranges (error bars) of estimates of migration rate and effective population size from 400 replicate simulations are plotted for each pair of model values of migration rate and effective population size. Solid circles denote parameter values used to simulate the data.

TABLE 3

Medians (5th, 95th percentiles) of the estimates of mutation rate, migration rate, and effective population size using genotype data simulated under the finite-island model with stepwise mutation models (employing either unbounded or bounded ranges in allele lengths; u = 0.0005) at different values of migration rate, m, and effective population size, N, based on 400 replicate simulations for each pair of model migration rate and effective population size parameters for dioecious model populations


Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: unbounded stepwise mutation model, u = 0.0005
100.023.712 (2.619, 5.420)0.020 (0.014, 0.028)10.00 (8.02, 13.15)
0.204.189 (2.862, 6.108)0.197 (0.152, 0.268)10.06 (8.36, 12.26)
200.023.600 (2.385, 4.965)0.020 (0.013, 0.028)19.79 (15.21, 28.11)
0.203.798 (2.431, 5.489)0.202 (0.136, 0.278)19.91 (15.87, 28.06)
500.022.898 (1.316, 4.757)0.020 (0.009, 0.031)51.23 (32.43, 107.5)
0.203.066 (1.492, 4.841)0.198 (0.092, 0.333)50.63 (33.15, 108.2)
Dioecious populations: bounded stepwise mutation model, u = 0.0005
100.023.409 (2.254, 5.292)0.020 (0.013, 0.027)9.95 (7.88, 13.48)
0.203.731 (2.335, 5.678)0.202 (0.145, 0.264)9.97 (8.31, 12.78)
200.023.286 (1.967, 4.810)0.020 (0.013, 0.029)19.73 (14.78, 30.14)
0.203.306 (1.968, 5.042)0.197 (0.129, 0.286)20.22 (15.31, 29.01)
500.022.652 (1.183, 4.584)0.019 (0.008, 0.032)53.26 (32.71, 119.4)

0.20
2.861 (1.224, 4.730)
0.203 (0.092, 0.327)
50.36 (33.12, 103.7)

Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: unbounded stepwise mutation model, u = 0.0005
100.023.712 (2.619, 5.420)0.020 (0.014, 0.028)10.00 (8.02, 13.15)
0.204.189 (2.862, 6.108)0.197 (0.152, 0.268)10.06 (8.36, 12.26)
200.023.600 (2.385, 4.965)0.020 (0.013, 0.028)19.79 (15.21, 28.11)
0.203.798 (2.431, 5.489)0.202 (0.136, 0.278)19.91 (15.87, 28.06)
500.022.898 (1.316, 4.757)0.020 (0.009, 0.031)51.23 (32.43, 107.5)
0.203.066 (1.492, 4.841)0.198 (0.092, 0.333)50.63 (33.15, 108.2)
Dioecious populations: bounded stepwise mutation model, u = 0.0005
100.023.409 (2.254, 5.292)0.020 (0.013, 0.027)9.95 (7.88, 13.48)
0.203.731 (2.335, 5.678)0.202 (0.145, 0.264)9.97 (8.31, 12.78)
200.023.286 (1.967, 4.810)0.020 (0.013, 0.029)19.73 (14.78, 30.14)
0.203.306 (1.968, 5.042)0.197 (0.129, 0.286)20.22 (15.31, 29.01)
500.022.652 (1.183, 4.584)0.019 (0.008, 0.032)53.26 (32.71, 119.4)

0.20
2.861 (1.224, 4.730)
0.203 (0.092, 0.327)
50.36 (33.12, 103.7)
TABLE 3

Medians (5th, 95th percentiles) of the estimates of mutation rate, migration rate, and effective population size using genotype data simulated under the finite-island model with stepwise mutation models (employing either unbounded or bounded ranges in allele lengths; u = 0.0005) at different values of migration rate, m, and effective population size, N, based on 400 replicate simulations for each pair of model migration rate and effective population size parameters for dioecious model populations


Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: unbounded stepwise mutation model, u = 0.0005
100.023.712 (2.619, 5.420)0.020 (0.014, 0.028)10.00 (8.02, 13.15)
0.204.189 (2.862, 6.108)0.197 (0.152, 0.268)10.06 (8.36, 12.26)
200.023.600 (2.385, 4.965)0.020 (0.013, 0.028)19.79 (15.21, 28.11)
0.203.798 (2.431, 5.489)0.202 (0.136, 0.278)19.91 (15.87, 28.06)
500.022.898 (1.316, 4.757)0.020 (0.009, 0.031)51.23 (32.43, 107.5)
0.203.066 (1.492, 4.841)0.198 (0.092, 0.333)50.63 (33.15, 108.2)
Dioecious populations: bounded stepwise mutation model, u = 0.0005
100.023.409 (2.254, 5.292)0.020 (0.013, 0.027)9.95 (7.88, 13.48)
0.203.731 (2.335, 5.678)0.202 (0.145, 0.264)9.97 (8.31, 12.78)
200.023.286 (1.967, 4.810)0.020 (0.013, 0.029)19.73 (14.78, 30.14)
0.203.306 (1.968, 5.042)0.197 (0.129, 0.286)20.22 (15.31, 29.01)
500.022.652 (1.183, 4.584)0.019 (0.008, 0.032)53.26 (32.71, 119.4)

0.20
2.861 (1.224, 4.730)
0.203 (0.092, 0.327)
50.36 (33.12, 103.7)

Parameter values

Parameter estimates
N
m
\({\hat{u}}\)
× 104
\({\hat{m}}\)

\({\hat{N}}\)

Dioecious populations: unbounded stepwise mutation model, u = 0.0005
100.023.712 (2.619, 5.420)0.020 (0.014, 0.028)10.00 (8.02, 13.15)
0.204.189 (2.862, 6.108)0.197 (0.152, 0.268)10.06 (8.36, 12.26)
200.023.600 (2.385, 4.965)0.020 (0.013, 0.028)19.79 (15.21, 28.11)
0.203.798 (2.431, 5.489)0.202 (0.136, 0.278)19.91 (15.87, 28.06)
500.022.898 (1.316, 4.757)0.020 (0.009, 0.031)51.23 (32.43, 107.5)
0.203.066 (1.492, 4.841)0.198 (0.092, 0.333)50.63 (33.15, 108.2)
Dioecious populations: bounded stepwise mutation model, u = 0.0005
100.023.409 (2.254, 5.292)0.020 (0.013, 0.027)9.95 (7.88, 13.48)
0.203.731 (2.335, 5.678)0.202 (0.145, 0.264)9.97 (8.31, 12.78)
200.023.286 (1.967, 4.810)0.020 (0.013, 0.029)19.73 (14.78, 30.14)
0.203.306 (1.968, 5.042)0.197 (0.129, 0.286)20.22 (15.31, 29.01)
500.022.652 (1.183, 4.584)0.019 (0.008, 0.032)53.26 (32.71, 119.4)

0.20
2.861 (1.224, 4.730)
0.203 (0.092, 0.327)
50.36 (33.12, 103.7)

The assumption of temporal equilibrium in the probability of identity recursion equations is used to estimate mutation rate, migration rate, and effective population size. To assess this assumption, I applied the estimation procedure to genotype data simulated under nonequilibrium conditions using the infinite-island model with no mutation and the finite-island model with mutation for dioecious populations with parameter values m = 0.05 and N = 20. For the finite-island model, I simulated data under the k-allele mutation model, the unbounded stepwise mutation model, and the bounded stepwise mutation model using a mutation rate of u = 0.0001 (the value of the mutation rate requiring the most generations to reach temporal equilibrium). At these parameter values, the infinite-island model is very close to temporal equilibrium after 50 generations, whereas the finite-island model with k-allele mutation requires ∼10,000 generations to reach equilibrium to four decimal places. Diploid genotypes were initialized using random pairs of alleles, the life cycle was iterated for 10, 20, 50, 100, or 200 generations, and genotypes were then sampled prior to migration (100 offspring genotyped at 30 eight-allele loci in each of 20 demes).

The accuracy of the estimates of mutation rate, migration rate, and effective population size, measured using the median of replicate estimates relative to their parametric values, increases as the number of generations increases from 10 to 200 for both the infinite- and the finite-island models (Figure 5). The estimates of u under the finite-island models are strongly positively biased at 10–200 generations (Figure 5A). In contrast, the estimates of m under the infinite-island model and the finite-island model with unbounded stepwise mutation are positively biased at 10 generations, but are nearly unbiased after ≥20 generations (Figure 5B). The estimates of m under the finite-island models with k-allele and bounded stepwise mutation models are negatively biased, but the bias is relatively small, especially after ≥100 generations (Figure 5B). The estimates of N exhibit the least bias among the estimated parameters, indicating essentially no bias in the infinite-island model and the finite island model with unbounded stepwise mutation at ≥10 generations and a small negative bias in the finite-island models with k-allele and bounded stepwise mutation models after 10 and 20 generations and very little bias for ≥50 generations (Figure 5C). The precision (measured via 5th and 95th percentiles in Figure 5) of the estimates of m and N is high relative to estimates of those parameters under equilibrium conditions (see supplemental Table S2 at http://www.genetics.org/supplemental/: u = 0.0001, m = 0.05, N = 20). Hence, the estimation procedures do not require equilibrium conditions to provide reasonable estimates of migration rate and effective population size, and the nonequilibrium conditions explored here actually increase the precision of these estimates. In contrast, the nonequilibrium conditions examined here result in inaccurate estimates of the mutation rate. Simulations suggest that at least 4000 generations under the finite-island model with k-allele mutation are required to obtain accurate estimates of the mutation rate when u = 0.0001, m = 0.05, and N = 20.

Figure 5.—

Estimation under nonequilibrium conditions. (A) Estimates of mutation rate, u, (on the log scale) as a function of number of generations using data generated under finite-island models with different mutation models for dioecious populations. (B) Estimates of migration rate, m, as a function of number of generations using data generated under the infinite-island model with no mutation and finite-island models with different mutation models for dioecious populations. (C) Estimates of effective population size, N, as a function of number of generations using data generated under the infinite-island model with no mutation and finite-island models with different mutation models for dioecious populations. The medians (symbols) and 5th and 95th percentiles (error bars) of estimates from 400 replicate simulations are plotted for each model at generations 10, 20, 50, 100, and 200. Dashed lines denote the parameter values u = 0.0001, m = 0.05, and N = 20. Symbols have been offset along the x-axis to improve clarity.

The estimation of parameters using Equation 4 assumes that k, the number of possible allelic states at a locus, is known. In practice, k might be estimated using the total number of alleles observed in all of the data for each locus, and there would be uncertainty in its value. Accordingly, using data generated under the finite-island models (100 offspring genotyped at 30 eight-allele loci in each of 20 demes after 6000 generations; hence k = 8) with k-allele mutation, unbounded stepwise mutation, and bounded stepwise mutation (u = 0.0005, m = 0.05, N = 20), I calculated estimates of mutation rate, migration rate, and effective population size using values for k in Equation 4 equal to 2, 4, 8, 16, and 100.

The medians of the estimates of migration rate and effective population size are close to their parametric values over the range of assumed values of k (2, 4, 8, 16, and 100), but the medians of the estimates of mutation rate deviate from the parametric values for most of the assumed values of k, with most cases exhibiting negative bias. The precision (measured via 5th and 95th percentiles) of the estimates of m and N is similar over the range of assumed values of k with the exception that estimates of m are slightly less precise for k = 2 under the k-allele mutation model. Estimates of u for k = 2 are quite variable and some are very near zero; otherwise the precision of the estimates of u is similar across the different assumed values of k. Hence, estimates of m and N are robust to uncertainty in the value of k, and, in contrast, estimates of u are more sensitive to deviations from the parametric value of k.

DISCUSSION

Population geneticists have actively studied the idea that demographic parameters such as migration rate and effective population size might be estimable from genetic data (e.g., Slatkin 1985; Waples 1989; Pudovkin  et al. 1996; Beerli and Felsenstein 2001; Vitalis and Couvet 2001; Wang and Whitlock 2003; Robledo-Arnuncio  et al. 2006). Using the classic island model (Wright 1951), I report that migration rate and effective population size can be jointly estimated from probabilities of identity using neutral markers in dioecious or monoecious populations when offspring genotypes are collected prior to migration from a single generation at a single point in time. The life cycle and sampling model are appropriate for highly fecund organisms with localized mating, including species of invertebrates, amphibians, fishes, and plants; hence the method has the potential for broad taxonomic utility.

The estimation procedure works because assuming a dioecious population—or monoecious populations with no selfing—in which offspring genotypes are sampled prior to migration has the consequence that Q1(t)Q2(t) and hence provides additional information that is not available for other mating systems and sampling schemes that result in Q1(t) = Q2(t) (e.g., random selfing resulting in random pairing of gametes from all adults during mating, including pairing of gametes from the same adult; Maruyama 1970; Nei and Feldman 1972; Nagylaki 1983; Crow and Aoki 1984; Epperson 1999) or Q1(t) very nearly equal to Q2(t) (e.g., the finite-island model under a postmigration census). Previous studies in population ecology (e.g., Caswell 2001) and genetics (e.g., Nagylaki 1983; Waples 1989; Vitalis 2002) have recognized that the mating system and/or timing of sampling can affect the interpretation of demographic quantities, but the application of these ideas to the present scenario of joint estimation of migration rate and effective population size using a sample from a single generation collected at a single point in time seems not to have been analyzed in prior investigations. Several studies have examined recursions for probabilities of identity in state that are similar to those used here, but these studies do not identify the estimation procedures developed here. In the  appendix, I outline how various recursions for probabilities of identity that have been studied (Maruyama 1970; Maynard  Smith 1970; Nei and Feldman 1972; Nagylaki 1983; Crow and Aoki 1984; Epperson 1999; Vitalis and Couvet 2001; Vitalis 2002; Balloux  et al. 2003) can be derived under the pre- and postmigration census schemes, thus helping to explain the different forms of these equations that occur in the literature. Indeed, Vitalis and Couvet (2001) estimate m and N using probabilities of identity, and their Equation 5 with no selfing is equivalent to the infinite-island model with k-allele mutation under a premigration census as defined here; but Vitalis and Couvet (2001) assume an infinite-island model with mutation among an infinite number of alleles, and they use the approximation

\(F_{\mathrm{ST}}{\approx}1/(1{+}4mN)\)
along with a two-locus identity measure (a fourth-moment quantity) rather than the single-locus quantities FIT and FST or Q1(t), Q2(t), and Q3(t) (all second-moment quantities).

Like other genetic methods for estimating demographic parameters (e.g., Waples 1989; Pudovkin  et al. 1996; Williamson and Slatkin 1999; Beerli and Felsenstein 2001; Vitalis and Couvet 2001; Wang and Whitlock 2003), the procedure described here will typically require considerable data to recover accurate and precise estimates. The accuracy and precision achieved will, in general, depend on several factors, including the values of the parameters u, m, and N, as well as the number of demes, number of individuals genotyped, number of loci used, and the appropriateness of the model to the empirical system under investigation. Previous studies (Waples 1989; Pudovkin  et al. 1996; Williamson and Slatkin 1999; Vitalis and Couvet 2001; Wang and Whitlock 2003) have shown that effective population size is more difficult to estimate as N increases, and my simulation results are consistent with the findings in these earlier studies. Indeed, my results suggest that Q1(t) and Q2(t) become increasingly similar as N increases, with the consequence that estimates of N approach infinity as

\({\hat{Q}}_{1(t)}\)
and
\({\hat{Q}}_{2(t)}\)
become equal. Further, negative estimates of N are possible with Equation 2 if
\({\hat{Q}}_{1(t)}\)
>
\({\hat{Q}}_{2(t)}.\)
Because effective population size influences genetic quantities via the function 1/N in standard models, it is not surprising that larger population sizes are more difficult to estimate with genetic data because this involves estimating an effect of magnitude ∼1/N—a small number for even moderately sized N. Hence, genetic-based methods for estimating effective population size work best for small populations, and the numbers of loci and individuals genotyped must increase with increasing N to maintain a given level of precision (Waples 1989; Pudovkin  et al. 1996; Williamson and Slatkin 1999; Vitalis and Couvet 2001; Wang and Whitlock 2003). Accordingly, the method presented here is most likely to be useful for populations with a metapopulation structure defined by many small demes. Detailed guidance on the accuracy and precision of the estimators for specific empirical scenarios can be obtained using simulations. Source code used to simulate data under the infinite-island model with no mutation and the finite-island model with k-allele mutation is available at http://www.genetics.org/supplemental/.

The simulation results suggest that the estimates of migration rate and effective population size are somewhat robust to violations of the model assumptions. Reasonable estimates of m and N can be obtained for loci exhibiting stepwise mutation (Ohta and Kimura 1973), under nonequilibrium conditions, or if the number of possible allelic states is not precisely known. In contrast, estimates of the mutation rate, u, are sensitive to violations in the assumptions and can be quite biased in these settings. The precision of the estimates of m and N is higher for data simulated with a high mutation rate and for data simulated under nonequilibrium conditions, suggesting that the procedure works better at higher levels of genetic diversity. The estimation procedure under the finite-island model requires that the number of demes, s, be known, but it does not require that all demes be sampled because all demes are identical in island models. In non-island models, the set of demes that is exchanging migrants generally must be known to estimate migration rates among the demes (Beerli and Felsenstein 2001; Wang and Whitlock 2003; Slatkin 2005).

Many studies investigate the estimation of demographic parameters from genetic data (Slatkin 1985; Waples 1989; Pudovkin  et al. 1996; Wang and Whitlock 2003; Robledo-Arnuncio  et al. 2006); however, few methods exist for jointly estimating parameters like migration rate and effective population size from genetic data collected from a sample taken from a single generation at a single point in time (Beerli and Felsenstein 1999, 2001; Vitalis and Couvet 2001). For example, the product parameter mN can be estimated from single-generation data on FST under the infinite-island model (Slatkin 1985), and effective population size alone can be estimated from multiple samples on allele frequencies from two or more generations (Waples 1989) or from a single sample of offspring assuming unrelated parents using heterozygote excess (Pudovkin  et al. 1996). Vitalis (2002) and Fontanillas  et al. (2004) use two samples (both pre- and postmigration samples) to estimate migration rate alone using F-statistics under the infinite-island model with sex-specific dispersal. Extending the idea in Waples (1989), if allele frequency data are available from multiple samples from multiple generations from two or more demes, then migration rate and effective population size can be jointly estimated (Wang and Whitlock 2003). Using data from a single generation, the method of Beerli and Felsenstein (2001) estimates the deme-specific product parameters 4uN and m/u for s demes under a general migration scheme under the assumption that effective population size is sufficiently large so that the coalescent model of genetic drift is appropriate and that m and u are sufficiently small so that the quantities mN and uN remain finite as N goes to infinity. In a two-deme version of their coalescent procedure, Beerli and Felsenstein (1999) initially estimate 4uN and m/u using moment estimators based on the probability-of-identity equations of Nei and Feldman (1972), which can be derived for randomly mating monoecious populations under a postmigration census scheme. The method of Vitalis and Couvet (2001) uses one- and two-locus probabilities of identity to estimate m and N under the infinite-island model with infinite-allele mutation and random selfing, assuming that u = 0 and m is sufficiently small so that the approximation

\(F_{\mathrm{ST}}{\approx}1/(1{+}4mN)\)
might be valid. In a somewhat different demographic scenario that tackles the same issues, if migration occurs via the dispersal of male gametes and genotype data are available from offspring and their mothers (e.g., pollen dispersal with genotype data from seeds and their mother plant), then the gamete dispersal curve can be estimated independently of effective population density by making use of probabilities of identity, and an approximate estimate of effective population density can also be calculated (Robledo-Arnuncio  et al. 2006). The method of Robledo-Arnuncio  et al. (2006) is nonequilibrium in the sense that it does not model mutation and it estimates dispersal in the most recent generation assuming that parents are unrelated. Under a demographic model of admixture of previously separated demes (vs. demes exhibiting continuous migration and drift), computationally intensive Bayesian procedures based on coalescent models have been used to estimate demographic parameters (e.g., the admixture proportion) that are consistent with a set of observed summary statistics, including estimates of FST (Estoup  et al. 2001; Excoffier  et al. 2005). Under the standard coalescent model, only the product parameter uN is estimable unless additional information on u (or N) is available (Estoup  et al. 2001) or the sampling scheme and demography mimic samples taken from the same deme over different generations (cf. Waples 1989; Excoffier  et al. 2005). The results from these studies illustrate the challenges of estimating demographic parameters from genetic data.

The method I describe here requires only a sample from a single generation at a single point in time; it can jointly estimate mutation rate, migration rate, and effective population size; it is relatively simple computationally and, given the parametric model, need not make assumptions concerning the values of parameters that might be estimated; but, at present, it has not been developed to accommodate more general demographic situations. However, it may be possible to extend the method to include other demographic and genetic scenarios, such as a time series of samples (Wang and Whitlock 2003), stepping-stone dispersal, more general migration models (e.g., Beerli and Felsenstein 2001), deme-specific effective population sizes, and other mutation models (e.g., Lai and Sun 2003). More general forms of the model can lead to additional (but still linear) recursions for the probabilities of identity in state, but if the probability of identity within individuals remains different from the probability of identity among individuals within demes in these more general settings, then information may be available to jointly estimate migration rates and effective population sizes in more detailed models.

APPENDIX

I consider parametric expressions involving Q1(t), Q2(t), and Q3(t), the probabilities of identity in allelic state within one selectively neutral locus at time t, in an s-deme finite-island model with genetic drift, migration, and mutation. I adopt part of the derivation strategy of Nagylaki (1983) and outline the life cycle for the models that I consider. Starting with N diploid, monoecious adults in each of s demes, reproduction begins with each adult producing a large (i.e., infinite) number of haploid gametes. The allele in each gamete may then mutate into a different allele according to a general mutation model. The reproduction phase is completed by the random pairing of gametes from different adults within demes; no offspring are produced using two gametes from the same adult (i.e., no selfing occurs). Offspring then migrate among demes so that, following migration, a fraction m of the individuals in a deme are migrants and a fraction 1 − m of the individuals are residents. Population regulation completes the life cycle with N offspring chosen at random within each deme to compose the adults that will produce the next generation. The life cycle just described is the diploid dispersion life cycle considered by Nagylaki (1983). The equilibrium results that follow also apply to dioecious populations when males and females are equal in number (the populations within demes are regulated to N/2 females and N/2 males so that the total adult effective population size is N), migrate at the same rate, and experience the same mutation model.

Because demographic and genetic measures can depend on the timing of sampling within the life cycle (e.g., Waples 1989; Caswell 2001; Vitalis 2002), I consider two census schemes, a premigration census and a postmigration census. Assuming pre- and postmigration census schemes for dioecious populations with sex-specific dispersal following the infinite-allele mutation model in a finite number of demes, Vitalis (2002) gives recursions for probabilities of identity by descent (premigration, Equation A1.4; postmigration, Equation A1.1; Vitalis 2002) that can be readily modified to obtain the results that follow. Note that the second and third columns in the matrix A following Equation A1.1 in Vitalis (2002) should have terms like (1 − 2/N) consistent with Equation 4 in that article, rather than terms like (1 − 1/N).

First, I consider the premigration census under the infinite-island model. Under the premigration census, the sampling of offspring occurs immediately following reproduction and prior to migration. Let Q1(t), Q2(t), and Q3(t) be the probabilities (summed over k alleles at one locus) that genes within individuals, between individuals within a deme, and between individuals between demes are the same allele at time t, respectively. Under the premigration census scheme, Equation A1.4 of Vitalis (2002) modified for an infinite number of demes with no mutation and no sex-specific dispersal yields, at temporal equilibrium, the recursions
\begin{eqnarray*}&&Q_{1(t{+}1)}{=}Q_{1(t)}{=}(1{-}m)^{2}Q_{2(t)}{+}{[}1{-}(1{-}m)^{2}{]}Q_{3(t)}\\&&Q_{2(t{+}1)}{=}Q_{2(t)}{=}\frac{1}{2N}(1{+}Q_{1(t)}){+}(1{-}m)^{2}\left(1{-}\frac{1}{N}\right)Q_{2(t)}{+}{[}1{-}(1{-}m)^{2}{]}\left(1{-}\frac{1}{N}\right)Q_{3(t)}\\&&Q_{3(t{+}1)}{=}Q_{3(t)}{=}Q_{3(t)}.\end{eqnarray*}
Solving for Q1(t) and Q2(t) yields
\begin{eqnarray*}&&Q_{1(t)}{=}\frac{(1{-}m)^{2}}{1{-}(1{-}m)^{2}(1{-}(1/2N))}\left(\frac{1}{2N}\right){+}\frac{{[}1{-}(1{-}m)^{2}{]}}{1{-}(1{-}m)^{2}(1{-}(1/2N))}Q_{3(t)}\\&&Q_{2(t)}{=}\frac{1}{1{-}(1{-}m)^{2}(1{-}(1/2N))}\left(\frac{1}{2N}\right){+}\frac{{[}1{-}(1{-}m)^{2}{]}}{1{-}(1{-}m)^{2}(1{-}(1/2N))}\left(1{-}\frac{1}{2N}\right)Q_{3(t)}.\end{eqnarray*}
Accordingly, the parameters FIT and FST (Wright 1951) are given by
\begin{eqnarray*}&&F_{\mathrm{IT}}{=}\frac{Q_{1}{-}Q_{3}}{1{-}Q_{3}}{=}\frac{(1{-}m)^{2}}{{[}1{-}(1{-}m)^{2}{]}2N{+}(1{-}m)^{2}}{\approx}\frac{(1{-}2m)}{(1{-}2m){+}4mN}\\&&F_{\mathrm{ST}}{=}\frac{Q_{2}{-}Q_{3}}{1{-}Q_{3}}{=}\frac{1}{{[}1{-}(1{-}m)^{2}{]}2N{+}(1{-}m)^{2}}{\approx}\frac{1}{(1{-}2m){+}4mN},\end{eqnarray*}
where the approximation omits terms proportional to m2. Equation 14 in Vitalis (2002) assuming no mutation, an infinite number of demes, and no sex-specific dispersal is the same as the equation for FST given above, but Vitalis (2002) does not report an expression for FIT. Thus, migration rate, m, and effective population size, N, can be expressed exactly in terms of FIT and FST via
\begin{eqnarray*}&&m{=}1{-}\sqrt{\frac{F_{\mathrm{IT}}}{F_{\mathrm{ST}}}}{=}1{-}\sqrt{\frac{Q_{1}{-}Q_{3}}{Q_{2}{-}Q_{3}}}\\&&N{=}\frac{1{-}F_{\mathrm{IT}}}{2(F_{\mathrm{ST}}{-}F_{\mathrm{IT}})}{=}\frac{1{-}Q_{1}}{2(Q_{2}{-}Q_{1})}.\end{eqnarray*}
Equation 16 in Vitalis (2002) gives an expression for sex-specific dispersal rates similar to the expression for m given here, but, importantly, the former is a function of FST from a sequence of samples taken pre- and postmigration, rather than a function of FIT and FST for a single sample as given here. Vitalis (2002) does not report an expression for N.
Under the postmigration census scheme, Equation A1.1 of Vitalis (2002) modified for an infinite number of demes with no mutation and no sex-specific dispersal yields, at temporal equilibrium, the recursions
\begin{eqnarray*}&&Q_{1(t{+}1)}{=}Q_{1(t)}{=}Q_{2(t)}\\&&Q_{2(t{+}1)}{=}Q_{2(t)}{=}(1{-}m)^{2}\frac{1}{2N}(1{+}Q_{1(t)}){+}(1{-}m)^{2}\left(1{-}\frac{1}{N}\right)Q_{2(t)}{+}{[}1{-}(1{-}m)^{2}{]}Q_{3(t)}\\&&Q_{3(t{+}1)}{=}Q_{3(t)}{=}Q_{3(t)}.\end{eqnarray*}
Solving for Q1(t) and Q2(t) gives
\[Q_{1(t)}{=}Q_{2(t)}{=}\frac{(1{-}m)^{2}}{1{-}(1{-}m)^{2}(1{-}(1/2N))}\left(\frac{1}{2N}\right){+}\frac{{[}1{-}(1{-}m)^{2}{]}}{1{-}(1{-}m)^{2}(1{-}(1/2N))}Q_{3(t)}.\]
In this case, because Q1(t) = Q2(t), the parameters FIT and FST are given by
\begin{eqnarray*}&&F_{\mathrm{IT}}{=}\frac{Q_{1}{-}Q_{3}}{1{-}Q_{3}}{=}F_{\mathrm{ST}}\\&&F_{\mathrm{ST}}{=}\frac{Q_{2}{-}Q_{3}}{1{-}Q_{3}}{=}\frac{(1{-}m)^{2}}{{[}1{-}(1{-}m)^{2}{]}2N{+}(1{-}m)^{2}}{\approx}\frac{(1{-}2m)}{(1{-}2m){+}4mN},\end{eqnarray*}
where the approximation omits terms proportional to m2. Equation 12 in Vitalis (2002) assuming no mutation and an infinite number of demes is the same as the equation for FST given above, but Vitalis (2002) does not report an expression for FIT. Hence, unlike the premigration census, the parameters m and N cannot be uniquely determined from FIT and FST under a postmigration census.
Under the finite-island model with a k-allele mutation scheme, the number of demes, s, is finite, and each gene occupies one of k allelic states, mutates with probability u per generation, and, given a mutation event, mutates to each of the other k − 1 alleles with equal probability. Under a premigration census, modifying Equation A1.4 of Vitalis (2002) to assume the finite-island model with a k-allele mutation scheme with no sex-specific dispersal yields, at temporal equilibrium, the recursions
\begin{eqnarray*}&&Q_{1(t{+}1)}{=}Q_{1(t)}{=}U_{2}{+}(U_{1}{-}U_{2}){[}M_{1}Q_{2(t)}{+}(1{-}M_{1})Q_{3(t)}{]}\\&&Q_{2(t{+}1)}{=}Q_{2(t)}{=}\frac{U_{1}}{2N}{+}\left(1{-}\frac{1}{2N}\right)U_{2}{+}\frac{1}{2N}(U_{1}{-}U_{2})Q_{1(t)}\\&&{+}M_{1}\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2})Q_{2(t)}{+}(1{-}M_{1})\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2})Q_{3(t)}\\&&Q_{3(t{+}1)}{=}Q_{3(t)}{=}U_{2}{+}(U_{1}{-}U_{2}){[}M_{2}Q_{2(t)}{+}(1{-}M_{2})Q_{3(t)}{]},\end{eqnarray*}
where
\begin{eqnarray*}&&M_{1}{=}(1{-}m)^{2}{+}\frac{m^{2}}{s{-}1}\\&&M_{2}{=}\frac{1{-}M_{1}}{s{-}1}\\&&U_{1}{=}(1{-}u)^{2}{+}\frac{u^{2}}{k{-}1}\\&&U_{2}{=}\frac{1{-}U_{1}}{k{-}1}.\end{eqnarray*}
The above equations, being linear, can be solved explicitly for the equilibrium values of Q1(t), Q2(t), and Q3(t). However, because I could not identify a simple form for the resulting expressions, I do not list them here. Under a postmigration census, modifying Equation A1.1 of Vitalis (2002) to assume the finite-island model with a k-allele mutation scheme with no sex-specific dispersal yields, at temporal equilibrium, the recursions
\begin{eqnarray*}&&Q_{1(t{+}1)}{=}Q_{1(t)}{=}U_{2}{+}(U_{1}{-}U_{2})Q_{2(t)}\\&&Q_{2(t{+}1)}{=}Q_{2(t)}{=}M_{1}\left[\frac{U_{1}}{2N}{+}\left(1{-}\frac{1}{2N}\right)U_{2}\right]{+}\frac{M_{1}}{2N}(U_{1}{-}U_{2})Q_{1(t)}\\&&{+}M_{1}\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2})Q_{2(t)}{+}(1{-}M_{1})(U_{1}{-}U_{2})Q_{3(t)}\\&&Q_{3(t{+}1)}{=}Q_{3(t)}{=}M_{2}\left[\frac{U_{1}}{2N}{+}\left(1{-}\frac{1}{2N}\right)U_{2}\right]{+}\frac{M_{2}}{2N}(U_{1}{-}U_{2})Q_{1(t)}\\&&{+}M_{2}\left(1{-}\frac{1}{N}\right)(U_{1}{-}U_{2})Q_{2(t)}{+}(1{-}M_{2})(U_{1}{-}U_{2})Q_{3(t)}.\end{eqnarray*}
Hence, at temporal equilibrium under a postmigration census, Q1(t) is nearly equal to Q2(t) because the mutation rate, u, is typically a very small number. Thus, the estimation of u, m, and N under a postmigration census should be difficult.
In the case of dioecious populations (assuming the locus is not sex-linked), probabilities of identity within and between individuals must be specified for male and female pairs of genes so that
\(Q_{1(t)}^{\mathrm{F}}\)
and
\(Q_{1(t)}^{\mathrm{M}}\)
are the probabilities that genes are identical in state within female and male individuals;
\(Q_{2(t)}^{\mathrm{FF}},Q_{2(t)}^{\mathrm{MM}},\)
and
\(Q_{2(t)}^{\mathrm{FM}}\)
are the probabilities that genes are identical in state between two females, between two males, and between a female and a male within a deme; and
\(Q_{3(t)}^{\mathrm{FF}},Q_{3(t)}^{\mathrm{MM}},\)
and
\(Q_{3(t)}^{\mathrm{FM}}\)
are the probabilities that genes are identical in state between two females, between two males, and between a female and a male for individuals in different demes (e.g., Vitalis 2002). Under a premigration census, modifying Equation A1.4 of Vitalis (2002) to have the k-allele mutation model with no sex-specific dispersal yields, at temporal equilibrium, the dioecious population recursions
\begin{eqnarray*}&&Q_{1(t{+}1)}^{\mathrm{F}}{=}Q_{1(t{+}1)}^{\mathrm{M}}{=}U_{2}{+}M_{1}(U_{1}{-}U_{2})Q_{2(t)}^{\mathrm{FM}}{+}(1{-}M_{1})(U_{1}{-}U_{2})Q_{3(t)}^{\mathrm{FM}}\\&&Q_{2(t{+}1)}^{\mathrm{FF}}{=}Q_{2(t{+}1)}^{\mathrm{MM}}{=}Q_{2(t{+}1)}^{\mathrm{FM}}{=}\frac{U_{1}}{2N}{+}\left(1{-}\frac{1}{2N}\right)U_{2}{+}\frac{1}{2N}(U_{1}{-}U_{2})\left(\frac{1}{2}\right)(Q_{1(t)}^{\mathrm{FF}}{+}Q_{1(t)}^{\mathrm{MM}})\\&&{+}\frac{1}{2}{[}M_{1}(U_{1}{-}U_{2})Q_{2(t)}^{\mathrm{FM}}{+}(1{-}M_{1})(U_{1}{-}U_{2})Q_{3(t)}^{\mathrm{FM}}{]}\\&&{+}M_{1}\left(\frac{1}{4}{-}\frac{1}{2N}\right)(U_{1}{-}U_{2})(Q_{2(t)}^{\mathrm{FF}}{+}Q_{2(t)}^{\mathrm{MM}})\\&&{+}(1{-}M_{1})\left(\frac{1}{4}{-}\frac{1}{2N}\right)(U_{1}{-}U_{2})(Q_{3(t)}^{\mathrm{FF}}{+}Q_{3(t)}^{\mathrm{MM}})\\&&Q_{3(t{+}1)}^{\mathrm{FF}}{=}Q_{3(t{+}1)}^{\mathrm{MM}}{=}Q_{3(t{+}1)}^{\mathrm{FM}}{=}U_{2}{+}M_{2}\left(\frac{1}{4}\right)(U_{1}{-}U_{2})(Q_{2(t)}^{\mathrm{FF}}{+}Q_{2(t)}^{\mathrm{MM}}{+}2Q_{2(t)}^{\mathrm{FM}})\\&&{+}(1{-}M_{2})\left(\frac{1}{4}\right)(U_{1}{-}U_{2})(Q_{3(t)}^{\mathrm{FF}}{+}Q_{3(t)}^{\mathrm{MM}}{+}2Q_{3(t)}^{\mathrm{FM}}).\end{eqnarray*}
Hence, at temporal equilibrium the probabilities of identity for dioecious populations are identical to those in monoecious populations (with no selfing) when males and females have identical demography.

Probabilities of gene identity have been analyzed extensively in the population genetics literature. I briefly summarize previous results in the context of the models that I have presented here. Equation 2-1 of Maruyama (1970) and Equation 1 of Nei and Feldman (1972) can be derived in the present context by assuming a finite-island model, mutation among an infinite number of alleles, and monoecious populations with random mating [including random selfing; hence, Q1(t) = Q2(t)] under a postmigration census. Equation 78 of Nagylaki (1983) with zero selfing can be derived by assuming a finite-island model, mutation among an infinite number of alleles, and monoecious populations with no selfing under a postmigration census. Equation 5 of Crow and Aoki (1984) can be derived by assuming a finite-island model, mutation among k alleles, and monoecious populations with random mating [including random selfing; hence, Q1(t) = Q2(t)] under a postmigration census. Equation 2 of Epperson (1999) can be derived by assuming a finite-island model with general between-deme migration rates, no mutation, and monoecious populations with random mating under a postmigration census. Recursions for Q1(t) and Q3(t) [a one-generation recursion for Q2(t) is not presented] in Maynard  Smith (1970) can be derived by assuming a finite-island model, mutation among an infinite number of alleles, and monoecious populations with no selfing under a premigration census. Equation 5 of Vitalis and Couvet (2001) with zero selfing can be derived by assuming an infinite-island model, mutation among k alleles, and monoecious populations with no selfing under a premigration census. Equations A1.1 and A1.4 of Vitalis (2002), assuming pre- and postmigration census schemes, respectively, can be derived assuming dioecious populations with sex-specific dispersal following the infinite-allele mutation model in a finite number of demes. Finally, the juvenile life stage recursions of Balloux  et al. (2003) with no selfing and no clonal reproduction can be derived by assuming a finite-island model, mutation among an infinite number of alleles, and monoecious populations with no selfing under a premigration census. Although many previous studies have analyzed probabilities of gene identity, I am not aware of any study that has identified the connection between the census scheme and the procedures for estimating mutation rate, migration rate, and effective population size as I have outlined them here.

Footnotes

Communicating editor: R. Nielsen

Acknowledgement

I thank Mark Holder, Steve Hudman, John Kelly, Rasmus Nielsen, Bruce Weir, and two anonymous reviewers for assistance, conversations, and/or comments concerning this research. I acknowledge funding from the University of Kansas and the National Science Foundation (DEB 06-09722).

References

Balloux, F., L. Lehmann and T. de  Meeûs,

2003
The population genetics of clonal and partially clonal diploids.
Genetics
 
164
:  
1635
–1644.

Beerli, P., and J. Felsenstein,

1999
Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach.
Genetics
 
152
:  
763
–773.

Beerli, P., and J. Felsenstein,

2001
Maximum likelihood estimation of a migration matrix and effective population size in n subpopulations by using a coalescent approach.
Proc. Natl. Acad. Sci. USA
 
98
:  
4563
–4568.

Caswell, H.,

2001
 Matrix Population Models: Construction, Analysis, Interpretation. Sinauer Associates, Sutherland, MA.

Crow, J. F., and K. Aoki,

1984
Group selection for a polygenic behavioral trait: estimating the degree of population subdivision.
Proc. Natl. Acad. Sci. USA
 
81
:  
6073
–6077.

Epperson, B. K.,

1999
Gene genealogies in geographically structured populations.
Genetics
 
152
:  
797
–806.

Estoup, A., I. J. Wilson, C. Sullivan, J. Cornuet and C. Moritz,

2001
Inferring population history from microsatellite and enzyme data in serially introduced cane toads, Bufo marinus.
Genetics
 
159
:  
1671
–1687.

Excoffier, L., A. Estoup and J. Cornuet,

2005
Bayesian analysis of an admixture model with mutations and arbitrarily linked markers.
Genetics
 
169
:  
1727
–1738.

Fontanillas, P., E. Petit and N. Perrin,

2004
Estimating sex-specific dispersal rates with autosomal markers in hierarchically structured populations.
Evolution
 
58
:  
886
–894.

Hänfling, B., and D. Weetman,

2006
Concordant genetic estimators of migration reveal anthropogenically enhanced source-sink population structure in the river sculpin, Cottus gobio.
Genetics
 
173
:  
1487
–1501.

Kimura, M., and J. Crow,

1964
The number of alleles that can be maintained in a finite population.
Genetics
 
49
:  
725
–738.

Lai, Y., and F. Sun,

2003
The relationship between microsatellite slippage mutation rate and the number of repeat units.
Mol. Biol. Evol.
 
20
:  
2123
–2131.

Maruyama, T.,

1970
Effective number of alleles in a subdivided population.
Theor. Popul. Biol.
 
1
:  
273
–306.

Maynard  Smith, J.,

1970
Population size, polymorphism, and the rate of non-Darwinian evolution.
Am. Nat.
 
104
:  
231
–237.

Nagylaki, T.,

1983
The robustness of neutral models of geographic variation.
Theor. Popul. Biol.
 
24
:  
268
–294.

Nei, M., and M. W. Feldman,

1972
Identity of genes by descent within and between populations under mutation and migration pressures.
Theor. Popul. Biol.
 
3
:  
460
–465.

Ohta, T., and M. Kimura,

1973
A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population.
Genet. Res.
 
22
:  
201
–204.

Pudovkin, A. I., D. V. Zaykin and D. Hedgecock,

1996
On the potential for estimating the effective number of breeders from heterozygote-excess in progeny.
Genetics
 
144
:  
383
–387.

Robledo-Arnuncio, J. J., F. Austerlitz and P. E. Smouse,

2006
A new method of estimating the pollen dispersal curve independently of effective density.
Genetics
 
173
:  
1033
–1045.

Rousset, F.,

2001
Inferences from spatial population genetics, pp. 239–269 in Handbook of Statistical Genetics, edited by D. J. Balding, M. Bishop and C. Cannings. John Wiley & Sons, New York.

Slatkin, M.,

1985
Gene flow in natural populations.
Annu. Rev. Ecol. Syst.
 
16
:  
393
–430.

Slatkin, M.,

2005
Seeing ghosts: the effect of unsampled populations on migration rates estimated for sampled populations.
Mol. Ecol.
 
14
:  
67
–73.

Vitalis, R.,

2002
Sex-specific genetic differentiation and coalescence times: estimating sex-biased dispersal rates.
Mol. Ecol.
 
11
:  
125
–138.

Vitalis, R., and D. Couvet,

2001
Estimation of effective population size and migration rate from one- and two-locus identity measures.
Genetics
 
157
:  
911
–925.

Wang, J., and M. C. Whitlock,

2003
Estimating effective population size and migration rates from genetic samples over space and time.
Genetics
 
163
:  
429
–446.

Waples, R. S.,

1989
A generalized approach for estimating effective population size from temporal changes in allele frequency.
Genetics
 
121
:  
379
–391.

Whitlock, M. C., and D. E. McCauley,

1999
Indirect measures of gene flow and migration: FST≠1/(4Nm+1).
Heredity
 
82
:  
117
–125.

Williamson, E. G., and M. Slatkin,

1999
Using maximum likelihood to estimate population size from temporal changes in allele frequencies.
Genetics
 
152
:  
755
–761.

Wright, S.,

1951
The genetical structure of populations.
Ann. Eugen.
 
15
:  
323
–354.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data