Originally published as Genetics Published Articles Ahead of Print on December 1, 2008.

Genetics, Vol. 181, 615-629, February 2009, Copyright © 2009
doi:10.1534/genetics.108.094342

Coalescence Times and FST Under a Skewed Offspring Distribution Among Individuals in a Population

Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138

1 Corresponding author: 4100 Biological Laboratories, 16 Divinity Ave., Cambridge, MA 02138.
E-mail: eldon{at}fas.harvard.edu

Manuscript received July 23, 2008. Accepted for publication November 25, 2008.

ABSTRACT

Estimates of gene flow between subpopulations based on FST (or NST) are shown to be confounded by the reproduction parameters of a model of skewed offspring distribution. Genetic evidence of population subdivision can be observed even when gene flow is very high, if the offspring distribution is skewed. A skewed offspring distribution arises when individuals can have very many offspring with some probability. This leads to high probability of identity by descent within subpopulations and results in genetic heterogeneity between subpopulations even when Nm is very large. Thus, we consider a limiting model in which the rates of coalescence and migration can be much higher than for a Wright–Fisher population. We derive the densities of pairwise coalescence times and expressions for FST and other statistics under both the finite island model and a many-demes limit model. The results can explain the observed genetic heterogeneity among subpopulations of certain marine organisms despite substantial gene flow.


NATURAL populations of organisms are often subdivided by geography. Individuals may or may not migrate between these subpopulations. Modeling gene flow between subpopulations can be traced back to WRIGHT (1931), whose island model describes a population subdivided into discrete, local subpopulations by geography, with limited migration between individual subpopulations. With the advent of techniques to characterize genetic variation, several measures of population subdivision have been proposed on the basis of probabilities of identity (see ROUSSET 2002 for a review). These include WRIGHT's (1951) FST, NEI's (1982) {gamma}ST, and LYNCH and CREASE's (1990) NST.

The quantity FST can, under the assumption of equilibrium, be used to estimate levels of gene flow from allozyme data (WRIGHT 1951). The quantity {gamma}ST can be calculated from DNA sequence data and is equivalent to FST if the mutation rate is very low (SLATKIN 1991). Levels of gene flow between subpopulations can thus also be estimated from DNA sequence data. However, SLATKIN (1991) argues that FST is appropriate for allozyme data, whereas the gene genealogy-based method of SLATKIN and MADDISON (1989) is appropriate for DNA sequence data. As FST continues to be used in investigations of population structure and, recently, as a tool for identifying loci under selection (e.g., MURRAY and HARE 2006), we are concerned with FST and related measures below.

We derive expressions for FST and NST under the island model of population subdivision with symmetric migration (NAGYLAKI 1980; STROBECK 1987) and skewed offspring distribution among individuals in a population. When the offspring distribution is skewed, individuals have some nonnegligible probability of having very many offspring. The population model of skewed offspring distribution we adopt in this work can result in an ancestral process with asynchronous multiple mergers (ELDON and WAKELEY 2006). An ancestral process with asynchronous multiple mergers, or {Lambda}-coalescent, was introduced by PITMAN (1999) and also derived by SAGITOV (1999) from a CANNINGS (1974) model. In a {Lambda}-coalescent, any number of ancestral lines can coalesce at once to a single ancestor. In contrast, the Kingman coalescent (KINGMAN 1982a,b) allows only two lines to coalesce each time. For a single population, the ancestral process obtained from the population model of ELDON and WAKELEY (2006), and employed in this work, is a special case of the {Lambda}-coalescent of PITMAN (1999) and SAGITOV (1999).

Type III survivorship curve, and high fecundity, characterize a diverse group of organisms (e.g., many plants and marine animals). A prime example are marine species with broadcast spawning, including Atlantic cod (Gadus morhua; ÁRNASON 2004), Pacific oysters (Crassostrea gigas; BECKENBACH 1994; HEDGECOCK 1994a), and red drum (Sciaenops ocellatus; TURNER et al. 2002). A model of skewed offspring distribution, in which individuals can have very many offspring with a nonnegligible probability, may therefore better apply in such cases than the Wright–Fisher (FISHER 1930; WRIGHT 1931) or the Moran (MORAN 1958, 1962) models.

Genetic observations from these species also argue against the standard population models. Genetic diversity is observed to be much lower than expected on the basis of population size for some marine populations (HEDGECOCK et al. 1982; NEI and GRAUR 1984; AVISE et al. 1988; AVISE 1994). In particular, low effective to actual population size ratios have been reported for Atlantic cod (ÁRNASON 2004), red drum (TURNER et al. 2002), and the Pacific oyster (HEDGECOCK 1994a), and this has been explained by high variance in offspring distribution (CROW and KIMURA 1970; HEDRICK 2005). Second, models of skewed offspring distribution predict a large number of singleton variants (ELDON and WAKELEY 2006; SARGSYAN and WAKELEY 2008), a feature observed, for example, in Pacific oysters (BOOM et al. 1994), Atlantic cod (ÁRNASON 2004), and some hydrothermal vent taxa (WON et al. 2003; HURTADO et al. 2004; JOHNSON et al. 2006; YOUNG et al. 2008).

Genetic heterogeneity on a small spatial scale has been observed for many marine populations, including the purple sea urchin (Strongylocentrotus purpuratus; EDMANDS et al. 1996), even though planktonic larvae disperse over wide-ranging habitats (JOHNSON and BLACK 1984; WATTS et al. 1990; HEDGECOCK 1994b; DAVID et al. 1997). A range of explanations has been proposed for the observed heterogeneity (see BURTON 1983; PALUMBI 1994). Our aim is to address, by analytic methods, the problem concerning the genetic population structure of a highly fecund species with potentially highly skewed offspring distribution, like the Atlantic cod (ÁRNASON et al. 2000).

We obtain the probability distributions of pairwise coalescence times, and expressions for FST, for both the finite island and a many-demes limit model. Our main result is that evidence of population subdivision can be observed in genetic data even if the usual migration rate Nm is very large. In essence, a skewed offspring distribution leads to high probabilities of identity by descent within subpopulations and thus high FST. Therefore, patterns in genetic data indicating population subdivision cannot be taken to indicate low levels of gene flow in a population with a skewed offspring distribution. In fact, estimates of migration rate based on FST (or NST) are confounded by the reproduction parameters of our model of skewed offspring distribution. These results may explain the genetic heterogeneity among subpopulations of some marine species like the purple sea urchin (S. purpuratus; EDMANDS et al. 1996), despite the potential for wide dispersal of long-lived planktotrophic larvae (BURTON 1983; PALUMBI 1994).


METHODS AND RESULTS
Throughout we are concerned with neutral genetic diversity at a single nonrecombining locus in a haploid population. As usual, N is the population size. The results should hold for a diploid population with gametic migration if we replace N with 2N. The population model we consider is a modification of the well-known Moran model of reproduction (MORAN 1958, 1962). In the Moran model, a single randomly chosen individual reproduces each time step. To keep the population size constant a randomly chosen individual, but not the offspring, dies to make room for the offspring.

In our model, which was first presented in ELDON and WAKELEY (2006), a single randomly chosen individual (the parent) reproduces each time step. With probability 1 – {varepsilon} the parent has one offspring. Alternatively, with probability {varepsilon} the parent has {psi}N – 1 offspring (a large reproduction event) with 0 < {psi} < 1. To keep population size constant when a large reproduction event occurs, a total of {psi}N – 1 individuals die to make room for the new offspring. In our model the parent always persists. The parameter {psi} represents the fraction of the population that is replaced by the offspring of the parent. ELDON and WAKELEY (2006) show that this modified Moran model of overlapping generations gives rise to a coalescent process that allows for asynchronous multiple mergers of ancestral lines, i.e., is of the same type as the ancestral process considered by PITMAN (1999) and SAGITOV (1999).

For ease of presentation, we define the following quantities: N{gamma}, cN, {lambda}{gamma}, and IA. The quantity N{gamma} is the coalescence timescale in our model. The coalescence timescale is proportional to the number of time steps, on average, it takes for two individuals to coalesce (in a single population). It depends on the value of {varepsilon} that we assume has the form {varepsilon} {equiv} 2{phi}/N{gamma} for some constants {phi} and {gamma} with 0 < {phi}, {gamma} < {infty}. In our model, the coalescence timescale is N{gamma}/2 time steps when 0 < {gamma} < 2. In the usual Moran model, the timescale is N2/2 time steps, which is also the value of N{gamma} when {gamma} ≥ 2.

For a single population, ELDON and WAKELEY (2006) show that different coalescent processes result depending on {gamma}. Multiple mergers of ancestral lines are allowed in the coalescent process when 0 < {gamma} ≤ 2, while Kingman's coalescent (KINGMAN 1982a,b) results when {gamma} > 2. The probability that two individuals do coalesce in a single time step is denoted by cN and depends on {varepsilon}. The rate {lambda}{gamma} of coalescence of two individuals is obtained from cN by "speeding up" time by a factor of N{gamma}. When 0 < {gamma} ≤ 2, {lambda}{gamma} depends on the reproduction parameters {phi} and {psi}. In mathematical notation, N{gamma} is expressed as Formula, and the coalescence probability cN is

Formula 1(1)
For notational convenience, we also define the indicator function IA as

Formula 1
For example, I{gamma}<2 = 1 if {gamma} < 2, and zero otherwise. In our model a large reproduction event occurs when the number of offspring of the parent equals {psi}N – 1. These events occur with probability {varepsilon}. Our choice of {varepsilon} = 2{phi}/N{gamma} results in the coalescence timescale being N{gamma}. The rate {lambda}{gamma} of coalescence is then

Formula 2(2)
The coalescence rate {lambda}{gamma} is a key quantity in nearly all of our results below.

Model of subdivision:

We now consider the finite island model of population subdivision with the simplifying assumption that migration does not change the sizes of the subpopulations (NAGYLAKI 1980; STROBECK 1987; HERBOTS 1997). Reproduction in all the subpopulations follows the modified Moran model described above. The discrete-time ancestral process for a sample of size 2 is a Markov chain with transition probabilities given in Equation A1 in the APPENDIX.

We are concerned with small migration rates, specifically those on the order of 1/N{gamma} time steps. This means that a single individual resides in the same subpopulation for 2N{gamma} time steps, on average, before migrating to a different subpopulation. When 0 < {gamma} < 2, each individual resides in the same subpopulation for only N{gamma} time steps, on average. This time can be much shorter (when 0 < {gamma} < 1) than the usual average of N time steps assumed in Wright–Fisher populations. In other words, a large number of individuals migrate during N time steps when 0 < {gamma} < 1. We let m denote the probability that a single individual resided in a different subpopulation in the previous time step and model m as m = m{gamma} {equiv} {kappa}/(2N{gamma}) in which {kappa} is a finite constant (0 < {kappa} < {infty}).

To illustrate the difference between our migration rate {kappa} and the usual migration rate Nm let M* {equiv} N2m{gamma} denote a migration rate scaled in units of N2 time steps (or N generations). This corresponds to the usual "Nm" in the Wright–Fisher model. Substituting for m{gamma} gives Formula 2. If, for example, {gamma} = Formula 2, then Formula 2. When {gamma} < 2 the migration rate M* is very high; i.e., Formula 2 as Formula 2 since {kappa} is finite. However, in our modified model of reproduction coalescence also occurs on the timescale of N3/2 time steps (or Formula 2 generations when {gamma} = Formula 2) and thus "counteracts" the effects of high migration rate.

The main results of this work concern expected coalescence times (Equations 3 and 5) and FST-like measures (Equations 1012). We also derive the densities of the coalescence times (see APPENDIX). The densities are used to derive distribution functions for the number of segregating sites between two sequences (see the APPENDIX), which in turn yield expressions for FST-like measures including mutation (Equations 13 and 14).

The distributions of the coalescence times are functions of {lambda}{gamma}:

DNA sequences differ because they have accumulated mutations from the time of their most recent common ancestor until they are sampled. By assuming a very low mutation rate, SLATKIN (1991) derived an expression for FST in terms of expected values of coalescence times. The time until two genes coalesce is therefore a fundamental quantity in theoretical work on structured populations. Given two genes sampled from a structured population, two different coalescence times arise that are of interest: the time T0 until two genes sampled from the same subpopulation coalesce and time T1 until two genes sampled from different subpopulations reach a common ancestor. The densities of T0 and T1 were previously derived under the structured coalescent by TAKAHATA (1988) and NATH and GRIFFITHS (1993) in the case of two subpopulations and by HERBOTS (1997) for any finite number of subpopulations.

Given the transition rates in Equation A2, we can obtain the distributions of the coalescence times T0 and T1 (see the APPENDIX). Figure 1 shows the distributions of T0 and T1, respectively, as functions of time for different values of {psi} (the fraction of the population replaced by the offspring of a single individual). As {psi} increases (i.e., tends to 1), the coalescence times T0 and T1 become very short.


Figure 1
View larger version (13K):
In this window
In a new window
Download PPT slide
 
FIGURE 1.—

The densities Formula 2 and Formula 2 of times to coalescence for two genes sampled from the same (T0), or different (T1), subpopulations as functions of time for different values of {psi} when the number of subpopulations D = 3 and {phi} = {kappa} = 1. The coalescence timescale is N2/2 in a and c and N{gamma}/2 with 0 < {gamma} < 2 in b and d. The solid lines in a and c are the densities obtained under the standard coalescent (i.e., {gamma} > 2).

 
The expected value and variance of T0 are both less than the corresponding quantities for T1. Specifically,

Formula 3(3)
and

Formula 4(4)
Equation 3 holds a key result, namely that E(T0) is always less than E(T1).

The significance of the result in Equation 3 is best understood by an example. When {gamma} < 2, say Formula 4 then the timescale is N{gamma} = N3/2, and {lambda}{gamma} = {psi}2 (assuming {phi} = 1). Our migration parameter is then {kappa} = m{gamma}N{gamma} = m{gamma}N3/2. Migration is scaled in units of N2 time steps in a standard Moran population. If we let M* {equiv} N2m{gamma} be a scaled migration rate in units of N2 time steps, then if m{gamma} is of order 1/N3/2 as above, M* becomes very high in a large population. Specifically, since {gamma} = Formula 4, we have Formula 4 (as Formula 4), since {kappa} is a constant. The result in Equation 3 says that even when Formula 4 one will still see evidence of population structure in DNA sequence data, since coalescence occurs on a timescale of Formula 4 time steps (in a large population) when {gamma} = Formula 4. In fact, Formula 4 as Formula 4 whenever 0 < {gamma} < 2.

Similarly, Formula 4 is always less than Formula 4. In addition, the expected value and variance of T0 are inversely proportional to {lambda}{gamma} and thus will be small when the probability of large reproduction events is close to one. The expressions for E(T0) and E(T1) (Equation 3) obtained under the usual reproduction models (NEI and FELDMAN 1972; LI 1976; GRIFFITHS 1981) can be recovered by assuming that large reproduction events occur on a longer timescale ({gamma} > 2) than usual (e.g., Wright–Fisher) sampling, in which case {lambda}{gamma} = 1. The variances of T0 and T1 were first derived by HEY (1991) under the structured coalescent and can be recovered in the same way from Equation 4.

A many-demes limit:

The structured coalescent simplifies under certain migration mechanisms when the number of subpopulations is taken to be much greater than the sample size of DNA sequences (WAKELEY 1998). The convergence of the ancestral process under a many-demes limit (i.e., when Formula 4) follows from the work of MÖHLE (1998), which shows how events in a stochastic process that occur on different timescales can be separated (see the APPENDIX for a more detailed description). We consider the ancestral process in the limit Formula 4 and Formula 4. Switching the order of the limits leads to the same coalescent process (see the APPENDIX).

The limit process of two genes sampled from a population subdivided into very many subpopulations (Formula 4), each of which is very large (Formula 4), is of the form P*etG* in which P* and G* are given by Equations A16 and A19, respectively. The form of P* tells us that the ancestral process immediately enters the continuous-time process if the two genes are sampled from two different demes. If the two genes are sampled from the same subpopulation, they coalesce with probability Formula 4 or enter the continuous-time process by moving to different subpopulations with probability Formula 4. In the continuous-time process the two lines wait with exponential time with rate Formula 4 on a timescale of DN{gamma} time steps until they coalesce. The ancestral process under the many-demes limit model (Equation A19) differs from the limit process obtained when the number of subpopulations is finite (Equation A2), in that G* has a zero entry for the transition where the two alleles enter the same subpopulation, after having been separated. When D < {infty}, the corresponding rate is {kappa}/(D – 1) (Equation A2). Ancestral lines can coalesce, however, only if they reside in the same subpopulation. The matrix B* (Equation A18) ensures that the two lines do arrive in the same subpopulation.

Again we are interested in the coalescence times T0 and T1 of two genes sampled from the same, or different, subpopulations, respectively. The distribution of T0 is a mixture distribution (see APPENDIX), and we obtain

Formula 5(5)
and

Formula 6(6)
The expressions for the expected value and variance of T0 and T1 obtained under the many-demes limit model (Equations 5 and 6) are functions of {lambda}{gamma} and {kappa} in the same way as the corresponding expected values and variances (Equations 3 and 4) obtained for a finite number of subpopulations. In particular, we always expect a shorter coalescence time for two ancestral lines sampled from the same subpopulation than if they were sampled from different subpopulations.

Deriving FST and NST:

The quantity FST is commonly used to assess levels of population subdivision. The inbreeding coefficient of an individual relative to a collection of subpopulations, FIT, can be attributed to nonrandom mating within a subpopulation (FIS) and to differences among subpopulations (FST; WRIGHT 1951). Two sequences are identical by descent if they have not experienced mutation from the time of their most recent common ancestral sequence until they are sampled. If we let f0 and f denote the probability of identity by descent of two genes sampled from the same subpopulation (f0) and at random from the collection of subpopulations (f), we can express FST as

Formula 7(7)
(NEI 1973). By the definition of FST in terms of inbreeding coefficients (as in Equation 7), FST depends on the mutation rate (µ). By forcing µ to be very low SLATKIN (1991) derived an approximation of FST that is a function of expectation of coalescence times and is given by

Formula 8(8)
in which T is the coalescence time of two lines randomly sampled from the collection of subpopulations, T0 is the time to coalescence of two lines from the same subpopulation, and µ is the mutation rate.

To obtain an expression of Formula 8 in terms of coalescence times under skewed offspring distribution, we can proceed by first obtaining the expected coalescence time E(T) of two genes randomly sampled from the collection of subpopulations, which is readily obtained from Equations 3 and A10 and is given by

Formula 9(9)
When the number of subpopulations D is finite, the general form of Formula 9 is

Formula 10(10)
For example, when 0 < {gamma} < 2, the rate of coalescence is {lambda}{gamma} = {psi}2 (with {phi} = 1) and Equation 10 gives Formula 10. The expression for Formula 10 in Equation 10 has the same form as the one derived by SLATKIN (1991) under the standard coalescent. The key difference is that, under skewed offspring distribution, FST is a function of the rate {lambda}{gamma} (Equation 2) of coalescence and thus a function of the reproduction parameters {phi} and {psi}. The result that SLATKIN (1991) obtained can be recovered from Equation 10 by taking {gamma} > 2, in which case {lambda}{gamma} = 1 (recall that the probability of large reproduction events {propto} 1/N{gamma}).

When the number of subpopulations Formula 10, we obtain from Equation 10

Formula 11(11)
In Equation 11 we have taken two limits: Formula 11 and Formula 11. Switching the order of the limits gives the same limit result for FST in Equation 11.

Following WRIGHT (1951), the value of FST has often been used to estimate levels of gene flow. Figure 2 shows Formula 11, obtained from Equation 11, as a function of {psi} for different values of FST (Figure 2a) and {phi} (Figure 2b) and for two different values of {lambda}{gamma}. Since FST is a function of {psi} and {phi}, so is any estimate of gene flow based on FST, as Figure 2 clearly shows.


Figure 2
View larger version (8K):
In this window
In a new window
Download PPT slide
 
FIGURE 2.—

The estimate Formula 11 of migration rate from Equation 11 as a function of {psi}. (a) Formula 11 when FST = 0.1 (solid line), FST = 0.2 (dashed line), and FST = 0.5 (dotted line). (b) Formula 11 with FST = 0.1 and {phi} = 1 (solid line), {phi} = 2 (dashed line), and {phi} = 5 (dotted line).

 
LYNCH and CREASE (1990) used the number of pairwise sequence differences of DNA sequences to estimate levels of genetic heterogeneity. In that context, LYNCH and CREASE (1990) introduced the quantity NST that has the form Formula 11 in which Formula 11 and Formula 11 are the average number of pairwise differences between sequences sampled from different, or the same, subpopulations, respectively. If mutation rate is constant and mutations occur according to the infinite-sites model (WATTERSON 1975), then NST estimates Formula 11 (SLATKIN 1993). Using the results obtained for expected coalescence times (Equation 3), we obtain Formula 11 as in Equation 11 for the many-demes limit model of population subdivision and

Formula 12(12)
when D < {infty}. The effect of skewed offspring distribution is the same on NST as it is on FST. Under the infinite-sites mutation model we do not need an assumption of small mutation rate to obtain an expression of NST in terms of coalescence times, unlike the case for FST. As NST is defined, the mutation parameter cancels out (SLATKIN 1993).

Number of segregating sites between pairs of sequences:

By the definition of FST in terms of probabilities of identity by descent (Equation 7), FST depends on mutation. ELDON and WAKELEY (2006) show that the limit process (as Formula 12) of our model of skewed offspring distribution predicts nonzero levels of genetic variation only when {gamma} > 1. If we (as in ELDON and WAKELEY 2006) let µ denote the probability of mutation for each offspring in a single time step, we define the mutation rate {theta} as Formula 12 (and {gamma} > 1). We can include mutation in an expression for FST by first obtaining the probability distributions of the number of segregating sites, under the infinite-sites model (WATTERSON 1975), between two genes given a model of population subdivision with migration. Let K0 denote the number of segregating sites between two genes sampled from the same subpopulation and K denote the number of segregating sites between two genes sampled randomly from the collection of subpopulations. The distributions of K0 and K are derived in the APPENDIX, along with the distribution of the number of segregating sites K1 between two genes sampled from different subpopulations. Then by the definition of FST given in Equation 7 we obtain

Formula 13(13)
When Formula 13,

Formula 14(14)
From Equation 14 we conclude that mutation can affect FST only if {theta} is large relative to {lambda}{gamma}. The expression for FST in Equation 14 has the same form as the one derived by WILKINSON-HERBOTS (1998) and by NEI (1975) and TAKAHATA (1983) by other methods, under the Wright–Fisher model, including mutation. In Figure 3, FST from Equation 14 is graphed as a function of {psi} for different values of {theta} and {kappa}. The interpretation of Figure 3 is that FST, as a function of {psi}, can vary considerably when the timescale of coalescence (and migration) is in units of N{gamma}/2 generations with 1 < {gamma} < 2 (Figure 3, b and d).


Figure 3
View larger version (13K):
In this window
In a new window
Download PPT slide
 
FIGURE 3.—

The quantity FST from Equation 14 as a function of {psi} (with {phi} = 1) for different values of {theta}, {kappa}, and rate of coalescence ({lambda}{gamma}). Solid lines, {theta} = 10; dashed lines, {theta} = 1; dotted lines, {theta} = 0.1.

 

Nei's genetic distance d:

Not all indicators of separation between populations depend on {lambda}{gamma}. NEI's (1972) genetic distance is more appropriate for estimating divergence time between species, and FST-like quantities are more suitable for inferring population structure within species (SLATKIN 1991). NEI's (1972) genetic distance measure is given by Formula 14 in which f0 and f1 are the probabilities of identity by descent of two genes sampled from the same or different subpopulations, respectively, and we add the subscript N to remind us that time is discrete. If we now assume that 0 < µE(ti) < 1 for i = 0, 1, then using the Maclaurin series expansion of the logarithmic function Formula 14 we obtain Formula 14 (previously obtained by SLATKIN 1991) in which t0 and t1 are the coalescence times for two genes sampled from the same, or different, subpopulations, respectively. To obtain an expression of d for continuous time, we assume that the product Formula 14 converges to a constant {theta} as Formula 14 (and {gamma} > 1). Rewriting the approximation for dN gives

Formula 15(15)
which has the continuous-time limit

Formula 16(16)
However, using the expressions for E(T1) and E(T0) (Equation 3), we obtain Formula 16 and so NEI's (1972) genetic distance is independent of {lambda}{gamma}. Another way of deriving an expression for d is to note that the probability of identity by descent of two genes is the same as the probability that no mutations occur from the time they are sampled until they reach a common ancestor. Thus fi = P(Ki = 0) for i = 0, 1. We can therefore write

Formula 17(17)
for any model of population subdivision. For the many-demes limit model under consideration,

Formula 17
Using either the limit approach (Equation 16) or the substitution approach (Equation 17) in the many-demes limit model, and assuming small {theta}/{kappa} (i.e., 0 < {theta}/{kappa} < 1), d is of the form {theta}/{kappa}. The same result is obtained for a finite number of subpopulations. Indeed, when D is finite, we obtain from Equations A28 and A29

Formula 18(18)
Thus, if 0 < ({theta}/2)(D 1)/{kappa} < 1, we have from Equations 17 and 18

Formula 19(19)
Even if ({theta}/2)(D – 1)/{kappa} > 1, we have from Equations 17 and 18 that d is not a function of {lambda}{gamma}. Thus NEI's (1972) genetic distance can be used to estimate divergence times of species even if one or both species have skewed offspring distribution, since d is proportional to the time of separation of two populations (NEI 1972; SLATKIN 1991).


DISCUSSION
Some organisms, for example Pacific oysters (BECKENBACH 1994; HEDGECOCK 1994a) and Atlantic cod (BEKKEVOLD et al. 2002; ÁRNASON 2004), may exhibit skewed offspring distribution among individuals in a population. Both BECKENBACH (1994) and HEDGECOCK (1994a) describe the reproductive mode of oysters, for example, as a lottery, in which only the offspring of a few lucky females survive. Oyster and cod females have very high reproductive potential, as they may produce millions of eggs in a single spawning (MAY 1967; STRATHMANN 1987; CHAMBERS and WAIWOOD 1996; KJESBU et al. 1996). The Wright–Fisher model does not capture the skewed offspring distribution possibly exhibited by organisms with high fecundities and high early mortality. The models of PITMAN (1999) and SAGITOV (1999), and later of ELDON and WAKELEY (2006) and SARGSYAN and WAKELEY (2008) for overlapping generations in a single population, incorporate the skewness and may thus better apply to organisms with highly fecund individuals and sweepstakes-style recruitment. By deriving distributions of coalescence times for two genes sampled from a subdivided population, we show how skewed offspring distribution confounds estimates of migration rate between subpopulations when based on FST-like measures of population subdivision.

An important result of this work is that FST depends not only on the migration rate {kappa} but also on the parameters ({psi} and {phi}) of our model of large offspring numbers. Demographic processes such as population size fluctuations, founder effects, or skewed offspring distribution have been thought to increase genetic differentiation among subpopulations. As defined and calculated from genetic data, common indicators of population subdivision then take on high values, thus suggesting low levels of migration (BOILEAU et al. 1992; WHITLOCK 1992; SLATKIN 1993; HEDGECOCK 1994a). Our main conclusions are twofold. First, FST is shown to depend on the parameters controlling the size ({psi}) and frequency ({phi}) of large reproduction events (the probability that the offspring of a single individual replace a fraction {psi} of the population is {varepsilon} = 2{phi}/N{gamma}) and can thus indicate high or low levels of genetic heterogeneity depending on {psi} and {phi}. To illustrate, consider the expression for FST derived under the many-demes limit model without mutation (Equation 11), and let the timescale of coalescence occur on N{gamma}/2 time steps (i.e., 0 < {gamma} < 2). In that case the rate of coalescence {lambda}{gamma} = {psi}2 (by taking {phi} = 1), and the rate of large reproduction events is high. By fixing {kappa}, we see that FST ranges from very low (when {psi} is low), to {approx}1/(1 + {kappa}), when {psi} {approx} 1. Second, to the extent that FST (or NST) is used in estimating levels of gene flow, these estimates are confounded by {lambda}{gamma} and thus by {phi} and {psi}. Also, migration in our model is not the usual Nm quantity, but is given by {kappa} = m{gamma}N{gamma}. This means that even when Nm is very large, we may still observe genetic heterogeneity, since the rate of large reproduction events is also large. In a population where individuals can have very many offspring, gene flow is not the only demographic force that influences genetic heterogeneity.

The coalescence times T0 and T1 (for genes sampled from the same or different subpopulations, respectively) are fundamental quantities of the ancestral process of genes in subdivided populations. The time during which DNA sequences accumulate mutations is determined by T0 and T1. As we have shown, the coalescence times depend on the skewness of the offspring distribution through the rate {lambda}{gamma} (Equation 2) of coalescence. By deriving the distributions of T0 and T1 for two genes in a structured population, we have obtained insight into how skewed offspring distribution shapes the genetics of structured populations. Since T0 and T1 are functions of {phi} and {psi} through the rate of coalescence {lambda}{gamma}, all the quantities of interest in regard to investigation of the genetics of structured populations, including expected values, number of segregating sites, and indicators of population subdivision, are functions of {lambda}{gamma}.

One such insight is that genetic heterogeneity can be observed in genetic data even if gene flow is very high by the usual standard (Formula 19). EDMANDS et al. (1996) found significant genetic heterogeneity among subpopulations of the purple sea urchin S. purpuratus sampled along the coast of California and Baja California. The ecology and physiology of S. purpuratus indicate the capacity for highly skewed offspring distribution: external fertilization and very high fecundity. Despite a planktonic larval period of several weeks (STRATHMANN 1978), and thus a potential for high dispersal, both allozyme and mtDNA sequence data revealed genetic differentiation, even over short distances (EDMANDS et al. 1996). We have shown that, regardless of the timescale of migration, E(T0) < E(T1). Genetic heterogeneity can, therefore, be observed in DNA sequence data even if gene flow is very high, in a population with skewed offspring distribution. Population turnover in a metapopulation model when demes that become extinct are recolonized by one or a few individuals can also lead to increased FST (WADE and MCCAULEY 1988; WHITLOCK and MCCAULEY 1990; PANNELL 2003). Indeed, a model of metapopulation structure that allows only one founder for every deme that is recolonized necessarily results in a coalescent process with multiple mergers, if the founder can have many offspring.

In summary, we consider the coalescence times of a subdivided population following a sweepstakes-style recruitment. The expected coalescence time for two genes sampled from the same subpopulation is always less than the expected coalescence time for two genes sampled from different subpopulations, even when migration occurs on a very short timescale. Estimates of migration rate based on FST are confounded by the rate {lambda}{gamma} of coalescence, since FST-like measures of genetic heterogeneity are a function of the reproduction parameters of our model of skewed offspring distribution. These results underscore the importance of choosing an appropriate limit process for the population under consideration.


APPENDIX

The discrete-time transition matrix:

The probability transition matrix {Pi}N (Equation A1) for the ancestral process over one time step for the case of arbitrary fixed number D ≥ 2 of subpopulations has three states: (1) two lines in the same deme but not coalesced, (2) two lines in different demes, and (3) the two lines have coalesced. We do not distinguish between subpopulations. The transition probabilities in {Pi}N are derived under the assumption that migration does not alter the subpopulation sizes (NAGYLAKI 1980; STROBECK 1987; HERBOTS 1997). We let m denote the single backward migration fraction. The matrix {Pi}N is

Formula A1(A1)
in which cN is the coalescence probability and

Formula A1

Formula A1

Formula A1

Formula A1

Formula A1

Formula A1

Formula A1

Formula A1

Formula A1
The matrix in Equation A1 is a generalization of the matrix for the same migration mechanism (cf. WAKELEY 2008) obtained under the usual Wright–Fisher model of reproduction. In Equation A1, the probability of coalescence is cN, instead of the usual 1/N, as is the case for the haploid Wright–Fisher model. The corresponding continuous-time process has rate matrix G given by

Formula A2(A2)

Distribution functions of the coalescence times when D is finite:

In this section we derive the distribution functions for the coalescence times T0 and T1 when the number of subpopulations is finite. Given the distributions of T0 and T1 we can determine the distribution of the coalescence time T of two genes sampled at random from the collection of subpopulations. The distributions of these coalescence times allow us to derive expressions for FST with or without mutation.

We can use the rate matrix (Equation A2) to obtain the density functions for T0 and T1. Using Laplace transforms (see HERBOTS 1997) we obtain

Formula A3(A3)
in which

Formula A4(A4)
and

Formula A5(A5)
for i = 1, 2. To obtain the density function of T1, we note that T1 can be represented as a sum of two independent random variables, T1 = Y1 + T0, where Y1 is an exponential random variable with rate {kappa}{gamma}/(D – 1). By direct calculation,

Formula A6(A6)
The form of the continuous functions Formula A6 and Formula A6 (Equations A3 and A6) immediately yields the cumulative densities Formula A6 and Formula A6 for T0 and T1, respectively. Namely, writing {lambda}i = –ri for i = 1, 2,

Formula A7(A7)
and

Formula A8(A8)
in which

Formula A8
Let T denote the time to coalescence for two genes sampled at random from the collection of subpopulations. Then, with probability 1/D the two genes are sampled from the same subpopulation, and with probability 1 – 1/D they are sampled from different subpopulations. The cumulative density function (c.d.f.) FT of T is then given by

Formula A9(A9)
in which Formula A9 denotes the c.d.f. of Ti for i = 0, 1. The expected value of T is

Formula A10(A10)
in which E(T0) and E(T1) are given by Equation 3 and the variance of T is

Formula A11(A11)
in which Var(T0) and Var(T1) are given by Equation 4. Note that although the expected value of T lies between E(T0) and E(T1), Equation A11 tells us that the variance of T may not lie between the variance of T0 and T1. If Formula A11, then Var(T) > Var(T1).

A many-demes limit model:

In this section we derive the ancestral process for two genes in the many-demes limit (Formula A11) and as Formula A11. Since now Formula A11, the single-generation backward transition matrix, following MÖHLE (1998), can be written

Formula A12(A12)
in which the matrices A and B are given below. The matrix A describes the probabilities of the transitions that occur on a timescale of time steps. The matrix B/D describes transitions that occur on a timescale of D time steps, thus forming the continuous-time part of the ancestral process. The limit process (as Formula A12) is then given by {Pi}(t) = PetPBP in which Formula A12 describes the equilibrium process of the events that occur on the timescale of time steps (MÖHLE 1998). The ancestral history of a sample is first adjusted by an instantaneous process described by P and then enters a continuous-time process described by the rate matrix PBP. Given the ancestral process, we can derive the distributions of T0 and T1.

The instantaneous matrix A is

Formula A13(A13)
with eigenvalues {lambda}1 = {lambda}2 = 1 and

Formula A13
and we observe that Formula A13. By calculating the corresponding eigenvectors (not shown) we obtain the limit matrix P by diagonalizing A,

Formula A14(A14)
in which

Formula A15(A15)
and in the limit Formula A15 we obtain

Formula A16(A16)
When Formula A16 but holding N finite, the ancestral process is given by PetPBP, in which the matrix B is given by

Formula A17(A17)
and we obtain

Formula A18(A18)
The rate matrix G* = P*B*P* then takes the general format

Formula A19(A19)
The ancestral process in the limit Formula A19 and Formula A19 is P*etG* and immediately yields the density functions for T0 and T1 as follows. The time T1 to coalescence for two lines sampled from different demes follows in each case ({gamma} > 2, {gamma} = 2, and 0 < {gamma} < 2) an exponential distribution with rate Formula A19. Now consider the time T0 to coalescence for two lines sampled from the same subpopulation. Going back in time, two lines in the same subpopulation can either coalesce with probability Formula A19 or they enter the continuous-time process with probability Formula A19, in which case one of the two lines migrates to a different subpopulation. Thus T0 follows a mixture distribution with cumulative density function

Formula A20(A20)

Order of limits irrelevant in the many-demes limit model:

In this section we show that the same ancestral process is obtained irrespective of the order of the limits Formula A20 and Formula A20. We have already derived the process when first Formula A20 and then Formula A20. Now we show that the same ancestral process is obtained when first Formula A20 and then Formula A20.

As Formula A20, we obtain the ancestral process described by the rate matrix

Formula A21(A21)
We remark that An = (–{kappa}{lambda}{gamma})n–1A for n ≥ 1. Since A is a rate matrix, we have, with a = {kappa} + {lambda}{gamma},

Formula A22(A22)
Equation A22 immediately gives us the instantaneous matrix

Formula A23(A23)
Equations A21 and A23 then give us the rate matrix G = PBP after first taking the limit Formula A23 and then Formula A23.

By similar arguments we can show that the ancestral process does indeed result in coalescence regardless of initial state. Indeed,

Formula A24(A24)
which gives, writing b = {kappa}{lambda}{gamma}/({kappa} + {lambda}{gamma}),

Formula A25(A25)
The equilibrium distribution (as Formula A25) is then

Formula A26(A26)
and we obtain that two genes do reach a common ancestor regardless of initial state—i.e.,

Formula A27(A27)

Number of segregating sites:

We now consider the number of segregating sites, under the infinite-sites model (WATTERSON 1975), between two genes in the two models of population subdivision discussed previously. First, consider the model of finite number of subpopulations. Let K0 and K1 denote the number of segregating sites for two genes sampled from the same or different subpopulations, respectively. We let {theta} denote the scaled mutation rate and note that, given a specific length of time t, the number of segregating sites is Poisson distributed with rate {theta}t/2. The probability distribution for the number of segregating sites for two genes sampled from the same subpopulation is

Formula A28(A28)
in which A1 and A2 are given in Equation A4, and {lambda}i = –ri from Equation A5.

The probability distribution for the number of segregating sites between two genes sampled from different subpopulations is

Formula A29(A29)
in which Bi = Ai{kappa}/({kappa} + (D – 1)ri) for i = 1, 2. The expected value and variance of K0 are both less than the corresponding quantities for K1. Indeed,

Formula A30(A30)
and

Formula A31(A31)
The probability mass distribution of the number of segregating sites K for two genes sampled at random from the collection of subpopulations is

Formula A32(A32)
in which Ci = Ai/D + (D – 1)Bi/D for i = 1, 2 and Formula A32. One can write K ~ (1/D)K0 + (1 – 1/D)K1; i.e., K is a mixture distribution. In fact, K is distributed as K0 with probability 1/D and as K1 with probability 1 – 1/D. Hence,

Formula A33(A33)
and so E(K0) < E(K) < E(K1). However, the variance of K (Equation A34), which can be obtained in the same way as the variance of the time to coalescence of two genes sampled at random from the collection of subpopulations (Equation A11), may not lie between the variance of K0 and K1. The variance of K is

Formula A34(A34)

Number of segregating sites under the many-demes limit model:

Under the many-demes limit model of population subdivision, the probability mass distribution for the number of segregating sites K1 between two genes sampled from different subpopulations is

Formula A35(A35)
The expected number and the variance of the number of segregating sites between two genes sampled from different subpopulations is then

Formula A36(A36)

Formula A37(A37)
The probability mass function for K0, the number of segregating sites between two genes sampled from the same subpopulation, is (k = 0, 1, ...)

Formula A38(A38)
which gives expected value

Formula A39(A39)
and the variance of K0 is

Formula A40(A40)


ACKNOWLEDGEMENTS
We thank two anonymous reviewers for helpful comments and suggestions.


LITERATURE CITED

ÁRNASON, E., 2004 Mitochondrial cytochrome b variation in the high-fecundity Atlantic cod: trans-Atlantic clines and shallow gene genealogy. Genetics 166: 1871–1885.[Abstract/Free Full Text]

ÁRNASON, E., P. H. PETERSEN, K. KRISTINSSON, H. SIGURGÍSLASON and S. PÁLSSON, 2000 Mitochondrial cytochrome b DNA sequence variation of Atlantic cod from Iceland and Greenland. J. Fish Biol. 56: 409–430.

AVISE, J. C., 1994 Molecular Markers, Natural History and Evolution. Chapman & Hall, New York.

AVISE, J. C., R. M. BALL and J. ARNOLD, 1988 Current versus historical population sizes in vertebrate species with high gene flow: a comparison based on mitochondrial DNA lineages and inbreeding theory for neutral mutations. Mol. Biol. Evol. 5: 331–344.[Abstract]

BECKENBACH, A. T., 1994 Mitochondrial haplotype frequencies in oysters: neutral alternatives to selection models, pp. 188–198 in Non-Neutral Evolution, edited by B. GOLDING. Chapman & Hall, New York.

BEKKEVOLD, D. M., M. HANSEN and V. LOESCHCKE, 2002 Male reproductive competition in spawning aggregations of cod (Gadus morhua l.). Mol. Ecol. 11: 91–102.[CrossRef][Medline]

BOILEAU, M. G., P. D. N. HEBERT and S. S. SCHWARTZ, 1992 Non-equilibrium gene frequency divergence: persistent founder effects in natural populations. J. Evol. Biol. 5: 25–39.[CrossRef]

BOOM, J. D. G., E. G. BOULDING and A. T. BECKENBACH, 1994 Mitochondrial DNA variation in introduced populations of Pacific oyster, Crassostra gigas, in British Columbia. Can. J. Fish. Aquat. Sci. 51: 1608–1614.[CrossRef]

BURTON, R. S., 1983 Protein polymorphisms and genetic differentiation of marine invertebrate populations. Mar. Biol. Lett. 4: 193–206.

CANNINGS, C., 1974 The latent roots of certain Markov chains arising in genetics: a new approach. Adv. Appl. Probab. 6: 260–290.[CrossRef]

CHAMBERS, R. C., and K. G. WAIWOOD, 1996 Maternal and seasonal differences in egg sizes and spawning characteristics of captive Atlantic cod, Gadus morhua. Can. J. Fish. Aquat. Sci. 53: 1986–2003.[CrossRef]

CROW, J. F., and M. KIMURA, 1970 Introduction to Population Genetics Theory. Harper & Row, New York.

DAVID, P., M. PERDIEU, A. PERNOT and P. J. JARNE, 1997 Fine-grained spatial and temporal population genetic structure in the marine bivalve Spisula ovalis l. Evolution 51: 1318–1322.[CrossRef]

EDMANDS, S., P. E. MOBERG and R. S. BURTON, 1996 Allozyme and mitochondrial DNA evidence of population subdivision in the purple sea urchin Strongylocentrotus purpuratus. Mar. Biol. 126: 443–450.[CrossRef]

ELDON, B., and J. WAKELEY, 2006 Coalescent processes when the distribution of offspring number among individuals is highly skewed. Genetics 172: 2621–2633.[Abstract/Free Full Text]

FISHER, R. A., 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford.

GRIFFITHS, R. C., 1981 The number of heterozygous loci between two randomly chosen completely linked sequences of loci in two subdivided population models. J. Math. Biol. 12: 251–261.

HEDGECOCK, D., 1994a Does variance in reproductive success limit effective population sizes of marine organisms?, pp. 1222–1344 in Genetics and Evolution of Aquatic Organisms, edited by A. BEAUMONT. Chapman & Hall, London.

HEDGECOCK, D., 1994b Temporal and spatial genetic structure of marine animal populations in the California Current. Calif. Coop. Ocean Fish. Invest. Rep. 35: 73–81.

HEDGECOCK, D., M. TRACEY and K. NELSON, 1982 Genetics, pp. 297–403 in The Biology of Crustacea, Vol. 2, edited by L. G. ABELE. Academic Press, New York.

HEDRICK, P. W., 2005 Large variance in reproductive success and the Ne/N ratio. Evolution 59: 1596–1599.[CrossRef][Medline]

HERBOTS, H. M., 1997 The structured coalescent, pp. 231–255 in Progress of Population Genetics and Human Evolution, edited by P. DONNELLY and S. TAVARÉ. Springer, New York.

HEY, J., 1991 A multi-dimensional coalescent process applied to multi-allelic selection models and migration. Theor. Popul. Biol. 39: 30–48.[CrossRef][Medline]

HURTADO, L. A., R. A. LUTZ and R. C. VRIJENHOEK, 2004 Distinct patterns of genetic differentation among annelids of eastern Pacific hydrothermal vents. Mol. Ecol. 13: 2603–2615.[CrossRef][Medline]

JOHNSON, M. S., and R. BLACK, 1984 Pattern beneath the chaos: the effect of recruitment on genetic patchiness in an intertidal limpet. Evolution 38: 1371–1383.[CrossRef]

JOHNSON, S. B., C. R. YOUNG, W. J. JONES, A. WARÉN and R. C. VRIJENHOEK, 2006 Migration, isolation, and speciation of hydrothermal vent limpets (gastropoda; lepetodrilidae) across the Blanco Transform Fault. Biol. Bull. 210: 140–157.[Abstract/Free Full Text]

KINGMAN, J. F. C., 1982a The coalescent. Stoch. Proc. Appl. 13: 235–248.[CrossRef]

KINGMAN, J. F. C., 1982b On the genealogy of large populations. J. Appl. Probab. 19A: 27–43.[CrossRef]

KJESBU, O. S., P. SOLEMDAL, P. BRATLAND and M. FONN, 1996 Variation in annual egg production in individual captive Atlantic cod (Gadus morhua). Can. J. Fish. Aquat. Sci. 53: 610–620.[CrossRef]

LI, W., 1976 Distribution of nucleotide difference between two randomly chosen cistrons in a subdivided population: the finite island model. Theor. Popul. Biol. 10: 303–308.[CrossRef][Medline]

LYNCH, M., and T. J. CREASE, 1990 The analysis of population survey data on DNA sequence variation. Mol. Biol. Evol. 7: 377–394.[Abstract]

MAY, A. W., 1967 Fecundity of Atlantic cod. J. Fish. Res. Brd. Can. 24: 1531–1551.

MÖHLE, M., 1998 A convergence theorem for Markov chains arising in population genetics and the coalescent with selfing. Adv. Appl. Probab. 30: 493–512.[CrossRef]

MORAN, P. A. P., 1958 Random processes in genetics. Proc. Camb. Philos. Soc. 54: 60–71.[CrossRef]

MORAN, P. A. P., 1962 Statistical Processes of Evolutionary Theory. Clarendon Press, Oxford.

MURRAY, M. C., and M. P. HARE, 2006 A genomic scan for divergent selection in a secondary contact zone between Atlantic and Gulf of Mexico oysters, Crassostrea virginica. Mol. Ecol. 15: 4229–4242.[CrossRef][Medline]

NAGYLAKI, T., 1980 The strong migration limit in geographically structured populations. J. Math. Biol. 9: 101–114.[CrossRef][Medline]

NATH, H. B., and R. C. GRIFFITHS, 1993 The coalescent in two colonies with symmetric migration. J. Math. Biol. 31: 841–852.[CrossRef][Medline]

NEI, M., 1972 Genetic distance between populations. Am. Nat. 106: 283–292.[CrossRef]

NEI, M., 1973 Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. USA 70: 3321–3323.[Abstract/Free Full Text]

NEI, M., 1975 Molecular Population Genetics and Evolution. Elsevier, New York.

NEI, M., 1982 Evolution of human races at the gene level, pp. 167–181 in Human Genetics, Part A: The Unfolding Genome, edited by B. BOHHE-TAMIR, P. COHEN and R. N. GOODMAN. Alan R. Liss, New York.

NEI, M., and M. FELDMAN, 1972 Identity of genes by descent within and between populations under mutation and migration pressure. Theor. Popul. Biol. 3: 460–465.[CrossRef][Medline]

NEI, M., and D. GRAUR, 1984 Extent of protein polymorphism and the neutral mutation theory. Evol. Biol. 17: 73–118.

PALUMBI, S. R., 1994 Genetic divergence, reproductive isolation, and marine speciation. Annu. Rev. Ecol. Syst. 25: 547–572.[CrossRef]

PANNELL, J. R., 2003 Coalescence in a metapopulation with recurrent local extinction and recolonization. Evolution 57: 949–961.[CrossRef][Medline]

PITMAN, J., 1999 Coalescents with multiple collisions. Ann. Probab. 27: 1870–1902.[CrossRef]

ROUSSET, F., 2002 Inbreeding and relatedness coefficients: What do they measure? Heredity 88: 371–380.[CrossRef][Medline]

SAGITOV, S., 1999 The general coalescent with asynchronous mergers of ancestral lines. J. Appl. Probab. 36: 1116–1125.[CrossRef]

SARGSYAN, O., and J. WAKELEY, 2008 A coalescent process with simultaneous multiple mergers for approximating the gene genealogies of many marine organisms. Theor. Popul. Biol. 74: 104–114.[CrossRef][Medline]

SLATKIN, M., 1991 Inbreeding coefficients and coalescence times. Genet. Res. 58: 167–175.[Medline]

SLATKIN, M., 1993 Isolation by distance in equilibrium and non-equilibrium populations. Evolution 47: 264–279.[CrossRef]

SLATKIN, M., and W. P. MADDISON, 1989 A cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics 123: 603–613.[Abstract/Free Full Text]

STRATHMANN, M. F., 1987 Reproduction and Development of Marine Invertebrates of the Northern Pacific Coast. University of Washington Press, Seattle.

STRATHMANN, R. R., 1978 The length of pelagic period in echinoderms with feeding larvae from the northeastern pacific. J. Exp. Mar. Biol. Ecol. 34: 23–27.[CrossRef]

STROBECK, C., 1987 Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117: 149–153.[Abstract/Free Full Text]

TAKAHATA, N., 1983 Gene identity and genetic differentiation of populations in the finite island model. Genetics 104: 497–512.[Abstract/Free Full Text]

TAKAHATA, N., 1988 The coalescent in two partially isolated diffusion populations. Genet. Res. Camb. 52: 213–222.[Medline]

TURNER, T. F., J. P. WARES and J. R. GOLD, 2002 Genetic effective size is three orders of magnitude smaller than adult size census size in an abundant, estuarine-dependent marine fish (Sciaenops ocellatus). Genetics 162: 1329–1339.[Abstract/Free Full Text]

WADE, M. J., and D. E. MCCAULEY, 1988 Extinction and recolonization: their effects on the genetic differentiation of local populations. Evolution 42: 995–1005.[CrossRef]

WAKELEY, J., 1998 Segregating sites in Wright's island model. Theor. Popul. Biol. 53: 166–174.[CrossRef][Medline]

WAKELEY, J., 2008 Coalescent Theory: An Introduction. Roberts & Company, Greenwood Village, CO.

WATTERSON, G. A., 1975 On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7: 256–276.[CrossRef][Medline]

WATTS, R. J., M. S. JOHNSON and R. BLACK, 1990 Effects of recruitment on genetic patchiness in the urchin Echinometra mathaei in Western Australia. Mar. Biol. 105: 145–151.[CrossRef]

WHITLOCK, M. C., 1992 Temporal fluctuations in demographic parameters and the genetic variance among populations. Evolution 46: 608–615.[CrossRef]

WHITLOCK, M. C., and D. E. MCCAULEY, 1990 Some population genetic consequences of colony formation and extinction: genetic correlations within founding groups. Evolution 44: 1717–1724.[CrossRef]

WILKINSON-HERBOTS, H. M., 1998 Genealogy and subpopulation differentiation under various models of population structure. J. Math. Biol. 37: 535–585.[CrossRef]

WON, Y., C. R. YOUNG, R. A. LUTZ and R. C. VRIJENHOEK, 2003 Dispersal barriers and isolation among deep-sea mussel populations (mytilidae: Bathymodiolus) from eastern Pacific hydrothermal vents. Mol. Ecol. 12: 169–184.[CrossRef][Medline]

WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16: 97–159.[Free Full Text]

WRIGHT, S., 1951 The genetical structure of populations. Ann. Eugen. 15: 323–354.

YOUNG, C. R., S. FUJIO and R. C. VRIJENHOEK, 2008 Directional dispersal between mid-ocean ridges: deep-ocean circulation and gene flow in Ridgeia piscesae. Mol. Ecol. 17: 1718–1731.[Medline]