Genetics, Vol. 164, 789-795, June 2003, Copyright © 2003

Selection in a Subdivided Population With Local Extinction and Recolonization

Joshua L. Cherrya
a Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138

Corresponding author: Joshua L. Cherry, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bldg. 45, Bethesda, MD 20894., cherry{at}oeb.harvard.edu (E-mail)

Communicating editor: N. TAKAHATA


*  ABSTRACT
*TOP
*ABSTRACT
*MODEL AND RESULTS
*COMPUTER SIMULATIONS
*DISCUSSION
*LITERATURE CITED

In a subdivided population, local extinction and subsequent recolonization affect the fate of alleles. Of particular interest is the interaction of this force with natural selection. The effect of selection can be weakened by this additional source of stochastic change in allele frequency. The behavior of a selected allele in such a population is shown to be equivalent to that of an allele with a different selection coefficient in an unstructured population with a different size. This equivalence allows use of established results for panmictic populations to predict such quantities as fixation probabilities and mean times to fixation. The magnitude of the quantity Nese, which determines fixation probability, is decreased by extinction and recolonization. Thus deleterious alleles are more likely to fix, and advantageous alleles less likely to do so, in the presence of extinction and recolonization. Computer simulations confirm that the theoretical predictions of both fixation probabilities and mean times to fixation are good approximations.


THE consequences of population subdivision for evolution depend on the nature of gene flow between subpopulations. Gene flow might be restricted to ordinary migration, but might also include extinction of subpopulations followed by recolonization. Extinction and recolonization affect not only the amount of neutral variation maintained in the population, but also the efficacy of natural selection compared to stochastic change in allele frequency.

In the absence of extinction and recolonization, subdivision increases the effective population size. Nonetheless, fixation probabilities of alleles subject to genic selection are unaffected by subdivision under fairly general conditions (MARUYAMA 1970 Down, MARUYAMA 1974 Down). These facts can in some cases be reconciled with the aid of the notion of effective selection coefficient (se; CHERRY and WAKELEY 2003 Down).

Extinction and recolonization lower effective population size (SLATKIN 1977 Down; MARUYAMA and KIMURA 1980 Down; WHITLOCK and BARTON 1997 Down). When sufficiently strong they can reduce effective size below the actual population size. Extinction and recolonization can affect the probability of fixation of alleles subject to selection (BARTON 1993 Down).

BARTON 1993 Down has derived expressions for the fixation probability of a favored allele initially present in a single copy in an infinite population with extinction and recolonization. Here I present results that cover deleterious as well as advantageous alleles, apply to any initial allele frequency, and do not require the assumption of an infinite population size. These results provide not only fixation probabilities but also a complete description of the trajectory of the frequency of a selected allele. This description is equivalent to that of an unstructured population with a different size and a different selection coefficient.


*  MODEL AND RESULTS
*TOP
*ABSTRACT
*MODEL AND RESULTS
*COMPUTER SIMULATIONS
*DISCUSSION
*LITERATURE CITED

To obtain results for a finite island model of subdivision, I first consider a model with a single sink population and a source population with constant allele frequency. The results translate to a quasi-equilibrium for subpopulations in a finite island model.

Consider a haploid population that receives migrants from a source population and is also subject to extinction and subsequent recolonization by a single individual from that source population. Suppose that a locus has two allelic forms, a and A, and that there is no further mutation. Denote by x the frequency of the A allele in the sink population, and let be its (constant) frequency in the source population. Let m be the rate of migration and let {lambda} be the probability of extinction and recolonization in any generation. Assume for the moment that there is no selection.

If there were no extinction and recolonization, the equilibrium distribution of the allele frequency x would be well approximated by a ß-distribution (WRIGHT 1931 Down). This follows from a diffusion approximation. The case with extinction and recolonization may not be amenable to a diffusion approach because extinction/recolonization events drastically change the allele frequency in the course of a generation. However, we need not be concerned with the exact distribution for the present purposes. Because we are interested only in the mean and variance of the change in allele frequency in a generation, the main concern is to find the expected value of x(1 - x), which is half the expected heterozygosity. The mean change in allele frequency due to selection is approximately proportional to this quantity. The component of the variance of this change that is due to ordinary drift is also approximately proportional to this quantity. The additional variance due to extinction/recolonization events can be calculated separately and combined with this.

It is convenient to think in terms of the "virtual heterozygosity" H, the expected value of 2x(1 - x). This quantity can be interpreted as the probability that two copies of the locus, drawn uniformly and independently from the population with replacement, are in different allelic states. We can write a recursion for H that depends on the mean allele frequency. For the two copies of the gene to be in different allelic states, it is necessary that there has not just been an extinction/recolonization event (probability 1 - {lambda}) and also that the same copy of the locus has not been sampled twice (probability 1 - 1/N). If these criteria are met there are three possibilities to consider. If neither sampled allele is a migrant, the probability that the alleles are different is simply the value of H in the previous generation. If exactly one is a migrant, the probability depends on the value of E[x] in the previous generation and is equal to E[x](1 - ) + (1 - E[x]). If both are migrants, this probability is that of picking two different alleles from the source population, 2(1 - ). Putting this all together gives

(1)

where Ht is the heterozygosity at time t and E[x] refers to the expectation in the previous generation. At equilibrium The equilibrium condition for H is therefore

(2)

Solution for H gives

(3)

The quantity FST, which is later interpreted as the fractional loss of heterozygosity due to subdivision, is defined by From Equation 3 it follows that

where the approximate equality holds for small m, large N, and small {lambda}. The first factor in the exact form would give 1 - FST if there were no extinction and recolonization and is approximately equal to 2Nm/(2Nm + 1), a familiar approximation for 1 - FST for an island model with ordinary migration (WRIGHT 1940 Down; DOBZHANSKY and WRIGHT 1941 Down). The second factor represents the additional loss of heterozygosity due to extinction/recolonization events; for {lambda} = 0 it equals one, and it is less than one for {lambda} > 0. The approximate equality is equivalent to a case of the expression for FST given by WHITLOCK and BARTON 1997 Down(Equation 21), with their k equal to 1/2. This approximation makes it clear that extinction and recolonization decreases heterozygosity and that the expression for 1 - FST is close to 2Nm/(2Nm + 1) when {lambda} = 0. Note that 1 - FST is independent of , which might not have been obvious from the start.

The sink population described above serves as a model for a subpopulation in a finite island model. In this model D demes ("islands"), each consisting of N individuals, exchange migrants among themselves and also serve as sources for recolonization. Let now refer to the frequency of the A allele in the population as a whole (the mean of the within-deme allele frequencies). In any generation, the population as a whole in effect serves as a source population for any particular deme, with an allele frequency . In the finite island model, in contrast to the source-sink model, the value of changes over time. However, when the number of demes is large, changes only slowly compared to the rate of equilibration of within-deme allele frequencies, unless selection is very strong (see below), and so a quasi-equilibrium is attained (WRIGHT 1931 Down; DOBZHANSKY and WRIGHT 1941 Down). This is so because the stochastic change in the population as a whole is the mean of the D independent within-deme stochastic changes. The quasi-equilibrium distribution of within-deme allele frequency approaches the equilibrium distribution that holds for the sink population of a source-sink model. Thus FST, the fractional loss of heterozygosity due to subdivision, is given approximately by Equation 4. This expression will allow derivation of a diffusion approximation for the behavior of an allele in the population as a whole when the number of demes is large.

The above treatment of the quasi-equilibrium neglected selection. Selection raises two issues. First, if selection is very strong, may change so rapidly that a quasi-equilibrium is not approached. Second, selection will alter the (quasi-) equilibrium distribution of allele frequency, even in the source-sink model, where a true equilibrium is reached. If selection is weak compared to stochastic change in allele frequency within a subpopulation, it has negligible effect on the (quasi-) equilibrium distribution of allele frequency. The mean change due to selection is proportional to s, and the variance due to ordinary drift is proportional to 1/N. Thus a sufficient condition for directional change to be weaker than stochastic change within a deme is

(5)

which is conservative because it does not take into account the stochastic change due to extinction and recolonization. This condition also guarantees that the rate of change of is small enough that a quasi-equilibrium is approached. The per-generation change in due to selection is at most about s(1 - ) (this would be the change if all demes had the same allele frequency). The maximum value of this change is therefore s/4. As is evident from Equation 1, heterozygosity in a deme decays toward its equilibrium on a timescale of N generations or faster (migration and extinction/recolonization hasten this decay). Thus condition (5) is also sufficient for the quasi-equilibrium to be approached. Because stochastic change in the whole population is a much weaker force than stochastic change in a subpopulation, selection may be sufficiently weak in the sense that |Ns| << 1 and yet strongly affect the trajectory of allele frequency in the population as a whole. I assume that this condition holds in all that follows.

Let xi be the allele frequency in the ith deme. The mean change in allele frequency in this deme due to selection is ~sxi(1 - xi) in one generation. The mean in the entire population is the mean of the sxi(1 - xi) over all i, or This mean has the same form as that in a panmictic population, but with s replaced by (1 - FST)s. Thus the effective selection coefficient se is given by

(6)

An analogous treatment of the variance in {Delta} will yield the effective population size Ne. This variance is approximated by a simple function of the expected variance of within-deme change in allele frequency. To obtain this quantity I first derive an expression for this variance conditional on within-deme allele frequency and then take the mean across the quasi-equilibrium distribution of allele frequency.

Suppose that x is the allele frequency within a deme in one generation and x' is the frequency in the next. The variance of {Delta}x, the change in allele frequency, is equal to the variance of x', the second moment of x' about E[x']. Because the mean change in allele frequency in a single generation is small for s, m, {lambda} << 1, E[x'] {cong} x. The distribution of x' is a mixture of two components, one corresponding to binomial sampling and another corresponding to extinction and recolonization. Conditional on no extinction, the second moment about E[x'] is (1/N)E[x'](1 - E[x']) or ~(1/N) x(1 - x). Conditional on extinction and recolonization, x' is 1 with probability and 0 with probability 1 - , so the second moment about E[x'] is (1 - E[x'])2 + (1 - )E[x']2 {cong} (1 - x)2 + (1 - )x2. Thus the variance in x' as a function of x is given by

(7)

Because 1 - {lambda} {cong} 1, this simplifies to

(8)

The expected value of this expression follows readily from our expressions for E[x] and E[x(1 - x)]. Using and we obtain

(9)

Therefore V{Delta}, the variance in the change in population-wide allele frequency , is given by

(10)

This variance is proportional to (1 - ), as it is in a panmictic population. The definition of variance effective population size is given by Therefore

(11)

with FST given by Equation 4. Ne can be larger or smaller than the actual population size ND, depending on the parameters N, m, and {lambda}.

These results show that a selected allele in a subdivided population with extinction and recolonization behaves much like an allele with a different selection coefficient in a panmictic population with a different size. This follows from the fact that both the mean and the variance of the change in allele frequency are approximately proportional to (1 - ), as they are in a panmictic population (expressions for this mean and variance completely determine the diffusion approximation). The parameters of the equivalent panmictic population, se and Ne, are given by Equation 6 and Equation 11. In the presence of extinction and recolonization, the value of Nese, which determines fixation probability, is different from its value in a panmictic population. The magnitude of this product is decreased by extinction and recolonization. Specifically,

(12)


*  COMPUTER SIMULATIONS
*TOP
*ABSTRACT
*MODEL AND RESULTS
*COMPUTER SIMULATIONS
*DISCUSSION
*LITERATURE CITED

To test the approximations used above, I ran computer simulations and compared the results to theoretical predictions. In these simulations the state of the population is represented by an array of D integers, each corresponding to a deme. Each integer indicates the number of copies of allele A in the deme and hence ranges from 0 to N. Each generation the new value for each deme is determined as follows. With probability {lambda} the deme undergoes extinction and recolonization, after which the new number of A alleles is N with probability and 0 with probability 1 - . With probability 1 - {lambda} the deme does not go extinct and the new number is chosen from a binomial distribution. The index parameter n of this binomial (number of "trials") is equal to N. The probability parameter p (probability of "success") is determined by the current allele frequency in the deme xi, the population-wide mean allele frequency , the migration rate m, and the selection coefficient s. Let This would be the expected allele frequency in the ith deme in the next generation if there were no selection. Therefore p = (1 + s)/(1 + sp).

Fig 1 compares the distribution of allele frequencies among many independent simulation runs to theoretical predictions at two time points. In the simulations N = 100, D = 100, s = 3 x 10-4, m = 0.001, {lambda} = 0.001, and the initial allele frequency was 1/2 (each deme initially contained 50 A and 50 a alleles). The predictions shown were obtained by iteration of the transition matrix for a Wright-Fisher population of 100 individuals. The selection coefficient for this population was chosen such that the product of the population size and the selection coefficient was equal to Nese, with Ne and se given by Equation 6 and Equation 11. Time was scaled to account for the difference in effective size between the small Wright-Fisher population and the subdivided population: the number of generations in the Wright-Fisher population was smaller than that in the simulations by a factor of Ne/100. The plots demonstrate excellent agreement between the predictions and the simulation results, confirming that the trajectory of allele frequency in the structured population is similar to that in a panmictic population with parameters given by Equation 6 and Equation 11.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 1. Predicted and observed distributions of allele frequencies at two times. In all simulations N = 100, D = 100, s = 3 x 10-4, m = 0.001, {lambda} = 0.001, and the initial allele frequency was 1/2. For these parameters Ne = 29,668, about half of what it would be with no extinction and recolonization, and se = 4.57 x 10-5. The histograms (bars) represent the results of 500,000 simulation runs. The predictions ("stair" plots) are based on a Wright-Fisher population of size 100, with a selection coefficient of 0.01357. (A) The distribution after 5044 generations. The prediction is based on 17 generations of the small Wright-Fisher population. (B) The distribution after 10,087 generations. The prediction is based on 34 generations of the Wright-Fisher population.

Predictions of fixation probabilities and fixation times follow from a combination of the theory presented here with classical diffusion results. Substitution of Ne and se for N and s in a familiar expression for fixation probability (WRIGHT 1931 Down) gives this probability as

where 0 is the initial allele frequency in the population and se and Ne are given by Equation 6 and Equation 11. Similarly, use of an expression given by KIMURA and OHTA 1969 Down(Equation 17), with se substituted for s and adjustments made for haploidy, gives predictions of mean times to fixation.

Table 1 and Table 2 compare such theoretical predictions to simulation results for s = 10-3, N = 100, D = 100, and various values of m and {lambda}. Table 1 shows predicted and observed probabilities of fixation of an allele initially present as a single copy, relative to the neutral fixation probability. All of the predictions are close to the observed quantities (largest deviation is 7%). Table 2 shows predicted and observed mean times to fixation. The predictions are very close to the observations, differing by at most 2%. Simulations with s = 3 x 10-4 or s = 10-4 also demonstrate a good agreement between prediction and observation (data not shown).


 
View this table:
In this window
In a new window

 
Table 1. Predicted and observed relative fixation probabilities for s = 10-3, D = 100, and N = 100


 
View this table:
In this window
In a new window

 
Table 2. Predicted and observed fixation times for s = 10-3, D = 100, and N = 100

Table 3 and Table 4 show relative fixation probabilities for selectively disfavored alleles. Table 3 gives results for s = -3 x 10-4. The predictions are all very close to the observations (within 3%). Table 4 shows results for a more strongly disfavored allele (s = -10-3), for which fixation probabilities would be minuscule without extinction and recolonization. For some sets of parameters, no fixations were observed in the simulations. This is consistent with the very low predicted values of fixation probability. For the other cases, the agreement between theory and observation is good (maximum deviation is 15%). Without extinction and recolonization, or without population structure altogether, the relative fixation probability would be 4 x 10-8. Thus the theory captures the more than seven orders of magnitude change in fixation probability due to extinction and recolonization. Predictions of fixation times are also good for both values of selection coefficient (maximum deviation 3%), as are predictions for s = -10-4 (data not shown).


 
View this table:
In this window
In a new window

 
Table 3. Predicted and observed relative fixation probabilities for s = -3 x 10-4, D = 100, and N = 100


 
View this table:
In this window
In a new window

 
Table 4. Predicted and observed relative fixation probabilities for s = -10-3, D = 100, and N = 100

Table 5 and Table 6 show results for a smaller number of demes (D = 30). Even with so few demes, predictions of fixation probabilities are close to the simulation results (all within 6%). Fixation times (not shown) are also in excellent agreement with predictions.


 
View this table:
In this window
In a new window

 
Table 5. Predicted and observed relative fixation probabilities for s = 10-3, D = 30, and N = 100


 
View this table:
In this window
In a new window

 
Table 6. Predicted and observed relative fixation probabilities for s = -10-3, D = 30, and N = 100


*  DISCUSSION
*TOP
*ABSTRACT
*MODEL AND RESULTS
*COMPUTER SIMULATIONS
*DISCUSSION
*LITERATURE CITED

Local extinction and recolonization affect the fates of alleles in subdivided populations. The results presented here describe the trajectory of the frequency of an allele under selection in the presence of this force when the number of demes is large and a deme is recolonized by a single haploid individual. The behavior of the subdivided population was shown to be equivalent to that of a certain panmictic population. The size of this equivalent panmictic population (Ne) is different from the actual size of the subdivided population. Ne can be larger or smaller than this actual size. The selection coefficient in the equivalent panmictic population (se) is different from the actual selection coefficient. se is always smaller in magnitude than the actual selection coefficient and has the same sign. Expressions for se and Ne are given by Equation 6 and Equation 11. These allow application of established results for panmictic populations to subdivided populations with extinction and recolonization.

The product Nese determines the probability of fixation of an allele. Extinction and recolonization reduces the magnitude of this quantity. Thus selection plays less of a role in determining the fate of an allele in the presence of extinction and recolonization. This reflects the fact that extinction and recolonization are additional stochastic forces that can overwhelm the directional effects of selection. This result is consistent with those obtained by BARTON 1993 Down for a favored allele in an infinite population, but applies much more generally.

Computer simulations confirm that these theoretical results make good predictions. Fixation probabilities observed in simulations are close to theoretical predictions. Predicted mean times to fixation also agree well with simulation results.

The results presented here are based on the assumption that an empty deme is recolonized by a single haploid founder. Results can be quite different when recolonization involves more than one founding allele. For example, FST is raised by extinction and recolonization with just one founding allele, but can be lowered by extinction and recolonization when there are multiple founders (TAKAHATA 1994 Down; WHITLOCK and BARTON 1997 Down). Thus with multiple founders |se| can be raised rather than lowered by extinction and recolonization, although |se| can never be larger than |s|.

BARTON 1993 Down noted an apparent discrepancy between the Ne that applies to fixation probabilities of selected alleles and Ne that is relevant to maintenance of neutral variation. However, the value of Ne given for fixation probabilities was based on the assumption that fixation probability depends on Nes. The distinction between s and se, and hence between Nes and Nese, resolves the apparent discrepancy: the ratio of se to s is 1 - FST, which approximately equals the ratio of the two values of Ne given by BARTON 1993 Down for the model analyzed here. Thus the concept of effective selection coefficient allows the use of a single value of Ne to describe both the behavior of neutral variation and the probability of fixation of an allele under selection.

The results presented here have implications for the interpretation of population-genetic data. Attempts have been made to estimate selection coefficients from molecular data. For example, BULMER 1991 Down and HARTL et al. 1994 Down estimated the selective cost of a nonoptimal codon in Escherichia coli as on the order of 10-9–10-8. What such studies actually estimate is the effective selection coefficient se. This quantity is of interest because it dictates the population-genetic behavior of, in this example, alleles differing by synonymous changes. However, if one is interested in the physiology of bacteria and the magnitude of the decrease in growth rate due to nonoptimal codons, the quantity of interest is s. Just as Ne can differ by orders of magnitude from the actual population size, se might be radically different from s. The difference between se and s might explain the discrepancy noted by BULMER 1991 Down between estimates of "selection coefficients" and predictions based on physiological considerations.

Knowledge of the actual population size could be used to obtain estimates of s if there were no extinction and recolonization, since Nese is unaffected by that type of population structure. In the presence of extinction and recolonization the actual population size is of no help, since Nese is in this case altered by population structure. However, knowledge of FST could be used to relate se to s via Equation 6.


*  ACKNOWLEDGMENTS

I thank Christina Muirhead and Jon Wilkins for comments on the manuscript. This work was supported by National Science Foundation grant DEB-9815367 to John Wakeley.

Manuscript received December 13, 2002; Accepted for publication March 4, 2003.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MODEL AND RESULTS
*COMPUTER SIMULATIONS
*DISCUSSION
*LITERATURE CITED

BARTON, N. H., 1993  The probability of fixation of a favoured allele in a subdivided population. Genet. Res. 62:149-157.

BULMER, M., 1991  The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897-907.[Abstract]

CHERRY, J. L. and J. WAKELEY, 2003  A diffusion approximation for selection and drift in a subdivided population. Genetics 163:421-428.[Abstract/Free Full Text]

DOBZHANSKY, T. and S. WRIGHT, 1941  Genetics of natural populations. V. Relations between mutation rate and accumulation of lethals in populations of Drosophila pseudoobscura. Genetics 26:23-51.[Free Full Text]

HARTL, D. L., E. N. MORIYAMA, and S. A. SAWYER, 1994  Selection intensity for codon bias. Genetics 138:227-234.[Abstract]

KIMURA, M. and T. OHTA, 1969  The average number of generations until fixation of a mutant gene in a finite population. Genetics 61:763-771.[Free Full Text]

MARUYAMA, T., 1970  On the fixation probability of mutant genes in a subdivided population. Genet. Res. 15:221-225.[Medline]

MARUYAMA, T., 1974  A simple proof that certain quantities are independent of the geographical structure of population. Theor. Popul. Biol. 5:148-154.[Medline]

MARUYAMA, T. and M. KIMURA, 1980  Genetic variability and effective population size when local extinction and recolonization of subpopulations are frequent. Proc. Natl. Acad. Sci. USA 77:6710-6714.[Abstract/Free Full Text]

SLATKIN, M., 1977  Gene flow and genetic drift in a species subject to frequent local extinctions. Theor. Popul. Biol. 12:253-262.[Medline]

TAKAHATA, N., 1994  Repeated failures that led to the eventual success in human evolution. Mol. Biol. Evol. 11:803-805.[Medline]

WHITLOCK, M. C. and N. H. BARTON, 1997  The effective size of a subdivided population. Genetics 146:427-441.[Abstract]

WRIGHT, S., 1931  Evolution in Mendelian populations. Genetics 16:97-159.[Free Full Text]

WRIGHT, S., 1940  Breeding structure of populations in relation to speciation. Am. Nat. 74:232-248.




This article has been cited by other articles:


Home page
GeneticsHome page
K. R. Takahasi
Evolution of Coadaptation in a Subdivided Population
Genetics, May 1, 2007; 176(1): 501 - 511.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. S. Willett and R. S. Burton
Evolution of Interacting Proteins in the Mitochondrial Electron Transport System in a Marine Copepod
Mol. Biol. Evol., March 1, 2004; 21(3): 443 - 453.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. L. Cherry
Selection, Subdivision and Extinction and Recolonization
Genetics, February 1, 2004; 166(2): 1105 - 1114.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. Roze and F. Rousset
Selection and Drift in Subdivided Populations: A Straightforward Method for Deriving Diffusion Approximations and Applications Involving Dominance, Selfing and Local Extinctions
Genetics, December 1, 2003; 165(4): 2153 - 2166.
[Abstract] [Full Text] [PDF]