- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Cherry, J. L.
- Articles by Wakeley, J.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Cherry, J. L.
- Articles by Wakeley, J.
A Diffusion Approximation for Selection and Drift in a Subdivided Population
Joshua L. Cherrya and John Wakeleyaa Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138
Corresponding author: Joshua L. Cherry, Cambridge, MA 02140., cherry{at}oeb.harvard.edu (E-mail)
Communicating editor, Y.-X. FU
| ABSTRACT |
|---|
The population-genetic consequences of population structure are of great interest and have been studied extensively. An area of particular interest is the interaction among population structure, natural selection, and genetic drift. At first glance, different results in this area give very different impressions of the effect of population subdivision on effective population size (Ne), suggesting that no single value of Ne can completely characterize a structured population. Results presented here show that a population conforming to Wright's island model of subdivision with genic selection can be related to an idealized panmictic population (a Wright-Fisher population). This equivalent panmictic population has a larger size than the actual population; i.e., Ne is larger than the actual population size, as expected from many results for this type of population structure. The selection coefficient in the equivalent panmictic population, referred to here as the effective selection coefficient (se), is smaller than the actual selection coefficient (s). This explains how the fixation probability of a selected allele can be unaffected by population subdivision despite the fact that subdivision increases Ne, for the product Nese is not altered by subdivision.
THE genetic consequences of population structure subdivision of the population or population viscosityhave been of interest to population geneticists since the beginnings of the field (![]()
Natural selection in a structured population provides what seems like an example of the inadequacy of a single Ne as a descriptor of a population. ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Much theoretical work has been done on the effective size of structured populations. In addition to work on simple migration models with no selection (cited above), there have been many efforts to calculate effective size under more general conditions. ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The model of population structure considered here is WRIGHT's (1931) island model. Specifically, the version in which a large number of demes ("islands") exchange migrants is considered (![]()
![]()
![]()
![]()
![]()
![]()
![]()
A diffusion approximation is given here for the combined process of genetic drift and genic selection under the island model of subdivision. This diffusion is equivalent to that describing a certain ideal (Wright-Fisher) population. The size of the equivalent Wright-Fisher population (Ne, by definition) is larger than that of the actual population. However, this equivalent population also has a smaller selection coefficient, referred to here as the effective selection coefficient se. The product of population size and selection coefficient is the same for the actual population and the equivalent ideal population, as required for consistency with MARUYAMA's (1970b) result.
| MODEL AND RESULTS |
|---|
Consider a population consisting of D demes, each containing N haploid individuals. Migration occurs at rate m. This means that under strict neutrality the parent of an individual would come from within the deme with probability 1 - m and from the population at large with probability m. With selection, the sampling of alleles from these potential parents is biased toward more fit alleles in the usual way. Two alleles are considered, and further mutation is neglected. One allele has a fitness of 1 + s relative to the other. The frequency of this allele in the ith deme is denoted by xi, and
denotes the mean frequency among demes, i.e., the overall frequency of the allele in the entire population.
For a diffusion approximation we need expressions for the per-generation mean and variance of the change in allele frequency as functions of that allele frequency. These are well established for populations with no structure. The mean change in a panmictic population, M
x, is given approximately by

where x is the allele frequency. The variance V
x is given approximately by x(1 - x)/N for a haploid Wright-Fisher population consisting of N individuals. For other population models, the effective population size Ne takes the place of N and the variance is given approximately by

This equation is essentially the definition of the variance effective population size (![]()
For our subdivided population, no single allele frequency completely describes the population. A variable of obvious interest is the overall allele frequency
. However, a particular value of
can be realized in many different ways. At one extreme, all demes could have the same allele frequency (
). At another, a fraction
of the demes could have allele frequency one, while the rest have frequency zero. Between these extremes lie a myriad of possibilities. Nonetheless, we can still hope to write a diffusion for
. The key is that for a particular value of
, and for given values of m and N, we can know roughly what distribution of within-deme allele frequencies to expect when D is large. Most importantly for the present purposes, we can write an expression for the expected value of xi(1 - xi), where xi is a within-deme allele frequency, as a function of
.
The change in overall allele frequency
is the mean of the changes in the xi. The variance in one of these changes is
xi(1 - xi)/N. The variance in the change in
, the mean of the xi, is equal to

The average of a large number of identically distributed random variables will be close to their common expected value. Thus, for large D, the above will be close to

Two forces, migration and selection, contribute to the mean change within a deme. Strictly speaking, these forces interact in a way that depends on the order in which selection and migration occur. However, under the usual assumptions that m << 1 and s << 1, these components of change can be treated separately. The component due to migration has mean m(
- xi). This quantity sums to zero across demes (migration does not change the overall allele frequency in this model). In a particular deme, the mean change due to selection is
sxi(1 - xi). Thus the mean change in overall allele frequency is approximately

Thus we would have the desired expressions for both the mean and the variance of the change in
if we had Exi(1 - xi) as a function of
.
Imagine that the migrants received by a deme had an allele frequency
whose value was fixed for all time. Under these conditions, it is known that the allele frequency in the deme would reach an equilibrium distribution that is given approximately by the probability density
![]() |
(1) |
where a = 2Nm
, b = 2Nm(1 -
), and C is a normalization constant (![]()
In reality,
changes over time. However, if these changes are sufficiently slow, then a deme will be exposed to roughly the same value of
for some time and would be approximately at the equilibrium (![]()
be sufficiently slow?
Drift within a deme is a more rapid process than drift within the population as a whole. Within-deme drift has a characteristic time of N generations, whereas for the population as a whole this time is at least ND generations (this is the limit for high migration; subdivision makes population-wide drift even slower). Migration only speeds the approach of a deme to its equilibrium distribution. Specifically, the deviation of Exi(1 - xi) from its equilibrium value decreases by a factor of (1 - 1/N)(1 - m)2 each generation. This follows from the recursion relation for Exi(1 - xi) under the constant
assumption, with the condition that Exi =
at the outset. Thus, absent of selection, the population will be in a state of quasi-equilibrium, with the distribution of within-deme allele frequencies given above.
Very strong selection might change
so rapidly that this quasi-equilibrium approximation does not hold. However, a simple limit on the magnitude of s guarantees that this approximation works well. The average change in
due to selection is at most s
(1 -
) per generation (population structure slows the rate below this, as will be clear from what follows), which cannot be greater in magnitude than |s|/4. If |s| is small compared to 1/N, little change in
will occur during the N generations that it takes for within-deme drift to occur, and the quasi-equilibrium will hold.
Under this same condition (|Ns| << 1), selection is not a strong force in determining the equilibrium distribution of within-deme allele frequencies (Equation 1). This distribution becomes approximately the same as that expected under neutrality. This is a ß-distribution whose probability density function (pdf) is
![]() |
(2) |
with a = 2Nm
and b = 2Nm(1 -
). The same approximation was used by ![]()
The assumption that |Ns| << 1, which is made in all of what follows, does not restrict us to especially weak selection. It allows that NDs, the quantity that determines fixation probabilities, can be quite large in magnitude. Indeed if |Ns| is not small compared to 1, |NDs| will be quite large for even moderate D. In that case, an infinite-population model would describe the population well: the more fit allele, when not initially rare, would almost certainly go to fixation via a nearly deterministic path, and an advantageous allele present in a single copy would fix with a probability of
2s.
We can now write the expected value of xi(1 - xi) as a function of
, using knowledge of the moments of a ß-distribution (Equation 2). The first moment of a ß-distribution is a/(a + b), and the second moment is a(a + 1)/(a + b)(a + b + 1). For the ß family member of interest, a = 2Nm
and b = 2Nm(1 -
). Substitution and simplification yield
![]() |
(3) |
Thus, the mean of the within-deme quantity xi(1 - xi) is proportional to
(1 -
). The proportionality constant 2Nm/(2Nm + 1) is a familiar expression for 1 - FST under an island model, where FST is the fractional decrease in heterozygosity due to subdivision. The mean change in
is given approximately by

The variance is given approximately by

which is similar to the expression used by ![]()

The selection coefficient in the equivalent population, referred to here as the effective selection coefficient and denoted by se, is given by

The product Nese is equal to DNs, as required for consistency with MARUYAMA's (1970b) conclusion that subdivision does not affect fixation probability in this model. Nonetheless, the fact that Ne is larger than the actual population size ND, and se is smaller than s, means that changes in allele frequency happen more slowly than in a panmictic population, as established by previous investigations (![]()
![]()
| COMPUTER SIMULATIONS |
|---|
The approximations used above for the among-deme distribution of allele frequencies can be tested by comparison of the theoretical predictions to the results of computer simulations. In these simulations the state of the population is represented by an array of D integers, each corresponding to a deme. Each integer indicates the number of copies of the allele in the deme and hence ranges from 0 to N. Each generation, the value for each deme is drawn from a binomial distribution. The index parameter n of this binomial is equal to N. The probability parameter p is determined by the current allele frequency in the deme xi, the population-wide mean allele frequency
, the migration rate, and the selection coefficient. Let
= (1 - m)xi + m
. This would be the mean allele frequency in the ith deme in the next generation if there were no selection. With selection, we have p = (1 + s)
/(1 + s
).
Fig 1 shows the distribution of allele frequencies among demes in one particular generation of a simulation. The parameter values used in the simulation were D = 1000, N = 100, m = 0.01, and s = 0.001. The ß density function given by Equation 2, with
equal to the actual overall allele frequency, should approximate this distribution. This function is also shown in Fig 1, and it agrees well with the observed distribution.
|
The only aspect of this distribution that is directly relevant to the diffusion is the mean value of the xi(1 - xi). Fig 2 compares the observed mean of the xi(1 - xi) to the value predicted on the basis of the observed
. This predicted value is given by Equation 3. Each plotted point represents the predicted and observed values at a time point. The data come from many independent simulations, with D = 100, N = 100, m = 0.01, and s = 0.001. The observed values agree well with the predictions. This confirms that the mean of the xi(1 - xi) is given, to a good approximation, by a function of
.
|
Another computational test of the analytic approximation involves the evolution of the distribution of the overall allele frequency
over time. The diffusion approximation developed here relates this distribution to that describing a certain panmictic population. This can be related to a much smaller population by a scaling of time. This is convenient because it is feasible to obtain an exact numerical solution for this smaller population by repeated application of its transition matrix. This is an alternative to numerical integration of expressions given by ![]()
![]()
-functions at the boundaries. Fig 3 compares the resulting prediction for the distribution of
to the results of many simulations after 5000 generations. The parameters were D = 100, N = 100, m = 0.01, and s = 0.001, and the initial allele frequency was 1/2. The theoretical prediction is in excellent agreement with the simulation results.
|
A closely related distribution is that of the absorption time, the time until fixation or extinction of an allele. Fig 4 compares the distribution predicted on the basis of Ne and se to simulation results for D = 100, N = 100, m = 0.001, s = 0.0001, and an initial allele frequency of 1/2. Again there is excellent agreement between the prediction and the outcome of the simulations. As an additional test, the mean of these absorption times can be compared to that predicted by the diffusion approximation. Diffusion theory gives the mean absorption time in a Wright-Fisher population as a certain integral (![]()
|
Table 1 compares the observed and predicted mean absorption times for a variety of parameter values with an initial allele frequency of 1/2. For D = 100 and N = 100, the simulation results are in excellent agreement with the predictions. All of the observed values are slightly smaller than the predictions, but only by at most a few percent. With D = 1000 and a mere 10 individuals per deme, the simulation results are again close to the theoretical predictions, even with strong selection and weak migration. For smaller numbers of demes this agreement deteriorates somewhat as migration becomes weak, as expected because the predictions involve the assumption that D is large. However, even with as few as 10 demes, the observed means differ from the predictions by <20%.
|
Table 2 shows results for alleles starting out at a single copy. For the higher migration rates the mean absorption times are in accord with the predicted values. For the lower migration rates the observed means are smaller than the predictions. This phenomenon has nothing to do with selection; it occurs even in its absence. It reflects the fact that extinction, the usual fate of an allele present in a single copy, occurs very quickly. When migration is weak, quasi-equilibrium cannot be achieved this rapidly. In the limit of very low migration, extinction almost always occurs before the allele can spread to other demes, and the time to loss is similar to that in a population of size N. A number more informative than the mean absorption time is the mean time until fixation (conditional on eventual fixation rather than on loss). Observed values of this quantity are compared to predictions in Table 2 [predicted values were calculated according to ![]()
|
In all of the simulations presented above, except where s = 0, |NDs|
1 (NDs ranges from 1 to 100). Therefore selection has a significant effect on the fate of the allele in the population as a whole. Thus the simulations test the ability of the theory to account for selection; had |NDs| been small, the deviation of the results from the strictly neutral case would be insignificant, and the simulations would test only whether the theory worked well under neutrality. The results demonstrate that the theoretical approximations work well in the presence of significant selection, so long as |Ns| is small compared to one (Ns ranges from 0.01 to 0.1 in the simulations).
| DISCUSSION |
|---|
Despite the fact that the state of a subdivided population cannot be described by a single allele frequency, the trajectory of the overall allele frequency
in an island model (![]()
The equivalent panmictic population has parameter values that are different from their counterparts in the island model. As is the case for many population-genetic models, the size of the hypothetical equivalent Wright-Fisher population, Ne, is different from (in this case greater than) the size of the actual population. This state of affairs is familiar in population genetics and is the reason for defining effective population size in the first place. A less familiar aspect of the present result is that the selection coefficient in the hypothetical equivalent population is different from that in the actual population. This motivates the definition, by analogy to effective population size, of the effective selection coefficient se. In the present case, se < s. Specifically, subdivision lowers the effective selection coefficient by the same factor by which it raises effective population size. This factor depends on the product of deme size and migration rate (Nm), which determines the extent of differentiation among the demes.
The effects of subdivision on Ne and se both result from the fact that the expected value of the within-deme quantity xi(1 - xi) is smaller than the quantity
(1 -
), which would be relevant if the population were panmictic. Both the mean and the variance of the change in allele frequency are approximately proportional to the mean of xi(1 - xi). Subdivision therefore slows down drift and selection by the same factor. While se is directly proportional to the mean change in allele frequency, Ne is inversely proportional to the variance, so it changes by the same factor but in the opposite direction. The quantity Nese, which is the ratio of the mean to the variance of the change in allele frequency, is unaffected by subdivision. Therefore, as expected from the results of ![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank Jon Wilkins for helpful discussions. This work was supported by National Science Foundation grant DEB-9815367 to J.W.
Manuscript received February 19, 2002; Accepted for publication October 25, 2002.
| LITERATURE CITED |
|---|
CABALLERO, A., 1994 Developments in the prediction of effective population size. Heredity 73:657-679.[Medline]
CHESSER, R. K., O. E. RHODES, JR., D. W. SUGG, and A. SCHNABEL, 1993 Effective sizes for subdivided populations. Genetics 135:1221-1232.[Abstract]
DOBZHANSKY, T. and S. WRIGHT, 1941 Genetics of natural populations. V. Relations between mutation rate and accumulation of lethals in populations of Drosophila pseudoobscura. Genetics 26:23-51.
EWENS, W. J., 1979 Mathematical Population Genetics. Springer-Verlag, Berlin.
GILLESPIE, J. H., 1998 Population Genetics: A Concise Guide. The Johns Hopkins University Press, Baltimore.
GREGORIUS, H. R., 1991 On the concept of effective number. Theor. Popul. Biol. 40:269-283.[Medline]
KIMURA, M., 1955a Solution of a process of random genetic drift with a continuous model. Proc. Natl. Acad. Sci. USA 41:144-150.
KIMURA, M., 1955b Stochastic processes and the distribution of gene frequencies under natural selection. Cold Spring Harbor Symp. Quant. Biol. 20:33-53.
KIMURA, M. and T. OHTA, 1969 The average number of generations until fixation of a mutant gene in a finite population. Genetics 61:763-771.
MARUYAMA, T., 1970a Effective number of alleles in a subdivided population. Theor. Popul. Biol. 1:273-306.[Medline]
MARUYAMA, T., 1970b On the fixation probability of mutant genes in a subdivided population. Genet. Res. 15:221-225.[Medline]
MARUYAMA, T., 1972a Distribution of gene frequencies in a geographically structured finite population. I. Distribution of neutral genes and of genes with small effect. Ann. Hum. Genet. 35:411-423.[Medline]
MARUYAMA, T., 1972b Distribution of gene frequencies in a geographically structured population. 3. Distribution of deleterious genes and genetic correlation between different localities. Ann. Hum. Genet. 36:99-108.[Medline]
MARUYAMA, T., 1972c Distribution of gene frequencies in a geographically structured population. II. Distribution of deleterious genes and of lethal genes. Ann. Hum. Genet. 35:425-432.[Medline]
MARUYAMA, T., 1974 A simple proof that certain quantities are independent of the geographical structure of population. Theor. Popul. Biol. 5:148-154.[Medline]
MARUYAMA, T. and M. KIMURA, 1980 Genetic variability and effective population size when local extinction and recolonization of subpopulations are frequent. Proc. Natl. Acad. Sci. USA 77:6710-6714.
NEI, M. and N. TAKAHATA, 1993 Effective population size, genetic diversity, and coalescence time in subdivided populations. J. Mol. Evol. 37:240-244.[Medline]
NORDBORG, M., 1997 Structured coalescent processes on different time scales. Genetics 146:1501-1514.[Abstract]
RANNALA, B., 1996 The sampling theory of neutral alleles in an island population of fluctuating size. Theor. Popul. Biol. 50:91-104.[Medline]
ROUSSET, F., 2001 Inferences from spatial population genetics, pp. 239269 in Handbook of Statistical Genetics, edited by D. J. BALDING, M. J. BISHOP and C. CANNINGS. John Wiley & Sons, Chichester, England.
SANTIAGO, E. and A. CABALLERO, 1995 Effective size of populations under selection. Genetics 139:1013-1030.[Abstract]
SLATKIN, M., 1977 Gene flow and genetic drift in a species subject to frequent local extinctions. Theor. Popul. Biol. 12:253-262.[Medline]
SLATKIN, M., 1981 Fixation probabilities and fixation times in a subdivided population. Evolution 35:477-488.
SLATKIN, M., 1991 Inbreeding coefficients and coalescence times. Genet. Res. 58:167-175.[Medline]
TAKAHATA, N., 1991 Genealogy of neutral genes and spreading of selected mutations in a geographically structured population. Genetics 129:585-595.[Abstract]
WANG, J. and A. CABALLERO, 1999 Developments in predicting the effective size of subdivided populations. Heredity 82:212-226.
WHITLOCK, M. C. and N. H. BARTON, 1997 The effective size of a subdivided population. Genetics 146:427-441.[Abstract]
WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16:97-159.
WRIGHT, S., 1939 Statistical Genetics in Relation to Evolution (Actualites Scientifiques et Industrielles, 802: Exposes de Biometrie et de la Statistique Biologique XIII). Hermann et Cie, Paris.
WRIGHT, S., 1940 Breeding structure of populations in relation to speciation. Am. Nat. 74:232-248.
WRIGHT, S., 1969 Evolution and the Genetics of Populations, Vol. 2: The Theory of Gene Frequencies. University of Chicago Press, Chicago.
This article has been cited by other articles:
![]() |
S. Lessard An Exact Sampling Formula for the Wright Fisher Model and a Solution to a Conjecture About the Finite-Island Model Genetics, October 1, 2007; 177(2): 1249 - 1254. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. R. Haag and D. Roze Genetic Load in Sexual and Asexual Diploids: Segregation, Dominance and Genetic Drift Genetics, July 1, 2007; 176(3): 1663 - 1678. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Lehmann and F. Balloux Natural Selection on Fecundity Variance in Subdivided Populations: Kin Selection Meets Bet Hedging Genetics, May 1, 2007; 176(1): 361 - 377. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Takahasi Evolution of Coadaptation in a Subdivided Population Genetics, May 1, 2007; 176(1): 501 - 511. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fearnhead Perfect Simulation From Nonneutral Population Genetic Models: Variable Population Size and Population Subdivision Genetics, November 1, 2006; 174(3): 1397 - 1406. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Bazin, S. Glemin, and N. Galtier Population size does not influence mitochondrial genetic diversity in animals. Science, April 28, 2006; 312(5773): 570 - 572. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Roze and F. Rousset Joint Effects of Self-Fertilization and Population Structure on Mutation Load, Inbreeding Depression and Heterosis Genetics, June 1, 2004; 167(2): 1001 - 1015. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Cherry Selection, Subdivision and Extinction and Recolonization Genetics, February 1, 2004; 166(2): 1105 - 1114. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Roze and F. Rousset Selection and Drift in Subdivided Populations: A Straightforward Method for Deriving Diffusion Approximations and Applications Involving Dominance, Selfing and Local Extinctions Genetics, December 1, 2003; 165(4): 2153 - 2166. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Innan and W. Stephan Distinguishing the Hitchhiking and Background Selection Models Genetics, December 1, 2003; 165(4): 2307 - 2312. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Cherry Selection in a Subdivided Population With Local Extinction and Recolonization Genetics, June 1, 2003; 164(2): 789 - 795. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Cherry Selection in a Subdivided Population With Dominance or Local Frequency Dependence Genetics, April 1, 2003; 163(4): 1511 - 1518. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wakeley Polymorphism and Divergence for Island-Model Species Genetics, January 1, 2003; 163(1): 411 - 420. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Cherry, J. L.
- Articles by Wakeley, J.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Cherry, J. L.
- Articles by Wakeley, J.





(xi = 


