- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Wakeley, J.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Wakeley, J.
Polymorphism and Divergence for Island-Model Species
John Wakeleyaa Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138
Corresponding author: John Wakeley, 16 Divinity Ave., Cambridge, MA 02138., wakeley{at}fas.harvard.edu (E-mail)
Communicating editor: N. TAKAHATA
| ABSTRACT |
|---|
Estimates of the scaled selection coefficient,
of Sawyer and Hartl, are shown to be remarkably robust to population subdivision. Estimates of mutation parameters and divergence times, in contrast, are very sensitive to subdivision. These results follow from an analysis of natural selection and genetic drift in the island model of subdivision in the limit of a very large number of subpopulations, or demes. In particular, a diffusion process is shown to hold for the average allele frequency among demes in which the level of subdivision sets the timescale of drift and selection and determines the dynamic equilibrium of allele frequencies among demes. This provides a framework for inference about mutation, selection, divergence, and migration when data are available from a number of unlinked nucleotide sites. The effects of subdivision on parameter estimates depend on the distribution of samples among demes. If samples are taken singly from different demes, the only effect of subdivision is in the rescaling of mutation and divergence-time parameters. If multiple samples are taken from one or more demes, high levels of within-deme relatedness lead to low levels of intraspecies polymorphism and increase the number of fixed differences between samples from two species. If subdivision is ignored, mutation parameters are underestimated and the species divergence time is overestimated, sometimes quite drastically. Estimates of the strength of selection are much less strongly affected and always in a conservative direction.
ONE of the primary goals of population genetics has been to measure and to understand the role of natural selection in shaping variation within and between species. Now that molecular technologies allow genetic variation to be assayed with relative ease, this goal seems within reach. A number of different approaches to studying selection have been proposed (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
An obvious shortcoming of these methods is that they assume the species under study are panmictic, i.e., not geographically or otherwise subdivided. It is well known that this assumption is incorrect for many species (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The diffusion result is obtained using Theorem 3.3 in ![]()
![]()
![]()
![]()
This work makes a connection between the PRF theory and work on the robustness of the coalescent process to population structure (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The main result presented here, besides the existence of the diffusion (9) below, is that, if mutations are introduced at a constant rate per generation and sites segregate independently of one another, the PRF results of ![]()
| THEORY |
|---|
A population or species is assumed to be subdivided into D demes of equal size N. The organisms are assumed to be haploid, but the results will hold for diploid organisms if N is replaced with 2N, if selection is additive, and if migration is gametic. The island model of migration (![]()
![]()
![]()
![]()
-1. The next section establishes the diffusion approximation for the frequency of the mutant allele as D
, but DsD remains finite. The migration rate can vary between 0 and 1 (0 < m
1) and N is assumed to be finite. This is in contrast to the usual assumption that Nm is finite as N goes to infinity.
Considering the number of mutants within each deme, it is apparent that there are exactly N + 1 kinds of demes. Each deme that begins a generation with i copies of the mutant will have mutant frequency
![]() |
(1) |
after migration and selection, where x is the frequency of the mutant in the total population. The next generation within the deme will be produced by randomly sampling N haploid individuals from this distribution. Thus, a deme that contains i copies of the mutant now has probability
![]() |
(2) |
of having j copies at the start of the following generation. Because limD
sD = 0, it is often necessary to consider only one part of Pij:
![]() |
(3) |
The notation o(sD) used in Equation 1 and below means that limD
o(sD)/sD = 0. Thus Pij = P*ij + o(1). The process of drift, described by Equation 2, happens independently within each deme.
Limiting allele frequency dynamics at a single locus:
Let ZDi(t) record the fraction of demes that contain i copies of the mutant and zi(t) be a particular realization of this random variable. Thus, ZD(t) is a Markov chain whose state space consists of all possible configurations of the D demes among the N + 1 mutant-count classes. Appendix A proves a diffusion result for ZD(t) as D goes to infinity and DsD remains finite. Briefly, this is done by using Equation 1 of ![]()
![]()
Ni=0iZDi(t)/N. The random variable XD(t) records the frequency of the mutant in the total population or the average frequency of the mutant among demes (x above). Next, let YD(t) = ZDi(t) -
i(t) be the deviation of ZDi(t) from the equilibrium prediction
i(t). For a given Pij(t), this equilibrium satisfies
![]() |
(4) |
It exists because P* = <P*ij> is ergodic and has a finite number of states. We can set
Ni=0
i(t) = 1, and
i(t) becomes the equilibrium prediction for ZDi(t).
The nature of the diffusion approximation (9) below is that the migration and drift within demes equilibrate quickly in comparison to the rate of drift and selection in the total population. The results show that, to a sufficient order of approximation, demes can be considered to always be at a stochastic equilibrium
j (0
j
N) with respect to migration and drift for a given x. The fraction of demes that have j copies of the mutant converges on
j =
Ni=1
iP*ij, where P*ij is given by Equation 3 with x constant. The distribution
is very well approximated by the hypergeometric distribution
![]() |
(5) |
which is a special case of the multivariate Poly(A) distribution; see Equation 40.13 in ![]()
![]()
![]()
![]()
![]()
The form of Equation 5 was obtained by selecting parameters of a hypergeometric distribution that gave the same mean and variance of allele counts among demes as Equation 4, namely
![]() |
(6) |
![]() |
(7) |
which were obtained using (4) together with the moments of the binomial distribution (P*). Equation 5 is the exact solution to (4) when N
2. In addition, as required by (4): when m approaches 1,
i becomes a binomial distribution with parameters N and x; and as m approaches 0, we have
0 = 1 - x,
N = x, and
j = 0 for 1
j
N - 1. Finally, if xi = j/N is the frequency in some deme i, then as N grows but 2Nm = M remains constant, (5) converges on the well-known ß-distribution result
![]() |
(8) |
which ![]()
![]()
0.007 and the relative error is never >
5%.
|
Appendix A shows that, in the limit as D goes to infinity, the change in x by drift and selection is so much slower than that by migration and drift within demes that the collection of demes is always at the equilibrium
i, which depends on N and m, and of course x. By Theorem 3.3 of ![]()
![]() |
(9) |
in which
= N limD
DsD. Time is measured in units of ND/(1 - F) generations, where F is the fixation coefficient, in this case given by Equation A13 in Appendix A. Thus, the diffusion of x is identical to the usual Wright-Fisher diffusion with genic selection, with the exception that it occurs on a timescale longer than that of the panmictic case by the factor 1/(1 - F). Thus, all the well-known predictions of that model apply; e.g., see chapter 5 of ![]()
![]()
when D
and limD
2NDmD = M, so that F = 1/(M + 1), but with limD
ND/D = 0 (![]()
![]()
![]()
Many independently segregating loci:
If we posit an infinite number of loci, i.e., nucleotide sites, which can sustain mutations and which each evolve according to the diffusion of the previous section independently, then the PRF results of ![]()
![]() |
(10) |
The subscripts in Equation 10 refer to "amino acid replacement" and "synonymous" following ![]()
F
1. The other effect, of course, is to distribute variation among demes as described in the previous section. In addition, the parameter tdiv in ![]()
![]()
Rewriting SAWYER and HARTL's (1992) Equation 13 and Equation 14 in terms of the present notation gives
![]() |
(11) |
![]() |
(12) |
![]() |
(13) |
![]() |
(14) |
for the expected numbers of fixed and polymorphic, synonymous, and replacement differences in two species. When a sample is taken from the two species, as in ![]()
Assume that we have taken a random sample of n sequences from d different demes in one of the species, such that n1, n2, ... , nd are the sample sizes from each deme. We can write in general that the expected number of sites that show i1, i2, ... , id copies of the mutant base in the sample (0
ik
nk) is given by
![]() |
(15) |
where j = a, s. The probability h(ik|x, nk), that ik copies of the mutant base are in the sample of nk items from the kth sampled deme, is an average over the within-deme distribution of allele frequencies:
![]() |
(16) |
If N is large and m correspondingly small, we may wish to use the large-deme approximation:
![]() |
(17) |
That is, when N is large we can approximate the hypergeometric probability that the sample contain ik copies of the mutant allele (present in j copies in the deme) with a binomial distribution and the allele count distribution
j with WRIGHT's (1931) continous ß-distribution of allele frequences, g(xk|x).
Because we have assumed an infinite number of independently segregating sites with collective mutation rates given by (10), the PRF model (![]()
![]()
![]()
![]()
| RESULTS |
|---|
The first result to note is that if each sample is taken from a different deme, the methods of ![]()
dk=1ik <
dk=1nk gives SAWYER and HARTL's (1992) Equation 15 and Equation 19 but with the scaled mutation rates that apply here:
s and
a. Similarly, SAWYER and HARTL's (1992) Equation 17 and Equation 18 are derived by considering the chance that ik = 1 for all k. In sum, inferences about selection coefficients, mutation rates, and divergence times are entirely robust to (island-model) population subdivision when each sample is taken from a different deme.
Inferences from single-deme samples:
At the opposite extreme, consider the case in which all samples are drawn from the same deme within each species. Note that we assume, as in ![]()
). Let n1 and n2 denote the sample sizes from the two species. For this sample, the expected numbers of fixed-synonymous (Ks), fixed-replacement (Ka), polymorphic-synonymous (Ss), and polymorphic-replacement (Sa) sites are given by
![]() |
(18) |
![]() |
(19) |
![]() |
(20) |
![]() |
(21) |
in which H(x, n) = 1 - h(n|x, n) - h(0|x, n). The results from Limiting allele frequency dynamics at a single locus are used to compute h(n|x, n) and h(0|x, n). Namely,
![]() |
(22) |
This same equation can be used to compute h(0|x, n) = h(n|1 - x, n).
Fig 2 plots the expected values of Ks, Ka, Ss, and Sa as functions of the migration rate when n1 = n2 = 10 and N = 100 and for three different values of
: -2, 0, and 2. The results are as expected for single-deme samples. When m = 1, they are the same as in a panmictic population. As m decreases, samples from single demes tend to be closely related, so the numbers of polymorphisms will decrease and the numbers of (apparent) fixation events will increase. This is true regardless of whether
is positive, zero, or negative, although the relative magnitudes of the four quantities depend strongly on
. The curves for E(Ks) and E(Ss) are, of course, identical for all values of
. The results that would be obtained by assuming limN
2Nm = M and using Equation 8 and Equation 17 would be similar to what is shown in Fig 2 if M were varied from 0.02 to 200.
|
To understand the effects of (island-model) population subdivision for the extreme case of single-deme samples, we can use the "data" of Fig 2 to fit the parameters of SAWYER and HARTL's (1992) panmictic model. Fig 3 shows that estimates of
are remarkably robust to subdivision, even in this case, where the effects of subdivision should be strongest. Again, if samples were taken singly from different demes, there would be no error in using the panmictic model. For single-deme samples there is some error when the migration rate is low, but even in the extreme case of m = 10-4 (2Nm = 0.02) the estimates are off only by
25%. However, the level of error will be greater for larger samples (see DISCUSSION) and when the absolute value of
is larger. An additional effect is that the error in estimating
is conservative in that the bias is toward neutrality regardless of whether
is positive or negative. Fig 4 shows the effect on the other parameters: tdiv,
s, and
a. As should be expected from Fig 2, mutation rates are underestimated and the divergence time is overestimated when the migration rate is small. The error in estimating these other parameters is much more extreme than that for
. In addition, there is a small effect of
on estimates of
a.
|
|
The expected number of neutral segregating sites:
Under neutrality, the results presented here agree with those found using a coalescent approach in ![]()
![]()
![]()

2Nm = M. We make the same assumption here and further assume that this occurs in such a way that the diffusion result still holds (see Limiting allele frequency dynamics at a single locus). Then we can use g(xk|x) and h*(ik|x, nk) in expression (15) to show that the expected number of synonymous segregating sites is equal to
s
n-1i=1 1/i when all n sampled are taken from separate demes. This was found in ![]()
Consider the number of segregating sites in a sample of n sequences, all from the same deme. From the coalescent approach we have
![]() |
(23) |
(![]()
![]() |
(24) |
and this is shown in Appendix B to be equivalent to (23).
| DISCUSSION |
|---|
The results presented above can be understood in terms of a sample-size effect of subdivision, one that depends on how the sample is distributed among demes. In the limit of a large number of demes, the history of a sample under neutrality has two distinct phases: the scattering phase and the collecting phase described in ![]()
n'
n (![]()
= 2. Thus, the values on the right-hand side of Fig 5 are identical to those on the right-hand side of Fig 2A. Although scales of the horizontal axes are not the same, the effect of smaller migration rate is qualitatively similar to that of smaller sample size. The reason that the values on the left-hand sides of the two panels are different is that the average value of n' at the left in Fig 2A is equal to 1.06, which is considerably smaller than the practical lower limit of 2 in Fig 5. Instead, the values on the left-hand side of Fig 5 can be compared to those in Fig 2A for log10(m) = -2.67, or m = 0.00215, which (with N = 100) gives E[n']
2.
|
This work shows that inferences about natural selection made from DNA polymorphism and divergence data are robust to population subdivision (Fig 3) as long as the migration rate is not too low. This is remarkable in view of the strong effects subdivision has on numbers of polymorphisms, shown in Fig 2, but is understandable in terms of the effect of subdivision on
s,
a, and tdiv. Except for the weak dependence of
a estimates on
(Fig 4), subdivision and migration act equally on selected and neutral variation. In both cases, fixation events are overestimated and polymorphisms underestimated when the migration rate is small. This causes mutation rates to be substantially underestimated and divergence times grossly overestimated if subdivision is ignored, but these effects compensate one another and allow relatively accurate estimates of selection even if subdivision is ignored. Often
will be the focus of study, but if
s,
a, and tdiv are also of interest, it would be useful to have a framework for simultaneous inferences about migration rates, selection coefficients, and these other parameters. The theory presented above is a first step toward this goal.
It is important to note that inferences about natural selection made from allele frequencies at polymorphic sites will be robust to subdivision only in the case of samples taken singly from different demes. Otherwise, the distribution of samples among demes will cause some frequency classes to be overrepresented, resulting in biased inferences. Even when all the samples are taken from the same deme, restricted migration can mimic the effect of positive
on allele frequencies (![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
I thank Dan Hartl, Thomas Nagylaki, Stanley Sawyer, and Clifford Taubes for helpful discussions of the work. I am also grateful to Sabin Lessard for seeing that deme sizes need not be large for the large-number-of-demes coalescent to hold. This work was supported by grants DEB-9815367 and DEB-0133760 from the National Science Foundation.
Manuscript received August 6, 2002; Accepted for publication October 14, 2002.
| APPENDIX A |
|---|
Again, ZDi(t) is the fraction of demes that contain i copies of the mutant. Let ZDij(t) record the fraction of demes that contain i copies of the mutant and are descended from a deme that contained j copies of the mutant in the previous generation. Of course, ZDi(t) =
Nj=0ZDij(t), and we can study the behavior of ZDi(t) by considering the simpler behavior of ZDij(t). In particular, conditional on the state of the system z(t) at time t,
![]() |
(A1) |
and ZDij(t + 1) and ZDkl(t + 1) are independent for all i and k, and j
l. Thus, we have conditional moments
![]() |
(A2) |
![]() |
(A3) |
![]() |
(A4) |
All the higher central moments of the ZDi(t + 1) are o(1/D).
Now let XD(t) =
Ni=0 iZDi(t)/N, and YDi(t) = ZDi(t) -
i(t). The diffusion result follows from these results (derived below) for changes over one generation:
![]() |
(A5) |
![]() |
(A6) |
![]() |
(A7) |
![]() |
(A8) |
![]() |
(A9) |
in which t has been suppressed, x =
Ni=0 izi/N, and yi = zi -
i, and
![]() |
(A10) |
![]() |
(A11) |
![]() |
(A12) |
The fixation index is given by
![]() |
(A13) |
It is clear from Equation A12 that c(x, 0) = 0 for all x
(0, 1). If, in addition, the zero solution of the difference equation
![]() |
(A14) |
is globally asymptotically stable, then the diffusion (9) holds (![]()
i and that Equation A14 is equivalent to Y(k + 1, x, y) = Y(k, x, y)P*. Proof of Equation A14 follows from the ergodicity of the stochastic matrix P*, i.e., that limk
P*(k)ij =
j, along the same lines as the proof in ![]()
The derivation of Equation A9 follows from Equation A4. For Equation A5 we have
![]() |
(A15) |
![]() |
(A16) |
![]() |
(A17) |
Putting in qj from Equation 1 and simplifying give
![]() |
(A18) |
which gives (A5) if we put zi = yi +
i on the right and simplify using Equation 7.
For Equation A6 we have
![]() |
(A19) |
![]() |
(A20) |
![]() |
(A21) |
![]() |
(A22) |
![]() |
(A23) |
![]() |
(A24) |
Again, putting in qj and simplifying, this becomes Equation A6.
For Equation A7 we have
![]() |
(A25) |
![]() |
(A26) |
As in (A20) above, the second sum on the right in (A26) is equal to E[XD(1) - x|z], which, from (A5), is o(1). Expanding and considering the third and fourth central moments of ZDi(1) gives the result (A7).
In the derivations of (A8) and (A9) below I assume that the exact solution of (4) is sufficiently close to (5) that the latter can be used in place of the exact solution. More precisely, I assume that
![]() |
(A27) |
where the coefficients rk depend on N, i, m, and x. This is certainly true for Equation 5, and because (5) and the exact solution of (4) are nearly identical in form (see Fig 1 and associated text), it should also be true of the exact solution although the coefficients rk will be different.
For Equation A8 we have
![]() |
(A28) |
![]() |
(A29) |
Using (A27), the second term on the right in Equation A29 becomes
![]() |
(A30) |
Then by the same argument that gave (A7), using (A1), it can be shown that these higher moments are also o(1/D). Because of this, and putting in
i =
Nj=0
jP*ji, Equation A29 becomes
![]() |
(A31) |
![]() |
(A32) |
which is equal to (A8).
For Equation A9 we have
![]() |
(A33) |
![]() |
(A34) |
using (3.12) in ![]()
This completes the derivation of (A5A9), showing that Theorem 3.3 in ![]()
| APPENDIX B |
|---|
Beginning with Equation 24, and then putting in g(x|x) and
s(x), we have
![]() |
(B1) |
![]() |
(B2) |
![]() |
(B3) |
Using the identity
![]() |
(B4) |
we obtain
![]() |
(B5) |
![]() |
(B6) |
which is the same as Equation 23 since the first term (n' = 1) is equal to zero.
| LITERATURE CITED |
|---|
ABRAMOWITZ, M., and I. A. STEGUN, 1965 Handbook of Mathematical Functions. Dover, New York.
AKASHI, H., 1999 Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics 151:221-238.
BUSTAMANTE, C. D., J. WAKELEY, S. SAWYER, and D. L. HARTL, 2001 Directional selection and the site-frequency spectrum. Genetics 159:1779-1788.
BUSTAMANTE, C. D., R. NIELSEN, S. A. SAWYER, K. M. OLSEN, and M. D. PURUGGANAN et al., 2002 The cost of inbreeding in Arabidopsis.. Nature 416:531-534.[Medline]
CHARLESWORTH, B., 2001 Effect of life history and mode of inheritance on neutral genetic variation. Genet. Res. 77:153-166.[Medline]
CHERRY, J. L. and J. WAKELEY, 2003 A diffusion approximation for selection and drift in a subdivided population. Genetics 163:421-428.
DONNELLY, P., M. NORDBORG, and P. JOYCE, 2001 Likelihoods and simulation methods for a class of nonneutral population genetic models. Genetics 159:853-867.
ETHIER, S. N. and T. NAGYLAKI, 1980 Diffusion approximations of Markov chains with two timescales and applications to population genetics. Adv. Appl. Prob. 12:14-49.
EWENS, W. J., 1979 Mathematical Population Genetics. Springer-Verlag, Berlin.
EWENS, W. J., 1990 Population genetics theorythe past and the future, pp. 177227 in Mathematical and Statistical Developments of Evolutionary Theory, edited by S. LESSARD. Kluwer Academic Publishers, Amsterdam.
FAY, J. C., G. J. WYCKOFF, and C.-I WU, 2002 Testing the neutral theory of molecular evolution with genomic data from Drosophila.. Nature 415:1024-1026.[Medline]
FISHER, R. A., 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford.
FU, X.-Y. and W.-H. LI, 1993 Statistical tests of neutrality of mutations. Genetics 133:693-709.[Abstract]
HARTL, D. L., E. N. MORIYAMA, and S. A. SAWYER, 1994 Selection intensity for codon bias. Genetics 138:227-234.[Abstract]
HUDSON, R. R. and N. L. KAPLAN, 1988 The coalescent process in models with selection and recombination. Genetics 120:831-840.
HUDSON, R. R., M. KREITMAN, and M. AGUADE, 1987 A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159.
JOHNSON, N. L., S. KOTZ and N. BALAKRISHNAN, 1997 Discrete Multivariate Distributions. Wiley, New York.
KIMURA, M., 1969 The number of heterozygous nucleotide sites maintained in a finite population due to the steady flux of mutations. Genetics 61:893-903.
LATTER, B. D. H., 1973 The island model of population differentiation: a general solution. Genetics 73:147-157.
MARUYAMA, T., 1970 Effective number of alleles in a subdivided population. Theor. Popul. Biol. 1:273-306.[Medline]
MCDONALD, J. H. and M. KREITMAN, 1991 Adaptive protein evolution at the adh locus in Drosophila.. Nature 351:652-654.[Medline]
MÖHLE, M., 1998 Robustness results for the coalescent. J. Appl. Prob. 35:438-447.
MÖHLE, M., 2001 Forward and backward diffusion approximations for haploid exchangeable population models. Stoch. Proc. Appl. 95:133-149.
MORAN, P. A. P., 1959 The theory of some genetical effects of population subdivison. Austr. J. Biol. Sci. 12:109-116.
MORAN, P. A. P., 1962 Statistical Processes of Evolutionary Theory. Clarendon Press, Oxford.
NAGYLAKI, T., 1980 The strong-migration limit in geographically structured populations. J. Math. Biol. 9:101-114.[Medline]
NEUHAUSER, C. and S. M. KRONE, 1997 The genealogy of samples in models with selection. Genetics 145:519-534.[Abstract]
NIELSEN, R., 2001 Statistical tests of neutrality in the age of genomics. Heredity 86:641-647.[Medline]
NORDBORG, M., 1997 Structured coalescent processes on different time scales. Genetics 146:1501-1514.[Abstract]
NOTOHARA, M., 1993 The strong migration limit for the genealogical process in geographically structured populations. J. Math. Biol. 31:115-122.
RANNALA, B., 1996 The sampling theory of neutral alleles in an island population of fluctuating size. Theor. Popul. Biol. 50:91-104.[Medline]
ROTHMAN, E. D., C. F. SING, and A. R. TEMPLETON, 1974 A model for the analysis of population structure. Genetics 78:934-960.
SAWYER, S. A. and D. L. HARTL, 1992 Population genetics of polymorphism and divergence. Genetics 132:1161-1176.[Abstract]
SLATKIN, M., 1985 Gene flow in natural populations. Annu. Rev. Ecol. Syst. 16:393-430.
SLATKIN, M. and G. BERTORELLE, 2001 The use of intraallelic variability for testing neutrality and estimating population growth rate. Genetics 158:865-874.
SMITH, N. G. and A. EYRE-WALKER, 2002 Adaptive protein evolution in Drosophila.. Nature 415:1022-1024.[Medline]
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.




































































