- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Eyre-Walker, A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Eyre-Walker, A.
Changing Effective Population Size and the McDonald-Kreitman Test
Adam Eyre-Walkeraa Centre for the Study of Evolution and School of Biological Sciences, University of Sussex, Brighton BN1 9QG, United Kingdom
Corresponding author: Adam Eyre-Walker
Communicating editor: G. B. GOLDING
| ABSTRACT |
|---|
Artifactual evidence of adaptive amino acid substitution can be generated within a McDonald-Kreitman test if some amino acid mutations are slightly deleterious and there has been an increase in effective population size. Here I investigate the conditions under which this occurs. I show that fairly small increases in effective population size can generate artifactual evidence of positive selection if there is no selection upon synonymous codon use. This problem is exacerbated by the removal of low-frequency polymorphisms. However, selection on synonymous codon use restricts the conditions under which artifactual evidence of adaptive evolution is produced.
THE McDonald-Kreitman (MK; ![]()
![]()
![]()
![]()
![]()
![]()
Let us imagine that we have mutliple alleles of a gene from within a species and a single outgroup sequence, of the same gene, from a different species. We count the number of sites at which we have a synonymous (Ps) or a nonsynonymous polymorphism (Pn) and the number of sites at which there has been a synonymous (Ds) or a nonsynonymous (Dn) substitution. Under the neutral theory of molecular evolution, in which all mutations are either neutral or strongly deleterious, it is not difficult to show that Pn/Ps = Dn/Ds. This forms the basis of the MK test of neutral evolution. In a number of data sets violations of the equality Pn/Ps = Dn/Ds have been demonstrated. Possibly the most interesting of these are those in which Dn/Ds > Pn/Ps, since this is consistent with adaptive amino acid substitution. If there has been adaptive amino acid substitution, the proportion of substitutions that were adaptive can be estimated as
![]() |
(1) |
(![]()
can vary between -
and 1. Negative values can be produced by sampling error or violations of the model; in particular, the segregation of slightly deleterious amino acid mutations can cause negative
-values.
It has been argued that artifactual evidence of adaptive amino acid substitution can be obtained in a MK test if some nonsynonymous mutations are slightly deleterious and the current effective population size is larger than the long-term effective population size (![]()
| THE MODEL |
|---|
Let us consider the divergence between two species and let us imagine that the effective population size was N*e for a proportion
of the time for which the two species diverged and that for the rest of the time it was
N*e, the current effective population size of the species from which the polymorphism data were obtained (Fig 1). Consider a mutation upon which the strength of selection is sif s > 0 the mutation is advantageous and if s < 0 it is deleteriouswhere the homozygote has an advantage (disadvantage) of 2s and the mutation is semidominant. ![]()
x is
![]() |
(2) |
|
(see also ![]()
![]() |
(3) |
where S = 4Nes, U = 4Neu, and u is the mutation rate per generation.
![]()
![]() |
(4) |
Since 2Nu mutations enter the population each generation, the rate of substitution of mutations of selection strength s is
![]() |
(5) |
Hence, the rate of divergence between the two species in our model is
![]() |
(6) |
A simple model:
Let us assume for simplicity that all synonymous mutations are neutral and that nonsynonymous mutations are either deleterious or slightly deleterious; then the estimated proportion of amino acid substitutions that have been driven by adaptive evolution is
![]() |
(7) |
an expression that is solely a function of
,
, and S, since the mutation parameters, U and u, cancel out.
Let us begin by assuming that the effective population was N*e until very recently such that
= 1. Fig 2 shows the estimate of
for mutations of different selection coefficients as a function of
. As expected,
is negative when there has been no change in Ne (
= 1); however, as
increases,
becomes less negative until it eventually becomes positive (note
decreases if the population size has decreased). A change in the population size has little consequence for mutations with S
0.1 since
0 unless the increase in effective population size has been very large. Similarly a change in population size has little effect for mutations with S >> 4 since
< 0 unless the change in Ne is very large. It is worth noting here that mutations for which S > 4 are unlikely to contribute much to evolution since their probabilities of fixation are very low; a mutation of S = 4 has a fixation probability that is 6% that of a neutral mutation.
|
Table 1 gives the critical value of
above which
> 0. The threshold increases as a function of both the strength of selection and the sample size, although it is not heavily dependent upon the latter. The increase in Ne needs to be at least threefold to generate artifactual evidence of positive selection for a sample of 10 sequences.
|
If we now assume that the increase in effective population size occurred sometime in the past (i.e.,
< 1), then
is again underestimated for deleterious mutations (Fig 2). Small increases in Ne actually decrease
, but larger increases generate artifactual evidence of positive selection. If
* is the value of
that gives
> 0 when
= 1, the threshold for
< 1 is at least
*/
, although it is much greater than this for mutations of small effect (Table 1). Note that a decrease in effective population size can increase
, but it nevers leads to artifactual evidence of adaptive evolution.
Although I phrased the model in terms of an increase in effective population size, one can equally think of the population as going through a bottleneck during the divergence between the species; in fact, it is more sensible to think in terms of a population bottleneck when
is smallfor example,
= 0.1 corresponds to a bottleneck of 10% of the total divergence time. Furthermore, it is worth noting that this bottleneck could have occurred at any time during the divergence of the species; it does not have to have been in the lineage leading to the outgroup taxon.
Excluding rare variants:
The fact that slightly deleterious mutations can make the McDonald-Kreitman test highly conservative, when the population size has not changed, led ![]()
![]()
becomes larger, it occurs more readily, and the range of selection coefficients that readily yield an overestimation of
is broader. This latter effect is slightly deceptive, since mutations with selection coefficients 4Nes < -4 contribute little to evolution, since their fixation probabilities are very low. If we assume that the change in Ne occurred sometime in the past, then larger increases in Ne are needed to generate artifactual evidence of adaptive evolution; exactly the same relationship holds as above, if
* is the value of
that gives
> 0 when
= 1, the threshold for
< 1 is at least
*/
.
|
Selection on synonymous codon use:
We have so far assumed that synonymous mutations are neutral, but there is evidence in many species that synonymous codon use is subject to selection (![]()
![]() |
(8) |
Fig 4 shows the effect of increasing Ss relative to Sn for two values of Sn. As one might expect, there is artifactual evidence of amino acid substitution if |Ss| > |Sn| even with no change in effective population size, and there is no artifactual evidence of adaptive evolution when Ss = Sn. However, if |Ss| < |Sn| selection upon synonymous codon use reduces the likelihood of producing artifactual evidence of positive selection. This is true whether the increase in population size increased very recently (i.e.,
= 1) or in the past (
< 1, data not shown).
|
However, assuming that synonymous mutations are unconditionally deleterious is unrealistic. It is more usual to model synonymous codon bias as being a balance among mutation, selection, and genetic drift (![]()
![]()
![]() |
(9) |
(![]()
![]()
N*e. However, we assume that the divergence during the increased population size is sufficiently small that the equilibrium frequency of A1 does not change. Under this model the probability of detecting an A1 or A2 polymorphism is
![]() |
(10) |
and the rate of evolution is
![]() |
(11) |
If we assume that the strength of selection acting upon synonymous codon use is Ss and that the strength of selection acting against nonsynonymous mutations, which are all assumed to be deleterious, is Sn, then
![]() |
(12) |
Let us begin by assuming that the effective population was N*e until very recently such that
= 1. If |Ss| > |Sn|, then artifactual evidence of adaptive evolution results even if the effective population size has not increased or decreasedif the population size increases substantially, then the artifactual evidence of adaptive amino acid substitution vanishes. When |Ss| < |Sn| selection on synonymous codon use reduces the effect of increases in effective population size. If the change in effective population size occurred sometime in the past, selection on synonymous codon use greatly reduces the chance of detecting artifactual evidence of positive selection (Fig 5)in fact,
decreases with increases in effective population size for some parameter combinations because synonymous sites undergo adaptive evolution when the population size increases. For higher values of Sn,
is greatly underestimated unless Ss is of very similar magnitude (results not shown).
|
Now let us consider the case where the ancestral population size was
N*e, the population size the species has been for many generations, but that it occasionally goes through bottlenecks (or sudden expansions) of size N*e. Under this model the probability of detecting an A1 or A2 polymorphism is
![]() |
(13) |
and the rate of evolution is
![]() |
(14) |
The estimated proportion of amino acid substitutions driven by positive selection is
![]() |
(15) |
These equations differ from Equation 10Equation 11Equation 12 only in that X(S) has been replaced by X(
S).
Since it does not seem sensible to consider the case where
= 1, since synonymous codon bias would not have equilibrated at the population size of
N*e under this model, I have considered the cases where the species have spent 50 and 10% of their divergence bottlenecked (
> 1; or as an expanded population). In this model
is underestimated unless |Ss| is similar in value to |Sn| (Fig 6).
|
| DISCUSSION |
|---|
As McDonald and Kreitman originally pointed out in their seminal paper (![]()
50-fold.
The range of selection coefficients over which an increase in effective population generates artifactual evidence of positive selection may seem rather small, but mutations with 4Nes values <-4 contribute little to substitution anyway, since their fixation probabilities are very low (<6% that of a neutral mutation). The critical quantity is the proportion of mutations with -0.1 > 4Nes > -4 out of the mutations with 0 > 4Nes > -4. This we do not know, but we can make some inferences. The level of constraint in protein-coding genes, as measured by the ratio of the nonsynonymous to the synonymous substitution rate, is <0.3 in most species (e.g., see ![]()
![]()
-2 (calculated using Equation 5). However, it seems very likely that some mutations are actually much more deleterious than this, which means that on average 4Nes > -2 for slightly deleterious mutations. In fact, a substantial proportion of mutations could be effectively neutral at all population sizes, in which case they will provide no artifactual evidence of adaptive evolution. To investigate this further, let us assume that the strength of selection is exponentially distributed,
![]() |
(16) |
where
is the average strength of selection. Under this distribution-of-fitness effect the average probability of detecting a new mutation in a sample of sequences and the rate of evolution are
![]() |
(17) |
and
![]() |
(18) |
respectively. If we assume that nonsynonymous mutations are deleterious but synonymous mutations are neutral, then the estimated proportion of amino acid substitutions that are adaptive is
![]() |
(19) |
and the ratio of the nonsynonymous to the synonymous substitution rate is
![]() |
(20) |
Equation 19 and Equation 20 are functions of
,
, and
since the mutation parameters cancel out. We can solve Equation 20 to find the value of
that gives a certain nonsynonymous over synonymous ratio for values of
and
. As a guide let us assume that
= 1; then to obtain a nonsynonymous over synonymous ratio (
) of 0.3 requires that
= -3.99; for
= 0.2
= -6.74; and for
= 0.1
= -15.0. Fig 7 shows the consequences of changing the effective population size when
= -6.74; qualitatively similar results are obtained with
= -3.99 and
= -15.0. The model behaves in a similar way to one in which all mutations are mildly deleterious (4Nes
-1): Modest changes in Ne can generate artifactual evidence of positive selection, but the change in Ne needs to be larger if the change occurred sometime in the past.
|
Although it is relatively easy to generate artifactual evidence of positive selection when there is no selection on synonymous codon use, this is generally not the case when there is selection. The behavior depends upon which population size synonymous codon use has equilibrated atif we assume that synonymous codon bias equilibrated at some ancestral population size and that the population size has subsequently increased in the lineage we have sampled polymorphism from, then selection on synonymous codon use reduces the effect of increasing population size; i.e., if
is overestimated, the bias is not as great as if there was no selection on synonymous codon use. Furthermore, there is often no artifactual evidence of adaptive amino acid substitution for any parameter combination if the change occurred sometime in the past, since under these conditions adaptive evolution occurs at synonymous sites leading to
< 0. Adaptive evolution occurs at synonymous sites because the equilibrium frequency of the preferred codon is lower in the ancestral population size than it will be eventually at the increased population size, and there is therefore a period during which advantageous preferred codons are fixed. In contrast, if synonymous codon bias equilibrated at the current size of the population that has been sampled for polymorphism data, but there have been bottlenecks during the divergence of the species, then synonymous codon bias increases
, but
is still often negative. Only if the strength of selection on synonymous mutations is similar to the strength of selection on nonsynonymous mutations, do we get artifactual evidence of positive selection. Of course, artifactual evidence of adaptive amino acid substitution is produced if the strength of selection is greater upon synonymous mutations than on nonsynonymous mutations, but this is true whether or not the population size has changed.
Do these results have any implications for our estimates of adaptive evolution? The proportion of amino acid substitutions that have been fixed by adaptive evolution has been estimated in Drosophila (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
However, there are several reasons for believing that
has not been overestimated. First, although D. simulans and D. melanogaster have expanded out of Africa, the effective population size of the non-African population appears to be lower than that of the African population ( ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
35% of all amino acid substitutions in the divergence in humans and old-world monkeys were adaptive, using single-nucleotide polymorphism data from humans. However, humans have undergone population size expansion and there is no clear evidence of selection on synonymous codon use (![]()
![]()
![]()
![]()
![]()
In summary, an increase in effective population size can generate artifactual evidence of adaptive amino acid substitution if there are slightly deleterious amino acid mutations; the conditions are quite permissive if there is no selection upon synonymous codon use, but the conditions become more restrictive if there is selection at synonymous sites.
| ACKNOWLEDGMENTS |
|---|
Thanks to Nicolas Bierne for helpful discussion and the Biotechnology and Biological Sciences Research Council and Royal Society for support.
Manuscript received May 21, 2002; Accepted for publication September 30, 2002.
| LITERATURE CITED |
|---|
AKASHI, H., 1996 Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster.. Genetics 144:1297-1307.[Abstract]
AKASHI, H., 1999 Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics 151:221-238.
AKASHI, H. and S. W. SCHAEFFER, 1997 Natural selection and the frequency distributions of "silent" DNA polymorphism in Drosophila. Genetics 146:295-307.[Abstract]
ANDOLFATTO, P., 2001 Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18:279-290.
BEGUN, D., 2001 The frequency distribution of nucleotide variation in Drosophila simulans.. Mol. Biol. Evol. 18:1343-1352.
BEGUN, D. and C. F. AQUADRO, 1993 African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548-550.[Medline]
BULMER, M., 1991 The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897-907.[Abstract]
CHARLESWORTH, B., 1994 The effect of background selection against deleterious mutations on weakly selected, linked variants. Genet. Res. 63:213-227.[Medline]
CHEN, F.-C. and W.-H. LI, 2001 Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am. J. Hum. Genet. 68:444-456.[Medline]
EYRE-WALKER, A. and L. D. HURST, 2001 The evolution of isochores. Nat. Rev. Genet. 2:549-555.[Medline]
EYRE-WALKER, A., P. D. KEIGHTLEY, N. G. C. SMITH, and D. GAFFNEY, 2002 Quantifying the slightly deleterious model of molecular evolution. Mol. Biol. Evol. 19:2142-2149.
FAY, J., G. J. WYCOFF, and C.-I WU, 2001 Positive and negative selection on the human genome. Genetics 158:1227-1234.
FAY, J., G. J. WYCOFF, and C.-I WU, 2002 Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature 415:1024-1026.[Medline]
JENKINS, D. L., C. A. ORTORI, and J. F. Y. BROOKFIELD, 1995 A test for adaptive change in DNA sequences controlling transcription. Proc. R. Soc. Lond. Ser. B 261:203-207.[Medline]
KEIGHTLEY, P. D. and A. EYRE-WALKER, 2000 Deleterious mutations and the evolution of sex. Science 290:331-333.
KIMURA, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.
KLIMAN, R., 1999 Recent selection on synonymous codon usage in Drosophila. J. Mol. Evol. 49:343-351.[Medline]
LI, W.-H., 1987 Models of nearly neutral mutations with particular implications for the nonrandom usage of synonymous codons. J. Mol. Evol. 24:337-345.[Medline]
MCDONALD, J. H. and M. KREITMAN, 1991 Adaptive evolution at the Adh locus in Drosophila. Nature 351:652-654.[Medline]
NACHMAN, M. W., W. M. BROWN, M. STONEKING, and C. F. AQUADRO, 1996 Nonneutral mitochondrial DNA variation in humans and chimpanzees. Genetics 142:953-963.[Abstract]
OTTO, S. P. and M. C. WHITLOCK, 1997 The probability of fixation in populations of changing size. Genetics 146:723-733.[Abstract]
SAWYER, S. A. and D. L. HARTL, 1992 Population genetics of polymorphism and divergence. Genetics 132:1161-1176.[Abstract]
SHARP, P. M., C. J. BURGESS, A. T. LLOYD and K. J. MITCHELL, 1992 Selective use of termination and variation in codon choice, pp. 397425 in Transfer RNA in Protein Synthesis, edited by D. L. HATFIELD, B. J. LEE and R. M. PIRTLE. CRC Press, Boca Raton, FL.
SMITH, N. G. C. and A. EYRE-WALKER, 2002 Adaptive protein evolution in Drosophila.. Nature 415:1022-1024.[Medline]
WISE, C. A., M. SRAML, and S. EASTEAL, 1998 Departure from neutrality at the mitochondrial NADH dehydrogenase subunit 2 gene in humans, but not in chimpanzees. Genetics 148:409-421.
WRIGHT, S., 1938 The distribution of gene frequencies under irreversible mutation. Proc. Natl. Acad. Sci. USA 24:253-259.
This article has been cited by other articles:
![]() |
P. Andolfatto Controlling Type-I Error of the McDonald-Kreitman Test in Genomewide Scans for Selection on Noncoding DNA Genetics, November 1, 2008; 180(3): 1767 - 1771. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Egea, S. Casillas, and A. Barbadilla Standard and generalized McDonald-Kreitman test: a website to detect selection by comparing different classes of DNA sites Nucleic Acids Res., July 1, 2008; 36(suppl_2): W157 - W162. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Llopart and J. M. Comeron Recurrent Events of Positive Selection in Independent Drosophila Lineages at the Spermatogenesis Gene roughex Genetics, June 1, 2008; 179(2): 1009 - 1020. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Charlesworth and A. Eyre-Walker The McDonald-Kreitman Test and Slightly Deleterious Mutations Mol. Biol. Evol., June 1, 2008; 25(6): 1007 - 1015. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. S. Ort and G. H. Pogson Molecular Population Genetics of the Male and Female Mitochondrial DNA Molecules of the California Sea Mussel, Mytilus californianus Genetics, October 1, 2007; 177(2): 1087 - 1099. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. G. Barraclough, D. Fontaneto, C. Ricci, and E. A. Herniou Evidence for Inefficient Selection Against Deleterious Mutations in Cytochrome Oxidase I of Asexual Bdelloid Rotifers Mol. Biol. Evol., September 1, 2007; 24(9): 1952 - 1962. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Maside and B. Charlesworth Patterns of Molecular Variation and Evolution in Drosophila americana and Its Relatives Genetics, August 1, 2007; 176(4): 2293 - 2305. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Osada Inference of Expression-Dependent Negative Selection Based on Polymorphism and Divergence in the Human Genome Mol. Biol. Evol., August 1, 2007; 24(8): 1622 - 1626. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Flowers, E Sezgin, S Kumagai, D. Duvernell, L. Matzkin, P. Schmidt, and W. Eanes Adaptive Evolution of Metabolic Pathways in Drosophila Mol. Biol. Evol., June 1, 2007; 24(6): 1347 - 1354. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. D. Frentiu, G. D. Bernard, C. I. Cuevas, M. P. Sison-Mangus, K. L. Prudic, and A. D. Briscoe Colloquium Papers: Adaptive evolution of color vision as seen through the eyes of butterflies PNAS, May 15, 2007; 104(suppl_1): 8634 - 8640. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Sawyer, J. Parsch, Z. Zhang, and D. L. Hartl Inaugural Article: Prevalence of positive selection among nearly neutral amino acid replacements in Drosophila PNAS, April 17, 2007; 104(16): 6504 - 6510. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gojobori, H. Tang, J. M. Akey, and C.-I Wu Adaptive evolution in humans revealed by the negative correlation between the polymorphism and fixation phases of evolution PNAS, March 6, 2007; 104(10): 3907 - 3912. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. L. Bauer DuMont, H. A. Flores, M. H. Wright, and C. F. Aquadro Recurrent Positive Selection at Bgcn, a Key Determinant of Germ Line Differentiation, Does Not Appear to be Driven by Simple Coevolution with Its Partner Protein Bam Mol. Biol. Evol., January 1, 2007; 24(1): 182 - 191. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Bachtrog and P. Andolfatto Selection, Recombination and Demographic History in Drosophila miranda Genetics, December 1, 2006; 174(4): 2045 - 2059. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Welch Estimating the Genomewide Rate of Adaptive Protein Evolution in Drosophila Genetics, June 1, 2006; 173(2): 821 - 837. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Akashi, W.-Y. Ko, S. Piao, A. John, P. Goel, C.-F. Lin, and A. P. Vitins Molecular Evolution in the Drosophila melanogaster Species Subgroup: Frequent Parameter Fluctuations on the Timescale of Molecular Divergence Genetics, March 1, 2006; 172(3): 1711 - 1726. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Loewe, B. Charlesworth, C. Bartolome, and V. Noel Estimating Selection on Nonsynonymous Mutations Genetics, February 1, 2006; 172(2): 1079 - 1092. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. B. DuMont and C. F. Aquadro Multiple Signatures of Positive Selection Downstream of Notch on the X Chromosome in Drosophila melanogaster Genetics, October 1, 2005; 171(2): 639 - 653. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Tiffin Comparative Evolutionary Histories of Chitinase Genes in the Genus Zea and Family Poaceae Genetics, July 1, 2004; 167(3): 1331 - 1340. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Bierne and A. Eyre-Walker The Genomic Rate of Adaptive Amino Acid Substitution in Drosophila Mol. Biol. Evol., July 1, 2004; 21(7): 1350 - 1360. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Zhang Evolution of the Human ASPM Gene, a Major Determinant of Brain Size Genetics, December 1, 2003; 165(4): 2063 - 2070. [Abstract] [Full Text] [PDF] |
||||






























