Genetics, Vol. 151, 221-238, January 1999, Copyright © 1999

Inferring the Fitness Effects of DNA Mutations From Polymorphism and Divergence Data: Statistical Power to Detect Directional Selection Under Stationarity and Free Recombination

Hiroshi Akashia
a Section of Evolution and Ecology, University of California, Davis, California 95616

Corresponding author: Hiroshi Akashi, Haworth Hall, University of Kansas, Lawrence, KS 66045-2106., hakashi{at}falcon.cc.ukans.edu (E-mail)

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

The fitness effects of classes of DNA mutations can be inferred from patterns of nucleotide variation. A number of studies have attributed differences in levels of polymorphism and divergence between silent and replacement mutations to the action of natural selection. Here, I investigate the statistical power to detect directional selection through contrasts of DNA variation among functional categories of mutations. A variety of statistical approaches are applied to DNA data simulated under Sawyer and Hartl's Poisson random field model. Under assumptions of free recombination and stationarity, comparisons that include both the frequency distributions of mutations segregating within populations and the numbers of mutations fixed between populations have substantial power to detect even very weak selection. Frequency distribution and divergence tests are applied to silent and replacement mutations among five alleles of each of eight Drosophila simulans genes. Putatively "preferred" silent mutations segregate at higher frequencies and are more often fixed between species than "unpreferred" silent changes, suggesting fitness differences among synonymous codons. Amino acid changes tend to be either rare polymorphisms or fixed differences, consistent with a combination of deleterious and adaptive protein evolution. In these data, a substantial fraction of both silent and replacement DNA mutations appear to affect fitness.


THE evolutionary fate of a DNA sequence mutation is governed by genetic drift, demographic processes, and natural selection acting directly on the mutation or indirectly through its effect on closely linked mutations. Distinguishing among the roles of each of these factors in patterning within- and between-species genetic variation is a central goal of population genetics (LEWONTIN 1974 Down). In particular, the shape of the distribution of fitness effects of mutations remains a contentious issue (KIMURA 1983 Down; GILLESPIE 1991 Down; OHTA 1992 Down; TAKAHATA 1996 Down).

A number of approaches attempt to infer evolutionary processes by comparing patterns of DNA variation from a given genetic region to those predicted under a specified evolutionary model (i.e., WATTERSON 1978 Down; STROBECK 1987 Down; TAJIMA 1989 Down; HUDSON et al. 1992 Down, HUDSON et al. 1994 Down; FU and LI 1993 Down; BRAVERMAN et al. 1995 Down; SIMONSEN et al. 1995 Down; FU 1996 Down, FU 1997 Down; KELLY 1997 Down). The most common null model assumes an infinite number of mutable sites, no natural selection, no recombination, a stationary frequency distribution of segregating mutations, and a Wright-Fisher demographic model. Rejection of the null hypothesis can be caused by a number of departures from these assumptions including changes in population size, population subdivision, genetic linkage to adaptive and deleterious mutations, and selection on the mutations themselves (STROBECK 1987 Down; HUDSON et al. 1992 Down; BRAVERMAN et al. 1995 Down; CHARLESWORTH et al. 1995 Down; SIMONSEN et al. 1995 Down; FU 1996 Down, FU 1997 Down). However, distinguishing between the contributions of demographic history and natural selection to a given departure from the null can be difficult. For example, patterns of neutral DNA sequence variation closely linked to a site that has undergone a recent adaptive substitution or "selective sweep" are similar to those in an expanding population (SIMONSEN et al. 1995 Down). Alternatively, patterns of neutral DNA variation linked to a site at which a polymorphism is maintained by balancing selection can be similar to sequence variation sampled from subdivided populations (HUDSON 1990 Down).

A second class of approaches compares patterns of DNA variation between two or more genetic regions (HUDSON et al. 1987 Down; MCDONALD 1996 Down, MCDONALD 1998 Down; HEY 1997 Down). Such comparisons attempt to distinguish between the effects of demographic history, which should have a roughly equal impact throughout the genome, and natural selection, whose effect may be more localized. Although the statistical power of this approach to detecting particular scenarios of evolution has not been investigated, regional differences in levels of polymorphism, or in the frequency distributions of mutations, could result from either balancing selection elevating linked neutral variation or directional selection reducing neutral polymorphism. However, between-region comparisons neither identify the particular site(s) under selection nor address what fraction of segregating or fixed mutations affects fitness.

Comparisons of evolutionary patterns between categories of mutations interspersed within a genetic region attempt to identify the direct action of natural selection. If the classes of mutations (such as replacement and silent changes) are randomly interspersed within a genetic region, then population level effects and selection at linked sites are expected to have a roughly equivalent impact on mutations in the two classes (HUDSON 1993 Down). Thus, differences in the frequency distributions of polymorphic mutations (SAWYER et al. 1987 Down) and in the ratios of polymorphic and fixed mutations (MCDONALD and KREITMAN 1991 Down; TEMPLETON 1996 Down; AKASHI 1997A Down) should reflect differences in the fitness effects of the mutations. A number of claims of adaptive (MCDONALD and KREITMAN 1991 Down; EANES et al. 1993 Down; LONG and LANGLEY 1993 Down; KAROTAM et al. 1995 Down; KING 1998 Down), deleterious (SAWYER et al. 1987 Down; BALLARD and KREITMAN 1994 Down; NACHMAN et al. 1994 Down, NACHMAN et al. 1996 Down; Rand et al. 1994 Down; AKASHI 1996 Down; TEMPLETON 1996 Down; WISE et al. 1998 Down), and balancing (WAYNE et al. 1996 Down) selection on amino acid variants, mutation-selection-drift at silent sites (BALLARD and KREITMAN 1994 Down; AKASHI 1995 Down, AKASHI 1997A Down; AKASHI and SCHAEFFER 1997 Down), and deleterious effects of transposable element insertions (GOLDING et al. 1986 Down) rely on such comparisons.

Although a growing number of studies are inferring evolutionary processes from comparisons among interspersed mutations, the sensitivity and robustness of this approach to detecting selection have not been examined. Here, I investigate the statistical power to detect directional selection through comparisons of patterns of variation between putative fitness classes of DNA mutations. All results are obtained under Sawyer and Hartl's Poisson random field model (SAWYER and HARTL 1992 Down; HARTL et al. 1994 Down) under assumptions of stationarity, free recombination, and independent fitness effects (results under departures from these assumptions will be addressed in a separate study). The performance of a number of statistical tests is compared over a wide range of selection intensities and sample sizes of DNA sequences. Such tests are applied to DNA sequence data from eight Drosophila simulans genes to determine the contribution of natural selection in silent and protein evolution.


*  NATURAL SELECTION AND THE EXPECTED CONFIGURATIONS OF MUTATIONS
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Kimura and Ohta treat gene frequency changes within populations and the accumulation of fixed differences between populations as two facets of an underlying process of evolution under relatively constant mutation rates, effective population sizes, and (for some mutations) directional selection (KIMURA and OHTA 1971 Down; KIMURA 1983 Down). Positive selection increases the probability that a mutation will rise in frequency in each generation, whereas negative selection has the opposite effect. However, except for very strong selection, the time scale of the process is on the order of Ne, the effective population size, generations. Inferring the fitness effects of mutations by measuring the evolutionary trajectories (frequency changes) of individual mutations in laboratory or natural populations requires relatively strong deterministic forces (s > 0.001; DYKHUIZEN and HARTL 1983 Down). In large populations, much weaker selection (s > 1/Ne) can have an important impact in long-term evolution (FISHER 1930 Down; WRIGHT 1931 Down; KIMURA 1962 Down; OHTA 1973 Down) but cannot be measured directly. For such mutations, the trajectories of the mutations, and thus their fitness effects, can be inferred by sampling two aspects of the evolutionary process. A "snapshot" of evolution can be obtained by comparing alleles from within a population; the data consist of the number of segregating sites and their frequencies in the sample. The numbers of mutations "fixed" between an outgroup and the most recent common ancestor of the population sample can be inferred by examining sequences from closely related species.

Consider an aligned set of DNA sequences from m individuals from a population and at least one sequence from an outgroup. Assume an infinite number of mutable sites in these sequences so that all mutations occur at unique sites. Assume first that ancestral and derived nucleotides can be determined at sites that vary in the sample and at sites that differ between the sample and the outgroup. At a given variable site, the nonancestral nucleotide will be found in r = 1 to m of the sequences. The distribution of nonancestral nucleotides falling into the r frequency classes will be referred to as the "configuration" of mutations [to distinguish the pattern from the "frequency distribution" of mutations that is often used to describe polymorphic mutations (r = 1 to m - 1)]. Mutations in frequency class m will be referred to as "fixed" between the sample and the outgroup. In the absence of information about the ancestral and derived nucleotides at variable sites, the configuration can be "folded-over" by pooling each pair of frequency classes r = i and r = m - i for all integers, 1 <= i <= m/2.

Figure 1 illustrates the quantitative effects of directional selection on the expected configurations of mutations. Positive directional selection skews the configuration toward a larger proportion of mutations at high frequencies within the population or fixed in the sample (Figure 1B). Negative directional selection has the opposite effect: a greater proportion of nonancestral nucleotides segregate at low frequencies (Figure 1A). However, note that under positive selection, the total expected number of variable sites in the sample increases as a function of Nes (Figure 1D), whereas under negative selection, the opposite is true (Figure 1C).



View larger version (24K):
In this window
In a new window
Download PPT slide
 
Figure 1. Expected configurations of neutral and selected mutations. The expected numbers of newly arisen mutations at frequency classes r = 1 to m in a sample of sequences were calculated according to SAWYER and HARTL 1992 Down and HARTL et al. 1994 Down. Data are shown for m = 5 sequences and tdiv = 0.6. a and b show the expected proportion of variable sites in the sample at different frequencies under negative and positive selection, respectively. c and d show the proportion of mutable sites at which variants are expected to be segregating at different frequencies or fixed in the sample under negative and positive selection, respectively. Note that the scales for the y-axis differ for c and d. For m = 5, pooling classes r = 1 and r = 4, and r = 2 and r = 3 would give a folded-over distribution with three frequency classes. Superscript f denotes "fixed" difference class (r = m). In this figure, and in the following figures, the scales of unlabeled x- and y-axes are equivalent to those of graphs in the same columns and rows, respectively.

GOLDING et al. 1986 Down and SAWYER et al. 1987 Down were the first to infer differences in the average fitness effects of mutations by contrasting within- and between-species variation among functional classes of DNA changes. SAWYER et al. 1987 Down suggested a simple comparison of the frequency distributions of silent and replacement polymorphisms in a 2 x 2 contingency table. Their data did not include an outgroup sequence, so the analyses were confined to polymorphism data with unknown ancestral and derived states (folded configurations). They divided segregating mutations into two frequency classes, "singletons" (r = 1 and r = m - 1) and other frequency classes (1 < r < m - 1). A departure from homogeneity in these classes among silent and replacement changes was interpreted as evidence for differences in the fitness effects of the mutations.

MCDONALD and KREITMAN 1991 Down included between-species variation in a similar test of homogeneity among frequency classes for silent and replacement mutations. Their 2 x 2 contingency table compares the numbers of polymorphic mutations, pooled across frequency classes (1 <= r < m), and the numbers of fixed differences (r = m). Because directional selection has a strong impact on the fixed differences class (Figure 1), including this information is likely to increase the sensitivity of the statistical approach to detect selection. However, by pooling all polymorphic mutations into a single category, McDonald and Kreitman's test sacrifices information from the frequency distribution of segregating mutations.

TEMPLETON 1996 Down combined the approaches of SAWYER et al. 1987 Down and MCDONALD and KREITMAN 1991 Down by expanding the contrast to a test of homogeneity across three frequency classes. The numbers of singleton polymorphisms (r = 1 and r = m - 1), polymorphisms at intermediate frequencies (1 < r < m - 1), and fixed differences (r = m) were compared between silent and replacement mutations. The statistical test examines information from both the folded frequency distribution of segregating mutations and the numbers of fixed differences, but information for polymorphic mutations segregating at frequencies greater than one is lost by pooling such variants into a single category.

The statistical power to detect selection through these approaches could, in principle, be enhanced by including more information from the sample and by employing a statistical test that is more sensitive to deviations caused by the particular alternative hypotheses of interest (AKASHI 1997A Down). Because even very weak selection affects the expected proportion of mutations in each frequency class, treating each class (1 <= r < m) as a distinct category may increase the sensitivity of the approach. In addition, outgroup sequences can be used to infer ancestral and derived states at variable positions so that r = i and r = m - i classes do not have to be pooled. Finally, under the Poisson random field model, directional selection has a strong effect on the means of the configurations of mutations (Figure 1). Statistical comparisons that are more sensitive to differences in the location of distributions may be more powerful for detecting fitness effects of mutations than tests of homogeneity.


*  CONFIGURATION TESTS BETWEEN NEUTRAL AND SELECTED MUTATIONS
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

SAWYER et al. 1987 Down identified the fitness effects of amino acid mutations by comparing the configurations of silent and replacement mutations within coding regions in DNA. Assuming neutral evolution at silent sites, a configuration of amino acid mutations skewed toward an excess of rare polymorphisms reflects deleterious amino acid mutations, whereas an excess of common or fixed amino acid differences supports adaptive protein evolution. The analyses below investigate the statistical power to detect selection through such comparisons between neutral and selected mutations.

Sawyer and Hartl's Poisson random field model allows relatively straightforward simulation of DNA variation data under directional selection. The model assumes a Wright-Fisher population of haploid individuals, an infinite number of mutable sites, a stationary frequency distribution of segregating mutations, and independent evolution at all sites (free recombination and independent fitness effects of mutations). Under these assumptions, the numbers of mutations in each frequency class in the configuration (r = 1 to m) are independent Poisson random variables whose means can be calculated according to the equations of SAWYER and HARTL 1992 Down and HARTL et al. 1994 Down. These means are a function of five parameters: Ne, the species effective population size; u, the expected number of mutations per nucleotide site, per generation; l, the number of aligned nucleotide sites; m, the number of alleles sampled from a given population; tdiv, the time of divergence between the population sampled and the outgroup (scaled to Ne generations); and s, the selective effect of mutations. Note that number of "alleles" refers to the number of chromosomes sampled from a population regardless of whether any pairs of the DNA sequences are identical or not.

In the following power tests, Ne and u were fixed and the other parameters were varied over a range of interest. Ne = 106 (KREITMAN 1983 Down) and u = 10-9 (MORIYAMA 1987 Down; ROWAN and HUNT 1991 Down) correspond to rough estimates for these parameters in Drosophila. Statistical power was examined for m = 5, 10, 25, and 50 alleles and for l = 500, 1000, 2500, and 5000 mutable sites. Selection coefficients were varied between -100 <= Nes <= 100, and the time of divergence between the sampled alleles and the outgroup was varied between tdiv = 0.6, 1.2, 2.4, and 4.8. The lower tdiv value was that estimated from intron polymorphism and divergence data in D. simulans since its split with its sister species, D. melanogaster (see AKASHI and SCHAEFFER 1997 Down). The upper value corresponds to ~10% expected divergence at neutral sites. For a given set of parameter values, mutations at half of the sites were neutral and the other half were selected. Simulations were also conducted for 20% neutral and 80% selected sites and for 80% neutral and 20% selected sites. In all simulations, selection coefficients were uniform for all mutations within each category.

For each set of five parameters, the expected values of the numbers of neutral and selected mutations in each frequency class in the configuration were calculated according to the equations of Table 2 of SAWYER and HARTL 1992 Down and Equation 2 of HARTL et al. 1994 Down. The algorithm of PRESS et al. 1992 Down(p. 293) was used to generate 1000 simulated data sets by sampling integers from Poisson distributions with these expected values as their means. All simulations were written in the C computer language and run on Macintosh and Pentium desktop computers.


 
View this table:
In this window
In a new window

 
Table 1. Statistical tests comparing the configurations of mutations


 
View this table:
In this window
In a new window

 
Table 2. Frequency distributions and divergence of DNA mutations in D. simulans

A variety of statistical tests was applied to each of the simulated data sets. For polymorphism (frequency distribution) data, the tests examined were as follows: SAWYER et al. 1987 Down 2 x 2 test of independence with frequency classes r = 1 and 1 < r < m, a 2 x (m - 1) test of independence for all the frequency classes of polymorphic mutations and a Mann-Whitney U-test (MWU) for all the frequency classes (see Table 1 for abbreviations). For unfolded distributions, ancestral and derived states were assumed to be inferred without error.

For the tests of homogeneity, the probability of the data under the null hypothesis of independence was estimated through a Monte Carlo approach. For each simulated 2 x n table, the product of each cell value and its natural logarithm was summed across all cells to give a test statistic. The test statistic was also calculated for 1000 randomized tables. In a generalization of the Fisher exact test, the joint probability of cell values was assumed to be the joint hypergeometric probabilities of the cells under homogeneity for the same marginal values as the simulated table (see Appendix 1). For each simulated table, 1000 random tables were generated from these joint hypergeometric probabilities. The fraction of these random tables with a test statistic equal to, or greater than, that observed in the sample was used as the estimate of the two-tailed probability of the observed data under the null hypothesis of homogeneity in their configurations (the procedure follows that of B. ENGELS, personal communication). In these simulations, the first column of the table is generated under the assumption of selective neutrality. Rejection of the null hypothesis indicates that the test has detected significant selective effects on the distribution of cell counts in the second column of the table.

TAJIMA'S 1989 Down D-test was also applied to the simulated nonneutral class of variation. This procedure compares polymorphism data to expectations under an equilibrium, neutral, no recombination model. The critical values of the test statistic were taken from TAJIMA's (1989) tables. The D-test differs from the other statistical tests described above because the test compares a single class of mutations to a null model and because the critical values that are commonly employed assume no recombination rather than free recombination. Tajima's test was included for comparison because it is often applied to data from recombining regions of DNA (i.e., TAJIMA 1989 Down; MORIYAMA and POWELL 1996 Down) and because its power to detect the fitness effects of mutations has not been addressed.

For the tests restricted to polymorphism data, the statistical power to detect both negative and positive selection is shown in Figure 2 and Figure 3. The power to reject the null hypothesis generally increases as a function of the absolute value of Nes, but decreases for large negative values (Figure 3). The cause of this pattern is apparent from Figure 1; although the location of the distribution of mutations continues to change as selection becomes stronger, the sample size of nonneutral polymorphisms decreases to zero. For 25 alleles of 1250 neutral and selected sites, however, the power to detect even very strong negative selection is considerable. Among tests of homogeneity, the 2 x 2 test is more sensitive to negative selection, whereas the 2 x (m - 1) test is generally more powerful for Nes > 0. This appears to reflect the lack of information in the higher frequency classes for mutations under negative selection (Figure 1). Tajima's D-test performs poorly for small numbers of alleles, but is quite sensitive to negative directional selection when the number of sampled alleles is large. Positive directional selection has a smaller impact on the frequency distribution of mutations (Figure 1) and was not detectable by the Tajima D-test under any of the parameter values examined.



View larger version (19K):
In this window
In a new window
Download PPT slide
 
Figure 2. Power of polymorphism configuration tests between neutral and weakly selected mutations. The y-axis plots the proportion of tests that reject fitness equivalence, P < 0.05, among 1000 simulated data sets for each value of Nes. See Table 1 for abbreviations for tests. Because each point on the graph reflects a proportion from a random binomial sample of size n = 1000, the 95% confidence intervals for the true values are ± 2[] . The relative order of the power of these tests (and those of Figure 3 Figure 4 Figure 5 Figure 6) was similar for the simulations under unequal numbers of neutral and selected sites described in the text.



View larger version (16K):
In this window
In a new window
Download PPT slide
 
Figure 3. Power of polymorphism configuration tests between neutral and deleterious mutations. Plots are equivalent to those of Figure 2 for strongly deleterious mutations.

The polymorphism tests show different sensitivities to changes in the examined numbers of alleles and numbers of sites. Increasing the number of sites has a larger impact on the power of tests of independence and fdMWU tests, whereas the Tajima test gains considerably from increasing the number of sampled alleles. For the parameter ranges considered, the fdMWU test is at least as powerful, and is often considerably more powerful, than the other polymorphism tests for detecting both positive and negative directional selection.

The expected proportion of variable sites in the fixed differences class (r = m) is very sensitive to selection (Figure 1). Thus, adding divergence data to comparisons of the configurations of mutations is likely to add substantial power to detect natural selection. Four such tests were applied to simulated data. Three different tests of independence were employed, the 2 x 2 polymorphism (1 <= r < m) and divergence (r = m) test of MCDONALD and KREITMAN 1991 Down, the 2 x 3 singleton (r = 1), intermediate frequency (1 < r < m), and divergence (r = m) test of TEMPLETON 1996 Down, and the 2 x m frequency distribution and divergence test for unpooled frequency classes. These tests were performed as Monte Carlo tests of homogeneity (MCH) as described above. Mann-Whitney U-tests were also applied to the m frequency classes (see Table 1 for abbreviations for tests). The equivalent tests and the fdMWU test were also performed on folded distributions (ancestral and derived states not inferred). The power of statistical tests that include divergence data is shown in Figure 4 and Figure 5.



View larger version (25K):
In this window
In a new window
Download PPT slide
 
Figure 4. Power of polymorphism and divergence configuration tests between neutral and weakly selected mutations. The y-axis plots the proportion of tests that reject fitness equivalence, P < 0.05, among 1000 simulated data sets for each value of Nes. See Table 1 for abbreviations for tests. TEMPLETON 1996 Down also suggested a "young vs. old" mutations 2 x 2 test of independence between singletons (r = 1 and r = m - 1) and all other frequency classes (1 < r < m - 1 and r = m). The power of this test was examined but the results are not shown. This test is sensitive to strong deleterious selection, especially for small sample sizes, but has little power to detect positive selection coefficients. For some parameters, the power of this test decreases with increasing numbers of alleles.



View larger version (19K):
In this window
In a new window
Download PPT slide
 
Figure 5. Power of polymorphism and divergence configuration tests between neutral and deleterious mutations. The plots are equivalent to those of Figure 4 for strongly deleterious mutations.

As expected, inclusion of the fixed difference class adds a great deal of power to detect natural selection, especially for Nes > 0. Among the tests of independence, the 2 x 2 test is relatively insensitive to deleterious evolution but is roughly equivalent to the 2 x 3 test for positive selection. The 2 x m test of homogeneity was considerably less powerful than the 2 x 3 test over almost the entire range of parameters examined. Apparently, the higher frequency cells increase the degrees of freedom in the statistical test but contribute little to the test statistic both because the expected and observed values are not sufficiently different and because the values in the cells are small (Figure 1). For a small number of sites and a large number of alleles, the fdMWU test can be more sensitive to negative selection than tests of independence that include divergence. However, the fddMWU test is either equivalent to, or more powerful than, all the other tests over all parameter values examined. The difference in power is most notable when the number of alleles is large and when selection coefficients are small. For tests that include divergence comparisons the number of sampled sites generally has a much larger impact on statistical power than increasing the number of sampled alleles (Figure 4 and Figure 5). Examining the unfolded distributions of newly arisen mutations can have a substantial effect on power when the numbers of sampled alleles is small and when selection is weak (Figure 6).



View larger version (13K):
In this window
In a new window
Download PPT slide
 
Figure 6. Configuration tests between neutral and selected mutations for folded and unfolded distributions. The y-axis plots the proportion of tests that reject fitness equivalence, P < 0.05, among 1000 simulated data sets for each value of Nes. The statistical power of the fddMWU test for folded (f) and unfolded (uf) configurations of mutations is shown. For unfolded configurations, ancestral and derived states are assumed to be inferred with complete accuracy. See Table 1 for abbreviations for tests.

It is important to note, however, that the results above hold only for the given model of evolution under the parameters examined. Although these findings probably hold for parameters within the range investigated, the superior power of the MWU test over the homogeneity tests is due to the particular deviation from the null investigated here. Under uniform selection, the location of the configuration of mutations undergoes a unilateral shift as a function of selection intensity (Figure 1). However, other alternative models may show differences in the configurations of mutations that may result in smaller deviations in their means. The choice of tests, therefore, depends on the particular alternatives under consideration. If a model predicts differences in the locations of distributions, then MWU tests may provide the greatest statistical power to detect selection. In the absence of a particular alternative hypothesis, Templeton's sidMCH test is a general method to test for departures from the null hypothesis of equivalent configurations among classes of mutations. (Similar comparisons that combine the frequency distribution of polymorphic mutations and the number of fixed differences could also enhance the sensitivity of between-region comparisons of DNA variation, but the power of such tests has not been investigated.)

The evolutionary distance, tdiv, between the alleles sampled from within a population and the outgroup sequence can have a large impact on the power to detect selection. Figure 7 shows the effect of times of divergence on the pdMCH and fddMWU tests. Increasing tdiv increases the sample sizes of fixed differences resulting in an increase in the power of all the tests that include this information and a decrease in the differences among these tests. However, these results assume accurate counting of the numbers of substitutions (under the infinite sites model, the number of diverged sites equals the number of substitutions). In practice, the number of substitutions is inferred given the number of differences between extant sequences and an evolutionary model that determines the appropriate correction for the number of sites that have undergone multiple substitutions. At higher levels of divergence, estimation of substitution rates can be quite sensitive to the assumed model of evolution (NEI and GOJOBORI 1986 Down; GOLDMAN and YANG 1994 Down; RZHETSKY and NEI 1995 Down; INA 1996 Down; MUSE 1996 Down). Accurate estimation requires knowledge of transition probabilities among different nucleotides (or codons) and of how these probabilities vary among sites and over time. In the absence of such knowledge, shorter divergence times allow more reliable inference of the numbers of evolutionary fixations at the expense of some statistical power. In addition, ancestral and derived states at variable nucleotide positions can be inferred with greater confidence when levels of evolutionary divergence among the sequences are low (COLLINS et al. 1994 Down; FRUMHOFF and REEVE 1994 Down; YANG et al. 1995 Down; SCHLUTER et al. 1997 Down; ZHANG and NEI 1997 Down).



View larger version (14K):
In this window
In a new window
Download PPT slide
 
Figure 7. The effect of tdiv on the power of configuration tests between neutral and selected mutations. The y-axis plots the proportion of tests that reject fitness equivalence, P < 0.05, among 1000 simulated data sets for each value of Nes. Results for the pdMCH and fddMWU tests are shown for tdiv = 0.6, 1.2, 2.4, and 4.8. See Table 1 for abbreviations for tests.

These simulation data suggest that, under the assumptions of the Sawyer-Hartl model, comparisons of the configurations of neutral and selected mutations have considerable statistical power to detect even very weak positive and negative selection. This inference of selection is dependent on two steps. A difference in the configurations of two categories of mutations suggests different distributions of their fitness effects. If one of the categories of mutations evolves neutrally, then, under a constant directional selection model, the sign of the fitness effects of the second class of mutations can be inferred from the location of its distribution relative to that for the neutral class. An excess of rare polymorphisms, relative to the neutral class, suggests negative selection coefficients, whereas too many fixed differences suggest adaptive evolution. The assumption of neutrality at silent sites in coding regions is critical to such inferences of selection in protein evolution. The following section employs simulation data to examine the statistical power to detect a particular model of selection at silent sites and compares the configurations of putative fitness classes of silent DNA mutations in D. simulans.


*  CONFIGURATION TESTS OF MUTATION-SELECTION-DRIFT
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Patterns of codon usage in a number of organisms are consistent with natural selection discriminating among synonymous codons to enhance the efficiency and/or the accuracy of protein synthesis (reviewed in IKEMURA 1985 Down; ANDERSSON and KURLAND 1990 Down; SHARP et al. 1995 Down). Under "major codon preference," codon usage bias is maintained by a balance among the forces of mutation pressure, genetic drift, and natural selection favoring translationally superior major codons (SHARP and LI 1986 Down; LI 1987 Down; BULMER 1988 Down, BULMER 1991 Down). The simplest evolutionary model of this scenario considers twofold redundant codons in a haploid organism (LI 1987 Down; BULMER 1991 Down). Mutations occur at rates v from nonmajor codons to major codons and u in the opposite direction. Major codons confer selective advantage s. This scenario is depicted below:

Consider a locus consisting of a number of such sites. The proportion of major codons at the locus is determined by u/v, the ratio of the mutation rates, and Nes, the product of effective population size and selection coefficient. If these parameters remain relatively constant, then the proportion of major codons at the locus will reach a steady state (i.e., equal numbers of forward and backward substitutions).

Major codon preference predicts two fitness classes of silent mutations, "preferred" mutations from nonmajor to major codons and "unpreferred" mutations in the opposite direction (AKASHI 1995 Down). A purely mutational model of codon bias requires differences in the forward and backward mutations rates (FREESE 1962 Down; SUEOKA 1962 Down, SUEOKA 1988 Down), but does not predict differences in the evolutionary configurations of mutations in the two directions, because both are neutral.

The statistical power to detect major codon preference at silent sites can be examined under the Sawyer-Hartl model. LI 1987 Down and BULMER 1991 Down give expressions for the steady-state proportion of major codons in a given gene under assumptions of constant Nes and independent evolution among sites. This proportion is determined by four parameters: u and v, the forward and backward mutation rates; s, the selective advantage of major codons; and Ne, the species effective population size. For these simulations, per-site mutation rates were set to u = 1.2 x 10-9 and v = 0.8 x 10-9. The ratio of the mutation rates, = 1.5, gives an equilibrium mutational base composition of 60% A + T, the average base composition of putatively neutrally evolving introns in D. melanogaster (SHIELDS et al. 1988 Down; MORIYAMA and HARTL 1993 Down). This base composition is also consistent with substitution patterns in presumably "dead-on-arrival" non-LTR transposable elements in Drosophila (D. PETROV, personal communication). The proportion of sites encoding major codons was calculated from Equation 6 of BULMER 1991 Down given the parameters, u, v, Ne, and s. The numbers of major and nonmajor codons were determined by this proportion and the number of sites in the locus, l, and were assumed fixed (no stochastic variance) for a given set of parameter values. Per-locus mutation rates to unpreferred and preferred mutations are the product of per-site mutation rates and the numbers of major and nonmajor codons, respectively. Note that, under this model, per-locus preferred and unpreferred mutation rates are a function of the strength of selection.

Figure 8 shows the expected configurations of unpreferred and preferred mutations under major codon preference. The numbers of mutations in each frequency class in the configuration, r = 1 to m, can be determined from Sawyer and Hartl's sampling equations given m, the number of alleles examined, tdiv, the time of divergence between the species sampled and the outgroup, and the five parameters discussed above. Even very weak selection can skew the configurations of the two classes of silent mutations. As selection increases the proportion of major codons in a given locus, differences in the configurations of the proportion of mutations in the frequency classes become more pronounced (Figure 8, a–c) but the expected numbers of preferred mutations decrease (Figure 8, d–f). This decrease in the per-locus preferred mutation rate will result in a loss of statistical power to detect differences between the configurations of preferred and unpreferred mutations. However, under major codon preference, observed levels of codon bias in Drosophila require selection coefficients in the range of ~0 < |Nes| < 3. The analyses below examine the statistical power to detect differences in the configurations of the two classes of silent mutations under such a parameter range.



View larger version (18K):
In this window
In a new window
Download PPT slide
 
Figure 8. Expected configurations of preferred and unpreferred mutations under major codon preference. The expected numbers of newly arisen mutations at frequency classes r = 1 to m in a sample of sequences were calculated according to SAWYER and HARTL 1992 Down and HARTL et al. 1994 Down. Data are shown for m = 5 sequences and tdiv = 0.6. a, b, and c show the expected proportion of variable sites in the sample at different frequencies under Nes = ±0.5, ±1.0, and ±3.0, which correspond to major codon usages of 65, 85, and 100%, respectively. c, d, and e show the proportion of sites at which variants are expected to be segregating at different frequencies or fixed in the sample. Superscript f denotes "fixed" difference class (r = m).

DNA variation data were simulated for preferred and unpreferred mutations under the Sawyer-Hartl Poisson random field model. Assuming stationary frequency distributions and independent evolution at all sites, the numbers of sampled preferred and unpreferred mutations in each frequency class are independent Poisson random variables. Simulations were conducted for the parameters described above and l = 500, 1000, 2500, and 5000 mutable sites and m = 5, 10, 25, and 50 alleles. Selection coefficients between major and nonmajor codons were varied between 0 <= Nes <= 6, and the time of divergence between the sampled alleles and the outgroup was varied between tdiv = 0.6, 1.2, 2.4, and 4.8. A total of 1000 sample configurations of preferred and unpreferred mutations were simulated for each set of parameters. The pdMCH, sidMCH, and fddMCH tests of independence, and the fdMWU and fddMWU tests were applied to each simulated data set.

Figure 9 compares the statistical power of these five methods to detect mutation-selection-drift. For all tests, the power to detect selection increases initially with Nes but falls off as major codon usage reaches 100%. Each of the statistical methods shows some power to detect weak selection. The relative power of the different tests is similar to that for the neutral vs. selected mutations tests. The frequency distribution test is generally less powerful than tests that include divergence data. Among the latter category, 2 x 3 tests of independence are considerably more powerful than both 2 x 2 and 2 x m tests. Overall, however, the fddMWU test is either indistinguishable from, or more powerful than, all the other tests over the parameter ranges considered. The gain in power is greatest when the number of sampled alleles is large. For the same total number of aligned nucleotides, increasing the number of sampled sites has a greater impact on statistical power than increasing the numbers of alleles.



View larger version (24K):
In this window
In a new window
Download PPT slide
 
Figure 9. Power of polymorphism and divergence configuration tests between preferred and unpreferred mutations. The y-axis plots the proportion of tests that reject fitness equivalence, P < 0.05, among 1000 simulated data sets for each value of the proportion of major codons at mutable sites. Because the direction of the deviations in the configurations of preferred and unpreferred mutations are predicted under major codon preference, one-tailed probabilities were calculated for these tests. For Monte Carlo homogeneity tests, the proportion of randomized tables with both a higher average frequency of preferred than unpreferred codons and an equal or larger test statistic than the data was taken as the probability of a given configuration. The five tests are those from Figure 6. The analyses are similar to those shown in AKASHI 1997A Down(Figure 5) but the per-site mutation rates are more conservative (lower), the 2 x m test of independence is included, Monte Carlo homogeneity tests replace G-tests, and power to reject the null is plotted as a function of levels of major codon usage rather than selection intensity.

These power analyses suggest that configuration comparisons, given enough mutations, can detect natural selection near its limit of efficacy. The configurations of preferred and unpreferred synonymous codons have been compared in DNA sequence data from D. simulans (AKASHI 1997A Down) and the results are reiterated in Figure 10. Major codon preference predicts frequency distributions and divergence skewed toward higher values for advantageous preferred silent mutations than for deleterious unpreferred mutations. Equivalent configurations constitute the null hypothesis in this comparison and neutrality of both classes of mutations (the purely mutational model) is a subset of this null. Figure 10 shows the configurations of preferred and unpreferred mutations pooled from five alleles from each of eight D. simulans genes found in the literature or in GenBank (Table 2). Methods to identify major codons and infer ancestral and derived states for silent mutations are given in AKASHI 1995 Down. Equivalent data from D. melanogaster are not shown because other lines of evidence indicate a reduction in the efficacy of selection at silent sites in this lineage (AKASHI 1995 Down, AKASHI 1996 Down). Although only five alleles were analyzed in D. simulans, close to 2500 silent sites were examined across the eight genes in Table 2. If codon bias is maintained under mutation-selection-drift in this lineage, then frequency distribution and divergence comparisons should have a high probability of rejecting fitness equivalence between preferred and unpreferred mutations.



View larger version (25K):
In this window
In a new window
Download PPT slide
 
Figure 10. The configurations of preferred, unpreferred, and replacement mutations in D. simulans. The proportions of 101 unpreferred (black), 37 preferred (gray), and 22 replacement (striped) mutations segregating at the given frequencies or fixed among five alleles of each of eight D. simulans genes are shown. Pooled data are from eight D. simulans genes from Table 2. Interestingly, 11 of the 12 amino acid "fixations" in D. simulans have occurred in the Zw gene, whereas the singleton polymorphisms are distributed more evenly among the eight genes; the distribution of selection coefficients may differ among genes. Superscript f denotes "fixed" difference class (r = m).

In D. simulans, the configurations of preferred and unpreferred mutations are similar to those expected under weak selection (Figure 8B). The 37 preferred mutations are segregating at higher frequencies and are more often fixed than the 101 unpreferred changes (Mann-Whitney U-test, z = 3.12, P = 0.0009, one-tailed). The other statistical tests were also significant at the 5% level; frequency distributions are skewed toward higher values (Mann-Whitney U-test, z = 1.71, P = 0.044), ratios of polymorphism to divergence are lower (Fisher's exact test, P = 0.007), and the ratios of singleton, intermediate frequency, and fixed differences are skewed toward higher values for preferred than for unpreferred mutations (Monte Carlo homogeneity test, P = 0.015). These patterns are both consistent with major codon preference and difficult to explain in the absence of selection (AKASHI 1997A Down).


*  NONNEUTRAL SILENT SITES AND TESTS OF NATURAL SELECTION IN PROTEIN EVOLUTION
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

SAWYER et al. 1987 Down and MCDONALD and KREITMAN 1991 Down proposed comparisons of configurations between a putatively neutrally evolving (silent) and a potentially selected (replacement) class of mutations. A number of claims of adaptive protein evolution depend on such an assumption of neutral evolution of synonymous mutations (MCDONALD and KREITMAN 1991 Down; EANES et al. 1993 Down; LONG and LANGLEY 1993 Down; KAROTAM et al. 1995 Down, KING 1998 Down). Configuration tests, however, support weak selection at silent sites in Drosophila (AKASHI 1995 Down, AKASHI 1997A Down; AKASHI and SCHAEFFER 1997 Down), and a number of other patterns of silent DNA evolution in Drosophila are consistent with mutation-selection-drift (SHIELDS et al. 1988 Down; SHARP and LI 1989 Down; KLIMAN and HEY 1993 Down, KLIMAN and HEY 1994 Down; MORIYAMA and HARTL 1993 Down; AKASHI 1994 Down; MORIYAMA and POWELL 1997 Down; POWELL and MORIYAMA 1997 Down). Although major codon preference could, in principle, affect interpretations of comparisons between silent and replacement mutations (AKASHI 1995 Down), the magnitude of such effects has not been evaluated.

Computer simulations were conducted to determine whether neutral protein evolution and selection at silent sites could mimic patterns that have been attributed to adaptive amino acid substitutions. DNA variation data were generated under a combination of the two scenarios described in the sections above. A given locus consists of two categories of mutations; half of the sites evolve under major codon preference and the other half evolve neutrally. Preferred and unpreferred mutations at major codon preference sites are pooled into a single category representing silent mutations and compared to the neutral class, representing replacement mutations. (This does not assume that all protein mutations are neutral. Amino acid positions at which mutations are strongly selected against do not contribute to variation and would not be counted in the number of mutable replacement sites in these simulations.) DNA variation data were generated as described above for the same mutational parameters, effective population size, times of divergence, and sample sizes. pdMCH tests of homogeneity and fdMWU and fddMWU tests were applied to 1000 simulated configurations of silent and replacement mutations for each set of parameters.

Figure 11 shows the fraction of statistical tests that reject equivalence of the configurations of nonneutral silent and neutral replacement mutations. Under the parameters examined, these tests show some power to reject the null when major codons reach a frequency of about 70 to 80% (Nes {approx} 1). At higher levels of major codon usage (stronger selection), the tests can be quite sensitive to mutation-selection-drift. At tdiv = 0.6, the fdMWU test is more powerful than the pdMCH test, and the fddMWU test is generally most powerful. At higher levels of divergence, all tests become more sensitive to major codon preference and the pdMCH test outperforms the fdMWU test (data not shown).



View larger version (12K):
In this window
In a new window
Download PPT slide
 
Figure 11. Power of polymorphism and divergence configuration tests between neutral mutations and mutations under major codon preference. The y-axis plots the proportion of tests that reject fitness equivalence, P < 0.05, among 1000 simulated data sets for each value of major codon usage. See text for simulation parameters and Table 1 for abbreviations for tests.

The impact of mutation-selection-drift on silent/replacement configuration comparisons is very sensitive to the strength of selection. If major codon preference accurately describes silent evolution, then silent/replacement comparisons may be valid in low codon bias genes. However, when selection intensity (and the proportion of major codons) is high, the majority of silent mutations are deleterious unpreferred changes (Figure 8, d–f), and the means of the configurations of silent mutations tend to be lower than those of the neutral expectation. The null model can be rejected at a high rate, and the relative locations of the configurations are consistent with neutral evolution at silent sites and adaptive amino acid substitutions. For the data examined in this study, the average codon bias across the eight D. simulans genes is ~80% major codons; comparisons of replacement mutations to pooled silent mutations are difficult to interpret.

Separate contrasts of replacement mutations to preferred and unpreferred silent changes may shed some light on mechanisms of protein evolution (AKASHI 1995 Down). Under major codon preference, the configurations of preferred and unpreferred mutations reflect evolution under small positive and negative selection coefficients, respectively. Thus, configurations of replacement mutations skewed toward a larger fraction of high frequency variants and fixed differences than preferred mutations reflect adaptive protein evolution, whereas configurations skewed toward an excess of low frequency amino acid mutations relative to unpreferred silent mutations support deleterious protein evolution.

The configurations of replacement mutations as well as preferred and unpreferred silent mutations among the five alleles of eight D. simulans genes are shown in Figure 10. Surprisingly, roughly half of the variable replacement sites are singleton polymorphisms and the other half are fixed in the samples of five D. simulans alleles. Because no prediction had been made for the shape of the configuration of replacement mutations, TEMPLETON's (1996) sidMCH test was applied to the data. The configuration of replacement mutations is significantly different from that of both preferred (P = 0.028, two-tailed) and unpreferred (P < 0.001) silent changes. Although the number of replacement mutations in these data is small, this configuration does not appear to conform to the predictions for protein evolution under uniform selection coefficients (including neutral evolution).

The excesses of rare amino acid polymorphisms and fixed differences can be explained by relaxing the assumption of uniform Nes. One possibility is a combination of a large fraction of slightly deleterious amino acid changes and heterogeneity in effective population size over time. Lower effective population sizes in the past would have allowed slightly deleterious mutations to go to fixation, whereas more effective selection in larger current populations keeps deleterious polymorphisms at low frequencies. Such a nonequilibrium scenario was suggested by OHTA 1993 Down to explain lower ratios of polymorphism to divergence for replacement than silent evolution at the Drosophila Adh locus. However, because major codon preference is very sensitive to small changes in Nes (AKASHI 1996 Down), population-level phenomena should impact silent DNA mutations as well as amino acid changes. Roughly equal numbers of preferred and unpreferred silent fixations in the D. simulans lineage (Table 2) argue against large, or prolonged, fluctuations in effective population size.

Heterogeneity in selection coefficients, either across sites or across time, could also account for the configurations of amino acid mutations in D. simulans. One possibility is that selection coefficients vary among amino acid positions; low frequency polymorphisms are deleterious mutations that rarely go to fixation, whereas fixed differences in the sample reflect occasional adaptive amino acid substitutions. In this scenario, the polymorphic and fixed mutations in the sample are not a result of a single process of evolution under constant parameters but reflect a combination of the evolutionary dynamics of multiple fitness classes of mutations.

Selection coefficients varying across time, rather than among DNA sites, could also explain the deficiency of intermediate frequency amino acid polymorphisms (HARTL and DYKHUIZEN 1985 Down). Most amino acid mutations are deleterious and are unlikely to reach appreciable frequencies within populations, but occasional environmental changes cause a subset of polymorphisms to become adaptive and rapidly go to fixation. Although the configuration of amino acid mutations in D. simulans is intriguing, more rigorous inference of mechanisms of protein evolution will require DNA sequence data both for a larger number of genes in this species and equivalent data from other lineages. In addition, interpreting these data may require predictions for the configurations of amino acid mutations under more complex models of evolution than those considered here.


*  DISCUSSION
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Under the Sawyer-Hartl Poisson random field model, comparisons of the configurations of functional categories of DNA mutations can have considerable power to detect even very weak directional selection on classes of DNA mutations. These findings cannot be generalized beyond evolution under the parameter ranges considered and under the Sawyer-Hartl assumptions of stationarity, free recombination, and independent fitness effects of all mutations. Given these assumptions, configuration comparisons that include information from both frequency distributions of polymorphic mutations and numbers of fixed differences confer the greatest power to detect the fitness effects of mutations. The Mann-Whitney U-test, which is sensitive to differences in the locations of distributions, is a more powerful statistical approach to detect uniform selection coefficients than contingency tests of homogeneity. Accumulating DNA variation data for a large number of mutations with similar fitness effects is critical to the power of these tests. Configuration tests suggest that among eight D. simulans genes, a large fraction of both silent and replacement mutations affect fitness. Some limitations to this approach and these findings are discussed below.

Robustness of configuration comparisons:
Under the Sawyer-Hartl assumptions, the numbers of observed mutations in each frequency class for each category of mutations are independent Poisson random numbers. Under these conditions, the test statistics of both Monte Carlo homogeneity tests and Mann-Whitney U-tests will be appropriately distributed under the null hypothesis of equivalent configurations of mutations. However, independent evolution at all sites, a stationary frequency distribution of mutations, and random sampling from a panmictic population are clearly not biologically realistic assumptions for many DNA sequence studies. One of the most appealing features of configuration comparisons is its claimed robustness to these assumptions. In the special case of no recombination, each of the alleles in a sample will be related by a single genealogy. If mutations have occurred at a constant (and low) rate on this genealogy, then the numbers of mutations from each category on each branch of the genealogy will be independent Poisson random numbers, regardless of whether the particular genealogy is sampled from an equilibrium, panmictic population (SAWYER et al. 1987 Down; HUDSON 1993 Down). A similar argument has been made for intermediate levels of recombination. If the two classes of mutations are randomly interspersed in a genetic region (with respect to differences in evolutionary histories of subregions in the data), then configuration tests should be robust to departures from stationarity and panmixis (MCDONALD and KREITMAN 1991 Down; HUDSON 1993 Down). The effects of such departures from the Sawyer-Hartl assumptions on the distribution of the test statistics of configuration comparisons have not been confirmed. It is unclear whether genetic linkage, nonstationarity, or nonrandom population sampling can lead to false rejection of the null model under equivalent distributions of fitness effects (Type I error). The robustness and power of configuration tests under violations of the Sawyer-Hartl assumptions are not addressed here.

The analyses above have implicitly assumed that per-locus mutation rates have remained constant over the time periods examined. Particular scenarios of variable mutation rates can produce differences in the configurations of mutations identical to those resulting from natural selection (EYRE-WALKER 1997 Down). Consistent differences in configuration tests in independent lineages can distinguish the effects of mutational processes from those of selection (AKASHI 1997B Down).

Interpreting departures from equivalent configurations of mutations:
If configuration comparisons are robust to departures from the assumptions of the Poisson random field model, then such methods provide a general approach for inferring the distribution of fitness effects for various classes of mutations. However, the relationship between evolutionary configurations and the fitness effects of mutations must be treated with caution. The null hypothesis of these tests is equivalent configurations for the classes of mutations. This condition is satisfied when the distributions of selection coefficients for the two classes of mutations are equivalent (neutrality for both classes of mutations is one scenario that satisfies this null). However, the converse does not necessarily hold; it is possible that different distributions of selection coefficients can give rise to the same configuration of mutations. Thus, rejection of the null can be interpreted as evidence for differences in the fitness effects of mutations, but similar configurations do not necessarily imply similar fitness effects. "Tests of neutrality" is not an appropriate description of configuration comparisons because the null hypothesis includes, but is not limited to, neutral evolution for both classes of mutations.

Given a departure from equivalent configurations for two or more classes of mutations, further inference (of the sign and magnitude of selection coefficients) requires additional information or assumptions. Under a directional selection model with uniform selection coefficients, the relative locations of the configurations identify the relative magnitudes of the fitness effects for the two classes of mutations. For example, if one class of mutations is known to evolve neutrally, then configurations skewed toward an excess of high frequency and fixed variants for a second class suggest adaptive evolution. However, the same pattern could arise from weak deleterious effects of the first class of mutations and neutral evolution for the second class. The assumptions underlying such inferences should be made explicit.

A number of studies have attempted to infer the absolute intensity of selection (the magnitude of Nes) from the configuration of mutations (SAWYER et al. 1987 Down; SAWYER and HARTL 1992 Down; HARTL et al. 1994 Down; AKASHI 1995 Down; AKASHI and SCHAEFFER 1997 Down; NACHMAN 1998 Down). These studies have found maximum-likelihood estimates for Nes given the observed ratios of polymorphic and fixed differences or the observed frequency distribution of polymorphic mutations. In addition to the Sawyer-Hartl assumptions of free recombination and stationarity, these studies have imposed an additional assumption of uniform Nes for all mutations in a given category. Surprisingly, each of the studies has found maximum-likelihood estimates of |Nes| {approx} 1. However, none of the studies has tested the fit of a distribution of selection coefficients to the data (such nested hypotheses can be tested through likelihood-ratio tests). For example, a number of studies have interpreted higher polymorphism/divergence ratios for replacement than for silent mutations as evidence for slightly deleterious protein mutations with uniform Nes {approx} -1. It is possible that a combination of relatively strongly deleterious and neutral (or even adaptive) mutations could explain the data equally well. Examination of each frequency class in the configuration of mutations under the Sawyer-Hartl maximum-likelihood method may help in distinguishing between such scenarios. The patterns of DNA variation in D. simulans suggest that distributions including both positive and negative fitness effects should be considered for both silent and protein variation.

Defining putative fitness classes of mutations:
Configuration comparisons can be applied to any categories of interspersed DNA mutations. Fitness effects of mutations in introns and noncoding regions, insertion/deletion events, and mutations in regulatory elements can be assessed through such methods. However, the power of this approach depends critically on the ability to identify putative fitness classes of mutations. At silent sites, a simple model of selection for translational efficiency predicts differential fitness effects of forward and backward DNA mutations; configuration comparisons provide a straightforward and powerful test of such a prediction. Comparison of pooled silent changes to a neutral class of variation has considerably less power to reveal selection (Figure 11). Unfortunately, defining putative fitness classes of protein variation is difficult; replacement mutations are often pooled into a single category. THORNE et al. 1996 Down and TEMPLETON 1996 Down have attempted to subdivide amino acid positions by their role in protein structure, but such subdivisions correspond to functional categories, rather than putative fitness classes. A combination of biochemical and ecological studies (reviewed in TAKAHATA 1996 Down; GOLDING and DEAN 1998 Down) could allow more informative use of configuration comparisons. The relative lack of biological models that predict the fitness effects of particular mutations may be the strongest limitation in current applications of configuration tests to reveal mechanisms of protein evolution. However, one of the critical determinants of the power of configuration tests is the number of sites evolving under a particular model of evolution. If protein adaptation occurs by a small number of amino acid substitutions at a few key sites (PERUTZ 1983 Down; YOKOYAMA 1997 Down; GOLDING and DEAN 1998 Down), then tests of parallelism/convergence (GOLDMAN 1993 Down; ZHANG and KUMAR 1997 Down) or the maximum-likelihood tests of NIELSEN and YANG 1998 Down may be more powerful than configuration tests for identifying positive directional selection in protein evolution.


*  ACKNOWLEDGMENTS

I am grateful to John Gillespie, Dave Cutler, Chuck Langley, and Laura Rose for many valuable discussions during the course of this work. Two anonymous reviewers contributed a great deal to improving this manuscript. H.A. was a National Science Foundation/Alfred P. Sloan Foundation Postdoctoral Fellow in Molecular Studies of Evolution.

Manuscript received May 12, 1998; Accepted for publication September 23, 1998.


*  APPENDIX 1
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Generating random 2 x n tables for Monte Carlo tests of homogeneity:
Assume a 2 (columns) x n (rows) contingency table with column sums c1, c2 and row sums r1, r2, ... , rn. Into an array of length l = c1 + c2, randomly insert r1 copies of the integer 1, r2 copies of the integer 2, ... , and rn copies of integer n. The counts (the numbers of 1's, 2's, ... , n's) for the first c1 entries of this array form the first column of the 2 x n table, and the counts for the remaining c2 entries form the second column. Randomly permute the array of n integers and reconstitute the table. Such random tables will have the same row and column sums as the observed (or simulated) table and the correct joint hypergeometric distribution (BILL ENGELS, personal communication).


*  LITERATURE CITED
*TOP
*ABSTRACT
*NATURAL SELECTION AND THE...
*CONFIGURATION TESTS BETWEEN...
*CONFIGURATION TESTS OF...
*NONNEUTRAL SILENT SITES AND...
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

AKASHI, H., 1994  Synonymous codon usage in Drosophila melanogaster: Natural selection and translational accuracy. Genetics 136:927-935[Abstract].

AKASHI, H., 1995  Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA. Genetics 139:1067-1076[Abstract].

AKASHI, H., 1996  Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster.. Genetics 144:1297-1307[Abstract].

AKASHI, H., 1997a  Codon bias evolution in Drosophila: population genetics of mutation-selection-drift. Gene 205:269-278[Medline].

AKASHI, H., 1997b  Distinguishing the effects of mutational biases and natural selection on DNA sequence variation [letter]. Genetics 147:1989-1991[Medline].

AKASHI, H. and S. W. SCHAEFFER, 1997  Natural selection and the frequency distributions of "silent" DNA polymorphism in Drosophila. Genetics 146:295-307[Abstract].

ANDERSSON, S. G. E. and C. G. KURLAND, 1990  Codon preferences in free-living microorganisms. Microbiol. Rev. 54:198-210[Abstract/Free Full Text].

BALLARD, J. W. and M. KREITMAN, 1994  Unraveling selection in the mitochondrial genome of Drosophila. Genetics 138:757-772[Abstract].

BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995  The hitchhiking effect on the site frequency spectrum of DNA plymorphisms. Genetics 140:783-796[Abstract].

BULMER, M., 1988  Are codon usage patterns in unicellular organisms determined by selection-mutation balance. J. Evol. Biol. 1:15-26.

BULMER, M., 1991  The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897-907[Abstract].

CHARLESWORTH, D., B. CHARLESWORTH, and M. T. MORGAN, 1995  The pattern of neutral molecular variation under the background selection model. Genetics 141:1619-1632[Abstract].

COLLINS, T. M., P. H. WIMBERGER, and G. J. P. NAYLOR, 1994  Compositional bias, character-state bias, and character-state reconstruction using parsimony. Syst. Biol. 43:482-496.

DYKHUIZEN, D. E. and D. L. HARTL, 1983  Functional effects of PGI allozymes in Escherichia coli.. Genetics 105:1-18[Abstract/Free Full Text].

EANES, W. F., M. KIRCHNER, and J. YOON, 1993  Evidence for adaptive evolution of the G6pd gene in the Drosophila melanogaster and Drosophila simulans lineages. Proc. Natl. Acad. Sci. USA 90:7475-7479[Abstract/Free Full Text].

EYRE-WALKER, A., 1997  Differentiating between selection and mutational bias. Genetics 147:1983-1987[Medline].

FISHER, R. A., 1930  The distribution of gene ratios for rare mutations. Proc. R. Soc. Edinb. Sect. B 50:204-219.

FREESE, E., 1962  On the evolution of base composition of DNA. J. Theor. Biol. 3:82-101.

FRUMHOFF, P. C. and H. K. REEVE, 1994  Using phylogenies to test hypotheses of adaptation: a critique of some current proposals. Evolution 48:172-180.

FU, Y.-X., 1996  New statistical tests of neutrality for DNA samples from a population. Genetics 143:557-570[Abstract].

FU, Y.-X., 1997  Statistical tests of neutrality of mutations against population growth, hitchhiking, and background selection. Genetics 147:915-925[Abstract].

FU, Y.-X. and W.-H. LI, 1993  Statistical tests of neutrality of mutations. Genetics 133:693-709[Abstract].

GILLESPIE, J. H., 1991 The Causes of Molecular Evolution. Oxford University Press, New York.

GOLDING, B. G. and A. M. DEAN, 1998  The structural basis of molecular adaptation. Mol. Biol. Evol. 15:355-369[Abstract].

GOLDMAN, N., 1993  Simple diagnostic statistical tests of models for DNA substitution. J. Mol. Evol. 37:650-661[Medline].

GOLDMAN, N. and Z. YANG, 1994  A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11:725-736[Abstract].

GOLDING, G. B., C. F. AQUADRO, and C. H. LANGLEY, 1986  Sequence evolution within populations under multiple types of mutation. Proc. Natl. Acad. Sci. USA 83:427-431[Abstract/Free Full Text].

HARTL, D. L., and D. E. DYKHUIZEN, 1985 The neutral theory and the molecular basis of preadaptation, pp. 107–124 in Population Genetics and Molecular Evolution, edited by T. OHTA and K. AOKI. Japan Sci. Soc. Press, Tokyo.

HARTL, D. L., E. N. MORIYAMA, and S. SAWYER, 1994  Selection intensity for codon bias. Genetics 138:227-234[Abstract].

HEY, J., 1997  Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 14:166-172[Abstract].

HUDSON, R. R., 1990 Gene genealogies and the coalescent process, pp. 1–44 in Oxford Series in Ecology and Evolution, Vol. 7, edited by D. FUTUYMA and J. ANTONOVICS. Oxford University Press, Oxford.

HUDSON, R. R., 1993  Levels of DNA polymorphism and divergence yield important insights into evolutionary processes. Proc. Natl. Acad. Sci. USA 90:7425-7426[Free Full Text].

HUDSON, R. R., M. KREITMAN, and M. AGUADÉ, 1987  A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159[Abstract/Free Full Text].

HUDSON, R. R., M. SLATKIN, and W. P. MADDISON, 1992  Estimation of levels of gene flow from DNA sequence data. Genetics 132:583-589[Abstract].

HUDSON, R. R., K. BAILEY, D. SKARECKY, J. KWIATOWSKI, and F. J. AYALA, 1994  Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster.. Genetics 136:1329-1340[Abstract].

IKEMURA, T., 1985  Codon usage and tRNA content in unicellular and multicellular organisms. Mol. Biol. Evol. 2:13-34[Abstract].

INA, Y., 1996  Pattern of synonymous and nonsynonymous substitutions: an indicator of mechanisms of molecular evolution. J. Genet. 75:91-115.

KAROTAM, J., T. M. BOYCE, and J. G. OAKESHOTT, 1995  Nucleotide variation at the hypervariable esterase 6 isozyme locus of Drosophila simulans. Mol. Biol. Evol. 12:113-122[Abstract].

KELLY, J. K., 1997  A test of neutrality based on interlocus associations. Genetics 146:1197-1206[Abstract].

KIMURA, M., 1962  On the probability of fixation of mutant genes in a population. Genetics 47:713-719[Free Full Text].

KIMURA, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

KIMURA, M. and T. OHTA, 1971  Protein polymorphism as a phase of molecular evolution. Nature 229:467-469[Medline].

KING, L. M., 1998  The role of gene conversion in determining sequence variation and divergence in the Est-5 gene family in Drosophila pseudoobscura.. Genetics 148:305-315[Abstract/Free Full Text].

KLIMAN, R. M. and J. HEY, 1993  Reduced natural selection associated with low recombination in Drosophila melanogaster.. Mol. Biol. Evol. 10:1239-1258[Abstract].

KLIMAN, R. M. and J. HEY, 1994  The effects of mutation and natural selection on codon bias in the genes of Drosophila. Genetics 137:1049-1056[Abstract].

KREITMAN, M., 1983  Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster.. Nature 304:412-417[Medline].

LEWONTIN, R. C., 1974 The Genetic Basis of Evolutionary Change. Columbia University Press, New York.

LI, W.-H., 1987  Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons. J. Mol. Evol. 24:337-345[Medline].

LONG, M. and C. H. LANGLEY, 1993  Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science 260:91-95[Abstract/Free Full Text].

MCDONALD, J. H., 1996  Detecting non-neutral heterogeneity across a region of DNA sequence in the ratio of polymorphism to divergence. Mol. Biol. Evol. 13:253-260[Abstract].

MCDONALD, J. H., 1998  Improved tests for heterogeneity across a region of DNA sequence in the ratio of polymorphism to divergence. Mol. Biol. Evol. 15:377-384[Abstract].

MCDONALD, J. H. and M. KREITMAN, 1991  Adaptive protein evolution at the Adh locus in Drosophila.. Nature 351:652-654[Medline].

MORIYAMA, E. N., 1987  Higher rates of nucleotide substitution in Drosophila than in mammals. Jpn. J. Genet. 63:139-147.

MORIYAMA, E. N. and D. L. HARTL, 1993  Codon usage bias and base composition of nuclear genes in Drosophila.. Genetics 134:847-858[Abstract].

MORIYAMA, E. N. and J. R. POWELL, 1996  Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277[Abstract].

MORIYAMA, E. N. and J. R. POWELL, 1997  Codon usage bias and tRNA abundance in Drosophila.. J. Mol. Evol. 45:514-523[Medline].

MUSE, S. V., 1996  Estimating synonymous and nonsynonymous substitution rates. Mol. Biol. Evol. 13:105-114[Abstract].

NACHMAN, M. W., 1998  Deleterious mutations in animal mitochondrial DNA. Genetica 102(103):61-69.

NACHMAN, M. W., S. N. BOYER, and C. F. AQUADRO, 1994  Nonneutral evolution at the mitochondrial NADH dehydrogenase subunit 3 gene in mice. Proc. Natl. Acad. Sci. USA 91:6364-6368[Abstract/Free Full Text].

NACHMAN, M. W., W. M. BROWN, M. STONEKING, and C. F. AQUADRO, 1996  Nonneutral mitochondrial DNA variation in humans and chimpanzees. Genetics 142:953-963[Abstract].

NEI, M. and T. GOJOBORI, 1986  Simple methods for estimating the numbers of synonymous and non-synonymous nucleotide substitutions. Mol. Biol. Evol. 3:418-426[Abstract].

NIELSEN, R. and Z. YANG, 1998  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929-936[Abstract/Free Full Text].

OHTA, T., 1973  Slightly deleterious mutant substitutions in evolution. Nature 246:96-98[Medline].

OHTA, T., 1992  The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Sys. 23:263-286.

OHTA, T., 1993  Amino acid substitution at the Adh locus of Drosophila is facilitated by small population size. Proc. Natl. Acad. Sci. USA 90:4548-4551[Abstract/Free Full Text].

PERUTZ, M. F., 1983  Species adaptation in a protein molecule. Mol. Biol. Evol. 1:1-28[Abstract].

POWELL, J. R. and E. N. MORIYAMA, 1997  Evolution of codon usage bias in Drosophila.. Proc. Natl. Acad. Sci. USA 95:7784-7790[Abstract/Free Full Text].

PRESS, W. H., S. A. TEUKOLSKY, W. T. VETTERLING and B. P. FLANNERY, 1992 Numerical Recipes in C: The Art of Scientific Computing, Ed. 2. Cambridge University Press, Cambridge.

RAND, D. M., M. DORFSMAN, and L. M. KANN, 1994  Neutral and nonneutral evolution of Drosophila mitochondrial DNA. Genetics 138:741-756[Abstract].

ROWAN, R. G. and J. A. HUNT, 1991  Rates of DNA change and phylogeny from the DNA sequences of the alcohol dehydrogenase gene for five closely related species of Hawaiian Drosophila. Mol. Biol. Evol. 8:49-70[Abstract].

RZHETSKY, A. and M. NEI, 1995  Tests of applicability of several substitution models for DNA sequence data. Mol. Biol. Evol. 12:131-151[Abstract].

SAWYER, S. A. and D. L. HARTL, 1992  Population genetics of polymorphism and divergence. Genetics 132:1161-1176[Abstract].

SAWYER, S. A., D. E. DYKHUIZEN, and D. L. HARTL, 1987  Confidence interval for the number of selectively neutral amino acid polymorphisms. Proc. Natl. Acad. Sci. USA 84:6225-6228[Abstract/Free Full Text].

SCHLUTER, D., T. PRICE, A. O. MOOERS, and D. LUDWIG, 1997  Likelihood of ancestral states in adaptive radiation. Evolution 51:1699-1711.

SHARP, P. M. and W.-H. LI, 1986  An evolutionary perspective on synonymous codon usage in unicellular organisms. J. Mol. Evol. 24:28-38[Medline].

SHARP, P. M. and W.-H. LI, 1989  On the rate of DNA sequence evolution in Drosophila.. J. Mol. Biol. 28:398-402.

SHARP, P. M., M. AVEROF, A. T. LLOYD, G. MATASSI, and J. F. PEDEN, 1995  DNA sequence evolution: the sounds of silence. Philos. Trans. R. Soc. Lond. B Biol. Sci. 349:241-247[Medline].

SHIELDS, D. C., P. M. SHARP, D. G. HIGGINS, and F. WRIGHT, 1988  "Silent" sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol. Evol. 5:704-716[Abstract].

SIMONSEN, K. L., G. A. CHURCHILL, and C. F. AQUADRO, 1995  Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413-429[Abstract].

STROBECK, C., 1987  Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117:149-153[Abstract/Free Full Text].

SUEOKA, N., 1962  On the genetic basis of variation and heterogeneity of DNA base composition. Proc. Natl. Acad. Sci. USA 48:582-592[Free Full Text].

SUEOKA, N., 1988  Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci. USA 85:2653-2657[Abstract/Free Full Text].

TAJIMA, F., 1989  Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595[Abstract/Free Full Text].

TAKAHATA, N., 1996  Neutral theory of molecular evolution. Curr. Opin. Genet. Dev. 6:767-772[Medline].

TEMPLETON, A. R., 1996  Contingency tests of neutrality using intra/interspecific gene trees: the rejection of neutrality for the evolution of the mitochondrial cytochrome oxidase II gene in the Hominoid primates. Genetics 144:1263-1270[Abstract].

THORNE, J. L., N. GOLDMAN, and D. T. JONES, 1996  Combining protein evolution and secondary structure. Mol. Biol. Evol. 13:666-673[Abstract].

WATTERSON, G. A., 1978  The homozygosity test of neutrality. Genetics 88:405-417[Abstract/Free Full Text].

WAYNE, M. L., D. CONTAMINE, and M. KREITMAN, 1996  Molecular population genetics of ref(2)P, a locus which confers viral resistance in Drosophila. Mol. Biol. Evol. 13:191-199[Abstract].

WISE, C. A., M. SRAML, and S. EASTEAL, 1998  Departure from neutrality at the mitochondrial NADH dehydrogenase subunit 2 gene in humans, but not in chimpanzees. Genetics 148:409-421[Abstract/Free Full Text].

WRIGHT, S., 1931  Evolution in Mendelian populations. Genetics 16:97-159[Free Full Text].

YANG, Z., S. KUMAR, and M. NEI, 1995  A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641-1650[Abstract].

YOKOYAMA, S., 1997  Molecular genetic basis of adaptive selection: examples from color vision in vertebrates. Annu. Rev. Genet. 31:315-336[Medline].

ZHANG, J. and S. KUMAR, 1997  Detection of convergent and parallel evolution at the amino acid sequence level. Mol. Biol. Evol. 14:527-536[Abstract].

ZHANG, J. and M. NEI, 1997  Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods. J. Mol. Evol. 44:S139-S146.




This article has been cited by other articles:


Home page
Mol Biol EvolHome page
W. Zhai, R. Nielsen, and M. Slatkin
An Investigation of the Statistical Power of Neutrality Tests Based on Comparative and Population Genetic Data
Mol. Biol. Evol., February 1, 2009; 26(2): 273 - 283.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
B. R. Morton, V.-u.-N. Dar, and S. I. Wright
Analysis of Site Frequency Spectra from Arabidopsis with Context-Dependent Corrections for Ancestral Misinference
Plant Physiology, February 1, 2009; 149(2): 616 - 624.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. M. Desai and J. B. Plotkin
The Polymorphism Frequency Spectrum of Finitely Many Sites Under Selection
Genetics, December 1, 2008; 180(4): 2175 - 2191.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. E. Palme, M. Wright, and O. Savolainen
Patterns of Divergence among Conifer ESTs and Polymorphism in Pinus sylvestris Identify Putative Selective Sweeps
Mol. Biol. Evol., December 1, 2008; 25(12): 2567 - 2577.
[Abstract] [Full Text] [PDF]


Home page
Biol LettHome page
P. R Haddrill and B. Charlesworth
Non-neutral processes drive the nucleotide composition of non-coding sequences in Drosophila
Biol Lett, August 23, 2008; 4(4): 438 - 441.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Wong, M. C. Turchin, M. F. Wolfner, and C. F. Aquadro
Evidence for Positive Selection on Drosophila melanogaster Seminal Fluid Protease Homologs
Mol. Biol. Evol., March 1, 2008; 25(3): 497 - 506.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. M. Matzkin
The Molecular Basis of Host Adaptation in Cactophilic Drosophila: Molecular Evolution of a Glutathione S-Transferase Gene (GstD1) in Drosophila mojavensis
Genetics, February 1, 2008; 178(2): 1073 - 1083.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Heger and C. P. Ponting
Variable Strength of Translational Selection Among 12 Drosophila Species
Genetics, November 1, 2007; 177(3): 1337 - 1348.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. S. McBride, J. R. Arguello, and B. C. O'Meara
Five Drosophila Genomes Reveal Nonneutral Evolution and the Signature of Host Specialization in the Chemoreceptor Superfamily
Genetics, November 1, 2007; 177(3): 1395 - 1416.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Charlesworth and A. Eyre-Walker
The other side of the nearly neutral theory, evidence of slightly advantageous back-mutations
PNAS, October 23, 2007; 104(43): 16992 - 16997.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
R. D. Hernandez, S. H. Williamson, L. Zhu, and C. D. Bustamante
Context-Dependent Mutation Rates May Cause Spurious Signatures of a Fixation Bias Favoring Higher GC-Content in Humans
Mol. Biol. Evol., October 1, 2007; 24(10): 2196 - 2202.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
Y. Radhakrishnan, M. A. Fares, F. S. French, and S. H. Hall
Comparative genomic analysis of a mammalian {beta}-defensin gene cluster
Physiol Genomics, August 20, 2007; 30(3): 213 - 222.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
X. Maside and B. Charlesworth
Patterns of Molecular Variation and Evolution in Drosophila americana and Its Relatives
Genetics, August 1, 2007; 176(4): 2293 - 2305.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
R. D. Hernandez, S. H. Williamson, and C. D. Bustamante
Context Dependence, Ancestral Misidentification, and Spurious Signatures of Natural Selection
Mol. Biol. Evol., August 1, 2007; 24(8): 1792 - 1800.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. Bartolome and B. Charlesworth
Evolution of Amino-Acid Sequences and Codon Usage on the Drosophila miranda Neo-Sex Chromosomes
Genetics, December 1, 2006; 174(4): 2033 - 2044.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. C. Presgraves
Intron Length Evolution in Drosophila
Mol. Biol. Evol., November 1, 2006; 23(11): 2203 - 2213.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
W.-Y. Ko, S. Piao, and H. Akashi
Strong Regional Heterogeneity in Base Composition Evolution on the Drosophila X Chromosome
Genetics, September 1, 2006; 174(1): 349 - 362.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. Bjornerfeldt, M. T. Webster, and C. Vila
Relaxation of selective constraint on dog mitochondrial DNA following domestication
Genome Res., August 1, 2006; 16(8): 990 - 994.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. D. Cutter, M.-A. Felix, A. Barriere, and D. Charlesworth
Patterns of Nucleotide Polymorphism Distinguish Temperate and Tropical Wild Isolates of Caenorhabditis briggsae
Genetics, August 1, 2006; 173(4): 2021 - 2031.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. M. Comeron
Weak selection and recent mutational changes influence polymorphic synonymous mutations in humans
PNAS, May 2, 2006; 103(18): 6940 - 6945.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Akashi, W.-Y. Ko, S. Piao, A. John, P. Goel, C.-F. Lin, and A. P. Vitins
Molecular Evolution in the Drosophila melanogaster Species Subgroup: Frequent Parameter Fluctuations on the Timescale of Molecular Divergence
Genetics, March 1, 2006; 172(3): 1711 - 1726.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. Loewe, B. Charlesworth, C. Bartolome, and V. Noel
Estimating Selection on Nonsynonymous Mutations
Genetics, February 1, 2006; 172(2): 1079 - 1092.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. D. Cutter
Nucleotide Polymorphism and Linkage Disequilibrium in Wild Populations of the Partial Selfer Caenorhabditis elegans
Genetics, January 1, 2006; 172(1): 171 - 184.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. M. Comeron and T. B. Guthrie
Intragenic Hill-Robertson Interference Influences Selection Intensity on Synonymous Mutations in Drosophila
Mol. Biol. Evol., December 1, 2005; 22(12): 2519 - 2530.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
M. Woolfit and L. Bromham
Population size and molecular evolution on islands
Proc R Soc B, November 7, 2005; 272(1578): 2277 - 2282.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. Nielsen, S. Williamson, Y. Kim, M. J. Hubisz, A. G. Clark, and C. Bustamante
Genomic scans for selective sweeps using SNP data
Genome Res., November 1, 2005; 15(11): 1566 - 1575.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
P. R. Haddrill, K. R. Thornton, B. Charlesworth, and P. Andolfatto
Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populations
Genome Res., June 1, 2005; 15(6): 790 - 799.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. H. Williamson, R. Hernandez, A. Fledel-Alon, L. Zhu, R. Nielsen, and C. D. Bustamante
Simultaneous inference of selection and population growth from patterns of variation in the human genome
PNAS, May 31, 2005; 102(22): 7882 - 7887.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. I. Wright and B. S. Gaut
Molecular Population Genetics and the Search for Adaptive Evolution in Plants
Mol. Biol. Evol., March 1, 2005; 22(3): 506 - 519.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Williamson, A. Fledel-Alon, and C. D. Bustamante
Population Genetics of Polymorphism and Divergence for Diploid Selection Models With Arbitrary Dominance
Genetics, September 1, 2004; 168(1): 463 - 475.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
N. Inomata, H. Goto, M. Itoh, and K. Isono
A Single-Amino-Acid Change of the Gustatory Receptor Gene, Gr5a, Has a Major Effect on Trehalose Sensitivity in a Natural Population of Drosophila melanogaster
Genetics, August 1, 2004; 167(4): 1749 - 1758.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
N. Bierne and A. Eyre-Walker
The Genomic Rate of Adaptive Amino Acid Substitution in Drosophila
Mol. Biol. Evol., July 1, 2004; 21(7): 1350 - 1360.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Schein, Z. Yang, T. Mitchell-Olds, and K. J. Schmid
Rapid Evolution of a Pollen-Specific Oleosin-Like Gene Family from Arabidopsis thaliana and Closely Related Species
Mol. Biol. Evol., April 1, 2004; 21(4): 659 - 669.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. E. Rose, P. D. Bittner-Eddy, C. H. Langley, E. B. Holub, R. W. Michelmore, and J. L. Beynon
The Maintenance of Extreme Amino Acid Diversity at the Disease Resistance Gene, RPP13, in Arabidopsis thaliana
Genetics, March 1, 2004; 166(3): 1517 - 1527.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. L. Halligan, A. Eyre-Walker, P. Andolfatto, and P. D. Keightley
Patterns of Evolutionary Constraints in Intronic and Intergenic DNA of Drosophila
Genome Res., February 1, 2004; 14(2): 273 - 279.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. A. Perez, A. Munte, J. Rozas, C. Segarra, and M. Aguade
Nucleotide Polymorphism in the RpII215 Gene Region of the Insular Species Drosophila guanche: Reduced Efficacy of Weak Selection on Synonymous Variation
Mol. Biol. Evol., November 1, 2003; 20(11): 1867 - 1875.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
G. Piganeau and A. Eyre-Walker
Estimating the distribution of fitness effects from DNA sequence data: Implications for the molecular clock
PNAS, September 2, 2003; 100(18): 10335 - 10340.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Akashi
Translational Selection and Yeast Proteome Evolution
Genetics, August 1, 2003; 164(4): 1291 - 1303.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. A. Schlenke and D. J. Begun
Natural Selection Drives Drosophila Immune System Evolution
Genetics, August 1, 2003; 164(4): 1471 - 1480.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
Y. Gilad and D. Lancet
Population Differences in the Human Functional Olfactory Repertoire
Mol. Biol. Evol., March 1, 2003; 20(3): 307 - 314.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. J. Begun and P. Whitley
Molecular Population Genetics of Xdh and the Evolution of Base Composition in Drosophila
Genetics, December 1, 2002; 162(4): 1725 - 1735.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. D. Kern, C. D. Jones, and D. J. Begun
Genomic Effects of Nucleotide Substitutions in Drosophila simulans
Genetics, December 1, 2002; 162(4): 1753 - 1761.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. J. Lercher, N. G. C. Smith, A. Eyre-Walker, and L. D. Hurst
The Evolution of Isochores: Evidence From SNP Frequency Distributions
Genetics, December 1, 2002; 162(4): 1805 - 1810.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Eyre-Walker
Changing Effective Population Size and the McDonald-Kreitman Test
Genetics, December 1, 2002; 162(4): 2017 - 2024.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. I. Wright, B. Lauga, and D. Charlesworth
Rates and Patterns of Molecular Evolution in Inbred and Outbred Arabidopsis
Mol. Biol. Evol., September 1, 2002; 19(9): 1407 - 1420.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. W. Hahn, M. D. Rausher, and C. W. Cunningham
Distinguishing Between Selection and Population Expansion in an Experimental Lineage of Bacteriophage T7
Genetics, May 1, 2002; 161(1): 11 - 20.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. D. Bustamante, J. Wakeley, S. Sawyer, and D. L. Hartl
Directional Selection and the Site-Frequency Spectrum
Genetics, December 1, 2001; 159(4): 1779 - 1788.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. J. Begun
The Frequency Distribution of Nucleotide Variation in Drosophila simulans
Mol. Biol. Evol., July 1, 2001; 18(7): 1343 - 1352.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Munte, M. Aguade, and C. Segarra
Changes in the Recombinational Environment Affect Divergence in the yellow Gene of Drosophila
Mol. Biol. Evol., June 1, 2001; 18(6): 1045 - 1056.
[Abstract] [Full Text]


Home page
GeneticsHome page
P. Tiffin and B. S. Gaut
Sequence Diversity in the Tetraploid Zea perennis and the Closely Related Diploid Z. diploperennis: Insights From Four Nuclear Loci
Genetics, May 1, 2001; 158(1): 401 - 412.
[Abstract] [Full Text]


Home page
GeneticsHome page
G. A. T. McVean and J. Vieira
Inferring Parameters of Mutation, Selection and Demography From Patterns of Synonymous Site Evolution in Drosophila
Genetics, January 1, 2001; 157(1): 245 - 257.
[Abstract] [Full Text]


Home page
GeneticsHome page
B. C. Verrelli and W. F. Eanes
Extensive Amino Acid Polymorphism at the Pgm Locus Is Consistent With Adaptive Protein Evolution in Drosophila melanogaster
Genetics, December 1, 2000; 156(4): 1737 - 1752.
[Abstract] [Full Text]


Home page
GeneticsHome page
J. M. Comeron and M. Kreitman
The Correlation Between Intron Length and Recombination in Drosophila: Dynamic Equilibrium Between Mutational and Selective Forces
Genetics, November 1, 2000; 156(3): 1175 - 1190.
[Abstract] [Full Text]


Home page
GeneticsHome page
D. D. Duvernell and W. F. Eanes
Contrasting Molecular Population Genetics of Four Hexokinases in Drosophila melanogaster, D. simulans and D. yakuba
Genetics, November 1, 2000; 156(3): 1191 - 1201.
[Abstract] [Full Text]


Home page
GeneticsHome page
D. M. Weinreich and D. M. Rand
Contrasting Patterns of Nonneutral Evolution in Proteins Encoded in Nuclear and Mitochondrial Genomes
Genetics, September 1, 2000; 156(1): 385 - 399.
[Abstract] [Full Text]


Home page
GeneticsHome page
A. Llopart and M. Aguadé
Nucleotide Polymorphism at the RpII215 Gene in Drosophila subobscura: Weak Selection on Synonymous Mutations
Genetics, July 1, 2000; 155(3): 1245 - 1252.
[Abstract] [Full Text]


Home page
GeneticsHome page
G. A. T. McVean and B. Charlesworth
The Effects of Hill-Robertson Interference Between Weakly Selected Mutations on Patterns of Molecular Evolution and Variation
Genetics, June 1, 2000; 155(2): 929 - 944.
[Abstract] [Full Text]


Home page
GeneticsHome page
D. J. Begun and P. Whitley
Adaptive Evolution of Relish, a Drosophila NF-{kappa}B/I{kappa}B Protein
Genetics, March 1, 2000; 154(3): 1231 - 1238.
[Abstract] [Full Text]


Home page
GeneticsHome page
K. J. Schmid, L. Nigro, C. F. Aquadro, and D. Tautz
Large Number of Replacement Polymorphisms in Rapidly Evolving Genes of Drosophila: Implications for Genome-Wide Surveys of DNA Polymorphism
Genetics, December 1, 1999; 153(4): 1717 - 1729.
[Abstract] [Full Text]


Home page
GeneticsHome page
R. Nielsen and D. M. Weinreich
The Age of Nonsynonymous and Synonymous Mutations in Animal mtDNA and Implications for the Mildly Deleterious Theory
Genetics, September 1, 1999; 153(1): 497 - 506.
[Abstract] [Full Text]