Genetics, Vol. 162, 1725-1735, December 2002, Copyright © 2002

Molecular Population Genetics of Xdh and the Evolution of Base Composition in Drosophila

David J. Beguna,b and Penn Whitleya
a Section of Integrative Biology, University of Texas, Austin, Texas 78712
b Section of Evolution and Ecology, University of California, Davis, California 95616

Corresponding author: David J. Begun, University of California, Davis, CA 95616., djbegun{at}ucdavis.edu (E-mail)

Communicating editor: W. STEPHAN


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Few loci have been measured for DNA polymorphism and divergence in several species. Here we report such data from the protein-coding region of xanthine dehydrogenase (Xdh) in 22 species of Drosophila. Many of our samples were from closely related species, allowing us to confidently assign substitutions to individual lineages. Surprisingly, Xdh appears to be fixing more A/T mutations than G/C mutations in most lineages, leading to evolution of higher A/T content in the recent past. We found no compelling evidence for selection on protein variation, though some aspects of the data support the notion that a significant fraction of amino acid polymorphisms are slightly deleterious. Finally, we found no convincing evidence that levels of silent heterozygosity are associated with rates of protein evolution.


THE nucleotide substitution process may be affected by many biological and statistical phenomena. The complexity of the process is probably a major reason why our understanding of it is only rudimentary. Discussion of the possible role of effective population size (NE) on evolutionary rates has been ongoing for many decades (WRIGHT 1931 Down, WRIGHT 1932 Down; FISHER 1958 Down; KIMURA 1983 Down; OHTA 1992 Down; GILLESPIE 1999 Down, GILLESPIE 2000 Down, GILLESPIE 2001 Down), yet has resulted in little clarity. For example, the neutral model of molecular evolution posits that there is no effect of population size on protein evolution, whereas simple models of adaptive protein evolution predict a positive correlation between substitution rate and population size (e.g., KIMURA 1983 Down). Models invoking a significant contribution of slightly deleterious alleles to evolution may predict a negative correlation between population size and rates of protein evolution (e.g., OHTA 1992 Down). Finally, recent theoretical analyses suggest that, contrary to previous results, substitution rates for sites under selection need not be strongly dependent on the population size (CHERRY 1998 Down; GILLESPIE 1999 Down, GILLESPIE 2000 Down, GILLESPIE 2001 Down). In spite of this long history of theoretical analysis, there have been few attempts to empirically estimate effective sizes through studies of DNA polymorphism and then to investigate whether patterns of nucleotide substitution are related to these estimated population sizes.

Molecular population genetic data from Drosophila melanogaster and D. simulans have revealed several interesting differences between these lineages, many of which have been interpreted in terms of differences in NE (CHOUDHARY and SINGH 1987 Down; AQUADRO et al. 1988 Down; reviewed in MORIYAMA and POWELL 1996 Down). For example, simulans is more polymorphic at silent sites than melanogaster (AQUADRO et al. 1988 Down; MORIYAMA and POWELL 1996 Down). Under the neutral model, this is consistent with the idea that NE is larger in simulans than in melanogaster. D. melanogaster harbors proportionally more amino acid polymorphism (relative to silent polymorphism) than does simulans and has fixed proportionally more amino acid mutants (relative to silent fixations) than has simulans (AKASHI 1995 Down, AKASHI 1996 Down; MORIYAMA and POWELL 1996 Down; TAKANO 1998 Down; but see BEGUN 1996 Down; ANDOLFATTO 2001 Down). The greater proportional protein polymorphism and divergence in melanogaster has been interpreted as resulting from reduced efficacy of mutation-selection balance against slightly deleterious amino acid variants in this species (AKASHI 1995 Down, AKASHI 1996 Down). Compared to simulans, melanogaster has also fixed a larger proportion of presumably slightly deleterious, unpreferred silent mutations (AKASHI 1995 Down, AKASHI 1996 Down). This difference, too, has been attributed to weaker selection against deleterious alleles in the melanogaster lineage (AKASHI 1995 Down, AKASHI 1996 Down). A smaller NE in melanogaster than in simulans would be a plausible explanation for weaker selection against deleterious mutations in the former.

However, the fact that differences in NE may be consistent with patterns of nucleotide variation in melanogaster and simulans provides only weak supporting evidence for the population-size hypothesis because the species may differ in many ways besides population size. This leaves room for an agnostic position on the cause of the genomic differences between these two species and motivates the study presented here—an analysis of silent and replacement variation at a single locus in population samples from several Drosophila species. We specifically selected groups of closely related species for most of our analyses. There were two primary motivations for this sampling strategy. First, low sequence divergence between species allows us to infer individual substitutions occurring on single evolutionary lineages. This reduces our dependence on uncertain models of the substitution process and allows us to test hypotheses by counting polymorphic and fixed mutations instead of estimating parameters of molecular evolution over longer time periods. Second, our sampling strategy provides an opportunity to investigate the connection between polymorphism and divergence for a homologous region of DNA in many species. We selected xanthine dehydrogenase (Xdh, corresponding to the ry locus of D. melanogaster) for this study. One factor motivating this choice was the allozyme (e.g., Figure 9.3 of KIMURA 1983 Down) and DNA sequence data (RILEY et al. 1992 Down), suggesting that it is highly polymorphic at the protein level in several Drosophila species. Therefore, we expected to observe sufficient numbers of amino acid polymorphisms and fixations to provide powerful tests of null hypotheses. Furthermore, Xdh shows moderate levels of codon bias in some previously sequenced Drosophila species (RODRIGUEZ-TRELLES et al. 1999 Down), suggesting that we could expect to observe large numbers of silent mutations as well.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Samples:
Flies/DNA used for isolating Xdh alleles, and the number of alleles isolated from each species, are given in Table 1.


 
View this table:
In this window
In a new window

 
Table 1. Samples used in the survey of Xdh polymorphism and divergence

PCR, cloning, and sequencing:
Several PCR primers or degenerate PCR primers were designed from regions of the Xdh protein that were highly conserved among D. melanogaster, D. pseudoobscura, and Bombyx mori. This region spanned residues 142–150 or 206–215 of the D. melanogaster Xdh protein for forward primers or residues 737–749 of the melanogaster protein for reverse primers. Several combinations of primer pairs were used to amplify Xdh fragments from different species. For species in which PCR was only marginally effective, a single allele was isolated from a gel-purified PCR product, cloned, and sequenced. These data were subsequently used to design species-specific primers. DNA isolated from single wild-caught flies was used in PCR for species provided by W. Etges. For lines established by us or provided by the Species Center, DNA was isolated from single flies taken from isofemale lines. PCR was carried out on DNA from a multifly prep provided by J. Powell (GLEASON and POWELL 1997 Down) for some of the willistoni group data. In all cases, PCR was carried out using a high-fidelity polymerase (Boehringer Mannheim, Indianapolis). The resulting product was cloned in the TOPO Zero Blunt vector (Invitrogen, San Diego). A single clone was sequenced from each PCR reaction. Sequencing was carried out by using primers that annealed to vector sequence followed by primer walking based on primers designed from sequence data from individual species or groups of closely related species. The total number of nucleotides sequenced per individual varied slightly as a consequence of technical difficulties sequencing some templates or from variation in locations of PCR primers. The vast majority of our alleles start at the residue corresponding to amino acid 216 of the melanogaster protein and end at the residue corresponding to 723 of the melanogaster protein. Thus, ~1500 bases were sequenced for each allele. This encompasses ~38% of the 1335-amino-acid-long melanogaster Xdh protein. The error rate of the polymerase is given as ~3 x 10-7 (Boehringer-Mannheim product literature), so we expect very few errors in the ~1.8 x 105 bp (120 alleles x ~1500 bp/allele) of reported sequence. DNaSP (v. 3.51, ROZAS and ROZAS 1999 Down) was used for most analysis of sequence polymorphism. Analyses of intraspecific variation were restricted to sites that were sequenced in all alleles. Sequences can be found under GenBank accession nos. AF543072, AF543189.

Polarizing mutations:
In most cases, phylogenetic relationships based on previously published data (summarized in POWELL 1997 Down) were used to determine ancestral states for polymorphic and fixed mutations. The affinis/athabasca/azteca clade is unresolved in previous analyses. We found strong support for an affinis/athabasca sister relationship, which we assumed for our analyses. Lack of appropriate outgroup data precluded inferences of fixations on some lineages. Specific phylogenetic relationships used for our analyses are in the Appendix.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Heterozygosity:
Table 2 shows estimates of silent and replacement heterozygosity at the Xdh locus for each of 22 species of Drosophila. Divergence among closely related species for silent and replacement sites is shown in Table 3. Very few sites are polymorphic in each of two sister taxa (with the exception of pseudoobscura and persimilis), suggesting that polymorphism data from each species can be considered as independent. Silent {theta} varies from a high of 0.064 to a low of 0.008; replacement {theta} varies from a high of 0.0080 to a low of 0.0014. Average heterozygosity is not significantly heterogeneous across species groups for silent sites (Kruskal-Wallis test, P = 0.09, species for which n < 4 are omitted) or replacement sites (Kruskal-Wallis test, P = 0.60, species for which n < 4 are omitted). Replacement and silent heterozygosity across species should be positively correlated across species under a neutral model of evolution with a homogeneous mutation rate. Our estimates of silent and replacement {theta} for 17 species were positively correlated (Spearman's {rho} = 0.56) and significantly different from zero (P = 0.02; Fig 1). The ratio of replacement to silent {theta} was not significantly heterogeneous across the repleta, obscura, virilis, and willistoni species groups (Table 2 and Table 4; Kruskal-Wallis test, P = 0.35).



View larger version (15K):
In this window
In a new window
Download PPT slide
 
Figure 1. Scatterplot of silent vs. replacement heterozygosity at the Xdh locus in 17 Drosophila species in which at least four alleles were sampled.


 
View this table:
In this window
In a new window

 
Table 2. Silent and replacement heterozygosity at Xdh in Drosophila


 
View this table:
In this window
In a new window

 
Table 3. Silent and replacement divergence per site at Xdh for closely related species pairs


 
View this table:
In this window
In a new window

 
Table 4. Silent and replacement polymorphisms in different species groups

Frequency spectrum:
Table 5 shows summaries of the frequency spectrum of polymorphism as estimated by Tajima's D (TAJIMA 1989 Down) for silent and replacement sites for the 17 species in which at least four alleles were sampled. Negative and positive D values indicate skews toward rare or intermediate frequency alleles, respectively. Xdh data show a strong trend toward negative Tajima's D for both silent and replacement polymorphisms, though none of the individual tests is significant after adjusting the critical value for multiple tests. Unfortunately, there are multiple explanations for this result. First, small sample sizes tend to produce negative D values even under the neutral, equilibrium model (TAJIMA 1989 Down). Second, sites under purifying selection may have negative D values. Third, neutral sites in expanding populations may be skewed toward rare variants. Fourth, multiple forms of selection may produce skews toward rare alleles at linked neutral sites (GILLESPIE 2000 Down). Finally, we cannot rule out the possibility that Taq polymerase errors during PCR contribute to the trend toward rare alleles, though Taq errors are expected to have only a small effect on the data set. Tajima's D values for replacement polymorphisms are significantly more negative than the corresponding values for silent polymorphisms (sign test, P = 0.01).


 
View this table:
In this window
In a new window

 
Table 5. Tajima's D for silent and replacement polymorphisms

Substitutions:
Substitutions between pairs of closely related species can be assigned to individual lineages under parsimony given an outgroup and a single mutation in the history of the site under consideration. Inference of the ancestral state for a pair of species should be reliable at the low level of sequence divergence observed for many of our species pairs (Table 3). Numbers of silent and replacement substitutions occurring on individual lineages are given in Table 6. A test of the contingency table of 16 species and silent vs. replacement fixations is only marginally significantly heterogeneous (G-test, P = 0.043). The ratio of replacement to silent fixations is not significantly heterogeneous across the repleta, obscura, virilis, and willistoni species groups (G-test, P = 0.83). Similarly, the ratio of replacement to silent fixations is not significantly different from the ratio of replacement to silent polymorphisms for any species group.


 
View this table:
In this window
In a new window

 
Table 6. Silent and replacement mutations in Drosophila lineages

Correlation between heterozygosity and protein divergence:
If silent mutations are neutral, silent site heterozygosity can be used as an estimator of NE . Under this premise we could investigate whether effective population size is correlated with the proportion of replacement to silent fixations along a lineage. We proceed under the assumption that silent heterozygosity is highly positively correlated with NE, though we acknowledge that selection may result in a complex relationship between silent heterozygosity and NE (e.g., AKASHI 1999 Down; GILLESPIE 1999 Down, GILLESPIE 2000 Down; MCVEAN and CHARLESWORTH 1999 Down). Fig 2 shows a scatterplot of silent {theta} vs. the ratio of replacement to silent fixations; each point represents the heterozygosity and the ratio of silent to replacement fixations for a single species. We find no evidence for a correlation between these two variables (Spearman's {rho} = -0.12, P = 0.66). However, the fact that several lineages have fixed a small number of mutations (Table 6) would inflate the sampling variance of the ratio of replacement to silent fixations, thereby reducing our power to detect any underlying relationship between this ratio and other variables. We consider eight independent pairwise comparisons of silent and replacement divergence to reduce the severity of this problem (Table 3). For each species pair we compared average silent heterozygosity to the ratio of replacement to silent divergence. There is no significant correlation (Spearman's {rho} = 0.17, P = 0.66). However, data from the willistoni/equinoxialis pair appear to be quite different from the other groups in that the level of polymorphism is unusually high. Data from the seven remaining species pairs reveal a strong, nearly significant positive correlation (Spearman's {rho} = 0.75, P = 0.066) between average silent heterozygosity and protein divergence (Fig 3).



View larger version (13K):
In this window
In a new window
Download PPT slide
 
Figure 2. Silent {theta} vs. the ratio of replacement to silent fixations. Fixations were assigned to individual lineages by parsimony. Only species for which at least four alleles were sampled are included.



View larger version (14K):
In this window
In a new window
Download PPT slide
 
Figure 3. Scatterplot of average silent {theta} vs. the ratio of replacement to silent divergence per site for pairs of sister taxa. Data are from Table 2, with data from willistoni/equinoxialis omitted.

Evolution of codon bias and base composition:
Preferred and unpreferred mutants are hypothesized to be slightly beneficial alleles and slightly deleterious alleles, respectively. Silent polymorphisms and fixations were categorized as preferred or unpreferred and polarized using parsimony (AKASHI 1995 Down; Table 7). This approach assumes that putative fitness categories derived from melanogaster data can be extended to distantly related species. This assumption seems reasonable because general patterns of codon usage are well conserved among melanogaster, virilis, and pseudoobscura (AKASHI and SCHAEFFER 1997 Down; MCVEAN and VIEIRA 1999 Down; KREITMAN and ANTEZANA 2000 Down), three species that diverged early in the history of the genus. Nevertheless, we compared codon usage [effective number of codons (ENC); WRIGHT 1990] in a high-bias (mercatorum, ENC = 34.71) vs. low-bias gene (willistoni, ENC = 54.52) to determine whether patterns of codon usage in our Xdh sequences are consistent with patterns observed more generally in Drosophila genes. Of the 21 codon families where preferred and unpreferred codons were identified in melanogaster (AKASHI 1995 Down), the preferred codon was used more often in the high-bias than in the low-bias Xdh sequence for 19 families. Furthermore, ENC is highly correlated with GC content at third positions of codons at Xdh (Fig 4). Both observations suggest that patterns of codon usage at Xdh across species are similar to genomic patterns of codon usage in melanogaster.



View larger version (14K):
In this window
In a new window
Download PPT slide
 
Figure 4. Percentage G or C at third positions of Xdh codons vs. the overall level of codon bias (ENC).


 
View this table:
In this window
In a new window

 
Table 7. Preferred and unpreferred mutants and codon bias (ENC) at Xdh in Drosophila lineages

Under an equilibrium model of codon bias evolution, one expects equal numbers of unpreferred and preferred fixations. Our fixation data, pooled across lineages (Table 7), clearly deviate from this expectation (binomial probability, P < 0.001). This result is not attributable to a small number of lineages that deviate strongly from the equilibrium prediction, but rather results from a general trend toward unpreferred fixations in most lineages—14 of 18 lineages have fixed more unpreferred than preferred mutations.

Similarly, there is a strong trend in the direction of excess unpreferred polymorphisms. Seventeen of 18 species have greater numbers of unpreferred than preferred polymorphisms. Under the neutral equilibrium model, the proportion of unpreferred to preferred polymorphisms should be equal to the proportion of unpreferred to preferred fixations. Comparison of unpreferred and preferred polymorphisms and fixations in simulans and melanogaster showed that simulans has a significant excess of unpreferred polymorphisms, while the polymorphic and fixed mutants in melanogaster were compatible with the strictly neutral model (AKASHI 1996 Down, AKASHI 1999 Down). Though Xdh fixations show a highly significant deviation from the equilibrium expectation (Table 7), joint consideration of the polymorphic and fixed mutations does not reject the neutral model for silent sites at Xdh. This is similar to the pattern seen in data from several melanogaster genes (AKASHI 1995 Down).

A caveat regarding these conclusions comes from separate analysis of data from the repleta group, which are consistent with the aforementioned results primarily because of the very large number of unpreferred fixations in the eremophila lineage (Table 7). If these data are omitted, the remaining repleta group data appear to differ from data from other species groups. Polymorphic and fixed, preferred and unpreferred mutations are significantly heterogeneous (P < 0.001). Furthermore, there is no excess of unpreferred fixations (25 preferred vs. 37 unpreferred, binomial probability, P = 0.08). Of the 8 non-eremophila repleta lineages, only 4 (Table 6) have fixed more unpreferred than preferred mutations [i.e., all 4 lineages (of 18 total lineages) that have not fixed more unpreferred mutations are from the repleta group].

All preferred melanogaster codons end in G or C (AKASHI 1995 Down). Thus, our fixation data should reveal an accumulation of A/T. Mutations were categorized as G or C to A or T and A or T to G or C. Numbers of fixed mutations in the two categories (pooled across lineages) for silent and replacement sites are given in Table 8. At equilibrium for base composition, one expects equal numbers of fixations in the two categories. There is a significant excess of A/T fixations (binomial probability, P < 0.001) at silent sites. Replacement sites show the same pattern, though the replacement fixations show only a marginally significant deviation from the equilibrium expectation (perhaps because of the smaller number of observations compared to silent mutations).


 
View this table:
In this window
In a new window

 
Table 8. Fixations along Drosophila lineages

A smaller NE in melanogaster than in simulans was proposed as a possible explanation for the much higher proportion of unpreferred fixations in the former (AKASHI 1996 Down). If this were a general explanation, we might expect to observe a negative correlation between ENC and silent heterozygosity; species with histories of larger NE should have higher levels of codon bias (and thus smaller ENC). Fig 5 shows a scatterplot of ENC (vs. silent heterozygosity for our Xdh samples; we see no evidence of a negative correlation between these two variables (Spearman's {rho} = 0.29, P = 0.25). In fact, the two willistoni group species have very high levels of silent heterozygosity, yet show low levels of codon bias at Xdh.



View larger version (13K):
In this window
In a new window
Download PPT slide
 
Figure 5. Codon bias (ENC) vs. silent {theta} for 17 species in which at least four alleles were sampled.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Silent substitutions at Xdh are biased toward unpreferred mutations. The two best-studied Drosophila species, melanogaster and simulans, have each fixed more unpreferred than preferred mutations (AKASHI 1995 Down; TAKANO 1998 Down; BEGUN 2001 Down; MCVEAN and VIEIRA 2001 Down). Although additional data are required to make strong inferences about global properties of silent substitutions in flies, the Xdh data presented here, data from several genes in melanogaster and simulans, and data from the saltans group of Drosophila (RODRIGUEZ-TRELLES et al. 1999 Down) suggest that accumulation of unpreferred fixations may be common in Drosophila genes. Since unpreferred mutations end in A or T, the accumulation of unpreferred mutations is equivalent to a change in base composition toward A or T. Data suggesting that replacement (RODRIGUEZ-TRELLES et al. 1999 Down, RODRIGUEZ-TRELLES et al. 2000 Down) or noncoding (TAKANO-SHIMIZU 2001 Down) fixations may also be skewed toward A or T suggest that evolution of base composition may not be restricted to silent sites.

We can entertain at least three kinds of explanations for evolution of A/T content. First, lineages may evolve in response to a change of A/T mutation bias in an ancestor. Second, lineages may evolve in response to selection favoring A/T (this requires independent selection favoring A/T in different lineages). Finally, A/T accumulation may reflect fixation of slightly deleterious (unpreferred) mutations by genetic drift. This final hypothesis may seem unlikely for our Xdh data because it invokes reduction of fitness in several Drosophila lineages that are biologically and historically distinct. Nevertheless, global factors (e.g., temperature) reducing Drosophila population sizes at some point during the last several million years could have promoted fixation of very slightly deleterious alleles, as suggested by AKASHI 1996 Down, AKASHI 1999 Down for the melanogaster lineage. Data of the type described here from several genes and from noncoding DNA (RODRIGUEZ-TRELLES et al. 1999 Down, RODRIGUEZ-TRELLES et al. 2000 Down; TAKANO-SHIMIZU 2001 Down) may help distinguish between some alternatives. For example, sequences from virilis (BERGMAN and KREITMAN 2001 Down) and the saltans group (RODRIGUEZ-TRELLES et al. 1999 Down, RODRIGUEZ-TRELLES et al. 2000 Down) suggest that these lineages may be evolving toward higher A/T content (relative to melanogaster), even in non-protein-coding regions. Such results weaken the hypothesis that accumulation of A/T at silent sites in exons can be explained simply by weaker selection against borderline deleterious mutations.

Although there are broad generalizations to be gleaned from the data, it also seems clear that there may be strong lineage effects on patterns of base composition evolution. For example, eremophila appears to have experienced an atypically large excess of unpreferred fixations relative to the other repleta lineages. This historical inference is consistent with the fact that eremophila Xdh is the least biased of the repleta group samples. Comparison of the repleta group to other species groups also supports the importance of lineage effects. Excluding eremophila, the repleta group shows no significant accumulation of unpreferred fixations. This is in contrast to pooled data from the other species groups, which show a highly significant excess of unpreferred fixations. Patterns of codon bias are consistent with this inference in that codon bias of repleta species (mean ENC = 39.8) is greater than that of other groups (mean obscura ENC = 45.3; mean virilis ENC = 44.2; mean willistoni ENC = 55.5). Major lineage effects can also be inferred simply by noting that A/T content at third positions among species for this region of Xdh ranges from 16.5 to 54.9%. Given that there was a single common ancestor with a particular A/T content, the wide range of current A/T contents is indicative of a heterogeneous substitution process (e.g., RODRIGUEZ-TRELLES et al. 1999 Down, RODRIGUEZ-TRELLES et al. 2000 Down; TAKANO-SHIMIZU 2001 Down). It is difficult to speculate on the causes of these lineage effects given the limited data. For example, the small amount of data for most lineages means that we are still unable to assess the potential significance of lineage x locus interactions (TAKANO-SHIMIZU 2001 Down). Data from other loci and other types of nucleotide sites will be important for clarifying Drosophila substitution patterns.

Previously collected data on polymorphism and divergence at silent sites suggested the presence of excess unpreferred (i.e., putatively borderline deleterious) polymorphisms in simulans, but not in melanogaster (AKASHI 1995 Down, AKASHI 1996 Down). Similar to the pattern previously observed for melanogaster (AKASHI 1996 Down), the contingency table of pooled, polymorphic, and fixed silent mutations at Xdh (Table 7) provides no support for heterogeneity of polymorphisms and fixations. However, consideration of the different species groups or overall levels of codon bias of Xdh genes does provide some evidence for nonneutral dynamics of silent mutations. Table 9 shows the number of polymorphic and fixed, unpreferred and preferred mutations in high-bias species (spenceri, arizonae, mulleri, mojavensis, aldrichi, leonis, hydei) vs. low-bias species (americana, azteca, athabasca, affinis, willistoni, equinoxialis, eremophila). Allele configurations for the high-bias species are significantly heterogeneous (G-test, P = 0.001) in the direction of excess unpreferred polymorphisms, as was previously observed in analyses of simulans sequences (AKASHI 1995 Down). Similarly, polymorphic and fixed mutants for the seven low-bias genes are significantly heterogeneous (P = 0.026). However, if data from eremophila are omitted, the polymorphic and fixed silent mutations in the remaining six low-bias species are no longer significantly heterogeneous. The numbers of fixed preferred vs. unpreferred mutations deviate from equilibrium expectation for low-bias species (binomial probability, P < 0.001; omitting eremophila, P = 0.005). However, fixations for the high-bias species do not deviate from the expectation (binomial probability, P = 0.26). Thus, although there does appear to be a general accumulation of unpreferred fixations, the effect seems to be greater in lower-bias lineages. Unfortunately, inferences about high-bias species are compromised by the fact that they all belong to the repleta group. Therefore, we cannot determine whether patterns associated with these genes result from lineage effects or effects related to overall levels of codon bias.


 
View this table:
In this window
In a new window

 
Table 9. Unpreferred and preferred mutations in lineages with high-bias and low-bias Xdh genes

Under a neutral model with homogeneous mutation rates, the fraction of new silent and replacement mutants that are neutral is expected to be the same across populations of different effective sizes. As predicted under this model, we observed a positive correlation between silent and replacement heterozygosity. However, other aspects of the data suggest the possibility of different dynamics of protein vs. silent variants. First, compared to silent polymorphisms, replacement polymorphisms consistently show greater skew toward rare alleles. Second, there is a marginally significant negative correlation between silent heterozygosity and Tajima's D for replacement polymorphisms (Fig 6, Spearman's {rho} = -0.51, P = 0.04). In other words, populations that harbor more silent variation tend to have amino polymorphisms that are more highly skewed toward rare alleles. This is the pattern one might expect if silent polymorphism were more highly correlated with NE than was replacement polymorphism and if selection against slightly deleterious amino acid mutations were more effective in larger populations (AKASHI 1999 Down). Nevertheless, given the weakness of the data and the possibility that demographic phenomena or several forms of selection (GILLESPIE 1999 Down) may produce negative population Tajima's D values, we should be reluctant to favor a particular explanation for the frequency spectrum data.



View larger version (15K):
In this window
In a new window
Download PPT slide
 
Figure 6. Tajima's D for replacement polymorphisms vs. silent {theta} for 17 species in which at least four alleles were sampled.

The analyses presented here are, to our knowledge, the first attempt to investigate whether nucleotide substitution rates differ along lineages leading to more polymorphic vs. less polymorphic species (but see SKIBINSKI and WARD 1982 Down for a similar analysis of allozyme data). We found no convincing correlation between silent heterozygosity and rates of protein evolution, though a subset of the data revealed a nonsignificant trend for more polymorphic species pairs having higher rates of protein evolution (Fig 3). There are several possible interpretations for the absence of a strong relationship between heterozygosity and protein evolution. First, the neutral model predicts no association between population size and rate of evolution. Second, the correlation between NE in the recent vs. more distant past may be weak. For example, closely related species such as simulans and melanogaster (e.g., MORIYAMA and POWELL 1996 Down) or simulans and sechellia (HEY and KLIMAN 1993 Down; KLIMAN et al. 2000 Down) have different levels of nucleotide variation, though they recently evolved from common ancestral populations of a certain size. Third, as noted earlier, recent theoretical results suggest there may be no major effect of population size on substitution rates, even if substitutions are influenced by natural selection (CHERRY 1998 Down; GILLESPIE 1999 Down, GILLESPIE 2000 Down, GILLESPIE 2001 Down). Fourth, the relatively small numbers of observed amino acid fixations observed in many lineages and/or the large variances on estimates of {theta} could limit our power to detect a correlation. Finally, Xdh amino acid variants could be under strong directional selection only in some lineages and/or at some times. It seems that the attempt to rule out certain models of molecular evolution using associations of heterozygosity and replacement substitution rates will be difficult.


*  FOOTNOTES

Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. AF543072, AF543189. Back


*  ACKNOWLEDGMENTS

We thank the individuals listed in Table 1 for providing flies or DNA. J. Gillespie, C. Langley, and two anonymous reviewers provided useful comments. This work was supported by the National Institutes of Health, the National Science Foundation, and a Sloan Young Investigator Award in Molecular Evolution.

Manuscript received January 21, 2002; Accepted for publication September 19, 2002.


*  APPENDIX
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED


 
View this table:
In this window
In a new window

 
APPENDIX


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

AKASHI, H., 1995  Inferring weak selection from patterns of polymorphism and divergence at silent sites in Drosophila. Genetics 139:1067-1076.[Abstract]

AKASHI, H., 1996  Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster.. Genetics 144:1297-1307.[Abstract]

AKASHI, H., 1999  Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics 151:221-238.[Abstract/Free Full Text]

AKASHI, H. and S. W. SCHAEFFER, 1997  Natural selection and the frequency distributions of "silent" DNA polymorphism in Drosophila. Genetics 146:295-307.[Abstract]

ANDOLFATTO, P., 2001  Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans.. Mol. Biol. Evol. 18:279-290.[Abstract/Free Full Text]

AQUADRO, C. F., K. M. LADO, and W. A. NOON, 1988  The rosy region of Drosophila melanogaster and Drosophila simulans. I. Contrasting levels of naturally occurring DNA restriction map variation and divergence. Genetics 119:875-878.[Abstract/Free Full Text]

BEGUN, D. J., 1996  Population genetics of silent and replacement variation in Drosophila simulans and D. melanogaster: X/autosome differences? Mol. Biol. Evol. 13:1405-1407.[Medline]

BEGUN, D. J., 2001  The frequency distribution of nucleotide variation in Drosophila simulans.. Mol. Biol. Evol. 18:1343-1352.[Abstract/Free Full Text]

BERGMAN, C. M. and M. KREITMAN, 2001  Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11:1335-1345.[Abstract/Free Full Text]

CHERRY, J. L., 1998  Should we expect substitution rate to depend on population size? Genetics 150:911-919.[Abstract/Free Full Text]

CHOUDHARY, M. and R. S. SINGH, 1987  A comprehensive study of genic variation in Drosophila melanogaster. III. Variations in genetic structure and their causes between Drosophila melanogaster and its sibling species Drosophila simulans. Genetics 117:697-710.[Abstract/Free Full Text]

FISHER, R. A., 1958 The Genetical Theory of Natural Selection. Dover Publications, New York.

GILLESPIE, J. H., 1999  The role of population size in molecular evolution. Theor. Popul. Biol. 55:145-156.[Medline]

GILLESPIE, J. H., 2000  Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155:909-919.[Abstract/Free Full Text]

GILLESPIE, J. H., 2001  Is the population size of a species relevant to its evolution? Evolution 55:2161-2169.[Medline]

GLEASON, J. M. and J. R. POWELL, 1997  Interspecific and intraspecific comparisons of the period locus in the Drosophila willistoni sibling species. Mol. Biol. Evol. 14:741-753.[Abstract]

HEY, J. and R. M. KLIMAN, 1993  Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10:804-822.[Abstract]

KIMURA, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.

KLIMAN, R. M., P. ANDOLFATTO, J. A. COYNE, F. DEPAULIS, and M. KREITMAN et al., 2000  The population genetics of the origin and divergence of the Drosophila simulans complex species. Genetics 156:1913-1931.[Abstract/Free Full Text]

KREITMAN, M., and M. ANTEZANA, 2000 Population and evolutionary genetics of codon usage in Drosophila, pp. 82–101 in Evolutionary Genetics: From Molecules to Morphology, edited by R. SINGH and C. KRIMBAS. Cambridge University Press, Oxford.

MCVEAN, G. A. T. and B. CHARLESWORTH, 1999  A population genetic model for the evolution of synonymous codon usage: patterns and predictions. Genet. Res. 74:145-158.

MCVEAN, G. A. T. and J. VIEIRA, 1999  The evolution of codon preference in Drosophila: a maximum-likelihood approach to parameter estimation and hypothesis testing. J. Mol. Evol. 49:63-75.[Medline]

MCVEAN, G. A. T. and J. VIEIRA, 2001  Inferring parameters of mutation, selection and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245-257.[Abstract/Free Full Text]

MORIYAMA, E. and J. R. POWELL, 1996  Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277.[Abstract]

OHTA, T., 1992  The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Syst. 23:263-286.

POWELL, J. R., 1997 Progress and Prospects in Evolutionary Biology: The Drosophila Model. Oxford University Press, New York.

RILEY, M. A., S. R. KAPLAN, and M. VEUILLE, 1992  Nucleotide polymorphism at the xanthine dehydrogenase locus in Drosophila pseudoobscura. Mol. Biol. Evol. 9:56-69.[Abstract]

RODRIGUEZ-TRELLES, F., R. TARRIO, and F. J. AYALA, 1999  Switch in codon bias and increased rates of amino acid substitution in the Drosophila saltans species group. Genetics 153:339-350.[Abstract/Free Full Text]

RODRIGUEZ-TRELLES, F., R. TARRIO, and F. J. AYALA, 2000  Fluctuating mutation bias and the evolution of base composition in Drosophila. J. Mol. Evol. 50:1-10.[Medline]

ROZAS, J. and R. ROZAS, 1999  DnaSP 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.[Abstract/Free Full Text]

SKIBINSKI, D. O. F. and R. D. WARD, 1982  Correlations between heterozygosity and evolutionary rate of proteins. Nature 298:490-492.

TAJIMA, F., 1989  Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.[Abstract/Free Full Text]

TAKANO, T. S., 1998  Rate variation of DNA sequence evolution in the Drosophila lineages. Genetics 149:959-970.[Abstract/Free Full Text]

TAKANO-SHIMIZU, T., 2001  Local changes in GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol. Biol. Evol. 18:606-619.[Abstract/Free Full Text]

WRIGHT, F., 1990  The effective number of codons used in a gene. Gene 87:23-39.[Medline]

WRIGHT, S., 1931  Evolution in Mendelian populations. Genetics 16:97-159.[Free Full Text]

WRIGHT, S., 1932 The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proceedings of the Sixth International Congress in Genetics, Vol 1. pp. 356–366.




This article has been cited by other articles:


Home page
GeneticsHome page
V. Nolte and C. Schlotterer
African Drosophila melanogaster and D. simulans Populations Have Similar Levels of Sequence Variability, Suggesting Comparable Effective Population Sizes
Genetics, January 1, 2008; 178(1): 405 - 412.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
W.-Y. Ko, S. Piao, and H. Akashi
Strong Regional Heterogeneity in Base Composition Evolution on the Drosophila X Chromosome
Genetics, September 1, 2006; 174(1): 349 - 362.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B. J. Wagstaff and D. J. Begun
Molecular Population Genetics of Accessory Gland Protein Genes and Testis-Expressed Genes in Drosophila mojavensis and D. arizonae
Genetics, November 1, 2005; 171(3): 1083 - 1101.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
E. S. Balakirev, V. R. Chechetkin, V. V. Lobzin, and F. J. Ayala
Entropy and GC Content in the {beta}-esterase Gene Cluster of the Drosophila melanogaster Subgroup
Mol. Biol. Evol., October 1, 2005; 22(10): 2063 - 2072.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. Bartolome, X. Maside, S. Yi, A. L. Grant, and B. Charlesworth
Patterns of Selection on Synonymous and Nonsynonymous Variants in Drosophila miranda
Genetics, March 1, 2005; 169(3): 1495 - 1507.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. L. Halligan, A. Eyre-Walker, P. Andolfatto, and P. D. Keightley
Patterns of Evolutionary Constraints in Intronic and Intergenic DNA of Drosophila
Genome Res., February 1, 2004; 14(2): 273 - 279.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
R. Piccinali, M. Aguade, and E. Hasson
Comparative Molecular Population Genetics of the Xdh Locus in the Cactophilic Sibling Species Drosophila buzzatii and D. koepferae
Mol. Biol. Evol., January 1, 2004; 21(1): 141 - 152.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Tarrio, F. Rodriguez-Trelles, and F. J. Ayala
A new Drosophila spliceosomal intron position is common in plants
PNAS, May 27, 2003; 100(11): 6580 - 6583.
[Abstract] [Full Text] [PDF]