- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Comeron, J. M.
- Articles by Kreitman, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Comeron, J. M.
- Articles by Kreitman, M.
Population, Evolutionary and Genomic Consequences of Interference Selection
Josep M. Comerona,b and Martin Kreitmanaa Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637
b Department of Biological Sciences, University of Iowa, Iowa City, Iowa 52242
Corresponding author: Josep M. Comeron, University of Iowa, 433 Biology Bldg., Iowa City, IA 52242., josep-comeron{at}uiowa.edu (E-mail)
Communicating editor: N. TAKAHATA
| ABSTRACT |
|---|
Weakly selected mutations are most likely to be physically clustered across genomes and, when sufficiently linked, they alter each others' fixation probability, a process we call interference selection (IS). Here we study population genetics and evolutionary consequences of IS on the selected mutations themselves and on adjacent selectively neutral variation. We show that IS reduces levels of polymorphism and increases low-frequency variants and linkage disequilibrium, in both selected and adjacent neutral mutations. IS can account for several well-documented patterns of variation and composition in genomic regions with low rates of crossing over in Drosophila. IS cannot be described simply as a reduction in the efficacy of selection and effective population size in standard models of selection and drift. Rather, IS can be better understood with models that incorporate a constant "traffic" of competing alleles. Our simulations also allow us to make genome-wide predictions that are specific to IS. We show that IS will be more severe at sites in the center of a region containing weakly selected mutations than at sites located close to the edge of the region. Drosophila melanogaster genomic data strongly support this prediction, with genes without introns showing significantly reduced codon bias in the center of coding regions. As expected, if introns relieve IS, genes with centrally located introns do not show reduced codon bias in the center of the coding region. We also show that reasonably small differences in the length of intermediate "neutral" sequences embedded in a region under selection increase the effectiveness of selection on the adjacent selected sequences. Hence, the presence and length of sequences such as introns or intergenic regions can be a trait subject to selection in recombining genomes. In support of this prediction, intron presence is positively correlated with a gene's codon bias in D. melanogaster. Finally, the study of temporal dynamics of IS after a change of recombination rate shows that nonequilibrium codon usage may be the norm rather than the exception.
THE general concept of effective population size (Ne), due to ![]()
![]()
) and Neµ (ß), respectively, include effective population size (see Table 1). However, polymorphism levels are not constant across the genome but rather are correlated with recombination rates, suggesting that additional factors may be influencing Ne (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
|
Further investigation on whether Ne can be viewed as varying across a genome takes advantage of its consequences on the effectiveness of weak selection. Theory predicts that the evolutionary dynamics of mutations whose selective effects are on the order of the reciprocal of population size (i.e.,
0.252.5) are expected to be very sensitive to small shifts in Ne (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Several models of strong selection (
>> 1) have been proposed to explain why Ne is reduced in regions of low recombination: (i) the hitchhiking (HH) and pseudo-HH (pHH) models, which invoke frequent positive Darwinian selection (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
However, strong selection is not a requirement for this effect. ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Weakly selected mutations, taken individually, are not expected to have a measurable effect on population parameters or on the tree topology of linked neutral mutations (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
We also investigate two other consequences of IS under low rates of recombination. First, because genes (exons and regulatory regions) are embedded in a matrix of generally less severely constrained DNA, IS may occur at sites with well-defined boundaries along the DNA. Therefore, we study the expected consequences of IS along such intervals under selection. We hypothesize that the effects of IS should not be uniform across a gene, and we use simulations to generate predictions that can be tested with Drosophila genomic data. Second, neutral sequences located between groups or clusters of selected sites under weak/moderate selection (e.g., introns within coding regions of genes) can be viewed as modifiers of recombination and hence can alter the effectiveness of selection (![]()
![]()
| MATERIALS AND METHODS |
|---|
Forward computer simulations:
A Wright-Fisher model was simulated with N diploid individuals (2N chromosomes) as previously described (![]()
![]()
N, respectively (see Table 1 for definitions). The number of recombination events per meiosis is not restricted (assuming no chiasma interference) to avoid underestimating the effect of recombination in long sequences when
N is high. The mutation process allows only two allelic states at a site, and reversible mutation is permitted, mimicking the mutation process between preferred (p) and unpreferred (u) codons. Mutation rates from p to u and vice versa are w and v, respectively, where
is the mutational bias. Unless otherwise indicated, we applied a mutation rate of
, with
. The previous assumption of only two allelic states (![]()
![]()
![]()
![]()
![]()
![]()
![]()
The fitness of each individual is based only on the selected sequence. The selection differential between p (preferred allele) and u (unpreferred allele) in the selected sequence is +s; fitness is multiplicative over sites and mutations are semidominant in their effect on fitness. Each new generation is obtained by first choosing N individuals (2N chromosomes) with probability proportional to their relative fitness. The next generation is constituted by randomly pairing the 2N chromosomes (each composed by the selected and adjacent neutral sequence), which are possibly mutated and/or recombined to form N new diploid individuals.
As indicated in RESULTS, the time to reach base composition equilibrium under a mutation-selection-drift (MSD) balance and IS (e.g., codon usage) when
may take on the order of
100250 N generations. Accordingly, our study of IS at equilibrium begins after a minimum of 250 N generations to assure base composition equilibrium. Each independent population realization was analyzed every N generations for a minimum of 1000 N generations. Population parameters were estimated in 20 independent samples. All estimates of population and evolutionary parameters (heterozygosity, nucleotide diversity, frequency skew, fixation rates) as well as the frequency of preferred codons (P) were obtained by studying the same number of sites, 250, regardless of the total number of sites in the sequence when L
250. These studied sites were homogeneously distributed across the sequence, unless explicitly indicated, to assure an average estimate of the parameters across the region. In every simulation both the selected and neutral sequences were analyzed. The ranges of parameter values we investigated for recombination, selection, and length were 0
N
0.4, 0.25
N
2.5, and 125
L
2500. The ranges of recombination rates under study are representative of most eukaryotes, including D. melanogaster. Assuming Ne
1 x 106 for D. melanogaster (![]()
N < 0.05,
N < 0.1 after taking into account gene conversion (![]()
![]()
![]()
N
0.004 may contain
15% of genes using rates of crossing over; these genomic regions are defined by the cytological bands 1A2C/20C20F (X chromosome), 21A/38B40F/41A44B/60D60F (chromosome 2), 61A61B/76B80F/81A84E (chromosome 3), and the complete fourth chromosome. When the contribution of gene conversion to the total recombination is taken into account,
N
0.004 may apply to >10% of D. melanogaster genes.
To evaluate the relative change in the effectiveness of selection on the selected sequences caused by varying the parameters (changing recombination rates and L and presence and length of intermediate regions) we compared the estimated value of the parameter
on the basis of the observed P. Following directly from the probability of fixation of p and u, and P at equilibrium under the infinitely many sites model and free recombination (![]()
![]()
![]()

(see ![]()
![]()
Linkage disequilibrium (LD) was estimated as the average over all pairwise comparisons of polymorphic sites by using D' (LD-D'; ![]()
![]()
![]()
![]()
![]()
![]()
![]()
Analyses of the D. melanogaster genome:
We studied the complete D. melanogaster genome (![]()
Heterogeneous codon bias across exons:
Two groups of genes were investigated. The first group (659 genes) was composed of all genes with a single long exon (>1000 bp or >333 amino acids). The second group (187 genes) included all genes with long coding regions (>333 amino acids) interrupted by introns and satisfying two criteria: (i) all (one or more) introns should be centrally located, dividing the coding region into two comparable regions (i.e., introns located between 30 and 70% of the relative total length of the coding region), and (ii) at least one intron should be >100 bp. The synonymous codon usage bias was measured using the frequency of GC-ending codons (GC3), the frequency of GC-ending codons in four-fold degenerate amino acids (GC4), and the frequency of preferred codons in D. melanogaster (![]()
Codon bias and the proportion of selected sites in a gene:
The analysis was carried out using all 7499 complete genes (out of 9172) with introns. As a proxy for the relative number (or density) of selected sites in a gene, we used the proportion of the length of the coding region (PLCR) in a gene, measured as the ratio between the length of the coding region and the length of the coding region plus the total length of the introns.
The recombination rate for each gene in the D. melanogaster genome was estimated as previously described (see ![]()
| RESULTS |
|---|
Effects of IS on population and evolutionary parameters at selected and adjacent neutral sequences
Effectiveness of selection:
We investigated the effectiveness of selection on weakly selected mutations by analyzing the proportion of preferred mutations at equilibrium (P; ![]()
![]()
![]()
) are nearly linearly related to Ne, the relationship between P and the selection parameter
(Nes) is strongly nonlinear. For instance, a 5% increase in P represents a 50, 21, and 27% increase in
when the original P is 0.5, 0.6, and 0.9. Therefore, although our simulations measure shifts in P with changes of parameters affecting IS, the magnitude of these shifts is better reflected by the change in the parameter
needed to account for the results under a no-interference model (SS-MSD). As Fig 1 shows, the effectiveness of selection, as measured by
, decreases as the recombination rate decreases, and this effect increases with L (see also ![]()
![]()
N < 2.5).
|
Polymorphism levels:
We studied the effect of IS on polymorphism levels, as measured by heterozygosity, in selected (
s) and neutral (
n) sequences. Under single-site models of weak selection (SS-MSD), the expectations are clear. A general reduction of the intensity of selection (
) predicts a relative increase of
s, making
s closer to
n. On the other hand, a reduction in Ne will cause a direct reduction in
n. The expected net consequence of reducing Ne, hence
, for mutations under SS-MSD is a reduction of
s because the reduction of
n is always greater than the expected increase of selected polymorphism due to a reduced selection, although this decrease of
s is not expected to be proportional to the reduction of Ne. For strong selection, a moderate reduction of Ne would not alter
s.
Our results (Fig 2) show that
s is below the levels expected on the basis of the imposed strength of selection acting on these mutations under SS-MSD. For all combinations of selection intensity (
N) and recombination rates (
N),
s decreases as the number of sites under selection (L) increases (see also ![]()
(e.g.,
s for
is 7075% of that for
, both for
and
), and this impact decreases, but is still noticeable, for very weak selection and high recombination (e.g.,
and
). For
, we also studied whether an even higher recombination rate
would completely eliminate IS. The results show that
does eliminate most IS on
s when L
125 compared to SS-MSD expectations, but IS is still detectable for larger L.
|
Heterozygosity is also reduced in the adjacent neutral sequences as a result of linkage to sites under weak selection. The most extreme reductions in neutral variation are observed for
(Fig 2A), where increasing either L or
N in the selected sequence substantially reduces
n. The impact that the L has on
n increases with the intensity of selection. For example, when
,
n is
75 and
50% of that observed when
for
and
, respectively. When
and L is large, there is a tendency to observe similar levels of polymorphism in selected (
s) and adjacent neutral mutations (
n), which are most evident for
(when L
2500), implying that linked selectively neutral sites and sites under weak selection may not be distinguishable by this criterion. Increasing L reduces
n even when the recombination rate is high (Fig 2B and Fig C). This observed reduction in
n is similar for different selection intensities when recombination is highest (
N = 0.4), likely reflecting a recombination rate threshold for IS.
Divergence and divergence to polymorphism ratio:
Under SS-MSD equilibrium the rate of fixation or divergence of selected mutations rapidly decreases with increasing selection intensity
. We focused on the rate of divergence when
N = 0 to illustrate the effect of IS on this evolutionary parameter (Fig 3A). As expected, selection acting at linked sites does not influence divergence for neutral mutations. The fixation rate of mutations under selection increases with L for any given
N, indicative of a reduction in the effectiveness of selection due to IS.
|
The SS-MSD model also predicts that the divergence:polymorphism ratio (Div/Pol) for weakly selected mutations decreases with increasing
because weak selection has stronger effects in reducing the rate of fixation than the level of polymorphism. Fig 3B shows the Div/Pol ratio for the region containing mutations under selection again for
. For the case of
, Div/Pol decreases with selection but to a lesser degree than that expected for a SS-MSD case. For the intermediate case of
, Div/Pol is only barely affected by selection. More exceptional is the situation in which the number of sites under selection is moderate to large (e.g.,
): In these cases Div/Pol increases not only relative to single-site expectations but also relative to neutral expectations. This trend results from two opposing effects of IS on selected mutations, increasing divergence and reducing polymorphism. Neutral sites (Fig 3C) show a consistent increase in the Div/Pol ratio with increasing IS, caused by the effect that IS has in reducing levels of linked neutral polymorphism.
Mutation frequency spectrum:
The SS-MSD models predict that, as
increases, weakly selected mutations will become less abundant and allele frequencies at polymorphic sites will decrease compared to neutral expectations. Fig 4 plots Tajima's D statistic, a measure of the skew of allele frequency compared to neutral frequency spectrum, for selected and neutral sequences under complete linkage. In the selected sequence, a more negative Tajima's D is observed with increasing selection intensity, as expected (see ![]()
N. This trend does not hold, however, when
N and L are large, (i.e.,
), where Tajima's D becomes unaffected or even less negative. This is, however, not surprising because Tajima's D statistic is not entirely independent of the number of segregating sites: It tends toward zero as the number of segregating sites in a sample becomes small for any given (nonneutral) frequency of variants. Therefore, IS increases the relative frequency of rare variants (hence it induces a negative Tajima's D) but IS also decreases the number of segregating sites, thus biasing Tajima's D estimates closer to zero when IS and its reduction of the number of segregating sites are severe.
|
The frequency spectrum of neutral mutations departs from the neutral equilibrium expectation, showing an excess of low frequency alleles when IS occurs in the adjacent selected sequence. Tajima's D becomes more negative as the number of selected sites or
N on these sites increases. When IS is strongest, the skew toward low frequencies becomes similar for both selected and neutral mutations, a trend we have also encountered for heterozygosity.
IS also influences the allele frequency of variants in the selected sequences when recombination is highest
while the frequency spectrum of neutral variants in adjacent sequences remains mostly unaffected. The skew toward low frequency variants in the selected sequences increases with L, although to a lesser degree compared to
N
0.004, and this effect intensifies with increasing
N.
IS and linkage disequilibrium:
![]()
![]()
![]()
N increases LD (with repulsion associations) in the selected sequences to a greater extent than in the adjacent neutral sequences (see Fig 5B for
). When recombination is very high
, LD-D' also varies in the selected sequences for different
N, but not in the adjacent neutral sequences.
|
However, the conditions that increase negative LD are the very same conditions for which there is an excess of low-frequency variants (with more negative Tajima's D estimates). Because most measures of LD correlate, to some degree, with the frequency of the mutations under study (![]()
N (and IS) increases. Therefore, the observed increase in LD with IS, as measured by D', is not only the consequence of a shift in the frequency spectrum.
|
Temporal dynamics after a change of recombinational environment:
Genome-wide and/or gene-specific changes in recombination rates may be common in many evolutionary systems, and so it is important to study the time needed to reach new equilibria. For instance, in Drosophila there is extensive gene order shuffling within chromosomal arms between species (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
We studied the number of generations needed to reach equilibrium after a change in the recombination rate under IS conditions. For simplicity we assumed a population at equilibrium under the initial conditions and the instantaneous fixation of a randomly chosen allele in a new recombinational environment. Such a situation might apply to a gene located near the breakpoint of a chromosomal rearrangement that was quickly driven to fixation (possibly by natural selection). In accord with expectations, most population and evolutionary parameters reach their new equilibria much sooner than codon usage (Fig 7A and Fig B). Under neutrality, few Ne generations (
4Ne for diploid individuals) after a hitchhiking event are sufficient to achieve near-equilibrium levels of polymorphism (![]()
s or Tajima's D increases with
N, requiring
2040N generations when
while this time is close to 4N generations when
N
0.5.
|
Multilocus parameters, such as those that estimate codon usage, take a very large number of generations to reach a new equilibrium following perturbation. Indeed, codon usage requires
100250N or
12.5/µ generations to reach the new equilibrium. This required time is not strongly dependent on the number of sites under selection, but it is dependent, as expected, on µ (data not shown). Two other features of codon bias evolution following a change of recombination environment were also observed. First, the change in codon bias is faster when the number of preferred mutations is increasing (i.e., changing from low to high recombination) than when the number is decreasing. Second, the weaker the selection the longer the period required to reach the new base composition equilibrium in either direction. These two features can be easily explained by three factors: (i) The average time to fixation of weakly advantageous mutations is shorter than that expected for weakly deleterious mutations, (ii) the speed to fixation of preferred mutations increases with
, and (iii) mutational pressure toward unpreferred mutations is higher when the ancestral sequence has higher P.
We also observed a tendency for population and evolutionary parameters to "overshoot" their equilibrium values when a sequence changes from no recombination to a high recombination rate (stronger for
than for
), but this effect is not detectable in the opposite direction. This overshoot creates a transient situation more nearly resembling a neutral equilibrium. Under the conditions we investigated, population parameters are closest to the neutral expectations
45N generations after the fixation of a single sequence in an environment with high recombination. For instance, in the case of
,
s changes through time from zero (right after the fixation event) to the new IS equilibrium under high recombination (after
40N generations) with an intermediate state 50% higher than that finally observed at equilibrium.
IS and its effect across regions under uniform selection: Consider an interval of a recombining genome in which segregating sites under selection result in IS, and further assume this interval is embedded in a region containing few additional mutations under selection. Since the magnitude of the IS effect acting at a particular site in this interval will be governed by the interactions between the segregating sites located on both sides of that site, it is reasonable to expect that IS will be stronger at sites embedded in the middle of a region under selection than at sites located close to an edge of this region. This situation may apply to many protein-coding regions in eukaryotic genomes, but it would be pertinent to any group of physically clustered sites under weak selection surrounded by largely unconstrained sites. Here, we investigate the magnitude of the "center" vs. "edge" effect for plausible rates of recombination and selection. The issue under scrutiny is whether IS differs measurably between the center and edge of a region under selection when both mutation rates and selection coefficients are uniformly distributed across the region.
We studied population and evolutionary parameters across sequences with 2500 sites under uniform selection, recombination, and mutation, with emphasis on a lateral and the central region of 250 sites each. Two indicators of IS are depicted in Fig 8 for the central and lateral regions: the proportion of preferred codons (P) and the divergence to polymorphism (Div/Pol) ratio. As expected, no effect is seen for the no-recombination case
. For intermediate recombination rates, IS differs between regions, with the central region showing stronger IS (central regions have lower P and higher Div/Pol ratio than lateral regions). This heterogeneous distribution of IS across regions (the center effect) decreases when recombination is very high, but it can still be seen for high recombination
when selection intensity is weak
.
|
The effect of neutral sequences between regions under selection: We studied whether or not small changes in the overall recombination rate between two regions under selection, caused only by a change in the physical distance between them, have a detectable effect on the overall IS. Here, the simulation procedure allowed us to generate a variable number of neutral sites (i.e., middle or "spacer" sequence) between two sequences under selection. The two regions under selection were identical, with equal numbers of selected sites, selection coefficients per site, and mutation and recombination rates per site. Mutation and recombination rates per site in the spacer sequence are the same as in the flanking selected sequences, but the mutations were selectively neutral. Thus, with respect to IS the presence and length of the intermediate neutral region alters only the number of recombination events between the two selected regions; it does not change directly any parameter on the flanking selected sequences.
We studied intermediate rates of recombination
for the case of a neutral sequence located between two sequences each of 500 selected sites. Fig 9A depicts the relative change of the effectiveness of selection (i.e.,
based on P) caused by the presence and length of the spacer sequence. The results show that the length of an intermediate neutral region has a detectable effect. In all cases, longer spacers lead to an increase of the effectiveness of selection (a reduction in IS) on the adjacent selected mutations. Serving as illustration, for the case of
and
the presence of a 1000-bp-long region in the middle of the selected sequence is equivalent to a relative increase of 7.4% in the overall fitness associated with the selected sequences (i.e., a gain of 2.3% preferred codons). A substantial fraction of the potential increment in fitness in regions of moderate to high recombination is achieved with short/intermediate sequences (<1000 bp), while for regions of more severely restricted recombination longer sequences are required to produce an equivalent increment in fitness. The maximum relative gain in fitness is higher for
than for
for the two rates of recombination investigated, as expected (see Fig 1).
|
Empirical tests of IS based on D. melanogaster's genome
Distribution of codon bias within genes:
As indicated in our simulations, IS is expected to be stronger in the center of regions under IS than in the margins of these regions. This leads to the first test prediction: Codon usage bias, a measure of the effectiveness of selection, will be lower in the middle of coding regions of genes than in the amino- or carboxy-terminal regions. Comparing codon bias levels within genes eliminates expression level and gene length as factors that can alter codon usage (![]()
![]()
![]()
We restricted our attention to the set of genes in the D. melanogaster genome composed of single long exons (>333 amino acids; see MATERIALS AND METHODS), a total of 659 genes. The frequency of GC at the third position of codons (GC3) was used as a measure of codon usage bias (![]()
![]()
![]()
, P < 1 x 10-6), with a lower GC3 in the central region. A similar result is obtained when the average GC3 of the two lateral sections is compared to the central section
or when each lateral region is compared separately to the central section (
, respectively). On average, the lower GC3 content in the central region of coding regions is equivalent to a reduction in
of
10% on synonymous mutations compared to lateral regions.
|
IS simulations also show that the intensity of IS increases with the length of the gene region (and hence number of sites) subject to weak selection. This leads to the second test prediction: The relative reduction in codon bias in the center of a gene will be positively correlated with the length of the coding region. Consistent with this prediction, we find a highly significant positive correlation between the length of the coding region and the difference of GC3 between lateral and central regions in the same set of 659 genes analyzed above (Spearman's correlation
). Quantitatively similar results are obtained with the frequency of preferred codons and with GC content at fourfold degenerate sites; data not shown.
According to our simulations, the presence of neutrally evolving sequences placed in the center of a region subject to weak selection can relieve the IS effect. This leads to the third test prediction: Centrally located introns will ameliorate the effect of IS in the central region of genes that contain them. To test this prediction, we compared codon bias in these same 659 genes, which lack introns, with comparable genes with introns located in the center of the coding regions (see MATERIALS AND METHODS). Fig 10B shows the results for the 187 genes obtained from the genome database satisfying these criteria. For these genes, there is no apparent reduction of GC3 in the middle of the coding regions. Accordingly, we do not detect significant heterogeneity of GC3 between the three regions
or a difference between the GC3 content of central and the two lateral regions (P > 0.15). In addition, the central sections of coding regions of genes with central intron(s) have a significantly higher GC3 content than the equivalent central sections in genes without introns (Mann-Whitney U-test,
) whereas both lateral sections show similar frequencies (P = 0.61 and P = 0.09). Therefore, the lower GC3 frequency in the middle of the coding region in genes without introns cannot be the result of general relaxed selection on codon bias in the central part of coding regions.
Proportion of selected sites in a gene and codon bias: According to our simulations, the presence of neutral sequences embedded in a region under selection causes an increase in the effectiveness of selection on adjacent selected sequences, the length of such neutral sequences being positively correlated with the increment of the effectiveness of selection. We investigated, therefore, a fourth test prediction: Codon bias will be positively correlated with measures of a gene's intron length and number. As a first approximation, we studied the relationship between measures of codon bias (e.g., GC3) and total intron length in the set of all genes with confirmed intron/exon structure. There is a weak positive relationship between GC3 and total intron length, both using all introns (R = 0.040, P = 0.0007) and after eliminating the small fraction of introns with detectable remnants of TE elements (R = 0.041, P = 0.0005).
We also studied the relationship between codon bias and measures of the proportion of sites under selection in a gene. As a simple measure of the density of sites under selection in a gene, we used the PLCR in a gene when embedded introns are included (see MATERIALS AND METHODS). The prediction under IS is again explicit: Codon bias (as measured by GC3) will decrease as PLCR increases. The analysis of all 7499 genes with introns reveals a significantly negative relationship between GC3 and PLCR (R = -0.136, P < 1 x 10-6); equivalent results are obtained using other measures of codon bias. Fig 11, a display of GC3 when genes are grouped with respect to PLCR into five sets of equal sample size, shows that the effect may be stronger when PLCR is medium/high.
|
Gene length may have a confounding effect on the relationship between GC3 and PLCR because the length of coding region is negatively related to codon bias (![]()
![]()
![]()
Gene length and intron presence: IS predictions of the favorable consequences of intermediate or spacer sequences forecast that intron length will increase with the length of the coding region. The average length of introns increases with the total length of the coding region (R = 0.219, P < 1 x 10-6). This relationship is not attributable to differences in either recombination rates or gene expression levels and remains significant (P < 1 x 10-6) after controlling for these variables. A greater number of introns are also observed in long genes (R = 0.53, P < 1 x 10-6) although this observation could be connected to causes other than IS.
Intergenic distance and gene length: In addition to having longer (and a greater number of) introns in relation to the length of a coding region, is there also evidence for greater intergenic distance as a function of length of coding regions of adjacent genes? This result is expected under a scenario where longer intergenic regions are favored when, otherwise, IS between adjacent genes would be enhanced, i.e., when the lengths of the neighboring coding regions increase. To address this seventh test prediction, we investigated the length of intergenic regions separating well-defined genes (see MATERIALS AND METHODS). The results, displayed in Fig 12, reveal a positive relationship between the length of the 6271 intergenic sequences investigated and the length of the flanking coding regions (R = 0.097, P < 1 x 10-6); a positive relationship is also observed in regions of high recombination (>3 x 10-8/bp/generation; R = 0.104, P < 1 x 10-6, n = 2367). Alternative explanations to this observation based on functional considerations might also be proposed, such as genes with longer coding regions, if they are functionally more complex, might require tighter gene regulation (and hence longer noncoding regions). But we are unaware of any explicit empirical support for this class of alternative explanations. If our interpretation of this correlation is correct, it would suggest that IS between adjacent genes might not be negligible in most of the range of recombination rates in Drosophila. This relationship is not an indirect consequence of the effect that recombination rates might have on both parameters: The length of the intergenic regions decreases with increasing recombination rates (R = -0.034, P = 0.008) but no relationship is detected between the length of coding regions and recombination (P > 0.40).
|
| DISCUSSION |
|---|
The Hill-Robertson effect, broadly defined, considers the reduction in the efficacy of selection as an indirect consequence of selection at a linked locus. This effect is generally interpreted as being equivalent to a reduction in Ne (![]()
![]()
![]()
![]()
![]()
Previous investigation of IS under plausible conditions of recombination, selection, and mutation allowed us to integrate two empirical observations about codon bias in Drosophila genes not easily explained by single-site models of selection (![]()
![]()
![]()
![]()
![]()
![]()
![]()
IS and its evolutionary consequences:
Polymorphism levels in the selected sequence (
s) are, in general terms, decreased by either increasing the number of selected sites or reducing recombination. When selection increases,
s becomes less affected by changes in recombination rates while linked neutral polymorphism (
n) and other parameters such as codon bias or rates of fixation vary substantially. This more modest response of
s to an increment of IS when selection increases is not surprising because the expected net reduction of
s due to smaller Ne under SS-MSD also decreases when selection increases (see RESULTS).
IS reduces polymorphism levels in adjacent neutral sequences and the effect increases with increasing any parameter that contributes to IS. The effect that IS has on
n is maximum for total linkage but it is also measurable when recombination occurs. Therefore, IS may be a contributing factor in the reduction of neutral polymorphism levels in regions of low recombination observed in a variety of organisms. But it is unlikely that IS alone can cause extreme reduction in levels of polymorphism in regions of low recombination. When IS is greatest (i.e., complete linkage and large L),
s and
n may become similarly reduced, making the distinction between selected and linked neutral sequences uncertain.
Consistent with previous studies and with the idea that IS reduces Ne and the efficacy of selection, we find that polymorphism levels (both
s and
n) decrease and the fixation rates of weakly selected mutations increase with IS. But IS also increases the skew in the frequency spectrum of mutations under weak selection (i.e., increasing the proportion of low-frequency variants). Furthermore, the study of linked neutral sequences shows that IS also creates a skew in the frequency spectrum of neutral mutations away from the equilibrium neutral distribution. These results reveal complexities in the evolutionary dynamics of IS that cannot be rationalized as being equivalent to a reduction in Ne in standard formulations of weak selection at equilibrium.
IS predicts a skew of allele frequencies away from the neutral equilibrium and to lower, nonpolarized, frequencies for both selected and linked neutral mutations. In common with the HH and pHH models, this skew will be most discernible in genomic regions with reduced recombination, causing a positive correlation between Tajima's D and rates of recombination (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In general, the application of the SS-MSD model to estimate the compound parameter
(Nes) when IS is present will lead to an underestimate of
for a given level of polymorphism or rate of divergence but it will lead to an overestimate of
on the basis of the frequency spectrum of mutations. Consequences of IS are also expected to be highly heterogeneous among genes and across genomic regions, on the basis of differences in recombination rates, gene densities, gene sizes, and gene structures. Hence, IS would cause large variances in many population and evolutionary parameters. This heterogeneity across genomes may be useful for differentiating IS from demographic causes, for which more homogeneously distributed effects are expected.
Variability in the intensity of IS has population genetics and evolutionary consequences that can easily be misinterpreted as indicating differences in selective regimes among genes. IS causes an increase in the rate of divergence of mutations under MSD, where the stronger the selection (0.25
N
2.5), the more conspicuous the effect of IS on the rate of divergence. As a result, differences in rates of substitution, both Ks and Ka (and Ka/Ks ratio), between genes can potentially be explained by variable IS with constant selection. Equivalently, variable IS will alter Div/Pol ratios without requiring differences in selection coefficients.
Temporal dynamics after a change of recombinational environment:
Gene rearrangements within chromosomal arms are a recurrent characteristic of Drosophila micro- and macroevolution. Because recombination is not homogeneously distributed along Drosophila chromosomes (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Our simulations indicate that the period of nonequilibrium base composition (codon usage) after a drastic change in recombination environment may be 100250Ne generations (i.e., equivalent to >1025 mya in most Drosophila species) when
. This suggests that nonequilibrium codon usage caused by frequent change in recombination rates may be the norm rather than the exception, at least in Drosophila evolution. Furthermore, the period required to reach the new (multisite) base composition equilibrium is expected to be longer for genes undergoing a reduction in codon bias than for those in which it is increasing. The nonequilibrium scenario is apparent in analyses of fixed synonymous mutations along D. melanogaster/D. simulans lineages, with codon bias declining both in D. melanogaster and, to a lesser degree, in D. simulans (![]()
![]()
![]()
![]()
![]()
![]()
Population and evolutionary parameters, such as polymorphism levels and their frequencies or rates of evolution, reach estimates representing the new equilibrium faster than those of codon usage (![]()
![]()
![]()
The results also have a practical implication when estimates of the compound parameter
are estimated from divergence data. Estimates of
using divergence data of weakly selected mutations will tend to indicate their lower boundaries, mostly suggesting a shrinking Ne in nearly all lineages because periods with reduced recombination will contribute most of the substitutions. This effect will be in addition to the previously described general underestimation of
based on rates of fixation because of IS. The observed strong and relatively fast effects that changing recombination rates have on the rates of fixation of weakly selected mutations are congruent with the suggestion (![]()
![]()
![]()
Heterogeneous effect of IS across selected regions:
The magnitude of IS is expected to be heterogeneously distributed across regions under uniform selection intensity and mutation rate as a consequence of the different number of weakly selected sites surrounding each site (i.e., density of selected sites within a genetic distance). This prediction was confirmed using simulations that show that consequences of IS are stronger in the central regions of sequences under selection compared to lateral regions, and the effect, we believe, is empirically detectable in Drosophila genes (see below).
Enhanced IS in central regions of sequences under selection has the consequence that many population and evolutionary parameters will be also heterogeneously distributed across sequences. These spatial differences might be incorrectly interpreted as indicating variable selective regimes although constancy (spatial and temporal) in selection coefficients might be the case. Two examples can be briefly given. First, heterogeneously distributed IS across sequences will generate Div/Pol values higher in the central regions of sequences under constant selection, suggestive of past action of positive selection or relaxed constraints in these central regions. Second, stronger IS will also tend to generate higher substitution rates and vary Ka/Ks ratios in central regions of genes compared to the edges of genes and this can easily be misinterpreted as variable selective constraints across the gene.
U-shaped distribution of codon bias across long exons in D. melanogaster:
Long undisrupted coding sequences in D. melanogaster have central sections with significantly reduced codon bias compared to lateral sections of the same coding region (Fig 9). The study and comparison of central and lateral sections of the same gene allow us to exclude many factors implicated as drivers of differential codon bias, including gene expression, gene length, genomic recombinational environment, and possible mutational differences associated with recombination or transcription rates. The comparison of codon bias in central regions in genes without introns and in genes with introns centrally located further allows us to rule out general relaxed constraints on codon bias in the central parts of proteins.
This observation fits with the outcome of the simulations that produce a U-shaped distribution of the effectiveness of selection and codon bias across sequences under selection, caused by stronger IS in central regions (Fig 4), and it is not predicted by other models of codon bias. For instance, models of selection on codon bias due to translational accuracy (![]()
![]()
![]()
![]()
![]()
). This last observation supports the proposal that IS might be detectable in genes across the entire range of recombination in D. melanogaster (![]()
![]()
![]()
Our analysis of genes without introns confirms a prior report indicating a higher GC3 in 5' sections of coding regions of D. melanogaster genes than 3' sections (![]()
![]()
![]()
![]()
Neutral regions as modifiers of recombination between selected regions:
Granted that the evolutionary consequence of neutral intermediate sequences as modifiers of recombination is very small for plausible lengths (i.e., <1000 bp), simulations show that under realistic rates of selection, recombination, and mutation, the presence of neutral intermediate (or spacer) sequences may have a measurable effect on the overall magnitude of IS in adjacent sequences under selection. Even very small increments in the number of recombination events between two regions under selection, obtained by increasing the physical distance between the two selected regions, can reduce IS when the recombination rate (per physical unit) is moderate/low. This reduction in IS instigates an increase of the effectiveness of selection together with the decline of all properties associated with allele perturbation or traffic. Therefore, in regions of reduced recombination, reasonably long intermediate sequences may be favored as a counterbalance to the reduced effectiveness of selection caused by tight linkage. Congruent with the simulation results, in D. melanogaster the presence of introns is associated with an increase in the effectiveness of selection. This result is observed using either the absolute length of introns or a measure of the relative length of introns, taking into account the length of the coding region. The difference in codon bias in central regions of coding regions between genes with introns centrally located and genes without introns also supports this interpretation.
Thus, genomic data in D. melanogaster support the hypothesis that the presence and length of "junk" DNA between clusters of selected sites may itself be a selective trait (![]()
![]()
![]()
![]()
![]()
![]()
![]()
Whether IS is restricted mainly to intervals containing single genes (both coding and regulatory regions) or, conversely, whether neighboring genes have detectable effects on each other will depend on the effective recombination rate between genes (involving physical distance and recombination rate per site) and gene lengths. Because the distance between genes in most eukaryotic species is usually several kilobases, IS between most adjacent genes will likely be negligible except for species/genomic regions with very low recombination rates. However, the results showing a positive relationship between intergenic distance and the length of the flanking coding regions suggest that, in D. melanogaster, IS may be influencing the size of intergenic sequences in some instances. Under this perspective, the study of IS and of the evolution of recombination and their effects on the effectiveness of selection should also incorporate the enormous plasticity that genomes have, involving gene structure, intron size, and gene density. Our studies on IS suggest that the apparent lack of biological function associated with many intronic or intergenic sequences might not always imply that they are devoid of evolutionary function.
Conclusions:
Mutations of weak selective effect, when they are sufficient in number and are tightly linked, reduce the overall efficacy of selection (e.g., cause an increase in the fixation rate of selected mutations) and, under the model we investigated, the consequences include a reduction of polymorphism. We found that the reduction of polymorphism extends not only to the sites under selection but to linked neutral mutations as well. These effects are similar to those seen in single-site models of selection, such as MSD, when Ne is reduced. Other consequences of IS, however, cannot be easily related to simpler models of selection, and perhaps these represent the unique signatures of IS. They include the increase in occurrence of rare alleles and negative linkage disequilibrium in both selected and linked neutral mutations.
The discovery of a genome-wide relationship between noncoding polymorphism levels and the recombination rate in Drosophila stimulated the investigation of models involving common forms of natural selection and the influence these forms of selection have on linked neutral variability. As with definitely deleterious mutation and BGS, weakly selected mutation and IS are also likely to be omnipresent throughout a species' genome. Both, therefore, have the potential for explaining variation in polymorphism levels associated with recombination rates. Likely, IS is distinguishable from models involving strong selection (BGS and HH/pHH models) in that consequences of clusters of weakly selected sites may have a much finer and patchier distribution across genomes than those caused by definitively selected mutations, and linked neutral variability will be reduced under IS only in small regions surrounding these clusters. Complete genome polymorphism studies of Drosophila, similar to those contemplated in humans, may allow us to distinguish IS effects from those caused by strongly selected mutations.
Perhaps the most exciting findings presented here are the empirical tests of IS. The seven predictions we generated for testing IS are, we believe, highly specific to IS and therefore are strong tests of this theory. Indeed the regularities we discovered in Drosophila genome architecture were investigated because we had a theory that led us to their predicted existence. Having made these discoveries now, we hope these genome-wide patterns stimulate the search for alternative explanations. A priori one might have thought that the attempt to find empirical support for a theory about weakly interacting mutations might be confined to the population genetic realm. Here we show that genome features may also be highly relevant to our population genetic theory. What this means is that genome features that have previously been attributed to ancient events and accidents of history may actually be features retained and sculpted by a common form of selection, underscoring the hidden treasures present in genome data. This trendusing genome-wide data to investigate population genetic models of selectionwill only accelerate as additional species are sequenced.
| ACKNOWLEDGMENTS |
|---|
We thank R. Hudson and A. Llopart for helpful discussions and suggestions. We also thank the anonymous reviewers and N. Takahata for useful comments. This research was supported by a National Institutes of Health grant GM-39355 to M.K.
Manuscript received August 15, 2001; Accepted for publication February 14, 2002.
| APPENDIX |
|---|
EFFECT OF SMALL NUMBER OF SIMULATED INDIVIDUALS ON ESTIMATES OF IS
The size of the simulated populations is usually very small compared to the actual natural populations due to computational time constraints when a large number of mutable sites and/or recombination events are under study. According to diffusion theory of weak selection, equivalent equilibria are expected for different numbers of individuals (or chromosomes) as long as the products Ns (
) and Nµ (ß) are kept constant and ß << 1 (![]()
![]()
![]()
![]()
Following ![]()
510% of the population. As ![]()
) are also underestimated when the sample represents a large fraction of the total population and N is not very large (i.e., <250 diploid individuals).
Fig 1 shows the effect of small populations on the study of IS for the case of L = 2500 linked sites under weak selection. Two parameters used to evaluate IS (see text) are depicted: the frequency of preferred codons (P) and Tajima's D. Under the applied selective model (semidominance and multiplicative over sites; see MATERIALS AND METHODS), the effects of IS are quantitatively altered by the use of very small populations, causing the tendency to overestimate the reduction of the effectiveness of selection due to IS. In particular, small populations generate sequences with smaller P and selected mutations segregate at frequencies closer to those expected under neutrality. These trends are less conspicuous, or even absent, when the causes of IS are reduced (e.g., L < 500).
Altogether, these results indicate the need for generating large population sizes to obtain a precise picture of the outcome of subtle interactions between drift, multilocus selection, linkage, and mutation. The population size, N, required for studies of IS should represent a compromise between accuracy of results and pragmatic simulation times. Note that the computational time needed to study populations near equilibrium may increase exponentially with N and the failure to allow such a period of time might produce imprecise or biased outcomes. The population size should be adapted to the sample size, mutation rate, number of sites under selection, selection coefficients, and likely the selective scheme to be scrutinized.
| LITERATURE CITED |
|---|
ADAMS, M. D., S. E. CELNIKER, R. A. HOLT, C. A. EVANS, and J. D. GOCAYNE et al., 2000 The genome sequence of Drosophila melanogaster.. Science 287:2185-2195
AGUADÉ, M., and C. H. LANGLEY, 1994 Polymorphism and divergence in regions of low recombination in Drosophila, pp. 6776 in Non-Neutral Evolution: Theories and Molecular Data, edited by B. GOLDING. Chapman & Hall, New York.
AGUADÉ, M., N. MIYASHITA, and C. H. LANGLEY, 1989 Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster.. Genetics 122:607-615
AKASHI, H., 1994 Synonymous codon usage in Drosophila melanogaster: Natural selection and translational accuracy. Genetics 136:927-935[Abstract].
AKASHI, H., 1995 Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA. Genetics 139:1067-1076[Abstract].
AKASHI, H., 1996 Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster.. Genetics 144:1297-1307[Abstract].
AKASHI, H. and S. W. SCHAEFFER, 1997 Natural selection and the frequency distributions of "silent" DNA polymorphism in Drosophila. Genetics 146:295-307[Abstract].
ANDOLFATTO, P. and M. NORDBORG, 1998 The effect of gene conversion on intralocus associations. Genetics 148:1397-1399
ANDOLFATTO, P. and M. PRZEWORSKI, 2000 A genome-wide departure from the standard neutral model in natural populations of Drosophila. Genetics 156:257-268
ANDOLFATTO, P. and M. PRZEWORSKI, 2001 Regions of lower crossing over harbor more rare variants in African populations of Drosophila melanogaster.. Genetics 158:657-665
AQUADRO, C. F., D. J. BEGUN and E. C. KINDAHL, 1994 Selection, recombination and DNA polymorphism in Drosophila, pp. 4656 in Non-Neutral Evolution: Theories and Molecular Data, edited by B. GOLDING. Chapman & Hall, New York.
ASHBURNER, M., 1989 Drosophila: A Laboratory Handbook. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
BEGUN, D. J., 2001 The frequency distribution of nucleotide variation in Drosophila simulans.. Mol. Biol. Evol. 18:1343-1352
BEGUN, D. J. and C. F. AQUADRO, 1992 Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster.. Nature 356:519-520[Medline].
BENNETZEN, J. L. and B. D. HALL, 1982 Codon selection in yeast. J. Biol. Chem. 257:3026-3031
BIRKY, C. W. and J. B. WALSH, 1988 Effects of linkage on rates of molecular evolution. Proc. Natl. Acad. Sci. USA 85:6414-6418
BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995 The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783-796[Abstract].
BULMER, M., 1991 The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897-907[Abstract].
CARVALHO, A. B. and A. G. CLARK, 1999 Intron size and natural selection. Nature 401:344[Medline].
CHARLESWORTH, B., 1994 The effect of background selection against deleterious mutations on weakly selected, linked variants. Genet. Res. 63:213-227[Medline].
CHARLESWORTH, B., 1996 Background selection and patterns of genetic diversity in Drosophila melanogaster.. Genet. Res. 68:131-149[Medline].
CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993 The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303[Abstract].
COMERON, J. M., 2001 What controls the length of noncoding DNA? Curr. Opin. Genet. Dev. 11:652-659[Medline].
COMERON, J. M. and M. AGUADÉ, 1996 Synonymous substitutions in the Xdh gene of Drosophila: heterogeneous distribution along the coding region. Genetics 144:1053-1062[Abstract].
COMERON, J. M. and M. KREITMAN, 1998 The correlation between synonymous and nonsynonymous substitutions in Drosophila: mutation, selection, or relaxed constraints? Genetics 150:767-775
COMERON, J. M. and M. KREITMAN, 2000 The correlation between intron length and recombination in Drosophila: dynamic equilibrium between mutational and selective forces. Genetics 156:1175-1190
COMERON, J. M., M. KREITMAN, and M. AGUADÉ, 1999 Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239-249
CROW, J. F., and M. KIMURA, 1970 An Introduction to Population Genetics Theory. Harper & Row, New York.
DURET, L. and D. MOUCHIROUD, 1999 Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila and Arabidopsis.. Proc. Natl. Acad. Sci. USA 96:4482-4487
DVORÁK, J., M. C. LUO, and Z. L. YANG, 1998 Restriction fragment length polymorphism and divergence in the genomic regions of high and low recombination in self-fertilizing and cross-fertilizing Aegilops species. Genetics 148:423-434
EWENS, W. J., 1979 Mathematical Population Genetics. Springer-Verlag, Berlin.
EYRE-WALKER, A., 1996 Synonymous codon bias is related to gene length in Escherichia coli: selection for translational accuracy? Mol. Biol. Evol. 13:864-872[Abstract].
EYRE-WALKER, A. and M. BULMER, 1993 Reduced synonymous substitution rate at the start of enterobacterial genes. Nucleic Acids Res. 21:4599-4603
FELSENSTEIN, J., 1974 The evolutionary advantage of recombination. Genetics 78:737-756
GALLEGO, P., E. JUAN, and M. PAPACEIT, 1999 Chromosomal homologies between Drosophila melanogaster and D. funebris determined by in-situ hybridization. Chromosome Res. 7:331-339[Medline].
GILLESPIE, J. H., 1997 Junk ain't what junk does: neutral alleles in a selected context. Gene 205:291-299[Medline].
GILLESPIE, J. H., 2000 Genetic drift in an infinite population. The pseudohitchhiking model. Genetics 155:909-919
GOLDING, G. B., 1997 The effect of purifying selection on genealogies, pp. 271285 in Progress in Population Genetics and Human Evolution, edited by P. DONNELLY and S. TAVARE. Springer-Verlag, New York.
GRANTHAM, R., C. GAUTIER, M. GOUY, M. JACOBZONE, and R. MERCIER, 1981 Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 9:R43-74.
GROSJEAN, H. and W. FIERS, 1982 Preferential codon usage in prokaryotic genes: the optimal codon-anticodon interaction energy and the selective codon usage in efficiently expressed genes. Gene 18:199-209[Medline].
HAMBLIN, M. T. and C. F. AQUADRO, 1999 DNA sequence variation and the recombinational landscape in Drosophila pseudoobscura: a study of the second chromosome. Genetics 153:859-869
HILL, W. G. and A. ROBERTSON, 1966 The effect of linkage on the limits to artificial selection. Genet. Res. 8:269-294[Medline].
HILL, W. G. and A. ROBERTSON, 1968 Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38:226-231.
HILLIKER, A. J. and A. CHOVNICK, 1981 Further observations on intragenic recombination in Drosophila melanogaster.. Genet. Res. 38:281-296[Medline].
HILLIKER, A. J., G. HARAUZ, A. G. REAUME, M. GRAY, S. H. CLARK, and A. CHOVNICK, 1994 Meiotic gene conversion tract length distribution within the rosy locus of Drosophila melanogaster. Genetics 137:1019-1026[Abstract].
HILTON, H., R. M. KLIMAN, and J. HEY, 1994 Using hitchhiking genes to study adaptation and divergence during speciation within the Drosophila melanogaster species complex. Evolution 48:1900-1913.
HUDSON, R. R., 1987 Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245-250[Medline].
HUDSON, R. R. and N. L. KAPLAN, 1995 Deleterious background selection with recombination. Genetics 141:1605-1617[Abstract].
HUDSON, R. R., K. BAILEY, D. SKARECKY, J. KWIATOWSKI, and F. J. AYALA, 1994 Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster.. Genetics 136:1329-1340[Abstract].
IKEMURA, T., 1981 Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol. 151:389-409[Medline].
KAPLAN, N. L., R. R. HUDSON, and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics 123:887-899
KELLY, J. K., 1997 A test of neutrality based on interlocus associations. Genetics 146:1197-1206[Abstract].
KIM, Y. and W. STEPHAN, 2000 Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics 155:1415-1427
KIRBY, D. A. and W. STEPHAN, 1995 Haplotype test reveals departure from neutrality in a segment of the white gene of Drosophila melanogaster.. Genetics 141:1483-1490[Abstract].
KLIMAN, R. M., 1999 Recent selection on synonymous codon usage in Drosophila. J. Mol. Evol. 49:343-351[Medline].
KLIMAN, R. M. and A. EYRE-WALKER, 1998 Patterns of base composition within the genes of Drosophila melanogaster. J. Mol. Evol. 46:534-541[Medline].
KLIMAN, R. M. and J. HEY, 1993 Reduced natural selection associated with low recombination in Drosophila melanogaster.. Mol. Biol. Evol. 10:1239-1258[Abstract].
KRAFT, T., T. SALL, I. MAGNUSSON-RADING, N. O. NILSSON, and C. HALLDEN, 1998 Positive correlation between recombination rates and levels of genetic variation in natural populations of sea beet (Beta vulgaris subsp. maritima). Genetics 150:1239-1244
KRESS, H., 1993 The salivary gland chromosomes of Drosophila virilis: a cytological map, pattern of transcription and aspects of chromosome evolution. Chromosoma 102:734-742[Medline].
KURLAND, C. G., 1987 Strategies for efficiency and accuracy in gene expression. 1. The major codon preference: a growth optimization strategy. Trends Biochem. Sci. 12:126-128.
LANGLEY, C. H., Y. N. TOBARI, and K. I. KOJIMA, 1974 Linkage disequilibrium in natural populations of Drosophila melanogaster.. Genetics 78:921-936
LANGLEY, C. H., J. MACDONALD, N. MIYASHITA, and M. AGUADÉ, 1993 Lack of correlation between interspecific divergence and intraspecific polymorphism at the suppressor of forked region in Drosophila melanogaster and Drosophila simulans.. Proc. Natl. Acad. Sci. USA 90:1800-1803
LANGLEY, C. H., B. P. LAZZARO, W. PHILLIPS, E. HEIKKINEN, and J. M. BRAVERMAN, 2000 Linkage disequilibria and the site frequency spectra in the su(s) and su(w(a)) regions of the Drosophila melanogaster X chromosome. Genetics 156:1837-1852
LEMEUNIER, F. and M. A. ASHBURNER, 1976 Relationships within the melanogaster species subgroup of the genus Drosophila (Sophophora). II. Phylogenetic relationships between six species based upon polytene chromosome banding sequences. Proc. R. Soc. Lond. Ser. B Biol. Sci. 193:275-294[Medline].
LEWONTIN, R. C., 1964 Interaction of selection + linkage. I. General considerationsheterotic models. Genetics 49:49-67
LEWONTIN, R. C., 1974 The Genetic Basis of Evolutionary Change. Columbia University Press, New York.
LEWONTIN, R. C., 1988 On measures of gametic disequilibrium. Genetics 120:849-852
LEWONTIN, R. C. and K. KOJIMA, 1960 The evolutionary dynamics of complex polymorphisms. Evolution 14:458-472.
LI, W.-H., 1987 Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons. J. Mol. Evol. 24:337-345[Medline].
LINDSLEY, D. L. and L. SANDLER, 1977 The genetic analysis of meiosis in female Drosophila melanogaster.. Philos. Trans. R. Soc. Lond. B Biol. Sci. 277:295-312[Medline].
LOZOVSKAYA, E. R., D. A. PETROV, and D. L. HARTL, 1993 A combined molecular and cytogenetic approach to genome evolution in Drosophila using large-fragment DNA cloning. Chromosoma 102:253-266[Medline].
LUDWIG, M., N. PATEL, and M. KREITMAN, 1998 Functional conservation of even-skipped stripe 2 enhancer in Drosophila. Development 125:949-958[Abstract].
MARTÍN-CAMPOS, J. M., J. M. COMERÓN, N. MIYASHITA, and M. AGUADÉ, 1992 Intraspecific and interspecific variation at the y-ac-sc region of Drosophila simulans and Drosophila melanogaster.. Genetics 130:805-816[Abstract].
MAYNARD SMITH, J. and J. HAIGH, 1974 The hitch-hiking effect of a favorable gene. Genet. Res. 23:23-35[Medline].
MCVEAN, G. A. and B. CHARLESWORTH, 2000 The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155:929-944
MCVEAN, G. A. and J. VIEIRA, 2001 Inferring parameters of mutation, selection and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245-257
MORIYAMA, E. N. and D. L. HARTL, 1993 Codon usage bias and base composition of nuclear genes in Drosophila. Genetics 134:847-858[Abstract].
MORIYAMA, E. N. and J. R. POWELL, 1996 Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277[Abstract].
MORIYAMA, E. N. and J. R. POWELL, 1998 Gene length and codon usage bias in Drosophila melanogaster, Saccharomyces cerevisiae and Escherichia coli.. Nucleic Acids Res. 26:3188-3193
NACHMAN, M. W., 1997 Patterns of DNA variability at X-linked loci in Mus domesticus.. Genetics 147:1303-1316[Abstract].
NACHMAN, M. W., V. L. BAUER, S. L. CROWELL, and C. F. AQUADRO, 1998 DNA variability and recombination rates at X-linked loci in humans. Genetics 150:1133-1141
NEUHAUSER, C. and S. M. KRONE, 1997 The genealogy of samples in models with selection. Genetics 145:519-534[Abstract].
NICOLAS, A., 1998 Relationship between transcription and initiation of meiotic recombination: toward chromatin accessibility. Proc. Natl. Acad. Sci. USA 95:87-89
OHTA, T., 1972 Evolutionary rate of cistrons and DNA divergence. J. Mol. Evol. 1:150-157.
OHTA, T., 1995 Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. J. Mol. Evol. 40:56-63[Medline].
OHTA, T. and M. KIMURA, 1971 On the constancy of the evolutionary rate of cistrons. J. Mol. Evol. 1:18-25[Medline].
PAPACEIT, M. and A. PREVOSTI, 1989 Differences in chromosome A arrangement between Drosophila madeirensis and Drosophila subobscura. Experientia 45:310-312[Medline].
PERLITZ, M. and W. STEPHAN, 1997 The mean and variance of the number of segregating sites since the last hitchhiking event. J. Math. Biol. 36:1-23[Medline].
PETROV, D. A. and D. L. HARTL, 1998 High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups. Mol. Biol. Evol. 15:293-302[Abstract].
PETROV, D. A., E. R. LOZOVSKAYA, and D. L. HARTL, 1996 High intrinsic rate of DNA loss in Drosophila. Nature 384:346-349[Medline].
PINSKER, W. and D. SPERLICH, 1984 Cytogenetic mapping of enzyme loci on chromosomes J and U of Drosophila subobscura.. Genetics 108:913-926
POWELL, J. R. and E. N. MORIYAMA, 1997 Evolution of codon usage bias in Drosophila. Proc. Natl. Acad. Sci. USA 94:7784-7790
PRZEWORSKI, M., B. CHARLESWORTH, and J. D. WALL, 1999 Genealogies and weak purifying selection. Mol. Biol. Evol. 16:246-252[Abstract].
PRZEWORSKI, M., R. R. HUDSON, and A. DI RIENZO, 2000 Adjusting the focus on human variation. Trends Genet. 16:296-302[Medline].
ROBERTSON, A., 1961 Inbreeding in artificial selection programmes. Genet. Res. 2:189-194.
SEGARRA, C. and M. AGUADÉ, 1992 Molecular organization of the X chromosome in different species of the obscura group of Drosophila. Genetics 130:513-521[Abstract].
SHARP, P. M. and W.-H. LI, 1987 The rate of synonymous substitution in enterobacterial genes is inversely related to codon usage bias. Mol. Biol. Evol. 4:222-230[Abstract].
SHARP, P. M. and W.-H. LI, 1989 On the rate of DNA sequence evolution in Drosophila. J. Mol. Evol. 28:398-402[Medline].
SHIELDS, D. C., P. M. SHARP, D. G. HIGGINS, and F. WRIGHT, 1988 "Silent" sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol. Evol. 5:704-716[Abstract].
SLATKIN, M., 2000 Balancing selection at closely linked, overdominant loci in a finite population. Genetics 154:1367-1378
STEPHAN, W., 1994 Effects of genetic recombination and population subdivision on nucleotide sequence variation in Drosophila ananassae, pp. 5766 in Non-Neutral Evolution: Theories and Molecular Data, edited by B. GOLDING. Chapman & Hall, New York.
STEPHAN, W. and C. H. LANGLEY, 1989 Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. I. Contrasts between the vermilion and forked loci. Genetics 121:89-99
STEPHAN, W. and C. H. LANGLEY, 1998 DNA polymorphism in Lycopersicon and crossing over per physical length. Genetics 150:1585-1593
TACHIDA, H., 2000 Molecular evolution in a multisite nearly neutral mutation model. J. Mol. Evol. 50:69-81[Medline].
TAKANO-SHIMIZU, T., 1999 Local recombination and mutation effects on molecular evolution in Drosophila. Genetics 153:1285-1296
TAKANO-SHIMIZU, T., 2001 Local changes in GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol. Biol. Evol. 18:606-619
TRUE, J. R., J. M. MERCER, and C. C. LAURIE, 1996 Differences in crossover frequency and distribution among three sibling species of Drosophila. Genetics 142:507-523[Abstract].
WHITING, J. H., JR., M. D. PHILEY, J. L. FARMER, and D. E. JEFFERY, 1989 In situ hybridization analysis of chromosomal homologies in Drosophila melanogaster and Drosophila virilis.. Genetics 122:99-109
WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16:97-159
WRIGHT, S., 1938 Size of population and breeding structure in relation to evolution. Science 87:430-431.
ZENG, L.-W., J. M. COMERON, B. CHEN, and M. KREITMAN, 1998 The molecular clock revisited: the rate of synonymous vs. replacement change in Drosophila. Genetica 102(103):369-382.
This article has been cited by other articles:
![]() |
J.-Q. Chen, Y. Wu, H. Yang, J. Bergelson, M. Kreitman, and D. Tian Variation in the Ratio of Nucleotide Substitution and Indel Rates across Genomes in Mammals and Bacteria Mol. Biol. Evol., July 1, 2009; 26(7): 1523 - 1531. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Kim and T. Wiehe Simulation of DNA sequence evolution under models of recent directional selection Brief Bioinform, January 1, 2009; 10(1): 84 - 96. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Vicario, C. E. Mason, K. P. White, and J. R. Powell Developmental Stage and Level of Codon Usage Bias in Drosophila Mol. Biol. Evol., November 1, 2008; 25(11): 2269 - 2277. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Friberg and W. R. Rice Cut Thy Neighbor: Cyclic Birth and Death of Recombination Hotspots via Genetic Conflict Genetics, August 1, 2008; 179(4): 2229 - 2238. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Baines, S. A. Sawyer, D. L. Hartl, and J. Parsch Effects of X-Linkage and Sex-Biased Gene Expression on the Rate of Adaptive Protein Evolution in Drosophila Mol. Biol. Evol., August 1, 2008; 25(8): 1639 - 1650. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Dolgin and B. Charlesworth The Effects of Recombination Rate on the Distribution and Abundance of Transposable Elements Genetics, April 1, 2008; 178(4): 2169 - 2177. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Warnecke and L. D. Hurst Evidence for a Trade-Off between Translational Efficiency and Splicing Regulation in Determining Synonymous Codon Usage in Drosophila melanogaster Mol. Biol. Evol., December 1, 2007; 24(12): 2755 - 2762. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Gilchrist Combining Models of Protein Translation and Population Genetics to Predict Protein Production Rates from Codon Usage Patterns Mol. Biol. Evol., November 1, 2007; 24(11): 2362 - 2372. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Loewe and B. Charlesworth Background Selection in Single Genes May Explain Patterns of Codon Bias Genetics, March 1, 2007; 175(3): 1381 - 1393. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Ingvarsson Gene Expression and Protein Length Influence Codon Usage and Rates of Sequence Evolution in Populus tremula Mol. Biol. Evol., March 1, 2007; 24(3): 836 - 844. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Beye, I. Gattermeier, M. Hasselmann, T. Gempe, M. Schioett, J. F. Baines, D. Schlipalius, F. Mougel, C. Emore, O. Rueppell, et al. Exceptionally high levels of recombination across the honey bee genome Genome Res., November 1, 2006; 16(11): 1339 - 1344. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-Y. Liao, N. M. Scott, and J. Zhang Impacts of Gene Essentiality, Expression Pattern, and Gene Compactness on the Evolutionary Rate of Mammalian Proteins Mol. Biol. Evol., November 1, 2006; 23(11): 2072 - 2080. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Presgraves Intron Length Evolution in Drosophila Mol. Biol. Evol., November 1, 2006; 23(11): 2203 - 2213. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Proschel, Z. Zhang, and J. Parsch Widespread Adaptive Evolution of Drosophila Genes With Sex-Biased Expression Genetics, October 1, 2006; 174(2): 893 - 900. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Halligan and P. D. Keightley Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison Genome Res., July 1, 2006; 16(7): 875 - 884. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron Weak selection and recent mutational changes influence polymorphic synonymous mutations in humans PNAS, May 2, 2006; 103(18): 6940 - 6945. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Akashi, W.-Y. Ko, S. Piao, A. John, P. Goel, C.-F. Lin, and A. P. Vitins Molecular Evolution in the Drosophila melanogaster Species Subgroup: Frequent Parameter Fluctuations on the Timescale of Molecular Divergence Genetics, March 1, 2006; 172(3): 1711 - 1726. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron and T. B. Guthrie Intragenic Hill-Robertson Interference Influences Selection Intensity on Synonymous Mutations in Drosophila Mol. Biol. Evol., December 1, 2005; 22(12): 2519 - 2530. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhu and C. D. Bustamante A Composite-Likelihood Approach for Detecting Directional Selection From DNA Sequence Data Genetics, July 1, 2005; 170(3): 1411 - 1421. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Burnette, E. Miyamoto-Sato, M. A. Schaub, J. Conklin, and A. J. Lopez Subdivision of Large Introns in Drosophila by Recursive Splicing at Nonexonic Elements Genetics, June 1, 2005; 170(2): 661 - 674. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Marais, P. Nouvellet, P. D. Keightley, and B. Charlesworth Intron Size and Exon Evolution in Drosophila Genetics, May 1, 2005; 170(1): 481 - 485. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Qin, W. B. Wu, J. M. Comeron, M. Kreitman, and W.-H. Li Intragenic Spatial Patterns of Codon Usage Bias in Prokaryotic and Eukaryotic Genomes Genetics, December 1, 2004; 168(4): 2245 - 2260. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron Selective and Mutational Patterns Associated With Gene Expression in Humans: Influences on Synonymous Composition and Intron Presence Genetics, July 1, 2004; 167(3): 1293 - 1304. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Clauss and T. Mitchell-Olds Functional Divergence in Tandemly Duplicated Arabidopsis thaliana Trypsin Inhibitor Genes Genetics, March 1, 2004; 166(3): 1419 - 1436. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Prachumwat, L. DeVincentis, and M. F. Palopoli Intron Size Correlates Positively With Recombination Rate in Caenorhabditis elegans Genetics, March 1, 2004; 166(3): 1585 - 1590. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Kim Effect of Strong Directional Selection on Weakly Selected Mutations at Linked Sites: Implication for Synonymous Codon Usage Mol. Biol. Evol., February 1, 2004; 21(2): 286 - 294. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Sheldahl, D. M. Weinreich, and D. M. Rand Recombination, Dominance and Selection on Amino Acid Polymorphism in the Drosophila Genome: Contrasting Patterns on the X and Fourth Chromosomes Genetics, November 1, 2003; 165(3): 1195 - 1208. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Andolfatto and J. D. Wall Linkage Disequilibrium Patterns Across a Recombination Gradient in African Drosophila melanogaster Genetics, November 1, 2003; 165(3): 1289 - 1305. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Navarro-Sabate, M. Aguade, and C. Segarra Excess of Nonsynonymous Polymorphism at Acph-1 in Different Gene Arrangements of Drosophila subobscura Mol. Biol. Evol., November 1, 2003; 20(11): 1833 - 1843. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Perez, A. Munte, J. Rozas, C. Segarra, and M. Aguade Nucleotide Polymorphism in the RpII215 Gene Region of the Insular Species Drosophila guanche: Reduced Efficacy of Weak Selection on Synonymous Variation Mol. Biol. Evol., November 1, 2003; 20(11): 1867 - 1875. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. W. Hahn, J. E. Stajich, and G. A. Wray The Effects of Selection Against Spurious Transcription Factor Binding Sites Mol. Biol. Evol., June 1, 2003; 20(6): 901 - 906. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Ohta Inaugural Article: Near-neutrality in evolution of genes and gene regulation PNAS, December 10, 2002; 99(25): 16134 - 16137. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Comeron, J. M.
- Articles by Kreitman, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Comeron, J. M.
- Articles by Kreitman, M.


) Neutral (






(inf.). Open and solid diamonds depict 






