| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 170, 481-485, May 2005, Copyright © 2005
doi:10.1534/genetics.104.037333
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JT, United Kingdom
3 Corresponding author: Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, King's Bldgs., W. Mains Rd., Edinburgh, EH9 3JT, United Kingdom.
E-mail: brian.charlesworth{at}ed.ac.uk
| ABSTRACT |
|---|
|
|
|---|
Hill-Robertson interference occurs when several genetically linked sites are undergoing selection at the same time (HILL and ROBERTSON 1966; GORDO and CHARLESWORTH 2001). When advantageous alleles initially arise in the population, they will usually not be associated with each other, because mutations appear at random in separate individuals. In the absence of recombination, one advantageous mutation will therefore tend to displace all the others (FISHER 1930; MULLER 1932). In the presence of recombination, advantageous alleles can be combined together to generate the optimal genotype. A similar argument can be made for the effects of recurrent deleterious mutations on the spread of advantageous alleles (FISHER 1930; CHARLESWORTH 1994; PECK 1994; ORR 2000). Selection is thus expected to be more efficient in the presence of recombination than in its absence. Hence, selective events occurring in one region of the genome would be facilitated if there were an enhancer of recombination nearby. Introns could act as such enhancers, because they increase the chance that a crossover occurs between sites in two different exons by spacing them apart, thereby allowing more efficient selection on variants in different coding regions of the same gene (COMERON and KREITMAN 2000).
If such interference is important, we would expect the efficacy of selection to be greater in genes with larger introns, all else being equal. Comeron and Kreitman designed a test based on the effect of intron size on selection on codon usage in Drosophila melanogaster (COMERON and KREITMAN 2002). This type of selection is believed to be very weak (Nes
1, where Ne is the effective population size and s the selection coefficient against a mutation to a nonoptimal codon), and is thus particularly prone to generating interference effects, because the chance that several synonymous sites are segregating in the same gene at the same time is high (GORDO and CHARLESWORTH 2001). They found that the average level of codon bias among genes drawn from the D. melanogaster genome sequence was not affected by the presence/absence of introns. They then examined codons located in the middle of the gene (called "central" codons), which are more subject to interference because they have more neighboring codons. COMERON and KREITMAN (2002) found that the level of codon bias for these central codons was slightly but significantly increased in genes with central introns, compared with genes lacking such introns, in agreement with the interference hypothesis.
Introns could also reduce interference between weakly selected mutations at amino acid sites within the same gene. This implies that the rate of nonsynonymous substitutions per site (dN) would be influenced by intron size. However, the extent to which dN is influenced by purifying selection vs. positive selection is unclear (AKASHI 1999; HURST 2002). If protein sequence evolution is caused mainly by the fixation of advantageous, weakly selected mutations (positive selection), the correlation between dN and intron size should be positive. In contrast, if it is driven by the fixation by drift of weakly selected, deleterious mutations (purifying selection), the correlation should be negative (HURST 2002). To distinguish between these two alternatives and test for an effect of intron size on dN, we used 630 orthologous gene pairs from D. melanogaster and D. yakuba (from DOMAZET-LOSO and TAUTZ 2003) to estimate dN using PAML with default parameters (GOLDMAN and YANG 1994). Further details are given in MARAIS et al. (2004) (the data set can be downloaded at http://biomserv.univ-lyon1.fr/~marais/dataIntronSize/). We examined the correlation between dN values and intron size in D. melanogaster. We used only the 570 genes that are likely to be located in regions of high recombination in this species (MARAIS et al. 2004), since genes in regions of low recombination (near the centromeres or telomeres and on chromosome 4) are known to accumulate transposable elements in their noncoding regions, and their intron sizes may have unusual evolutionary dynamics (BARTOLOMé et al. 2002; RIZZON et al. 2002). The results did not differ significantly when gene pairs from regions of low recombination were included (data not shown).
We find that dN is (1) almost two times lower in genes with introns than in genes without introns (a nonparametric Kolmogorov test was significant with P = 0.02, see Figure 1A) and (2) negatively correlated with total intron size (Spearman nonparametric correlation coefficient Rs = 0.19, P < 104, see Figure 1B). Figure 1C shows that there are clear-cut differences in the mean dN values among intron size categories and that the observed correlation is not due to the effects of outliers. A similar relation was found between dN/dS and intron size (Rs = 0.10, P < 104), where dS is the rate of synonymous substitution per site. This eliminates the possibility that a correlation between point mutation rates (reflected in dS) and deletion rates (potentially affecting intron size) explains the results. Total intron size is influenced by both individual intron size and the number of introns. We have also defined a new index (relative distance between sites, RDS), which gives a better measure of the effect of introns on the distance between codons within a gene. It is the sum of the pairwise distances (in bases) for all codons within a gene, divided by the sum of pairwise distances for all codons within the coding sequence with the introns spliced out. A gene without introns would have RDS = 1, and a gene with introns would have RDS > 1, with a value that depends on the number, position, and size of these introns. We find a slightly stronger correlation of dN with RDS than with intron size (Rs = 0.24, P < 104).
|
An alternative explanation involves the presence in introns of regulatory elements controlling gene expression. In particular, if the most conserved genes have more such elements (since the levels of expression of such genes may need to be more precisely controlled), we would expect a negative relationship between dN and intron size. Regulatory elements are more frequent in the first introns than in other introns in mammals (MAJEWSKI and OTT 2002; KEIGHTLEY and GAFFNEY 2003; CHAMARY and HURST 2004) and possibly also in Drosophila (DURET 2001). In agreement with this, first introns are almost two times larger than other introns in vertebrates and Drosophila (DURET 2001), which is also true for our data set (first introns, mean of 518 bp; others, mean of 294 bp; P < 104 on a Kolmogorov test). Second, we found that their size is significantly positively correlated with expression level (Rs = 0.22, P < 104), whereas other introns do not show a significant correlation (Rs = 0.10, NS). This is confirmed by an analysis of the whole genome (Figure 2). Third, by analyzing a previously published alignment of 163 introns from D. melanogaster and D. simulans (HALLIGAN et al. 2004), we found that intron divergence is significantly negatively correlated with size for first introns (Rs = 0.29, P = 0.03); although there is a negative correlation for the other introns, it is not significant (Rs = 0.14, NS).
|
Our observations do not appear to support the interference hypothesis, but do not allow us to rule it out. The extent of Hill-Robertson interference between amino acid sites under selection is not very well understood, either theoretically or empirically. Some recent work suggests that such interference may explain an apparent relationship between recombination rate and dN in a comparison of D. melanogaster and D. simulans (BETANCOURT and PRESGRAVES 2002), but the meaning of this relationship has been recently challenged (MARAIS and CHARLESWORTH 2003; MARAIS et al. 2004). If there is little or no interference between amino acid sites within the same gene, introns would have an effect only on the efficacy of selection on synonymous sites within a gene. However, this effect is very weak. Previous work shows that introns are associated with a change in mean frequency of optimal codons from 64 to 68% and only for a subset of codons (the central ones, see above). This is in agreement with the very weak correlation between intron size and recombination rates reported previously (CARVALHO and CLARK 1999; COMERON and KREITMAN 2000) and suggests that interference explains only a very small fraction of variability in intron size in eukaryotic genomes.
We have found that intron size is globally negatively correlated with expression level in Drosophila, as reported for other eukaryotes (CASTILLO-DAVIS et al. 2002). However, when we split introns into first introns vs. the others, we found that first intron size is significantly positively correlated with expression level. This does not disagree with the hypotheses of selection for reducing the cost of transcribing introns (CARVALHO and CLARK 1999; CASTILLO-DAVIS et al. 2002) and selection against large introns in active chromosomal domains (PRACHUMWAT et al. 2004), which were proposed to explain the negative relationship between intron size and expression level. It means simply that Drosophila first introns do not follow the general trend, probably because these introns are enriched in regulatory elements, which appear to be more frequent in highly expressed genes. However, this does not seem to be the case in humans, where first introns are smaller in ubiquitously expressed genes than in narrowly expressed genes, although the difference is much smaller than that for other introns (COMERON 2004). Further investigation is needed to understand this difference between Drosophila and humans.
Our results suggest that genes with more slowly evolving amino acid sequence (low dN) may also have more regulatory elements, particularly in their first introns, and that this generates the observed relationship between dN and intron size. It has already been shown that highly conserved genes have special expression patterns. DURET and MOUCHIROUD (2000) showed that these genes are much more broadly expressed than others in mammals. They suggested that this is because mutations in housekeeping genes affect more tissues than mutations in tissue-specific genes and will therefore have larger effects on fitness. This would cause them to be much more constrained, although other explanations are possible (AKASHI 2001, 2003). More recently, CASTILLO-DAVIS et al. (2004) have shown that protein sequence divergence is correlated with that of cis-regulatory elements. To measure the latter, they defined a new index, dSM (the fraction of both noncoding sequences that does not contain a region of significant alignment), and computed it for a set of aligned genomic sequences from C. elegans and C. briggsae. They showed that (1) it correlates positively with expression differences between nematodes, (2) shared sequences correspond to experimentally known motives for gene expression, and (3) dSM is large in nonpromoter intergenic regions. They then observed that dSM and dN are positively correlated in nematodes, suggesting that selective pressures on gene expression and protein sequence evolution are coupled (CASTILLO-DAVIS et al. 2004). A similar conclusion has been reached on different grounds for Drosophila (NUZHDIN et al. 2004). This is entirely consistent with our observations and with the idea that selection for the presence of regulatory elements can affect the evolution of intron size.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
2 Present address: Wildlife Conservation Research Unit, Department of Zoology, University of Oxford, Tubney House, Abingdon Rd., Tubney, Abingdon, OX13 5QL, United Kingdom. ![]()
| LITERATURE CITED |
|---|
|
|
|---|
AKASHI, H., 1999 Within- and between-species DNA sequence variation and the footprint of natural selection. Gene 238: 3951.[CrossRef][Medline]
AKASHI, H., 2001 Gene expression and molecular evolution. Curr. Opin. Genet. Dev. 11: 660666.[CrossRef][Medline]
AKASHI, H., 2003 Translational selection and yeast proteome evolution. Genetics 164: 12911303.
BARTOLOMé, C., X. MASIDE and B. CHARLESWORTH, 2002 On the abundance and distribution of transposable elements in the genome of Drosophila melanogaster. Mol. Biol. Evol. 19: 926937.
BERGMAN, C. M., and M. KREITMAN, 2001 Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11: 13351345.
BETANCOURT, A. J., and D. C. PRESGRAVES, 2002 Linkage limits the power of natural selection in Drosophila. Proc. Natl. Acad. Sci. USA 99: 1361613620.
CARVALHO, A. B., and A. G. CLARK, 1999 Intron size and natural selection. Nature 401: 344.[CrossRef][Medline]
CASTILLO-DAVIS, C. I., S. L. MEKHEDOV, D. L. HARTL, E. V. KOONIN and F. A. KONDRASHOV, 2002 Selection for short introns in highly expressed genes. Nat. Genet. 31: 415418.[Medline]
CASTILLO-DAVIS, C. I., D. L. HARTL and G. ACHAZ, 2004 Cis-regulatory and protein evolution in orthologous and duplicate genes. Genome Res. 14: 15301536.
CHAMARY, J. V., and L. D. HURST, 2004 Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: evidence for selectively driven codon usage. Mol. Biol. Evol. 21: 10141023.
CHARLESWORTH, B., 1994 The effect of background selection against deleterious mutations on weakly selected, linked variants. Genet. Res. 63: 213227.[Medline]
COMERON, J. M., 2001 What controls the length of noncoding DNA? Curr. Opin. Genet. Dev. 11: 652659.[CrossRef][Medline]
COMERON, J. M., 2004 Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence. Genetics 167: 12931304.
COMERON, J. M., and M. KREITMAN, 2000 The correlation between intron length and recombination in Drosophila: dynamic equilibrium between mutational and selective forces. Genetics 156: 11751190.
COMERON, J. M., and M. KREITMAN, 2002 Population, evolutionary and genomic consequences of interference selection. Genetics 161: 389410.
DOMAZET-LOSO, T., and D. TAUTZ, 2003 An evolutionary analysis of orphan genes in Drosophila. Genome Res. 13: 22132219.
DURET, L., 2001 Why do genes have introns? Recombination might add a new piece to the puzzle. Trends Genet. 17: 172175.[CrossRef][Medline]
DURET, L., and D. MOUCHIROUD, 2000 Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17: 6874.
FISHER, R. A., 1930 The Genetical Theory of Natural Selection. Clarendon Press, Oxford.
GOLDMAN, N., and Z. YANG, 1994 A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11: 725736.[Abstract]
GORDO, I., and B. CHARLESWORTH, 2001 Genetic linkage and molecular evolution. Curr. Biol. 11: R684R686.[CrossRef][Medline]
HALLIGAN, D. L., A. EYRE-WALKER, P. ANDOLFATTO and P. D. KEIGHTLEY, 2004 Patterns of evolutionary constraints in intronic and intergenic DNA of Drosophila. Genome Res. 14: 273279.
HILL, W. G., and A. ROBERTSON, 1966 The effect of linkage on limits to artificial selection. Genet Res. 8: 269294.[Medline]
HURST, L. D., 2002 The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 18: 486.[CrossRef][Medline]
KEIGHTLEY, P. D., and D. J. GAFFNEY, 2003 Functional constraints and frequency of deleterious mutations in noncoding DNA of rodents. Proc. Natl. Acad. Sci. USA 100: 1340213406.
MAJEWSKI, J., and J. OTT, 2002 Distribution and characterization of regulatory elements in the human genome. Genome Res. 12: 18271836.
MARAIS, G., and B. CHARLESWORTH, 2003 Genome evolution: recombination speeds up adaptive evolution. Curr. Biol. 13: R68R70.[CrossRef][Medline]
MARAIS, G., and G. PIGANEAU, 2002 Hill-Robertson interference is a minor determinant of variations in codon bias across Drosophila melanogaster and Caenorhabditis elegans genomes. Mol. Biol. Evol. 19: 13991406.
MARAIS, G., T. DOMAZET-LOSO, D. TAUTZ and B. CHARLESWORTH, 2004 Correlated evolution of synonymous and nonsynonymous sites in Drosophila. J. Mol. Evol. 59: 771779.[CrossRef][Medline]
MATTICK, J. S., 2001 Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep. 2: 986991.[CrossRef][Medline]
MAXWELL, E. S., and M. J. FOURNIER, 1995 The small nucleolar RNAs. Annu. Rev. Biochem. 64: 897934.[CrossRef][Medline]
MULLER, H. J., 1932 Some genetic aspects of sex. Am. Nat. 66: 118138.[CrossRef]
NUZHDIN, S. V., M. L WAYNE, K. L HARMON and L. M MCINTYRE, 2004 Common patterns of evolution of gene expression level and protein sequence in Drosophila. Mol. Biol. Evol. 21: 13081317.
ORR, H. A., 2000 The rate of adaptation in asexuals. Genetics 155: 961968.
PECK, J. R., 1994 A ruby in the rubbish: beneficial mutations, deleterious mutations and the evolution of sex. Genetics 137: 597606.[Abstract]
PETROV, D. A., 2002 DNA loss and evolution of genome size in Drosophila. Genetica 115: 8191.[CrossRef][Medline]
PETROV, D. A., T. A. SANGSTER, J. S. JOHNSTON, D. L. HARTL and K. L. SHAW, 2000 Evidence for DNA loss as a determinant of genome size. Science 287: 10601062.
PRACHUMWAT, A., L. DEVINCENTIS and M. F. PALOPOLI, 2004 Intron size correlates positively with recombination rate in Caenorhabditis elegans. Genetics 166: 15851590.
RIZZON, C., G. MARAIS, M. GOUY and C. BIéMONT, 2002 Recombination rate and the distribution of transposable elements in the Drosophila melanogaster genome. Genome Res. 12: 400407.
This article has been cited by other articles:
![]() |
S. S. Hughes, C. O. Buckley, and D. E. Neafsey Complex Selection on Intron Size in Cryptococcus neoformans Mol. Biol. Evol., February 1, 2008; 25(2): 247 - 253. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. J. Hertel Combinatorial Control of Exon Recognition J. Biol. Chem., January 18, 2008; 283(3): 1211 - 1215. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Ingvarsson Gene Expression and Protein Length Influence Codon Usage and Rates of Sequence Evolution in Populus tremula Mol. Biol. Evol., March 1, 2007; 24(3): 836 - 844. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Egea, S. Casillas, E. Fernandez, M. A. Senar, and A. Barbadilla MamPol: a database of nucleotide polymorphism in the Mammalia class Nucleic Acids Res., January 12, 2007; 35(suppl_1): D624 - D629. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Presgraves Intron Length Evolution in Drosophila Mol. Biol. Evol., November 1, 2006; 23(11): 2203 - 2213. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.-Y. Ko, S. Piao, and H. Akashi Strong Regional Heterogeneity in Base Composition Evolution on the Drosophila X Chromosome Genetics, September 1, 2006; 174(1): 349 - 362. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Halligan and P. D. Keightley Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison Genome Res., July 1, 2006; 16(7): 875 - 884. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Collins and D. Penny Investigating the Intron Recognition Mechanism in Eukaryotes Mol. Biol. Evol., May 1, 2006; 23(5): 901 - 910. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Akashi, W.-Y. Ko, S. Piao, A. John, P. Goel, C.-F. Lin, and A. P. Vitins Molecular Evolution in the Drosophila melanogaster Species Subgroup: Frequent Parameter Fluctuations on the Timescale of Molecular Divergence Genetics, March 1, 2006; 172(3): 1711 - 1726. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Fox-Walsh, Y. Dou, B. J. Lam, S.-p. Hung, P. F. Baldi, and K. J. Hertel The architecture of pre-mRNAs affects mechanisms of splice-site pairing PNAS, November 8, 2005; 102(45): 16176 - 16181. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |