| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Letter to the Editor |
Corresponding author: Csaba Pál, Collegium Budapest, Institute for Advanced Study, Szentháromság u. 2, H-1014 Budapest, Hungary., cspal{at}colbud.hu (E-mail)
THE rate of protein evolution shows considerable variation among genes. This variation is thought to reflect differences in the proportion of the sequence that is critical to fulfill given functions (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Analysis of the molecular evolution of proteins in yeast allows us to address these issues, not least because of the availability of large-scale microarray expression data from the yeast Saccharomyces cerevisiae. Additionally there are extensive sequence data from the same species and close relatives. Furthermore, the presence of numerous duplicates in yeast allows us to compare rates of evolution for functionally related and sequence-related genes. These duplicates appear for the most part to be the product of a genome duplication
100 million years ago that was followed by a period of gene loss (![]()
![]()
![]()
METHODS
The dataset used in this article consists of the remnants of the ancient genome duplication including 376 gene pairs organized into 55 gene clusters (![]()
Protein sequence alignments were carried out with CLUSTAL-W version 1.4 (![]()
![]()
![]()
![]()
Whole genome transcription data from ![]()
We analyzed the rate of protein evolution in the duplicated Saccharomyces genes using, as a reference, sequences from another related organism, C. albicans. This organism is a close relative of S. cerevisiae and it is also known to have branched off from the Saccharomyces genus before the genome duplication in yeast (![]()
![]()
The criteria used to define homologs were as follows: (1) BLASTP expectation value, E < 10-20 for both gene pairs; (2) highest score for both duplicated S. cerevisiae genes; and (3) the duplicate-to-duplicate distance is smaller than the duplicate-to-Candida distances. The alignment method and the calculation of protein distances were the same as above. Only gene pairs with dA < 1.5 were retained.
RESULTS
Do highly expressed genes evolve at a low rate? There is a statistically highly significant negative association between gene expression level and protein distance (Fig 1; Pearson correlation, r = -0.584, P < 10-6). It was shown that codon usage bias [assessed by the codon adaptation index (CAI)] strongly correlates with mRNA abundance in yeast (![]()
|
In the first approach, we used a smaller dataset of the whole sample containing only gene pairs within the same broad functional category. Analyzing only genes coding for enzymes of carbohydrate, amino acid, or nucleotide metabolism, the significant correlation between gene expression level and the rate of protein evolution remains (Spearman-rank correlation, N = 29, r = -0.485, P = 0.0071). This controls for functional equivalence but not for sequence similarity.
In a second, more rigorous approach, we examined the relative rate of protein evolution of each member of the duplicated pair when compared with the same Candida ortholog. We could then ask whether the slower evolving of the two duplicates also has the higher expression level. We found a significant negative correlation between the relative difference in gene expression levels and the relative difference in protein distances (P < 0.01, r = -0.215; see Fig 2). The more highly expressed duplicate evolves at a lower rate in 87 out of 150 gene pairs, but this is not significant (sign-test, P = 0.06).
|
Given that most of the duplicated genes in yeast have highly overlapping functions (![]()
The results above are, however, of less utility if the duplicates were not all derived at the same time as a result of a genome duplication event. Were they derived independently at numerous times, low divergence between two genes might indicate recent duplication rather than slow evolution. A further problem with the use of these duplicates is the possibility of gene conversion. Highly expressed genes may be more prone to frequent gene conversion events, leading to reduced sequence divergence between the pairs (e.g., ![]()
In the first, we compared the duplicate-to-duplicate distance with the average duplicate-to-outgroup distance. Under the assumption of frequent gene conversion events (or independent duplication events), the two distances should be uncoupled. The reason is the following. Let us assume that after the duplication event, one copy has accumulated some mutations. As long as the divergence between the duplicates is small, one copy may convert the other. This process would retard the rate of divergence between the duplicates, while divergence of the duplicates from orthologous sequences may remain relatively unaffected. If some sequences have undergone gene conversion but most have not, those that have been homogenized should appear as outliers (below the best-fit line) in the regression of duplicate-to-duplicate distance against duplicate-to-outgroup distance.
A BLAST search was evaluated to find C. albicans genes whose sequences were most similar to the 315 duplicated genes in yeast (for details see METHODS). With the resulting 160 gene pairs with appropriate out-group sequences, a remarkably strong correlation was found between the duplicate-to-duplicate distance and the average of the duplicate-to-outgroup distances (Pearson correlation, r = 0.870, P < 10-6; see Fig 3). To search for possible outliers, the standardized residuals (SR) from the regression line were computed for all data points. Only 3 gene pairs were detected with suspiciously low duplicate-to-duplicate distance, defined as SR < -2, corresponding to a 0.05 significance level (see Table 1A).
|
|
The slope of the regression equation (
0.6) also provides an estimate of the timing of the genome duplication event. The two species probably diverged
140330 million years ago (![]()
![]()
0.82 x 108 years ago. This is in accordance with the previous estimate, of 108 years (![]()
Another way to detect gene conversion is to examine the number of synonymous substitutions per site (KS) between the duplicates. Under the assumptions that synonymous sites evolve neutrally and no gene conversion occurs between the duplicates, KS is expected to be roughly constant. Unfortunately, in the case of yeast, the first assumption is likely to be invalid. Codon usage shows a strong bias in highly expressed genes, most likely resulting from selection for translational efficiency (e.g., ![]()
DISCUSSION
We found that (1) highly expressed genes evolve slowly in yeast, (2) this is not due simply to these genes being functionally or sequence related, and (3) it is not due to gene conversion. Additionally, we can conclude that the pattern is not likely to be the result of mutational patterns alone. In yeast, increased transcription rates are associated with increased mutation rates (![]()
![]()
Our results show that tissue-specific gene expression differences cannot fully account for the slower evolution of highly expressed genes. One possible explanation for the results is that, as previously postulated (![]()
As regards selection on the half-life of proteins or mRNA, the logic could run in many ways. Gene dosage requirements might impose strong selection pressure to reduce the rate of degradation of protein or mRNA. If a gene product is required at a high expression level, selection might also favor longer persistence of the protein product and might also make the mRNA less likely to decay before being translated. Conversely, the product might be expressed at a high rate because it has a high decay rate. This might be because its optimal enzymatic configuration is unstable. Selection might preserve the configuration that optimizes the enzymatic abilities ahead of ensuring stability of the structure. In reality a trade-off of these parameters is expected.
Analysis of the data casts little extra light on this possibility. Unfortunately, current data on protein degradation rates in yeast are too scarce for a comparative analysis. In contrast, data on mRNA half-life are available for a large number of yeast genes (![]()
Regardless of the uncertainty in interpretation, the results presented here indicate that in looking for the reasons that highly expressed genes evolve slowly, we need not concentrate exclusively on breadth of expression. The absolute rate of expression appears to be of some importance. That is not to say that in mammals tissue breadth has no effect, but it seems unlikely to be the only effect.
ACKNOWLEDGMENTS
We are grateful to Deborah Charlesworth, Ken Wolfe, Laurent Duret, and the two anonymous referees for their helpful suggestions. We also thank the European Science Foundation TBA program for providing a travel grant to C.P.
Manuscript received October 26, 2000; Accepted for publication March 19, 2001.
LITERATURE CITED
ALTSCHUL, S. F., T. L. MADDEN, A. A. SCHAFFER, J. ZHANG, and Z. ZHANG et al., 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402[Abstract/Full Text].
BERBEE, M. L., and J. W. TAYLOR, 1993 Chapter 4, pp. 6778 in The Fungal Holomorph: Mitotic, Meiotic and Pleomorphic Speciation in Fungal Systematics, edited by D. R. REYNOLDS and J. W. TAYLOR. CAB International, Wallingford, UK.
COGHLAN, A. and K. H. WOLFE, 2000 Relationship of codon bias to mRNA concentration and protein length in saccharomyces cerevisiae. Yeast 16:1131-1145[Medline].
DATTA, A. and S. JINKS-ROBERTSON, 1995 Association of increased spontaneous mutation rates with high levels of transcription in yeast. Science 268:1616-1619[Medline].
DURET, L. and D. MOUCHIROUD, 2000 Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17:68-85[Abstract/Full Text].
FELSENSTEIN, J., 1989 PHYLIP (phylogeny inference package) version 3.2. Cladistics 5:164-166.
HASTINGS, K. E. M., 1996 Strong evolutionary conservation of broadly expressed protein isoforms in the troponin I gene family and other vertebrate gene families. J. Mol. Evol. 42:631-640[Medline].
HOLSTEGE, F. C., E. G. JENNINGS, J. J. WYRICK, T. I. LEE, and C. J. HENGARTNER et al., 1998 Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95:717-728[Medline].
KUMA, K., N. IWABE, and T. MIYATA, 1995 Functional constraints against variations on molecules from the tissue level: slowly evolving brain-specific genes demonstrated by protein kinase and immunoglobulin supergene families. Mol. Biol. Evol. 12:123-130[Abstract].
LI, W. H., 1993 Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36:96-99[Medline].
LI, W. H., 1997 Molecular Evolution. Sinauer, Sunderland, MA.
MOREY, N. J., C. N. GREENE, and S. JINKS-ROBERTSON, 2000 Genetic analysis of transcription-associated mutation in Saccharomyces cerevisiae.. Genetics 154:109-120[Abstract/Full Text].
PAMILO, P. and N. O. BIANCHI, 1993 Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. Mol. Biol. Evol. 10:271-281[Abstract].
PESOLE, G., M. LOTTI, L. ALBERGHINA, and C. SACCONE, 1995 Evolutionary origin of nonuniversal CUGSer codon in some Candida species as inferred from a molecular phylogeny. Genetics 141:903-907[Abstract].
PETES, T. D. and C. W. HILL, 1988 Recombination between repeated genes in microorganisms. Annu. Rev. Genet. 22:147-168[Medline].
SEOIGHE, C. and K. H. WOLFE, 1999a Updated map of duplicated regions in the yeast genome. Gene 238:253-261[Medline].
SEOIGHE, C. and K. H. WOLFE, 1999b Yeast genome evolution in the post-genome era. Curr. Opin. Microbiol. 2:548-554[Medline].
THOMPSON, J. D., D. G. HIGGINS, and T. J. GIBSON, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680[Abstract].
WILLIAMS, E. J. B. and L. D. HURST, 2000 The proteins of linked genes evolve at similar rate. Nature 407:900-903[Medline].
WOLFE, K. H. and D. C. SHIELDS, 1997 Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387:708-713[Medline].
This article has been cited by other articles:
![]() |
G. Musso, M. Costanzo, M. Huangfu, A. M. Smith, J. Paw, B.-J. San Luis, C. Boone, G. Giaever, C. Nislow, A. Emili, et al. The extensive and condition-dependent nature of epistasis among whole-genome duplicates in yeast Genome Res., July 1, 2008; 18(7): 1092 - 1099. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. M. Cooper and C. D. Brown Qualifying the relationship between sequence conservation and molecular function Genome Res., February 1, 2008; 18(2): 201 - 205. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Andres, C. de Hemptinne, and J. Bertranpetit Heterogeneous Rate of Protein Evolution in Serotonin Genes Mol. Biol. Evol., December 1, 2007; 24(12): 2707 - 2715. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Mank, L. Hultin-Rosenberg, E. Axelsson, and H. Ellegren Rapid Evolution of Female-Biased, but Not Male-Biased, Genes Expressed in the Avian Brain Mol. Biol. Evol., December 1, 2007; 24(12): 2698 - 2706. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Saunders and P. Green Insights from Modeling Protein Evolution with Context-Dependent Mutation and Asymmetric Amino Acid Selection Mol. Biol. Evol., December 1, 2007; 24(12): 2632 - 2647. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Resch, L. Carmel, L. Marino-Ramirez, A. Y. Ogurtsov, S. A. Shabalina, I. B. Rogozin, and E. V. Koonin Widespread Positive Selection in Synonymous Sites of Mammalian Genes Mol. Biol. Evol., August 1, 2007; 24(8): 1821 - 1831. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Carmel, I. B. Rogozin, Y. I. Wolf, and E. V. Koonin Evolutionarily conserved genes preferentially accumulate introns Genome Res., July 1, 2007; 17(7): 1045 - 1050. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Hakes, S. C. Lovell, S. G. Oliver, and D. L. Robertson Specificity in protein interactions and its relationship with sequence diversity and coevolution PNAS, May 8, 2007; 104(19): 7999 - 8004. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Georgelis, E. L. Braun, J. R. Shaw, and L. C. Hannah The Two AGPase Subunits Evolve at Different Rates in Angiosperms, yet They Are Equally Sensitive to Activity-Altering Amino Acid Changes When Expressed in Bacteria PLANT CELL, May 1, 2007; 19(5): 1458 - 1472. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Plotkin and H. B. Fraser Assessing the Determinants of Evolutionary Rates in the Presence of Noise Mol. Biol. Evol., May 1, 2007; 24(5): 1113 - 1121. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-S. Lin, W.-L. Hsu, J.-K. Hwang, and W.-H. Li Proportion of Solvent-Exposed Amino Acids in a Protein and Rate of Protein Evolution Mol. Biol. Evol., April 1, 2007; 24(4): 1005 - 1011. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. P. Cusack and K. H. Wolfe Not Born Equal: Increased Rate Asymmetry in Relocated and Retrotransposed Rodent Gene Duplicates Mol. Biol. Evol., March 1, 2007; 24(3): 679 - 686. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Denning and M. F. Rexach Rapid Evolution Exposes the Boundaries of Domain Structure and Function in Natively Unfolded FG Nucleoporins Mol. Cell. Proteomics, February 1, 2007; 6(2): 272 - 282. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Stoletzki and A. Eyre-Walker Synonymous Codon Usage in Escherichia coli: Selection for Translational Accuracy Mol. Biol. Evol., February 1, 2007; 24(2): 374 - 381. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. K. Choi, S. C. Kim, J. Seo, S. Kim, and J. Bhak Impact of Transcriptional Properties on Essentiality and Evolutionary Rate Genetics, January 1, 2007; 175(1): 199 - 206. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Bloom, A. Raval, and C. O. Wilke Thermodynamics of Neutral Protein Evolution Genetics, January 1, 2007; 175(1): 255 - 266. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. F. Eanes, T. J. S. Merritt, J. M. Flowers, S. Kumagai, E. Sezgin, and C.-T. Zhu Flux control and excess capacity in the enzymes of glycolysis and their relationship to flight metabolism in Drosophila melanogaster PNAS, December 19, 2006; 103(51): 19413 - 19418. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Shakhnovich and E. V. Koonin Origins and impact of constraints in evolution of gene families Genome Res., December 1, 2006; 16(12): 1529 - 1536. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bartolome and B. Charlesworth Evolution of Amino-Acid Sequences and Codon Usage on the Drosophila miranda Neo-Sex Chromosomes Genetics, December 1, 2006; 174(4): 2033 - 2044. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-Y. Liao, N. M. Scott, and J. Zhang Impacts of Gene Essentiality, Expression Pattern, and Gene Compactness on the Evolutionary Rate of Mammalian Proteins Mol. Biol. Evol., November 1, 2006; 23(11): 2072 - 2080. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Julenius and A. G. Pedersen Protein Evolution Is Faster Outside the Cell Mol. Biol. Evol., November 1, 2006; 23(11): 2039 - 2048. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-S. Lin, J. K. Byrnes, J.-K. Hwang, and W.-H. Li Codon-usage bias versus gene conversion in the evolution of yeast duplicate genes PNAS, September 26, 2006; 103(39): 14412 - 14416. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Bloom, D. A. Drummond, F. H. Arnold, and C. O. Wilke Structural Determinants of the Rate of Protein Evolution in Yeast Mol. Biol. Evol., September 1, 2006; 23(9): 1751 - 1761. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. He and J. Zhang Toward a Molecular Understanding of Pleiotropy Genetics, August 1, 2006; 173(4): 1885 - 1891. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Siwach, S. D. Pophaly, and S. Ganesh Genomic and Evolutionary Insights into Genes Encoding Proteins with Single Amino Acid Repeats Mol. Biol. Evol., July 1, 2006; 23(7): 1357 - 1369. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Charlesworth and A. Eyre-Walker The Rate of Adaptive Evolution in Enteric Bacteria Mol. Biol. Evol., July 1, 2006; 23(7): 1348 - 1356. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-Y. Liao and J. Zhang Low Rates of Expression Profile Divergence in Highly Expressed Genes and Tissue-Specific Genes During Mammalian Evolution Mol. Biol. Evol., June 1, 2006; 23(6): 1119 - 1128. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. O. Wilke and D. A. Drummond Population Genetics of Translational Robustness Genetics, May 1, 2006; 173(1): 473 - 481. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-H. Kim and S. V. Yi Correlated Asymmetry of Sequence and Functional Divergence Between Duplicate Proteins of Saccharomyces cerevisiae Mol. Biol. Evol., May 1, 2006; 23(5): 1068 - 1075. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Salathe, M. Ackermann, and S. Bonhoeffer The Effect of Multifunctionality on the Rate of Evolution in Yeast Mol. Biol. Evol., April 1, 2006; 23(4): 721 - 722. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Makino and T. Gojobori The Evolutionary Rate of a Protein Is Influenced by Features of the Interacting Partners Mol. Biol. Evol., April 1, 2006; 23(4): 784 - 789. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. E. Popescu, T. Borza, J. P. Bielawski, and R. W. Lee Evolutionary Rates and Expression Level in Chlamydomonas Genetics, March 1, 2006; 172(3): 1567 - 1576. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-Y. Liao and J. Zhang Evolutionary Conservation of Expression Profiles Between Human and Mouse Orthologous Genes Mol. Biol. Evol., March 1, 2006; 23(3): 530 - 540. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Drummond, A. Raval, and C. O. Wilke A Single Determinant Dominates the Rate of Yeast Protein Evolution Mol. Biol. Evol., February 1, 2006; 23(2): 327 - 337. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Drummond, J. D. Bloom, C. Adami, C. O. Wilke, and F. H. Arnold Why highly expressed proteins evolve slowly PNAS, October 4, 2005; 102(40): 14338 - 14343. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Stoletzki, J. Welch, J. Hermisson, and A. Eyre-Walker A Dissection of Volatility in Yeast Mol. Biol. Evol., October 1, 2005; 22(10): 2022 - 2026. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Lemos, B. R. Bettencourt, C. D. Meiklejohn, and D. L. Hartl Evolution of Proteins and Gene Expression Levels are Coupled in Drosophila and are Independently Associated with mRNA Abundance, Protein Length, and Number of Protein-Protein Interactions Mol. Biol. Evol., May 1, 2005; 22(5): 1345 - 1354. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Wall, A. E. Hirsh, H. B. Fraser, J. Kumm, G. Giaever, M. B. Eisen, and M. W. Feldman Functional genomic analysis of the rates of protein evolution PNAS, April 12, 2005; 102(15): 5483 - 5488. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Zhang and X. He Significant Impact of Protein Dispensability on the Instantaneous Rate of Protein Evolution Mol. Biol. Evol., April 1, 2005; 22(4): 1147 - 1155. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Alba and J. Castresana Inverse Relationship Between Evolutionary Rate and Age of Mammalian Genes Mol. Biol. Evol., March 1, 2005; 22(3): 598 - 606. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Aris-Brosou Determinants of Adaptive Evolution at the Molecular Level: the Extended Complexity Hypothesis Mol. Biol. Evol., February 1, 2005; 22(2): 200 - 209. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Hirsh, H. B. Fraser, and D. P. Wall Adjusting for Selection on Synonymous Sites in Estimates of Evolutionary Distance Mol. Biol. Evol., January 1, 2005; 22(1): 174 - 177. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. K. Jordan, L. Marino-Ramirez, Y. I. Wolf, and E. V. Koonin Conservation and Coevolution in the Scale-Free Human Gene Coexpression Network Mol. Biol. Evol., November 1, 2004; 21(11): 2058 - 2070. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Subramanian and S. Kumar Gene Expression Intensity Shapes Evolutionary Rates of the Proteins Encoded by the Vertebrate Genome Genetics, September 1, 2004; 168(1): 373 - 381. [Abstract] [Full Text] [PDF] |
||||