- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Urrutia, A. O.
- Articles by Hurst, L. D.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Urrutia, A. O.
- Articles by Hurst, L. D.
Codon Usage Bias Covaries With Expression Breadth and the Rate of Synonymous Evolution in Humans, but This Is Not Evidence for Selection
Araxi O. Urrutiaa and Laurence D. Hurstaa Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom
Corresponding author: Laurence D. Hurst, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom., l.d.hurst{at}bath.ac.uk (E-mail)
| ABSTRACT |
|---|
In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.
DOES selection act on mutations within exons that do not alter the amino acid sequence of the coded protein? Originally it was asserted that these synonymous mutations must be neutral (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In mammalian genomes, codon usage bias is also observed (![]()
![]()
![]()
![]()
![]()
|
|
Nonetheless in one case there has been a claim that a highly expressed set of genes (histones) does show codon usage that deviates from background nucleotide content in flanking regions (![]()
![]()
![]()
![]()
![]()
In this study we present the results from the analysis of codon usage bias in a sample of over 2000 human genes designed to ask whether codon usage bias in mammals can be explained by background nucleotide content alone or whether such parameters as expression breadth might also be important. To achieve this we developed a tool to measure codon usage bias, correcting nucleotide biases.
| METHODS |
|---|
Sequences from 2396 genes were included in the sample. Accession numbers were obtained from the DURET and MOURICHOUD (2000) database and sequences retrieved by ACNUC (![]()
![]()
Randomization tests:
Random sequences were generated for each gene conserving the base content at first, second, and third sites and for gene length. Start and stop codons were removed from the randomizations. During randomizations all sequences that contained an internal stop codon were discarded. The procedure was repeated until a total of 1000 random sequences were obtained.
Effective number of codons tests:
Effective number of codons (ENC) values were obtained for all sequences and values of original sequences were compared with the distribution of random sequences. As the ENC index has a cutoff at 61 and all sequences with greater values are adjusted to 61, the variance of the distribution was estimated on the basis of the median instead of the mean and by using only the lower half of the distribution.
Defining amino acids:
In all tests, nondegenerative amino acids (methionine and tryptophan) were not taken into account. For the majority of the amino acids all their alternative codons have the same bases at the first and second site. The exceptions are serine, arginine, and leucine, each encoded by six alternative codons. Each of these amino acids was treated in all tests as two independent amino acids, one of twofold degeneracy and one of fourfold degeneracy.
Background nucleotide bias model expectations:
To obtain expected proportions for each alternative codon correcting for background nucleotide content, all codons were split into three groups according to the number of different nucleotides (two, three, and four) that could appear at the third site without changing the amino acid encoded. The group of degeneracy two was further divided into two groups, those where the choice is between T and C and those ending in A or G. The expected proportions of each alternative codon for a given amino acid were derived from all the other sites with the same degree of degeneracy or greater (i.e., excluding the amino acid being analyzed). For example, for amino acids with two degrees of degeneracy that could use the nucleotides thiamine and cytosine at the third site, expectations were calculated on the basis of all the other amino acids of two degrees of degeneracy that had a choice of the same nucleotides for the third site and also all the amino acids of four degrees of degeneracy were included by calculating the relative frequencies of thiamine and cytosine. For isoleucine, expectations were calculated by calculating the relative frequencies of adenine, cytosine, and thiamine in fourfold degenerate amino acids. Finally, for all the fourfold degenerate amino acids, only the distributions of nucleotides at the third sites of other fourfold degenerate sites were used for calculating expectations.
To minimize the uncertainty in the expected values, all cases with <30 sites to base the expectations on were eliminated. It should be noted that by using this model as null expectation, we are not taking into account the codon bias caused by dinucleotide biases.
Probability of observed bias:
Proportions of observed and expected codon usage for each amino acid were represented in terms of the minimal number of binomial variables. For amino acids with two alternative codons, codon usage is represented in terms of one variable, the frequency of one codon over the number of times the amino acid is present. For three alternative codons A, B, and C, codon usage can be represented with two variables: (a) the proportion of codon A over the total number of times the amino acid is present and (b) the proportion of codon B from the sum of frequencies of codons B and C. For amino acids with four alternative codons A, B, C, and D, the proportions of codon usage are represented by three variables: (a) the proportion of codons A + B over the frequency of the amino acid, (b) the proportion of codon A over the sum of frequencies of codons A + C, and (c) the proportion of B over the sum of frequencies of codons B + D. Under this method, the distribution of codon usage of a gene, and the expected one, can be represented by 38 binomial variables. All sequences in which not all the variables could be assessed were excluded from analysis (leaving n = 1629). To estimate the probability of the bias observed for each gene under the null hypothesis, the deviation from expectation for each variable was represented in terms of the numbers of standard deviations away from the mean (z). The standard deviation for a binomial variable can be defined as

The squared z values for each of the 38 variables were calculated and then summed to obtain the overall score (x),

Assuming that the binomial variables are normally distributed, the probability of occurrence of the observed bias can then be calculated with a
2 distribution of 38 d.f. that has the following probability density function:

Analysis of overall bias:
All sequences were concatenated into a single large sequence. Observed and expected codon distributions were obtained as previously described for individual genes. The probability of the observed bias or greater from background nucleotide bias model expectations for each amino acid with n alternative codons was estimated using two standard methods of goodness of fit that approximate the
2 distribution with n - 1 d.f.:
A.
2 test:

B. G-test:

Probabilities were estimated for the values obtained by comparing with cumulative
2 distributions.
Dinucleotide analysis:
Dinucleotide proportions were obtained for sites first to second, second to third, and third to first for each gene. Expected proportions for a given dinucleotide (Ed) at sites s2 and s3 given the nucleotide content in the sequence at each site were calculated as

where p(n2) and p(n3) are the proportions of the second and third nucleotides of the dinucleotide at sites s2 and s3. Dinucleotide bias (DnB) was estimated for each gene by

where Od,s2,s3 is the observed proportion of the dinucleotide.
| RESULTS |
|---|
Background nucleotide content alone does not explain the codon bias observed in mammalian genes:
The extent of codon usage bias in human genes is dominantly dictated by the nucleotide content of the chromosomal region within which the gene finds itself (![]()
![]()
= 0.05 (see METHODS).
However, because the ENC has a cutoff at 61 it has limited use for sequences with low codon usage bias. In addition, randomizing the first and second positions could potentially influence the distribution of ENC values in the random sequences. We therefore performed a second test to estimate the probability of the bias observed, by comparing the proportions of alternative codons of each sequence from the same sample of genes to expected proportions based on the nucleotide content at the third sites of each sequence. If the bias from equiprobability of a gene can be explained by the nucleotide content of that gene, then the null expectation would be that frequencies for each codon should match proportions of bases at the third site of all the amino acids with the same degree of degeneracy in that gene. To estimate the probability of obtaining the observed bias or greater under the null hypothesis for each gene, the observed and expected frequencies of codon usage for each amino acid were represented in terms of the minimal set of binomial variables, which is a method to approximate a multinomial distribution (see METHODS). The probability of obtaining the observed bias or greater under null expectation was estimated by summing the squared z values of distances of observed from expected for each binomial variable and comparing this with a standard
2 distribution (see METHODS). Significant deviations from expected (defined as P < 0.01) were found for 99% of the genes in the sample (data not shown). We conclude that "background" nucleotide content explains some, but by no means all, of the observed codon usage bias. Dinucleotide biases also are known to affect the bias in codon usage (![]()
![]()
Prior methods for assessing codon usage bias have limitations:
There are several methods to measure codon usage bias; however, many of them require a known set of preferred codons estimated from highly expressed genes. ENC (![]()
![]()
|
Maximum-likelihood codon bias is a new method for determining codon usage bias correcting for background nucleotide content:
Given the limitations of the available methods, we chose to develop an alternative method that is easy to obtain and not sensitive to amino acid biases. We wanted a method that could measure the degree of nonrandomness in the use of alternative codons that is minimally affected by the presence of rare amino acids. In addition the method should allow testing of a variety of null hypotheses for codon distribution (i.e., not just equiprobability of occurrence); in this article we use this method to correct for background nucleotide content, but it can be used to correct for dinucleotide biases as well.
The use of alternative codons can be thought of as an ensemble of several random variables, one per amino acid, each with two to six possible different outcomes or codons (amino acids encoded by only one codon cannot have codon usage bias), and each outcome with an associated probability of appearance. Each specific distribution of outcomes is a vector and the codon bias for one amino acid is the distance of the observed vector from the expected one. However, to obtain an index of codon usage bias for a complete gene, the biases of individual amino acids have to be added in a sensible way. Different amino acids within a gene vary in two aspects: frequency within a sequence and their degree of degeneracy. If an amino acid is rare, then the observed distribution is more likely to be far from the expected just by chance; therefore the bias of a rare amino acid should be downscaled to have less impact on the overall index of codon bias. The different amino acids also vary in the number of alternative codons by which they are encoded and this should also be taken into account when biases from different amino acids are to be added.
Taking into account the two aspects discussed above, we developed a new method that is easy to calculate and allows us to test different models to explain codon usage bias. The bias of an individual amino acid BA with frequency NA of level of degeneracy T, having the observed Oc and expected Ec proportions for each alternative codon, is obtained by

The bias for a gene Bg can then be obtained by summing over all amino acids,

where A is the number of amino acids contributing to the index.
All genes where more than five amino acids were missing or no index could be estimated were removed from all comparisons (leaving n = 2387). We denominated the method as maximum-likelihood codon bias (MCB), where the contribution to the index of the bias of each amino acid is weighted by an estimation of the likelihood of occurrence of bias on each amino acid, given its frequency and degree of degeneracy. Nevertheless, MCB is not a maximum-likelihood method in a strict sense. We believe this method would be useful for interspecies comparisons by allowing correction for differences in nucleotide composition. Importantly, MCB is minimally affected by biases in amino acid content of different degrees of degeneracy (r2 = 0.4%; Fig 3B) and appears to effectively remove the influence of background GC content (compare Fig 2A and Fig 2B).
It should be noted that with any procedure that estimates the distance from randomness, the size of the sample of events affects the variance that is expected; since the length of genes varies it is expected that this would influence the MCB values that are obtained. Therefore it is important to carefully study the relation of gene length with the variables that are being tested against codon usage values. A script for calculating MCB is available from the authors.
MCB covaries with breadth of expression and rates of synonymous substitution:
Expected distributions for each codon family were derived from the base composition of all third sites with the same or greater level of degeneracy within a given sequence (according to METHODS) and MCB values were obtained for all genes in the data sample. If the residual biases in codon usage, once correcting for nucleotide content, are due to selection then we could expect (a) higher bias in more broadly expressed genes, (b) consistently preferred codons, or (c) an inverse correlation with levels of synonymous substitutions (Ks).
We assessed the effect of breadth of expression on codon usage bias in our sample (see METHODS). Breadth of expression is not a direct measure of expression rate and therefore we may not necessarily be analyzing the key parameter. Nonetheless, the breadth of expression is known to covary with the intensity of purifying selection acting on the nonsynonymous sites (![]()
T mutations in "breathing DNA") since the MCB method already takes into account gene-specific background nucleotide concentrations.
|
An inverse correlation between codon usage bias and rates of silent site substitutions has been observed in bacteria (![]()
![]()
![]()
![]()
![]()
|
The covariance with expression breadth and synonymous substitution rates is also found when correlating the Karlin and Mrazek method as a measure of codon bias (breadth of expression, P < 0.001, r2 = 0.9%; synonymous substitution rate using Li93, P = 0.002, r2 = 0.4%). Although both the MCB and Karlin and Mrazek methods significantly correlate with synonymous rates of substitutions and breadth of expression, these weak correlations should be interpreted cautiously.
Codon preferences:
The above results are suggestive of a role for selection. If selection is to explain the above effects then we should also expect to see certain codons repeatedly being favored among genes. To investigate if the observed biases were favoring specific codons over others we performed an overall analysis of the whole sample by concatenating all genes into one large sequence. If the biases are due to factors specific for individual genes these should cancel each other out in the whole sample. The proportions for each alternative codon were obtained and compared with expectations from the nucleotide biases. Significant differences from expectations were observed for all of the amino acids that have two or more synonymous codons using the two tests of goodness of fit (P < 0.001, see METHODS).
A more conservative test is to investigate the consistency of the direction of the biases for individual genes. If there is no significant tendency favoring a particular set of codons then it is expected that a codon would be overrepresented one-half of the times that it deviates from expectation. The majority of the codons have significantly less heterogeneity than expected by chance and some were biased in one direction in 90% of the genes (Table 1). The above results are consistent with selective pressures favoring specific codons. Were this the result of selection we can predict that tRNA levels should be more highly skewed for the amino acids showing bias than for those showing little bias as has been shown for other species (cf. ![]()
![]()
![]()
![]()
|
Expression breadth and synonymous substitution patterns are most probably due to gene length effects:
The above results are suggestive of selection possibly playing a role in codon usage bias in humans. However, as stated earlier, genes of different length are likely to have different MCB values owing to the nature of the method. Indeed, if we randomize our sequences and measure the mean MCB for 1000 simulants for each of our genes, we find that the MCB, on average, is higher for shorter genes. This is to be expected of any statistic that employs a multinomial distribution and applies equally to the method of Karlin and Mrazek.
Importantly, it so happens that in our data set longer genes have a slightly higher rate of synonymous substitutions and are not expressed in as broad a range of tissues. Therefore, plotting mean MCB for the randomized genes against breadth of expression for the real gene, we still find a weak positive correlation of the order of magnitude reported for the real genes (P < 0.001, r2 = 4.0%). Likewise we find in the mean MCB vs. Ks regression a weak negative correlation of about the order reported for the real genes [Li93, P = 0.001, r2 = 0.6%; Tamura and Nei method (TN93), P = 0.002, r2 = 0.5%]. Moreover, when we subtract the average bias of the random sequences from the bias of the real sequences, the correlation with breadth of expression disappears and with rates of substitution weakens considerably (expression, P = 0.348, r2 = 0.01%; Ks Li93, P = 0.014, r2 = 0.03%). Therefore, the most conservative interpretation of our data is that MCB does covary with expression breadth and Ks, but this is likely to be because of a tendency of larger genes having lower expression breadth and higher rates of silent site substitution. The data appear not to support the hypothesis that covariance is due to selection on codon usage per se. It should be noted that for 96% of the sequences the MCB value of the real data was higher than the mean value for the random sequences.
Dinucleotide effects and preferred codons:
We are left trying to understand why there is such a large residual variance in codon usage after background nucleotide content is taken into account. One possibility is that the biases are caused by mutation biases or selection associated with particular dinucleotides. We performed a dinucleotide analysis on the whole sample (see METHODS) and also found that the sequences of the sample show significant biases in the appearance of dinucleotides from the expectations based on nucleotide content variations, consistent with previous observations (![]()
![]()
![]()
![]()
It is not the case, however, that dinucleotide effects can explain the totality of the residual bias. If there are significant biases that cannot be explained by dinucleotide effects, then this should be revealed by comparing the relative frequencies of the codons that encode amino acids with the same degree of degeneracy and that share the second nucleotide in their codons. So, for example, if dinucleotide biases explain codon usage bias, then the relative frequency of A-ending codons among the codons that specify glutamine (CAA, CAG) should be the same as the relative frequency of the A-ending codons among the codons that specify glutamic acid (GAA, GAG).
For each gene, the relative frequencies for each codon were calculated with respect to the other codons that encode the same amino acid. Those amino acids whose codons have the same nucleotide at the second site and that have the same type of degeneracy were grouped. Three such groups can be formed: (1) tyrosine, histidine, asparagines, and aspartic acid; (2) glutamine, lysine, and glutamic acid of twofold degeneracy; and (3) proline, threonine, and alanine of fourfold degeneracy. Within each group of amino acids, subgroups of those codons that have the same nucleotides at the first and the second sites were formed. Within each subgroup the relative frequencies of codons were compared against each other with Mann-Whitney tests. A total of 21 comparisons were made (for amino acids with twofold degree of degeneracy, only one subgroup of codons was formed since the second subgroup is complementary). In this test, the CG content variations do not affect the comparisons because the relative frequencies for each amino acid are calculated with respect to the other codons that encode the same amino acid. Assuming that there are no diamino acid biases or other factors of bias than dinucleotide effects, then we can expect that the relative proportions of codons of different amino acids are nearly identical (i.e., not significantly different) since they are expected to interact with similar proportions of nucleotides at the first position of the next codon. The major difference was found in the comparison of the codons CAA and GAA (encoding for glutamine and glutamic acid, respectively) with mean frequencies of 0.24 and 0.38, respectively. From all 21 possible comparisons within subgroups, only the comparison of the codons TAT and AAT (that encode for tyrosine and asparagine, respectively) was not significant with an
value of 0.05. All but 4 were significantly different with an
value of 0.01.
Some of the differences observed might be due to the existence of trinucleotide biases, diamino acid biases, or any more elaborated mutation patterns. These results show, however, that dinucleotide effects cannot alone account for all of the observed distribution of codons in human genes.
| DISCUSSION |
|---|
We found that codon usage bias in mammalian genes is not completely explained by background nucleotide content variation. We therefore developed a method to study the influence of other variables on codon usage. Unlike other methods ours appears to be insensitive to influence from rare amino acids. When we apply this method to a sample of human sequences, correcting expected distributions for background nucleotide content, we find that codon usage bias covaries with breadth of expression and is inversely correlated with the rate of synonymous substitution. This could suggest selective pressures related to translation efficiency, as has been conjectured (![]()
We also observe that there are codons that are consistently over- or underrepresented. This pattern can be explained in part by dinucleotide biases that also influence codon usage. However, we have also shown that not all the bias can be explained by such a simple mutational bias. While the cause of the remaining bias is uncertain, we fail to provide support for the hypothesis that codon usage is owing to selection.
Can we be sure that selection does not affect codon usage in mammals? While the above results would tend to suggest an absence of selection, as might be assumed to be the dominant position, several caveats must be noted. First, the dearth of TA dinucleotides may be a result of selection, as we discussed. However, ![]()
![]()
Third, we need to understand how to resolve the present findings with the result that there are dramatic increases in the amount of gene expression observed when foreign sequences, to be expressed in mammalian cells, are modified to avoid having rare codons. One possibility is that negative results are not reported and therefore we are left only with the cases in which the increase in expression could be due to the change in some synonymous sites rather than the effect of codon usage per se. On the other hand, this observation could indeed be indicative of selective pressures related to translation efficiency acting on codon usage distributions. However, because we are using a method that measures distance from random use, rather than the degree in which optimal codons are used, we might not have adequate resolution to detect the patterns. Using a method to measure codon bias based on the degree of use of optimal codons, but correcting for the background nucleotide bias, could allow recovery of evidence of weak selective pressures acting on coding sequences in mammals. In the meantime, we may conclude that codon usage bias covaries with expression breadth and the rate of synonymous evolution in humans but that this is not evidence for selection.
| ACKNOWLEDGMENTS |
|---|
We thank Laurent Duret, Brian Charlesworth, Jody Hey, and two anonymous referees for comments on an earlier version of the manuscript. This work was funded by a grant from Conacyt to A.O.U. and by the Biotechnology and Biological Sciences Research Council (BBSRC) and the Royal Society to L.D.H.
Manuscript received May 22, 2001; Accepted for publication August 15, 2001.
| LITERATURE CITED |
|---|
BERNARDI, G., 1995 The human genome: organization and evolutionary history. Annu. Rev. Genet. 29:445-476[Medline].
BERNARDI, G., D. MOUCHIROUD and C. GAUTIER, 1997 Isochores and synonymous substitutions in mammalian genes, pp. 197208 in DNA and Protein Sequence Analysis, edited by M. J. BISHOP and C. J. RAWLINGS. IRL Press, Oxford.
BEUTLER, E., T. GELBART, J. H. HAN, J. A. KOZIOL, and B. BEUTLER, 1989 Evolution of the genome and the genetic-codeselection at the dinucleotide level by methylation and polyribonucleotide cleavage. Proc. Natl. Acad. Sci. USA 86:192-196
DEBRY, R. W. and W. F. MARZLUFF, 1994 Selection on silent sites in the rodent H3 histone gene family. Genetics 138:191-202[Abstract].
DUNN, K. A., J. P. BIELAWSKI, and Z. H. YANG, 2001 Substitution rates in Drosophila nuclear genes: implications for translational selection. Genetics 157:295-305
DURET, L. and N. GALTIER, 2000 The covariation between TpA deficiency, CpG deficiency, and G + C content of human isochores is due to a mathematical artifact. Mol. Biol. Evol. 17:1620-1625
DURET, L. and L. D. HURST, 2001 The elevated GC content at exonic third sites is not evidence against neutralist models of isochore evolution. Mol. Biol. Evol. 18:757-762
DURET, L. and D. MOUCHIROUD, 1999 Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, Arabidopsis. Proc. Natl. Acad. Sci. USA 96:4482-4487
DURET, L. and D. MOUCHIROUD, 2000 Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17:68-74
EYRE-WALKER, A. C., 1991 An analysis of codon usage in mammalsselection or mutation bias. J. Mol. Evol. 33:442-449[Medline].
GOUY, M. and C. GAUTIER, 1982 Codon usage in bacteriacorrelation with gene expressivity. Nucleic Acids Res. 10:7055-7074
GOUY, M., F. MILLERET, C. MUGNIER, M. JACOBZONE, and C. GAUTIER, 1984 Acnuca nucleic-acid sequence data-base and analysis system. Nucleic Acids Res. 12:121-127.
HANAI, R. and A. WADA, 1988 The effects of guanine and cytosine variation on dinucleotide frequency and amino-acid composition in the human genome. J. Mol. Evol. 27:321-325[Medline].
IKEMURA, T. and K. WADA, 1991 Evident diversity of codon usage patterns of human genes with respect to chromosome-banding patterns and chromosome-numbersrelation between nucleotide-sequence data and cytogenetic data. Nucleic Acids Res. 19:4333-4339
KANAYA, S., Y. YAMADA, Y. KUDO, and T. IKEMURA, 1999 Studies of codon usage and tRNA genes of 18 unicellular organisms and quantification of Bacillus subtilis tRNAs: gene expression level and species-specific diversity of codon usage based on multivariate analysis. Gene 238:143-155[Medline].
KARLIN, S. and J. MRAZEK, 1996 What drives codon choices in human genes? J. Mol. Biol. 262:459-472[Medline].
KING, J. L. and T. H. JUKES, 1969 Non-Darwinian evolution. Science 164:788-798
LANDER, E. S., L. M. LINTON, B. BIRREN, C. NUSBAUM, and M. C. ZODY et al., 2001 Initial sequencing and analysis of the human genome. Nature 409:860-921[Medline].
LEVY, J. P., R. R. MULDOON, S. ZOLOTUKHIN, and C. J. LINK, 1996 Retroviral transfer and expression of a humanized, red-shifted green fluorescent protein gene into human tumor cells. Nat. Biotechnol. 14:610-614[Medline].
LI, W.-H., 1993 Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36:96-99[Medline].
MARAIS, G., D. MOUCHIROUD, and L. DURET, 2001 Does recombination improve selection on codon usage? Lessons from nematode and fly complete genomes. Proc. Natl. Acad. Sci. USA 98:5688-5692
MORIYAMA, E. N. and J. R. POWELL, 1997 Codon usage bias and tRNA abundance in Drosophila. J. Mol. Evol. 45:514-523[Medline].
POWELL, J. R. and E. N. MORIYAMA, 1997 Evolution of codon usage bias in Drosophila. Proc. Natl. Acad. Sci. USA 94:7784-7790
SHARP, P. M. and W. H. LI, 1987 The rate of synonymous substitution in enterobacterial genes is inversely related to codon usage bias. Mol. Biol. Evol. 4:222-230[Abstract].
SHARP, P. M., T. M. F. TUOHY, and K. R. MOSURSKI, 1986 Codon usage in yeastcluster-analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res. 14:5125-5143
SHARP, P. M., M. AVEROF, A. T. LLOYD, G. MATASSI, and J. F. PEDEN, 1995 DNA-sequence evolutionthe sounds of silence. Philos. Trans. R. Soc. Lond. Ser. B 349:241-247[Medline].
STENICO, M., A. T. LLOYD, and P. M. SHARP, 1994 Codon usage in Caenorhabditis elegans: delineation of translational selection and mutational biases. Nucleic Acids Res. 22:2437-2446
WELLS, K. D., J. A. FOSTER, K. MOORE, V. G. PURSEL, and R. J. WALL, 1999 Codon optimization, genetic insulation, and an rtTA reporter improve performance of the tetracycline switch. Transgenic Res. 8:371-381[Medline].
WRIGHT, F., 1990 The effective number of codons used in a gene. Gene 87:23-29[Medline].
ZHOU, J., W. J. LIU, S. W. PENG, X. Y. SUN, and I. FRAZER, 1999 Papillomavirus capsid protein expression level depends on the match between codon usage and tRNA availability. J. Virol. 73:4972-4982
ZOLOTUKHIN, S., M. POTTER, W. W. HAUSWIRTH, J. GUY, and N. MUZYCZKA, 1996 A "humanized" green fluorescent protein cDNA adapted for high-level expression in mammalian cells. J. Virol. 70:4646-4654[Abstract].
This article has been cited by other articles:
![]() |
P. K. Ingvarsson Gene Expression and Protein Length Influence Codon Usage and Rates of Sequence Evolution in Populus tremula Mol. Biol. Evol., March 1, 2007; 24(3): 836 - 844. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Cutter, J. D. Wasmuth, and M. L. Blaxter The Evolution of Biased Codon and Amino Acid Usage in Nematode Genomes Mol. Biol. Evol., December 1, 2006; 23(12): 2303 - 2315. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Withers, L. Wernisch, and M. d. Reis Archaeology and evolution of transfer RNA genes in the Escherichia coli genome RNA, June 1, 2006; 12(6): 933 - 942. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron Weak selection and recent mutational changes influence polymorphic synonymous mutations in humans PNAS, May 2, 2006; 103(18): 6940 - 6945. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. C. Burns, J. Shaw, R. Campagnoli, J. Jorba, A. Vincent, J. Quay, and O. Kew Modulation of Poliovirus Replicative Fitness in HeLa Cells by Deoptimization of Synonymous Codon Usage in the Capsid Region. J. Virol., April 1, 2006; 80(7): 3259 - 3272. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Semon, J. R. Lobry, and L. Duret No Evidence for Tissue-Specific Adaptation of Synonymous Codon Usage in Humans Mol. Biol. Evol., March 1, 2006; 23(3): 523 - 529. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Shabalina, A. Y. Ogurtsov, and N. A. Spiridonov A periodic pattern of mRNA secondary structure created by the genetic code. Nucleic Acids Res., January 1, 2006; 34(8): 2428 - 2437. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. d. Reis, R. Savva, and L. Wernisch Solving the riddle of codon usage preferences: a test for translational selection Nucleic Acids Res., September 24, 2004; 32(17): 5036 - 5044. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Plotkin, H. Robins, and A. J. Levine Tissue-specific codon usage and the expression of human genes PNAS, August 24, 2004; 101(34): 12588 - 12591. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron Selective and Mutational Patterns Associated With Gene Expression in Humans: Influences on Synonymous Composition and Intron Presence Genetics, July 1, 2004; 167(3): 1293 - 1304. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. T. Eskesen, F. N. Eskesen, and A. Ruvinsky Natural Selection Affects Frequencies of AG and GT Dinucleotides at the 5' and 3' Ends of Exons Genetics, May 1, 2004; 167(1): 543 - 550. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A. Borkovich, L. A. Alex, O. Yarden, M. Freitag, G. E. Turner, N. D. Read, S. Seiler, D. Bell-Pedersen, J. Paietta, N. Plesofsky, et al. Lessons from the Genome Sequence of Neurospora crassa: Tracing the Path from Genomic Blueprint to Multicellular Organism Microbiol. Mol. Biol. Rev., March 1, 2004; 68(1): 1 - 108. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. O. Urrutia and L. D. Hurst The Signature of Selection Mediated by Expression on Human Genes Genome Res., October 1, 2003; 13(10): 2260 - 2264. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. D. Makova and W.-H. Li Divergence in the Spatial Pattern of Gene Expression Between Human Duplicate Genes Genome Res., July 1, 2003; 13(7): 1638 - 1645. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Rosenberg, S. Subramanian, and S. Kumar Patterns of Transitional Mutation Biases Within and Among Mammalian Genomes Mol. Biol. Evol., June 1, 2003; 20(6): 988 - 993. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Hellmann, S. Zollner, W. Enard, I. Ebersberger, B. Nickel, and S. Paabo Selection on Human Genes as Revealed by Comparisons to Chimpanzee cDNA Genome Res., May 1, 2003; 13(5): 831 - 837. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Vinogradov DNA helix: the importance of being GC-rich Nucleic Acids Res., April 1, 2003; 31(7): 1838 - 1844. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Novembre Accounting for Background Nucleotide Composition When Measuring Codon Usage Bias Mol. Biol. Evol., August 1, 2002; 19(8): 1390 - 1394. [Full Text] [PDF] |












