Genetics, Vol. 149, 37-44, May 1998, Copyright © 1998

How Optimized Is the Translational Machinery in Escherichia coli, Salmonella typhimurium and Saccharomyces cerevisiae?

Xuhua Xiaa
a Evolutionary Genetics Group, Department of Ecology and Biodiversity, The University of Hong Kong, Hong Kong, Peoples Republic of China

Corresponding author: Xuhua Xia, Evolutionary Genetics Group, Department of Ecology & Biodiversity, The University of Hong Kong, Pokfulam Road, Hong Kong, xxia{at}hkusua.hku.hk (E-mail).

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*THE ELONGATION MODEL, ITS...
*DISCUSSION
*LITERATURE CITED

The optimization of the translational machinery in cells requires the mutual adaptation of codon usage and tRNA concentration, and the adaptation of tRNA concentration to amino acid usage. Two predictions were derived based on a simple deterministic model of translation which assumes that elongation of the peptide chain is rate-limiting. The highest translational efficiency is achieved when the codon recognized by the most abundant tRNA reaches the maximum frequency. For each codon family, the tRNA concentration is optimally adapted to codon usage when the concentration of different tRNA species matches the square-root of the frequency of their corresponding synonymous codons. When tRNA concentration and codon usage are well adapted to each other, the optimal content of all tRNA species carrying the same amino acid should match the square-root of the frequency of the amino acid. These predictions are examined against empirical data from Escherichia coli, Salmonella typhimurium, and Saccharomyces cerevisiae.


SYNONYMOUS codon usage differs among different genomes (GRANTHAM et al. 1980 Down, GRANTHAM et al. 1981 Down; MORIYAMA and HARTL 1993 Down; MARTIN 1995 Down; XIA 1996 Down), among different genes within the same genome (GOUY and GAUTIER 1982 Down; IKEMURA 1985 Down, IKEMURA 1992 Down; SHARP and LI 1986 Down, SHARP and LI 1987 Down; SHARP et al. 1988 Down), and even among different segments of the same gene (AKASHI 1994 Down). Three hypotheses have been proposed to account for this variation of synonymous codon usage (or various components of the variation): the mutation bias hypothesis (MARTIN 1995 Down), the transcription-maximization hypothesis (XIA 1996 Down) and translational efficiency hypothesis (IKEMURA 1981 Down; KIMURA 1983 Down; ROBINSON et al. 1984 Down; KURLAND 1987A Down, KURLAND 1987B Down; BULMER 1987 Down, BULMER 1988 Down, BULMER 1991 Down).

Of these three hypotheses, the translational efficiency hypothesis (hereafter referred to as TEH) is the most general and has received the most empirical support. In verbal forms, the hypothesis states that there is strong selection favoring increased rate of protein synthesis and that a coding strategy that increases the rate of translation initiation and peptide elongation (and consequently increases the rate of protein synthesis) is favoured by natural selection. The hypothesis is favored by three independent lines of evidence. First, the frequency of codon usage is positively correlated with tRNA availability (IKEMURA 1981 Down, IKEMURA 1982 Down, IKEMURA 1985 Down, IKEMURA 1992 Down; GOUY and GAUTIER 1982 Down). Second, the degree of codon usage bias is related to the level of gene expression, with highly expressed genes exhibiting greater codon bias than lowly expressed genes (BENNETZEN and HALL 1982 Down; IKEMURA 1985 Down; SHARP and DEVINE 1989 Down; SHARP et al. 1988 Down). Third, mRNA consisting of preferred codons is translated faster than mRNA artificially modified to contain rare codons (ROBINSON et al. 1984 Down; SORENSEN et al. 1989 Down).

Many models of TEH have been presented that can be called either initiation models or elongation models. Initiation models assume that the initiation of translation is rate-limiting, e.g., LILJENSTROM and VON HEIJNE 1987 Down; BULMER 1991 Down; XIA 1996 Down, whereas elongation models assume that the elongation of the peptide chain is rate-limiting, e.g., VARENNE et al. 1984 Down; BULMER 1987 Down. Empirical data and theoretical considerations suggest that both initiation and elongation are rate-limiting.

The model presented here is strictly a deterministic elongation model, because I think that previous elongation models are not well presented and that expectations are often only vaguely specified. This has resulted in some confusion. For example, KIMURA 1983 Down assumed that the translational efficiency is maximized when the proportion of different synonymous codons matches exactly the proportion of isoaccepting tRNAs. The assumption is unwarranted, and the translational efficiency, given the perfect matching, will be shown later to be the same as the presumably less adaptive scenario when different tRNA species are present in equal amount and codon usage drifts freely in any direction.

Another reason for presenting the model is to relate amino acid usage to the availability of tRNA species carrying different amino acids. From an evolutionary point of view, one would intuitively expect an efficient translational machinery to have more tRNA coding for more frequently used amino acids, but this intuition has not been formally established or rejected.

Below I present the elongation model, from which a few specific predictions concerning mutual adaptation between tRNA content and codon usage are derived. Also derived is a relationship between tRNA content and amino acid usage. Empirical data from Escherichia coli, Salmonella typhimurium and Saccharomyces cerevisiae were used to test the predictions.


*  THE ELONGATION MODEL, ITS PREDICTIONS, AND EMPIRICAL TESTS
*TOP
*ABSTRACT
*THE ELONGATION MODEL, ITS...
*DISCUSSION
*LITERATURE CITED

Consider the time required to translate a single codon coding for amino acid i (AAi, i = 1, 2, ..., 20). Designate this codon as SCij (j = 1, 2, ..., ni, where ni is the number of synonymous codons for AAi). Let r be the rate of aminoacyl-tRNA diffusing to the A site of the ribosome during translation, Pi be the probability that the arriving aminoacyl-tRNA carries AAi ({Sigma}20i=1 Pi = 1 ), and pij be the conditional probability that the aminoacyl-tRNA recognizes the synonymous codon SCij, given that the tRNA carries AAi ({Sigma}nij=1 pij = 1 for each given i). Let tl be the time spent in linking the right amino acid to the elongating protein chain, and tr be the time spent in rejecting each wrong aminoacyl-tRNA. Now the total time spent in translating SCij is

(1)
where the first term on the right-hand side of the equation is the time needed for an aminoacyl-tRNA carrying the right amino acid and the right cognate anti-codon to arrive at the A site of the ribosome and the third term represents time spent in rejecting all the wrong aminoacyl-tRNA prior to the arrival of the right aminoacyl-tRNA. Similar formulation can be found in VARENNE et al. 1984 Down and BULMER 1987 Down. The total time (T) required to translate L codons (total elongation time) can be shown to be

(2)
where

The term fij is the frequency of synonymous codon j for amino acid i in the mRNA molecule ({Sigma}20i=1 {Sigma}nij=1 fij = L), Ni is the number of codons for amino acid i ({Sigma}Ni = L; {Sigma}nij=1 fij = Ni), and Qij is the proportion of synonymous codon j for amino acid i in the mRNA molecule. Note that Qij is a property of the mRNA whereas Pi and pij are properties of the tRNA pool, with Pi being the proportion of tRNA carrying AAi, and pij being the fraction of tRNA that recognizes synonymous codon j among all tRNA species that carry AAi.

Our objective is to find the condition, i.e., the relationship among Qij, Pi and pij, that minimizes T. Because tl, tr and L are not dependent on Qij, Pi and pij, they are treated as constants. Thus, minimizing T in Equation 2 is equivalent to minimizing Y. Specifically, we are interested in three relationships. First, given the relative availability of different tRNA (Pi and pij), find what pattern of codon usage (Qij) in the mRNA would minimize Y. Second, given the pattern of codon usage (Qij), find what values for Pi and pij would minimize Y. Third, given amino acid usage, find the distribution of Pi that would minimize Y. Intuitively, we would expect frequently used amino acids to correspond to large Pi values, but the exact relationship has not been derived, let alone tested against empirical evidence.

Adaptation of codon usage to tRNA content:
Suppose that an mRNA molecule specifies N residues of the same amino acid with n synonymous codons, and that the associated frequency distribution of synonymous codons is Qj ({Sigma}Qj = 1). For simplicity, we assume that there are also n types of tRNA species for the amino acid, with each type recognizing just one of the n synonymous codons. The proportion of the n types of tRNA species is pj ({Sigma}pj = 1). Now we have Y (which is the term to be minimized) below:

(3)

First consider what values Qj should take when pj = . One might intuitively think that, to make full use of the equal availability of the n types of tRNA, Qj should match pj and should all be equal to 1/n. This is false. When pj values are equal, Y is equal to () no matter what value Qj takes as long as Qj values sum to 1. Thus, Qj is a neutral character when pj values are all equal. I reiterate this point because some confusion has been introduced by KIMURA 1983 Down who wrongly assumed that the highest translational efficiency is achieved when the relative frequencies of synonymous codons exactly match those of the cognate tRNAs.

When pj values are not equal, then the smallest Y is achieved when the codon recognized by the most abundant tRNA becomes fixed, with the consequent loss of other synonymous codons. To see this more clearly, we re-write Equation 3 as follows:

(4)

If p1 is the largest of all pj values, then the first {Pi} term, i.e., the one associated with Q1, on the numerator of Equation 4 is the smallest of all {Pi} terms. It is therefore obvious that minimization of Y in Equation 4 requires that Q1 equal 1 and that all other Qj values equal zero. This means that whenever the availability of different tRNA species (pj) for an amino acid is different, the codon usage of this amino acid should evolve towards increasing the frequency of the synonymous codon that is recognized by the most abundant cognate tRNA species. The minimum of Y achievable through adaptation of codon usage to tRNA content is

(5)
where pM designates the most abundant tRNA species for the amino acid. Ymin reaches its minimum value when pM = 1, which requires not only the adaptation of codon usage to tRNA content, but also adaptation of tRNA content to extremely biased codon usage.

For the special case with n = 2, Y in Equation 3 can be written as

(6)

The term within the parenthesis is plotted against p1 and Q1 (Figure 1). Two conclusions can be drawn. First, when pj values are all equal to 1/n, i.e., when p1 = 0.5 in Figure 1 for n = 2, then Qj can take any value between 0 and 1 without affecting translational efficiency, and Y is relatively small. We will call this condition with equal pj values the baseline condition. For unequal pj values, i.e., for p1 != 0.5 in Figure 1, Y values will be larger than that in the baseline condition whenever Qj values are smaller than pj values for pj > , e.g., when Q1 = 0.8 and p1 = 0.9 in Figure 1, or larger than pj values for pj < , e.g., when Q1 = 0.9 and p1 = 0.1, in Figure 1, in which case the reduction in translational efficiency, i.e., the increase in Y, is outstanding (Figure 1). Y will be the same as that in the baseline condition when Qj exactly matches pj, e.g., when Q1 = p1 in Figure 1. The baseline condition therefore seems to guarantee a relatively small Y value over a wide fluctuation of Qj values. Y will be smaller than the baseline condition only when Qj values are larger than pj values for pj > 1/n, e.g., when Q1 = 0.9 and p1 = 0.8 in Figure 1, or smaller than pj values for pj < 1/n, e.g., when Q1 = 0.1 and p1 = 0.2, in Figure 1.



View larger version (30K):
In this window
In a new window
Download PPT slide
 
Figure 1. —Change in translational time in relation to Q1 (the proportion of codon 1 in a two-codon family) and p1 (the proportion of tRNA species recognizing codon 1). Y' is the term within the parenthesis in Equation 6. The bolded plane perpendicular to Y' represents the baseline condition, i.e., the Y' value for p1 = 0.5. The downward arrows designate areas where Y' is smaller than it is in the baseline condition.

We have now reached a specific and intuitively appealing prediction, that codon-usage bias should be more extreme than the bias in tRNA content. If pj is larger than 1/n, then Qj should be larger than pj; if pj is smaller than 1/n, then Qj should be smaller than pj. If this is not the case, then the translational efficiency is lower than that for the baseline condition.

An empirical test of this prediction has several requirements. First, we need codon families in which a codon will not be recognized by both the common and the rare tRNA, otherwise Qj would be impossible to calculate in any meaningful way. Among the 23 codon families, i.e., when we split each of the six-member codon families for Leu, Ser, and Arg into two, only six meet this criterion (Table 1). Secondly, we need codon usage of genes that are highly expressed, otherwise we should not expect any mutual adaptations between tRNA content and codon usage bias. IKEMURA 1992 Down compiled codon usage of presumably highly expressed genes in E. coli, S. typhimurium, and S. cerevisiae (five, three and five genes, respectively), which are used to generate Table 1.


 
View this table:
In this window
In a new window

 
Table 1. Adaptation of codon usage to tRNA content in E. coli, S. typhimurium and S. cerevisiae

For all three species, the QM values are always larger than the pM values (Table 1). This guarantees that the resulting Y is smaller than that in the baseline condition. The adaptation of codon usage to tRNA content in the highly expressed genes in the three unicellular species is almost perfect (the optimal is when QM = 1), suggesting that the effect of mutation on codon usage bias must be very weak for these genes. However, if we ignore the expressivity of the genes and pool the codon usage of all genes in the gene bank, then most QM values are smaller than the pM values (data not shown), suggesting that, for most genes, the translational efficiency is lower than that in the baseline condition.

Adaptation of tRNA to codon usage:
When Qj values are fixed, e.g., when codon bias is maintained by mutation bias, the values that pj should take to minimize Y can be found as follows. We first re-write Y in Equation 3:

(7)

The condition that minimizes Y is found by taking partial derivatives of Y with respect to pj, and setting the partial derivatives to zero. This yields

(8)

Expressed in another way, the condition implies

(9)

i.e., the bias in tRNA availability for an amino acid should not be as dramatic as that in codon usage. In other words, selection driving tRNA adaptation to codon usage guarantees that tRNA bias will not be as extreme as codon bias. Results similar to Equation 9 have been derived before (BULMER 1987 Down).

The relationship between p and Q in Equation 9 can also be written as p = a , where a is a constant. IKEMURA 1992 Down plotted an equivalent measure of Q versus an equivalent measure of p (Figure 3 in IKEMURA 1992 Down) for a highly expressed gene in E. coli (groEL), and the result confirmed the predicted quadratic relationship between p and Q.




View larger version (20K):
In this window
In a new window
Download PPT slide
 
Figure 2. —The availability of tRNA carrying a certain amino acid increases linearly with the square-root of the frequency of the amino acid in (A) E. coli, (B) S. typhimurium and (C) S. cerevisiae. The 20 Ni values are from Table 1 in IKEMURA 1992 Down. Corresponding Pi values are from Table 2 in IKEMURA 1992 Down for the two prokaryotic species and Table 3 in IKEMURA 1982 Down for the yeast. Ni values were presented as the number in 1000.



View larger version (16K):
In this window
In a new window
Download PPT slide
 
Figure 3. —The slightly negative relationship between pM (the proportion of the most abundant tRNA among all tRNA species carrying the same amino acids) and Ncodon (the number of codons for each amino acid). (A) E. coli, (B) S. typhimurium.

We should now note that the baseline condition depicted in Figure 1 is not stable because, with all pj values equal to 1/n, Qj values can drift to any value without affecting translational efficiency (Equation 3 and Figure 1). When Qj values differ from 1/n, there will then be selection favoring adaptation of tRNA content to codon usage (Equation 9), which would drive pj values away from 1/n. Note that this selection pressure will not drive pj values more extreme than Qj values (Equation 9), otherwise the selection would result in a less efficient translational machinery. The resulting unequal pj values, in turn, create selection pressure for codon usage adaptation (Equation 5).

Evolution of tRNA in response to amino acid usage:
To my knowledge, none of the TEH models linked tRNA availability to amino acid usage. The 20 amino acids are not used equally in proteins, and we intuitively would expect those frequently used amino acids to be carried by more tRNA than those rarely used amino acids. To better visualize the effect of amino acid usage on Pi, which is the proportion of tRNA species carrying amino acid i in the total tRNA pool, we write Y in Equation 2 in the expanded form:

(10)
where Ni is the total number of codons for amino acid i. When codon usage is perfectly adapted to tRNA availability for each amino acid, which is approximately true based on empirical data in Table 1, Y becomes

(11)
according to Equation 5. The minimization of Y requires

(12)
where Pi and Pj designate the proportion of tRNA carrying amino acids i and j, respectively; and Ni and Nj are the number of amino acids i and j, respectively. When tRNA concentration for each amino acid is well adapted to codon usage, all pM values approach 1 and become nearly equal, so that Equation 12 becomes

(13)

This relationship has not been recognized previously.

Empirical data for testing the above prediction is readily available. The Pi values can be derived from data in Table 2 in IKEMURA 1992 Down for E. coli, and S. typhimurium, and from Table 3 in IKEMURA 1982 Down for S. cerevisiae (whose tRNA data are incomplete). IKEMURA 1992 Down also compiled the codon usage of 937 E. coli genes, 130 S. typhimurium genes, and 581 S. cerevisiae genes, from which one can derive Ni values in Equation 13. The 20 pairs of Pi and values are plotted on Figure 2A Figure C, for E. coli, S. typhimurium, and S. cerevisiae, respectively. The fit is quite remarkable.

Such a seemingly straightforward interpretation, however, has a major difficulty. The argument requires that all pM values be either approximately one (which should hold only for highly expressed genes), or approximately equal (which we have no reason to expect), so as to cancel each other out. Only a few loci are deemed highly expressed, yet 937 loci from E. coli, 130 from S. typhimurium and 581 from S. cerevisiae were used for Figure 2. Why should lowly expressed genes contribute to the linear relationship? The simplifying assumption, that pM {approx} 1, seems unjustified. It is therefore necessary to work out the relationship between P and N when the assumption of pM {approx} 1 does not hold.

I propose the following equation, which is more general than Equation 13 and does not require pM {approx} 1, to describe the relationship between P and N:

(14)

If the parameter b is shown to be = 1/2, then Equation 14 is reduced to Equation 13. Given Equation 14, we have

(15)

After some algebraic manipulation, we obtain

(16)
where

(17)

As expected, the relationship between P and N depends on the magnitude of Z, which in turn depends on the relationship between pM and N. If pM is independent of N and approaches 1, then Z = 0, and P = aN1/2, which is Equation 13. If pM and N are positively correlated, then Z < 0. If Z lies within (-1, 0), then P will increase with N at a decreasing rate. If Z = -1, then there will be no relationship between P and N, which we know to be false from Figure 2. If Z < -1, then P will decrease with N at a decreasing rate, which we also know to be false from Figure 2. If pM and N are negatively correlated, then Z > 0. If Z is between 0 and 1, then P will increase with N at a decreasing rate. If Z = 1, then P will increase linearly with N, rather than with the square-root of N as predicted from Equation 13. If Z > 1, then P will increase with N at an increasing rate.

There seems to be a slightly negative relationship between pM and N for data from the two prokaryotic species (Figure 3), which is not statistically significant. It is not possible to obtain good pM values for S. cerevisiae because some of its tRNA species remain unquantified. Based on the relationship between pM and N for the two prokaryotic species, we expect Z in Equation 16 and Equation 17 to be slightly larger than 0. Consequently, the coefficient b in P = aNb should be slightly larger than 1/2. The b values that provide the best fit to the data points in Figure 2A Figure C are 1.10, 0.99 and 0.88, respectively, i.e., Z = 1.20, 0.98 and 0.76, respectively. Equation 14, however, does not fit the empirical data significantly better than Equation 13.

Translational efficiency and translational accuracy:
Translational accuracy has recently been suggested to be an important factor related to codon usage bias (BULMER 1991 Down; AKASHI 1994 Down). This proposal received empirical substantiation from a study of protein-coding genes in Drosophila that revealed differences in codon usage among different regions of the same gene. For example, gene regions of greater amino acid conservation tend to exhibit more dramatic codon usage bias than do regions of lower amino acid conservation (AKASHI 1994 Down).

Translational efficiency and translational accuracy are inextricably coupled in their effect on codon usage bias. To reduce translational error, one needs to reduce the number of wrong aminoacyl-tRNA species that have to be rejected before the arrival of the right aminoacyl-tRNA. Equation 1 shows this number to be

(18)
for each codon translated. To translate an mRNA with L codons, the total number of wrong aminoacyl-tRNA species that translational machinery needs to reject is

(19)
where Y is exactly the same as the Y in Equation 2. To minimize the number of translational errors, we minimize Y, which leads to exactly the same predictions that we have already attributed to TEH. The rationale for separating the effect of maximizing translational accuracy on codon usage bias from that of maximizing translational efficiency is discussed later.


*  DISCUSSION
*TOP
*ABSTRACT
*THE ELONGATION MODEL, ITS...
*DISCUSSION
*LITERATURE CITED

Validity of the model:
Protein synthesis is a multi-step process including initiation of transcription, elongation of mRNA chain, initiation of translation, and elongation of the peptide chain. Opinions differ concerning which step might be rate-limiting. XIA 1996 Down argued that the rate of protein synthesis depends much on the rate of initiation of translation. He reasoned that the rate of initiation depends on the encountering rate between ribosomes and mRNA, which in turn depends on the concentration of ribosomes and mRNA. Thus, patterns of codon usage that increase transcriptional efficiency should increase mRNA concentration, which in turn would increase the initiation rate and the rate of protein synthesis. He presented a model predicting that the most frequently used ribonucleotide at the third codon sites in mRNA molecules should be the same as the most abundant ribonucleotide in the cellular matrix where mRNA is transcribed. This prediction is supported by several lines of evidence. That the initiation step is rate-limiting has also been suggested by other studies, e.g., LILJENSTRÖM and VON HEIJNE 1987; BULMER 1991.

While not denying the possibility that initiation of translation may be rate-limiting, the model presented here explicitly assumes that the elongation of the peptide chain is rate-limiting. There is a substantial amount of empirical evidence supporting this assumption (PEDERSEN 1984 Down; BONEKAMP et al. 1985 Down; BONEKAMP and JENSEN 1988 Down; WILLIAMS et al. 1988 Down). In particular, mRNA consisting of preferred codons is translated faster than mRNA artificially modified to contain rare codons (ROBINSON et al. 1984 Down; SORENSEN et al. 1989 Down). That elongation is a rate-limiting process has also been suggested on the basis of theoretical considerations (LILJENSTROM and VON HEIJNE 1987 Down).

BULMER 1991 Down, however, argued that initiation rather than elongation is rate-limiting. He reasoned that, for elongation to be rate-limiting, there should be so many ribosomes that would bind to all free mRNA molecules as soon as the latter become available for binding. Since ribosomes form the largest part of the protein translational machinery (and are therefore likely to be costly and time-consuming to make), it would be inefficient to saturate the system with them. He summarized empirical evidence that seems to suggest that ribosomes are far from saturating the system. For example, there are an average of 225 bases per ribosome in a polysome (INGRAHAM et al. 1983 Down), and each ribosome covers only about 30 bases (KOZAK 1983 Down). This BULMER 1991 Down interprets to mean that it is very rare for more than one ribosome to compete for the free binding site of the mRNA. Thus, there is no need for the ribosome to travel down the length of the mRNA in a hurry, i.e., there is little benefit associated with more efficient elongation.

There are two weaknesses in such arguments. First, KOZAK's (1983) study does not necessarily mean that a ribosome needs clear only 30 bases to free the initiation site for the binding of the next ribosome. Second, even if the ribosome needs to move only 30 bases to free the initiation site, there is still some probability for more than one ribosomes to arrive at the free initiation site. Only one of the arriving ribosomes would have a chance to bind to the initiation site, while the rest would have to be turned away. Increased elongation rate would reduce the occurrence of such events.

In addition to the assumption that elongation is rate-limiting, the model also assumes that either r, i.e., the rate of aminoacyl-tRNA diffusing to the A site of the ribosome during translation, is not extremely large, or tr, i.e., the time spent in rejecting each wrong aminoacyl-tRNA, is not negligibly small. These seem to be reasonable assumptions, although BILGIN et al. 1988 Down suggested that tr might indeed be very small.

Relative importance of translational efficiency and accuracy on codon usage bias:
Although the model of maximizing translational accuracy and that of maximizing translational efficiency produce the same set of predictions, it is still possible to separate the effect of maximizing translational accuracy on codon usage bias from that of maximizing translational efficiency. For example, a protein gene could have arginine codons in different domains of different functional importance. Being in the same protein gene, these arginine codons are subject to the same selection pressure exerted by maximizing translational efficiency, and consequently should have the same codon usage bias according to the model of maximizing translational efficiency. However, those arginine codons located in the functionally important domains are subject to greater selection pressure exerted by maximizing translational accuracy than those located in the functionally unimportant domains. Consequently, the former codons will be more biased towards using the optimal codon than the latter. Some preliminary findings along this line of reasoning have already been reported (AKASHI 1994 Down).

The reasoning above leaves one question unanswered. Why is it necessary to invoke translational efficiency to account for codon usage bias? Can't we attribute all the codon usage bias to the effect of maximizing translational accuracy and forget about translational efficiency? The answer is that the effect of maximizing translational accuracy is insufficient to account for the observed codon usage bias. For example, highly expressed genes exhibit greater codon bias than lowly expressed genes, but the former are not necessarily more conservative than the latter (greater conservativeness presumably implies greater demand for accuracy). We can rank protein genes according to their conservativeness, or rank them according to their expressivity, and find out which ranking explains codon usage bias better. Preliminary results (unpublished) suggest that the expressivity is the more important of the two.

It should be noted that the within-gene variation in codon usage bias found in Drosophila (AKASHI 1994 Down) does not seem to be general. For example, it is not observed in E. coli and S. typhimurium (HARTL et al. 1994 Down). More empirical studies are needed to assess the effect of maximizing translational accuracy on codon usage bias.

How optimized are the translational machinery?
From our results, we can say that codon usage in those highly expressed genes is almost as optimal as possible, with the QM values larger than pM values and almost equal to one. However, for the majority of genes, the QM values are smaller than pM (data not shown), which implies that the translational efficiency for the majority of the genes is less than in the seemingly less adaptive scenario when different tRNA species are present in equal amounts and codon usage drifts freely in any direction.

We should note that selection for codon adaptation to tRNA content operates on individual genes, whereas selection for the adaptation of tRNA content to codon usage operates at the genome level. Thus, although Equation 5 suggests that the optimal condition is when both the most abundant tRNA and its cognate codon become fixed, Equation 9 shows that selection for tRNA adaptation to codon usage will always lag behind codon usage bias.

The most remarkable feature from the model is the prediction relating amino acid usage (Ni) to tRNA content (Pi), which is strongly supported by empirical evidence (Figure 2). A more extensive study is underway to confirm the generality of the relationship.


*  ACKNOWLEDGMENTS

I thank L. CHOY and A. K. F. LO for assistance and members of University of Hong Kong Ecology and Evolutionary Genetics Group (Y. SADOVY, in particular) for discussion and comments. Part of the work was done when I was at the Museum of Natural Science of Louisiana State University. I benefited greatly from referees' comments. This project is supported by a Research Grant Council grant (HKU 7233/98M) from Hong Kong Government and a Committee on Research and Conference Grants grant (335/023/0022) from The University of Hong Kong to X. XIA, and a National Science Foundation grant (DEB95-27583) to MARK S. HAFNER and X. XIA.

Manuscript received September 23, 1997; Accepted for publication January 21, 1998.


*  LITERATURE CITED
*TOP
*ABSTRACT
*THE ELONGATION MODEL, ITS...
*DISCUSSION
*LITERATURE CITED

AKASHI, H., 1994  Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927-935[Abstract].

BENNETZEN, J. L. and B. D. HALL, 1982  Codon selection in yeast. J. Biol. Chem. 257:3026-3031[Abstract/Free Full Text].

BILGIN, N., M. EHRENBERG, and C. KURLAND, 1988  Is translation inhibited by noncognate ternary complexes? FEBS-LETT. 233:95-99[Medline].

BONEKAMP, F. and K. F. JENSEN, 1988  The AGG codon is translated slowly in E. coli even at very low expression levels. Nucleic Acids Res. 16:3013[Abstract/Free Full Text].

BONEKAMP, F., H. D. ANDERSEN, T. CHRISTENSEN, and K. F. JENSEN, 1985  Codon-defined ribosomal pausing in Escherichia coli detected by using the pyrE attenuator to probe the coupling between transcription and translation. Nucleic Acids Res. 13:4113-4123[Abstract/Free Full Text].

BULMER, M., 1987  Coevolution of codon usage and transfer RNA abundance. Nature 325:728-730[Medline].

BULMER, M., 1988  Codon usage and intergenic position. J. Theor. Biol. 133:67-71[Medline].

BULMER, M., 1991  The selection mutation drift theory of synonymous codon usage. Genetics 129:897-907[Abstract].

GOUY, M. and C. GAUTIER, 1982  Codon usage in bacteria: correlation with gene expressivity. Nucleic Acids Res. 10:7055-7064[Abstract/Free Full Text].

GRANTHAM, R., C. GAUTIER, M. GOUY, R. MERCIER, and A. PAVE, 1980  Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 8:49-79.

GRANTHAM, R., C. GAUTIER, M. GOUY, M. JACOBZONE, and R. MERCIER, 1981  Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 9:43-79.

HARTL, D. L., E. N. MORIYAMA, and S. A. SAWYER, 1994  Selection intensity for codon bias. Genetics 1138:227-234.

IKEMURA, T., 1981  Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol. 151:389-409[Medline].

IKEMURA, T., 1982  Correlation between the abundance of yeast transfer RNAs and the occurrence of the respective codons in protein genes. J. Mol. Biol. 158:573-597[Medline].

IKEMURA, T., 1985  Codon usage and tRNA content in unicellular and multicellular organisms. Mol. Biol. Evol. 2:13-34[Abstract].

IKEMURA, T., 1992 Correlation between codon usage and tRNA content in microorganisms, pp. 87–111 in Transfer RNA in Protein Synthesis, edited by D. L. HATFIELD, B. J. LEE, and R. M. PIRTLE. CRC Press, Boca Raton, FL.

INGRAHAM, J. L., O. MAALØE and F. C. NEIDHARDT, 1983 Growth of the bacterial cell. Sinauer, Sunderland, Mass.

KIMURA, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.

KOZAK, M., 1983  Comparison of initiation of protein synthesis in procaryotes, eucaryotes, and organelles. Microbiol. Rev. 47:1-43[Free Full Text].

KURLAND, C. G., 1987a  Strategies for efficiency and accuracy in gene expression. 1. The major codon preference: a growth optimization strategy. Trends Biochem. Sci. 12:126-128.

KURLAND, C. G., 1987b  Strategies for efficiency and accuracy in gene expression. 2. Growth optimized ribosomes. Trends Biochem. Sci. 12:169-171.

LILJENSTRÖM, H. and G. VON HEIJNE, 1987  Translation rate modification by preferential codon usage: intragenic position effects. J. Theor. Biol. 124:43-55[Medline].

MARTIN, A. P., 1995  Metabolic rate and directional nucleotide substitution in animal mitochondrial DNA. Mol. Biol. Evol. 12:1124-1131[Abstract].

MORIYAMA, E. N. and D. L. HARTL, 1993  Codon usage bias and base composition of nuclear genes in Drosophila. Genetics 134:847-858[Abstract].

PEDERSEN, S., 1984  Escherichia coli ribosomes translate in vivo with variable rate. EMBO J. 3:2895[Medline].

ROBINSON, M., R. LILLEY, S. LITTLE, J. S. EMTAGE, and G. YAMAMOTO et al., 1984  Codon usage can effect efficiency of translation of genes in Escherichia coli. Nucleic Acids Res. 12:6663-6671[Abstract/Free Full Text].

SHARP, P. M. and K. M. DEVINE, 1989  Codon usage and gene expression level in Dictyostelium discoideum: highly expressed genes do "prefer" optimal codons. Nucleic Acids Res. 17:5029-5038[Abstract/Free Full Text].

SHARP, P. M. and W. H. LI, 1986  An evolutionary perspective on synonymous codon usage in unicellular organisms. J. Mol. Evol. 24:28-38[Medline].

SHARP, P. M. and W. H. LI, 1987  The codon adaptation index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15:1281-1295[Abstract/Free Full Text].

SHARP, P. M., E. COWE, D. G. HIGGINS, D. C. SHIELDS, and K. H. WOLFE et al., 1988  Codon usage patterns in Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster and Homo sapiens: a review of the considerable within-species diversity. Nucleic Acids Res. 16:8207-8211[Abstract/Free Full Text].

SORENSEN, M. A., C. G. KURLAND, and S. PEDERSEN, 1989  Codon usage determines translation rate in Escherichia coli. J. Mol. Biol. 207:365-377[Medline].

VARENNE, S., J. BUG, R. LLOUBES, and C. LAZDUNSKI, 1984  Translation is a nonuniform process: effect of tRNA availability on the rate of elogation of nascent polypeptide hains. J. Mol. Biol. 180:549-576[Medline].

WILLIAMS, D. P., D. RIGIER, D. AKIYOSHI, F. GENBAUFFE, and J. R. MURPHY, 1988  Design, synthesis and expression of a human interleukin-2 gene incorporating the codon usage bias found in highly expressed Escherichia coli genes. Nucleic Acids Res. 16:10453-10467[Abstract/Free Full Text].

XIA, X., 1996  Maximizing transcription efficiency causes codon usage bias. Genetics 144:1309-1320[Abstract].




This article has been cited by other articles:


Home page
Mol Biol EvolHome page
N. Stoletzki and A. Eyre-Walker
Synonymous Codon Usage in Escherichia coli: Selection for Translational Accuracy
Mol. Biol. Evol., February 1, 2007; 24(2): 374 - 381.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
C. C. Burns, J. Shaw, R. Campagnoli, J. Jorba, A. Vincent, J. Quay, and O. Kew
Modulation of Poliovirus Replicative Fitness in HeLa Cells by Deoptimization of Synonymous Codon Usage in the Capsid Region.
J. Virol., April 1, 2006; 80(7): 3259 - 3272.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Akashi
Translational Selection and Yeast Proteome Evolution
Genetics, August 1, 2003; 164(4): 1291 - 1303.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. J. Lynn, G. A. C. Singer, and D. A. Hickey
Synonymous codon usage is subject to selection in thermophilic bacteria
Nucleic Acids Res., October 1, 2002; 30(19): 4272 - 4277.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
C. M. Laemmli, J. H. J. Leveau, A. J. B. Zehnder, and J. R. van der Meer
Characterization of a Second tfd Gene Cluster for Chlorophenol and Chlorocatechol Metabolism on Plasmid pJP4 in Ralstonia eutropha JMP134(pJP4)
J. Bacteriol., August 1, 2000; 182(15): 4165 - 4172.
[Abstract] [Full Text]


Home page
J. Biol. Chem.Home page
M. V. Berjanskii, M. I. Riley, A. Xie, V. Semenchenko, W. R. Folk, and S. R. Van Doren
NMR Structure of the N-terminal J Domain of Murine Polyomavirus T Antigens. IMPLICATIONS FOR DnaJ-LIKE DOMAINS AND FOR MUTATIONS OF T ANTIGENS
J. Biol. Chem., November 10, 2000; 275(46): 36094 - 36103.
[Abstract] [Full Text] [PDF]