## Abstract

Recent work has shown that expression level is the main predictor of a gene's evolutionary rate and that more highly expressed genes evolve slower. A possible explanation for this observation is selection for proteins that fold properly despite mistranslation, in short selection for translational robustness. Translational robustness leads to the somewhat paradoxical prediction that highly expressed genes are extremely tolerant to missense substitutions but nevertheless evolve very slowly. Here, we study a simple theoretical model of translational robustness that allows us to gain analytic insight into how this paradoxical behavior arises.

THE increasing availability of whole-genome sequences from many different species has revealed a surprising fact: Different genes within the same organism evolve at dramatically different rates. For example, the evolutionary rates of the fastest- and slowest-evolving genes in *Saccharomyces cerevisiae* are separated by three orders of magnitude (Drummond *et al.* 2005). Because the dominant force shaping genomewide patterns of evolutionary rate is most likely stabilizing selection, the evolutionary rates of genes should correlate with quantities that measure how important or otherwise constrained a gene is (Kimura 1983; Ohta 1992). A wide array of such quantities have been proposed, shown to correlate with evolutionary rate, and subsequently disputed. Examples include a gene's dispensability or essentiality (Hurst and Smith 1999; Hirsh and Fraser 2001; Jordan *et al.* 2002; Pal *et al.* 2003; Zhang and He 2005; Wall *et al.* 2005), its number of interaction partners (Fraser *et al.* 2002; Bloom and Adami 2003; Jordan *et al.* 2003; Hahn *et al.* 2004; Agrafioti *et al.* 2005), its length (Marais and Duret 2001), or its centrality in the protein interaction network (Hahn and Kern 2005). However, it seems that most importantly, the expression level (Pal *et al.* 2001; Rocha and Danchin 2004), or perhaps more accurately the frequency of translation events (Drummond *et al.* 2005, 2006), influences evolutionary rate.

Drummond *et al.* (2005) have recently introduced a theory for why highly expressed genes evolve slowly. Translation is error prone, and inactivated or misfolded proteins resulting from mistranslation impose substantial costs on the cell (Bucciantini *et al.* 2002), costs that increase with expression level. One way in which the cost associated with a highly expressed gene is reduced is translational accuracy (Akashi 1994, 2001), whereby the gene is encoded with optimal codons whose translation is less error prone than the translation of other codons. Translational accuracy can explain why the rate of synonymous evolution *d*_{S} is correlated with expression level, codon adaptation index, or protein abundance. However, it cannot explain why the rate of nonsynonymous evolution *d*_{N} shows an even stronger correlation with these quantities (Drummond *et al.* 2005). Selection for translational accuracy can reduce the translational error rate by a factor of 4–9 (Precup and Parker 1987), but even optimally coded genes that are highly expressed may produce a large amount of erroneous polypeptides. Therefore, Drummond *et al.* (2005) suggest that highly expressed genes should be under additional selection to be tolerant to translation errors, that is, the polypeptides produced from these genes should fold properly even if they were erroneously translated. Recent work on protein biochemistry has shown that proteins can differ widely in their tolerance to missense substitutions and that properly chosen point mutations can dramatically increase the tolerance of a protein to additional substitutions (Bloom *et al.* 2005). Drummond *et al.*'s hypothesis, referred to as selection for translational robustness, predicts a constraint on the nonsynonymous rate of evolution, whereas selection for translational accuracy predicts primarily a constraint on the synonymous rate of evolution. We must assume that both selective constraints will operate on genes that are frequently translated.

Genes that are highly tolerant to translational missense errors must also, by definition, be highly tolerant to missense substitutions. However, the translational robustness hypothesis predicts that these genes will nevertheless be strongly conserved under evolution. An example of a gene that exhibits this paradoxical behavior is Rubisco, an extremely abundant protein that fixes carbon dioxide during photosynthesis. Rubisco is strongly conserved across phyla, but appears to tolerate most missense substitutions in the laboratory without loss of fold (Spreitzer 1993; Kellogg and Juliano 1997).

The purpose of this article is to put the translational robustness hypothesis into precise mathematical terms and to demonstrate how highly expressed genes can evolve to be tolerant to missense substitutions and yet remain strongly conserved under evolution.

## MATERIALS AND METHODS

#### Model:

We consider the evolution of a gene encoding a protein of length *L*. Each site in the protein can be in one of two states, *neutral* or *nonneutral*. We denote the number of neutral sites in the protein by *n* (see Table 1 for definitions of variables). A mutation at a neutral site of a folded protein leaves the protein folded, but changes the site from neutral to nonneutral. A mutation at a nonneutral site of a folded protein will usually cause misfolding and consequent loss of function, but with a small probability α, such a mutation will not affect folding but turn the site into a neutral one. For simplicity, we assume that once an amino acid sequence loses the ability to fold, it is impossible to mutate it back into a folded state. The rationale behind this assumption is that the likelihood of a mutation restoring fold to an unfolded amino acid sequence is so low as to be negligible. Our model is a reasonable abstraction of a thermodynamic framework of protein evolution that has recently been shown to have good predictive power for both simulated and real proteins (Bloom *et al.* 2005; Wilke *et al.* 2005). The key insight of this framework is that a protein's tolerance to substitutions is closely related to the protein's stability—more stable proteins can withstand more missense substitutions—and that therefore proteins can change from being highly fragile to being highly robust to mutations and vice versa through the accumulation of stabilizing or destabilizing mutations. In this sense, a mutation from a nonneutral to a neutral site in our model corresponds to a stabilizing mutation, and the opposite mutation corresponds to a destabilizing mutation. Thus, our model captures the following key aspects of protein biochemistry: (i) Homologous proteins can vary widely in their tolerance to mutations, and individual point mutations can increase or decrease this tolerance; (ii) mutations that increase a protein's tolerance are much rarer than mutations that decrease its tolerance; (iii) highly tolerant proteins are extremely rare, while moderately tolerant proteins are abundant; and (iv) nonfunctional mutant proteins are likely to be misfolded.

The gene is expressed at a level that leads to the synthesis of *x* polypeptides. For simplicity, we assume that the total number of polypeptides translated per gene is proportional to the gene's expression level and that the constant of proportionality is 1. Thus, *x* is also the expression level, measured in mRNA molecules per cell. Under translation, each site is mistranslated with probability τ (we neglect premature termination of the translation process). The probability that a single mRNA molecule is mistranslated and leads to a misfolded protein is 1 − [1 − τ(1 − α)]^{L}^{−n} ≈ τ(*L* − *n*), where the approximation holds for . Let *f* = τ(*L* − *n*) be the fraction of synthesized proteins that misfold. We assume that the expression level is regulated such that the number of folded proteins per gene, the protein abundance *A*, is held constant, regardless of the translational error rate. Then, *A* = *x*(1 − *f*). The total number of misfolded polypeptides per gene follows as *xf* = *Af*/(1 − *f*). Finally, we assume that each misfolded polypeptide imposes a cost *c* on the cell, so that the total cost of a gene translated at abundance level *A* is *cAf*/(1 − *f*). We turn this cost into a fitness value by assuming that each misfolded protein has the same relative effect on fitness. Then we can write the overall fitness of a gene with *n* neutral sites as(1)Without loss of generality, we use *c* = 1 henceforth.

#### Simulations:

We implemented a stochastic simulation of *N* sequences reproducing in discrete, nonoverlapping generations. We employed standard Wright–Fisher sampling, that is, the probability that a sequence in generation *t* + 1 is the offspring of a sequence at generation *t* is directly proportional to the latter's fitness.

We measured the evolutionary rate along the line of descent from the most recent common ancestor (MRCA) of the final population backward in time, as described by Wilke (2004). Briefly, we let the simulated population evolve until the birth time of the population's MRCA, designated *t*_{0}, exceeded a fixed equilibration time *t*_{equil} plus a time window *t*_{meas}, *t*_{0} > *t*_{equil} + *t*_{meas}. All quantities were measured on the equilibrated population during this latter time window. For all results reported, we chose *t*_{equil} = *t*_{meas} = 400,000, *U* = 0.001, τ = 0.001, and *L* = 100. At all parameter settings, we carried out 100 replicas and averaged results over all replicas.

## ANALYTICAL RESULTS

#### Solution based on Sella–Hirsh theory:

We can calculate the steady-state solution of our model using the analogy between evolutionary biology and statistical physics recently demonstrated by Sella and Hirsh (2005). The theory of Sella and Hirsh (2005) is applicable whenever the product of population size and mutation rate is ≪1, *NU* ≪ 1. In this regime, the population is essentially homogeneous at all times and can be represented at any given point in time by a single sequence. We say that the population is in state *i* if the dominant sequence in the population is sequence *i*. The key insight of Sella and Hirsh (2005) is that the probability *p _{i}* to find the population in state

*i*is proportional to a function

*F*(

*i*) (also called a Boltzmann factor) that depends only on the fitness of sequence

*i*, the population size, and details of the mutation process. Thus, it follows that(2)where the sum runs over all possible sequences

*j*. Once we have the probabilities

*p*, we can calculate all observable quantities of interest, such as mean fitness and mean evolutionary rate, using standard probability theory (see also below).

_{i}As the fitness of a sequence in our model depends only on the sequence's number of neutral sites *n*, it is useful to lump all sequences with the same *n* into a single class and calculate the probability *p _{n}* that the population is in a state represented by any sequence with

*n*neutral sites. Since there are such sequences, we introduce , where

*F*(

*n*) is the appropriate Boltzmann factor for a single sequence with

*n*neutral sites, and then calculate

*p*as . In our case, (

_{n}*n*) is given by(3)with

*w*as defined in Equation 1. The second term in the exponential takes into account the asymmetry in the mutation process, that is, mutations that increase

_{n}*n*are a factor α less likely to occur than mutations that decrease

*n*(Sella and Hirsh 2005, supplemental text).

With the formalism outlined in the previous paragraphs, we can calculate the expected number of neutral sites in the steady-state *E*[*n*] as(4)and the expected fitness as(5)Note that the expected values are not taken over the population (which is assumed to be homogeneous), but over a long time window in the steady state. Next we can calculate the evolutionary rate *K*, that is, the expected number of amino acid substitutions per unit time that accumulate along the line of descent in an equilibrated population. We find(6)where is the probability of fixation of sequence *j* in background *i* (Sella and Hirsh 2005),(7)[Note that and .]

We can simplify these expressions in the special cases that *A* is either very large or very small. After inserting Equation 1 into Equation 3, we have(8)where we have also made the approximation *N* − 1 ≈ *N*. We continue to use this approximation throughout the rest of this article. From Equation 8, we can see that the behavior of the system changes drastically depending on whether the product *NA* is large or small. However, since the population size *N* is the same for all genes in a species while each gene's corresponding protein abundance *A* can vary over many orders of magnitude, in the following we assume that *N* is fixed and consider the limits of very small and very large *A*.

For , the first term in the exponential disappears, and *F*′(*n*) becomes simply . Thus, we find that(9)We cannot obtain similarly simple expressions for *E*[*w*] and *K* in this limit, but do so in the next subsection, using a different method.

For , we have to distinguish between the cases *n* = *L* and *n* < *L*. For *n* < *L*, the first term in the exponential in Equation 8 becomes much larger in magnitude than the second term, which is a constant. Thus, we can neglect the second term and have(10)This expression tends to 0 for large *A*. For *n* = *L*, we have *F*′(*L*) = α* ^{L}*, independent of

*A*. All terms but the

*n*=

*L*term disappear, and we have

*E*[

*n*] =

*L*,

*E*[

*w*] =

*w*, and(11)

_{L}#### Approximate solution:

Sella–Hirsh theory yields an exact solution for our model. However, the resulting expressions are somewhat unwieldy and do not lead to simple analytical expressions for intermediate *A*. Therefore, we now derive an approximate solution to our model.

For small τ, we can approximate *w _{i}* ≈ exp[−

*A*τ(

*L*−

*i*)] and find ln(

*w*/

_{j}*w*) =

_{i}*A*τ(

*j*−

*i*). This approximation is equivalent to neglecting the small number of additional translation events required to replace polypeptides that misfold. The probability of fixation follows from Equation 7 as(12)The idea of the approximate solution is that in the steady state, mutations that increase the number of neutral sites and those that decrease it are in perfect balance. Therefore, the number of neutral sites in the steady state,

*n**, satisfies(13)According to Equation 12, and are independent of

*n*. We introduce π

_{+}and π

_{−}, the probabilities of fixation for a mutation that increases or decreases

*n*by 1, respectively, and find(14)(15)After inserting these expressions into Equation 13 and solving for

*n**, we obtain(16)With this result, the expected fitness in the steady state becomes(17)and the evolutionary rate

*K*follows as(18)In the appendix, we derive an expression for

*K*as a function of

*n**, rather than as a function of

*A*as we have done here.

In the limit *A*→0, we have π_{+} = π_{−} = 1/*N* and *n** = α*L*/(α + 1). [Note that this expression is identical to the result found through Sella–Hirsh theory (Equation 9), if we equate *n** with *E*[*n*].] Therefore, in this limit,(19)In the limit , we have π_{+} = 1, π_{−} = *e*^{−2NAτ}, and *n** = *L*. Therefore, in this limit,(20)The expressions for *n** and *K* again agree with the results found through Sella–Hirsh theory, if we assume 1 − τ ≈ 1 in Equation 11.

#### Limitations on the number of neutral sites:

Certain residues may never tolerate any substitutions, such as the active-site serine of a serine protease or the heme-binding histidines of hemoglobin. Under the assumption that *m* sites can never be neutral, we can write *L* = *l* + *m*, and for small τ the fitness *w _{n}* is approximately(21)In other words, all fitness values are rescaled by a factor depending on τ but not on

*n*. Equation 7 reveals that such a rescaling leaves the fixation probabilities unchanged. Therefore, the approximate solution remains unchanged except that we replace

*L*by

*l*(=

*L*−

*m*) everywhere and add a factor

*e*

^{−Aτm}to Equation 17.

For Sella–Hirsh theory, if we assume that τ*m* is negligibly small, the Boltzmann factor *F*′(*n*) gains a similar leading factor, which, as Equations 4 and 5 make clear, also drops out, this time through the normalization term. In the case of *E*[*n*], the result is only that *L* must again be replaced by *l* = *L* − *m*, while the expected value *E*[*w*] also gains a leading factor *e*^{−Aτm}, just as in the approximate case.

In short, when there are *m* never-neutral sites, the main effects are to lower the population's fitness by a factor *e*^{−Aτm} and to reduce the expected number of neutral sites *E*[*n*] and the evolutionary rate *K* roughly as though the evolving gene were shorter by *m* residues.

## SIMULATION RESULTS

First, we studied the rate of evolution *K* as a function of the protein abundance *A*, for various population sizes (Figure 1). We found that *K* levels off for . The asymptotic value at *A* = 0 is *K*(0) = α*U* (Equation 19). For increasing *A*, *K* first increases and then rapidly drops off to zero for even larger *A*. The main effect of the population size *N* is to determine at what level of *A* this drop off initiates. With increasing *N*, the evolutionary rate *K* seems to be simply shifted to the left, toward lower *A* (Figure 1). We can make this statement more precise by considering the large-*A* limit of our approximate solution, Equation 20. This limit shows that the evolutionary rates *K*(*N*, *A*) and *K*(*aN*, *a*^{−1}*A*) are related through *K*(*N*, *A*) ≈ *a*^{−1}*K*(*aN*, *a*^{−1}*A*), where *a* is an arbitrary constant. Therefore, if we increase *N* by a factor of *a*, the resulting curve *K*(*A*) appears on a log-log plot to be shifted upward and to the left by an amount of log(*a*). The upward shift cannot be noticed, because it exists only for very large *A*, and thus the effect of increasing *N* seems to be to simply shift the *K*(*A*) curve to the left.

Second, we studied the effect of varying α on *K*(*A*) (Figure 2). The variable α influences mainly the asymptotic limit of *K*(*A*) for small *A* [with *K*(*A*) increasing with α], but does not affect how quickly *K*(*A*) decays for large *A*.

Third, we studied the behavior of the expected fitness and the expected number of neutral sites for varying levels of *A*. The expected fitness is ∼1 for both very low and very high *A*, but drops below 1 for the intermediate values of *A* for which *K*(*A*) starts to decay (Figure 3a). The expected number of neutral sites is α*L*/(α + 1) for very small *A* and increases to 1 for large *A* (Figure 3b). We can understand the different evolutionary regimes of low *A* and high *A* as follows: For low *A*, since very little protein is synthesized, the cost associated with misfolded proteins following erroneous translation is negligible. Therefore, the expected fitness in this regime is 1. Since the cost of misfolding is negligible, the number of neutral sites is not under selection in this regime and it settles to the value at which the mutations increasing *n* and those decreasing *n* exactly balance each other. For high *A*, on the other hand, the cost of translation-induced misfolding is tremendous. Therefore, at high *A*, the population converges to the single optimal sequence with *n* = *L* (or *n* = *l* if some sites can never be neutral). Every mutation that reduces *n* by even 1 is highly deleterious and therefore will virtually never go to fixation, even in a small population. For *l* = *L*, the optimal sequence (which has *n* = *l*) pays no cost whatsoever under mistranslation, and the expected fitness is again 1. For intermediate *A*, the cost of translation-induced misfolding is significant but not debilitating. As a result, *n* is elevated in comparison to its low-*A* limit, but the expected fitness falls below 1.

Finally, the calculation in the appendix predicts that the evolutionary rate should be independent of *N* if we plot it as a function of the expected number of neutral sites *E*[*n*] rather than a function of *A*. Figure 4 shows that this prediction is indeed accurate. We find that there are two distinct regimes for the evolutionary rate. For small *E*[*n*], the evolutionary rate increases with *E*[*n*]. This increase is caused by the increased availability of neutral mutations with growing *E*[*n*]. However, even though we can calculate what the evolutionary rate would be for arbitrarily small *E*[*n*], in equilibrium *E*[*n*] will never be below its limiting value for small *A*, α*L*/(α + 1). For large *E*[*n*], the behavior of *K* is reversed, and now it decreases with increasing *E*[*n*]. The decay comes about because in this regime, even though there are many mutations that do not disrupt the fold of a properly translated protein, these mutations increase the amount of mistranslated, misfolded proteins and thus are selected against. The quantity *E*[*n*] can get arbitrarily close to *L* (or *l* for *m* > 0), and therefore *K* can decay to almost zero if *A* is sufficiently large.

Throughout this study, we found good agreement among the numerical simulations, Sella–Hirsh theory, and our simple analytical approximation. Some discrepancies appeared between theory and simulations for the largest population size (*N* = 1000) and for very small α in conjunction with large *A*. We attribute the former to a violation of the condition , which must be satisfied for both Sella–Hirsh theory and our approximate solution. We carried out our simulations with a mutation rate of *U* = 0.001, which means that *NU* = 1 for *N* = 1000. The latter discrepancies are caused by insufficient equilibration time. For large *A*, the number of neutral sites *n* always approaches *L*, irrespective of population size or α. However, the smaller α is, the lower the probability that a mutation occurs that increases *n*. Therefore, the equilibration time needed at large *A* grows without bound as α decreases. We did additional simulations in this regime and found that the simulation results approached the predicted quantities with increasing equilibration time (data not shown).

## DISCUSSION

We have developed a simple model that describes the slowdown of the rate of evolution of highly expressed genes under selective pressure for translational robustness. We have studied the model with numerical simulations and have solved the model exactly using Sella–Hirsh theory. We have also developed a simple analytic approximation that is in excellent agreement with the predictions from Sella–Hirsh theory and is valid for the entire range of possible parameter values (as long as ).

The model abstracts a previous thermodynamic model of protein mutational tolerance introduced by Bloom *et al.* (2005) in which mutations may leave unperturbed or destabilize the protein's native structure (common) or stabilize it (uncommon). Increases in stability tend to increase the number of sites at which substitutions can be tolerated, so-called neutral sites, while decreases in stability usually cause misfolding or decrease the number of neutral sites. In this work, we have modeled neutral sites directly. In doing so, we allow only stepwise changes in neutral sites, sacrificing treatment of large stability changes that might radically alter the number of neutral sites and the potential stability dependence of mutational effects to gain analytical tractability.

Our results show a clear example of the paradox cited in the Introduction (Figure 4): Given selection against the costs of protein misfolding, genes simultaneously become more mutationally tolerant (larger number of neutral sites, *n*) but evolve slower as if fewer mutations were tolerated. The paradox's resolution requires disentangling two kinds of mutational tolerance, one of which captures the likelihood of loss of protein function due to a mutation while the other quantifies the fitness cost of that mutation, the cost that ultimately determines evolutionary rate. When protein misfolding imposes minimal fitness costs, as is the case with low-expression proteins, the proportion of mutations that preserve protein function governs the rate of evolution (Figure 4, left). However, at high expression levels, fitness costs of mutations that preserve protein function grow large and can become the dominant determinant of evolutionary rate (Figure 4, right).

This observation has an important corollary. When selection for translational robustness is weak, functional loss is likely the main determinant of fitness costs. Thus, our results suggest that evolutionary conservation of sites in low-expression proteins may be more likely to indicate functional importance than similar conservation at sites in high-expression proteins.

Our simple model produces an exponential decline in evolutionary rate with increasing expression level, whereas in yeast, a power law better describes this relationship (Drummond *et al.* 2005). Several possibilities may explain the discrepancy. First, our model assumes a symmetric binomial distribution of the number of neutral sites, but the distribution for real proteins may be skewed or heavy tailed. Second, the cost of additional misfolded proteins may not be independent of the number of already misfolded proteins. For example, misfolded proteins form toxic aggregates (Bucciantini *et al.* 2002), and aggregation is not a linear function of protein concentration. Finally, differences in protein structure and function between high- and low-expression proteins may play a role. Drummond *et al.* (2005) have previously examined differences between functionally and structurally similar paralogs and found a similar power-law relationship. However, more subtle but important differences may separate paralogs and influence their evolution.

Our results here demonstrate that profound differences in protein evolutionary rates can arise even in the absence of functional and structural differences and when variables such as protein length, the translation error rate, and the underlying distribution of the number of neutral sites are held constant. In real genomes, of course, all these features vary and some, perhaps all, are under selection. The value of the model is its utility in explaining why highly expressed proteins evolve slowly across taxa (Drummond *et al.* 2005).

Interestingly, our model reveals two evolutionary-rate regimes (Figures 1 and 2), one in which rates remain relatively constant with increasing protein production and another in which rates decline precipitously. In yeast, virtually all genes appear to fall in the latter regime, as the evolutionary rate declines almost immediately from low to high expression (Drummond *et al.* 2005), raising the possibility of genome-level selection in this direction. If yeast protein synthesis levels reflect organismal needs and cannot be freely modulated, as seems likely, and protein synthesis costs dominate cellular energy consumption, as evidence suggests (Princiotta *et al.* 2003), the remaining genome-level target for selection is on the fitness cost per misfolded protein, *c*. Decreasing *c* pushes genes away from the decline to where cost differences become negligible, whereas increasing *c* pushes genes toward the decline, amplifying the cost difference between high- and low-expression proteins. One way to decrease *c* is to drive translation errors down to negligible levels. Another is to maintain a quality-control apparatus (*e.g.*, chaperones and proteases) with so much excess bandwidth that cost differences associated with variability in protein misfolding become negligible. Evidence suggests that both strategies for decreasing *c* impose significant fitness penalties. Decreasing translational error rates can be easily accomplished, often with a single ribosomal mutation (Alksne *et al.* 1993), but ribosomal accuracy and growth rate are often negatively correlated, presumably through the intrinsic speed/accuracy trade-off inherent in ribosomal proofreading (Kurland 1992). Maintenance of a chaperone fleet large enough to dilute out misfolding cost differences would divert enormous cellular resources for little benefit, and the massive induction of chaperones after heat shock suggests that cellular chaperone levels do not have much remaining bandwidth under normal conditions. Overall, it seems plausible that the steep decline observed in yeast's evolutionary-rate–expression relationship reflects a balance favoring a relatively high cost per misfolded protein *c*. Costly translational accuracy and quality control machinery may be reduced so long as the increased errors and reduced folding assistance can be compensated for; translational robustness provides that compensation—essentially for free—but is ultimately limited by mutation pressure away from robust sequences and by the fundamental intolerance of proteins to at least some errors.

Our model distinguishes between the number of polypeptides produced per gene, *x*, and the abundance of functional proteins, *A*, yet our approximate solution essentially equates these quantities with only minor accuracy loss. The approximation works for two reasons. First, misfolded polypeptides impose a negligible cost for low-abundance proteins, while for high-abundance proteins, misfolded polypeptides are rare because of selection for translational robustness. We expect these nontrivial results to hold for many organisms. Second, in our model, the number of translation events *x*, the primary determinant of the number of mistranslated proteins, is estimated accurately by *A*, a situation unlikely to hold for most organisms. Protein abundance reflects a balance of ongoing translation and turnover (Greenbaum *et al.* 2003), such that a high abundance can result either from moderate translational frequency and long protein half-life or from rapid translation and short half-life. Because half-lives can vary over orders of magnitude (Hargrove and Schmidt 1989), abundance and translation frequency may only weakly correlate in real organisms. Among protein abundance, mRNA expression level, and translation frequency, we hypothesize that the latter, even though difficult to measure, will best predict evolutionary rate.

Bürger *et al.* (2005) recently studied a question closely related to this article, asking why phenotypic mutation rates (corresponding to the translational error rate in this article) are much higher than genotypic mutation rates. Within the framework of their model, Bürger *et al.* (2005) found very little pressure for reduction of phenotypic mutation rates below a certain threshold. Even though we keep the translational error rate constant in our model, we can consider a change in the number of neutral sites *n* as a change in the phenotypic mutation rate and thus compare our results to those of Bürger *et al.* (2005). In contrast to their conclusions, we find that the fraction of neutral sites, *n*/*L*, quickly rises to the maximum possible for highly expressed genes, thus reducing the phenotypic mutation rate to zero except when some sites cannot be made neutral. We believe that the differences in results are caused by differences in the way in which we and Bürger *et al.* (2005) treated costs of erroneously translated proteins in our models. Bürger *et al.* (2005) consider the total cost of protein synthesis, but do not consider additional penalties imposed by misfolded proteins, not only for their recognition and cleanup by the quality-control system but also for their innate toxicity (Bucciantini *et al.* 2002). Clearly, if we neglect these unique costs, then the only pressure to reduce the phenotypic mutation rate is to reduce the cost of synthesis for misfolded proteins, and this pressure will be weak if this cost is only a small proportion of the total cost of protein synthesis. In our model, on the other hand, we have focused exclusively on costs of misfolded proteins apart from their synthesis costs, implicitly assuming that the total cost of protein synthesis is approximately equal to the cost of synthesis of functional proteins and that the benefit of the functional proteins will pay for their synthesis. We believe that there is indeed a strong selective pressure to reduce the phenotypic mutation rate for highly expressed genes, but that it is cheaper for cells to evolve translationally robust genes than to evolve highly accurate transcription and translation machinery.

Can translational robustness really be obtained cheaply? Drummond *et al.* (2005) have suggested that increased protein stability both confers mutational tolerance and constrains sequence evolution. Increasing protein stability, that is, decreasing the free energy of folding Δ*G*_{f}, provides a plausible mechanism for obtaining translational robustness for numerous reasons. First, increased stability leads to increased mutational tolerance and can be obtained by point mutations (Bloom *et al.* 2005). Second, the stability-increase mechanism is sufficiently general to encompass proteins of diverse functions and to operate in a wide range of organisms. Third, stability is free in the sense that obtaining a protein with lower Δ*G*_{f} requires only a chance mutation. While many researchers have noted an apparent trade-off between protein stability and enzymatic activity, it is crucial to emphasize that this trend may be statistical rather than intrinsic: Because both high activity and high stability are rare properties, mutations that improve both are exceedingly unlikely unless both are constrained (Giver *et al.* 1998). Selection for translational robustness provides precisely that dual constraint, and because many millions of mutations may be screened over evolutionary time, the very few resulting in highly expressed proteins with increased stability (conferring tolerance to translation errors) and uncompromised activity will be found. Finally, the very rarity of such stabilizing mutations provides a measure of the scarcity of highly stable proteins available for exploration by evolutionary drift. If increased stability is a dominant response to the need for mutational tolerance in highly expressed proteins, it will restrict drift and slow evolution relative to less-constrained low-expression proteins.

## APPENDIX

Here we derive an expression for the evolutionary rate *K* as a function of the number of neutral sites *n** rather than the protein abundance *A*. For the remainder of this appendix, we drop the superscript from *n** for simplicity. We begin by noting that Equation 16 implies(A1)Further, note that we can write, for sufficiently large *N*,(A2)Therefore, *e*^{−2NAτ} = α(*L* − *n*)/*n*. We can solve this expression for *A* and find(A3)After inserting Equation A3 into the definition of π_{−}, we obtain(A4)We expand this expression for large *N*, replacing the exponential in the numerator by the first two terms of its Taylor series, and find(A5)Now, after inserting Equations A1 and A5 into Equation 6, we obtain for the expected evolutionary rate(A6)Note that this expression is independent of the population size *N*. Even though we have derived it under the assumption that *N* is large, we find that it works very well even for moderate population sizes of ≤100.

## Acknowledgments

D.A.D. acknowledges, with gratitude, the support of Frances Arnold. C.O.W. was supported by National Institutes of Health (NIH) grant AI 065960 and D.A.D. was supported by NIH National Research Service award 5 T32 MH19138.

## Footnotes

Communicating editor: N. S. Wingreen

- Received September 21, 2005.
- Accepted February 15, 2006.

- Copyright © 2006 by the Genetics Society of America