- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.105.046599v1
172/1/197 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Bürger, R.
- Articles by Nowak, M. A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Bürger, R.
- Articles by Nowak, M. A.
Originally published as Genetics Published Articles Ahead of Print on September 2, 2005.
Genetics, Vol. 172, 197-206, January 2006, Copyright © 2006
doi:10.1534/genetics.105.046599
Why Are Phenotypic Mutation Rates Much Higher Than Genotypic Mutation Rates?
Reinhard Bürger*,
,1,
Martin Willensdorfer
and
Martin A. Nowak
* Department of Mathematics, University of Vienna, 1090 Vienna, Austria and
Program for Evolutionary Dynamics, Department of Mathematics and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138
1 Corresponding author: Department of Mathematics, University of Vienna, Nordbergstrasse 15, A-1090 Vienna, Austria.
E-mail: reinhard.buerger{at}univie.ac.at
The evolution of genotypic mutation rates has been investigated in numerous theoretical and experimental studies. Mutations, however, occur not only when copying DNA, but also when building the phenotype, especially when translating and transcribing DNA to RNA and protein. Here we study the effect of such phenotypic mutations. We find a maximum phenotypic mutation rate, umax, that is compatible with maintaining a certain function of the organism. This may be called a phenotypic error threshold. In particular, we find a minimum phenotypic mutation rate, umin, with the property that there is (nearly) no selection pressure to reduce the rate of phenotypic mutations below this value. If there is a cost for lowering the phenotypic mutation rate, then umin is close to the optimum phenotypic mutation rate that maximizes the fitness of the organism. In our model, there is selective pressure to decrease the rate of genotypic mutations to zero, but to decrease the rate of phenotypic mutations only to a positive value. Despite its simplicity, our model can explain part of the huge difference between genotypic and phenotypic mutation rates that is observed in nature. The relevant data are summarized.
THE evolution of mutation rates by natural selection has attracted the attention of evolutionary biologists for many decades (STURTEVANT 1937), and a large number of models have been developed to understand various aspects of the evolution of mutation rates (SNIEGOWSKI et al. 2000). In contrast to the 1930s, a substantial body of empirical data about mutation rates at many levels (per base pair, per gene, or genomic) and for many different organisms is now available (DRAKE et al. 1998). For instance, mutation rates per base pair per replication in microbes with DNA chromosomes range from
7 x 107 down to
7 x 1011. There is a strong negative correlation with the genome size, so that the mutation rates per genome differ only by about a factor of two for the organisms cited in DRAKE et al.'s (1998) Table 4. Mutation rates per base pair estimated from specific loci in higher eukaryotes are in the range 2 x 1010 to 5 x 1011 (DRAKE et al.'s 1998 Table 5). Per locus mutation rates also vary widely, even within an organism, with an approximate range from 104 to 106.
In addition to these "genotypic" mutations, organisms are also confronted with what we call "phenotypic" mutations. These are the errors that occur when a DNA-coded gene is transcribed to mRNA and subsequently translated to protein. First measurements of phenotypic mutation rates, in particular, Escherichia coli RNA polymerase error rates, were obtained by SPRINGGATE and LOEB (1975). Soon afterward, EDELMANN and GALLANT (1977) measured the cysteine misincorporation rate for the E. coli protein flagellin. These early studies indicated that phenotypic mutation rates are by orders of magnitude larger than genotypic mutation rates. Later studies of E. coli (ELLIS and GALLANT 1982) estimated a global phenotypic error rate of 4.5 x 104 per codon and confirmed the difference between phenotypic and genotypic mutation rates. Studies in yeast yield similar results (SHAW et al. 2002). In contrast to genotypic mutation rates, there does not seem to be a significant difference between eukaryotic and prokaryotic phenotypic mutation rates. Several proofreading and quality control mechanisms exist that increase the accuracy of transcription and translation (THOMAS et al. 1998; IBBA and SÖLL 1999; WITHEY and FRIEDMAN 2002). But apparently there is not enough evolutionary pressure to increase the accuracy of the transcription and translation apparatus to DNA replication standards.
Apart from the fact that the huge differences between genotypic and phenotypic mutation rates are puzzling, such high phenotypic mutation rates conceivably could pose a problem because the production of functional protein requires the absence of a deleterious mutation event during transcription and translation. Therefore, cells with a higher phenotypic mutation rate must produce more molecules of this protein than cells with a lower rate. If the production of protein is associated with costs, a selective pressure to reduce the phenotypic mutation rate might be expected. The problem may be exacerbated when more genes have to be transcribed and expressed to increase the "fitness" of a cell, or rather of a single-cell organism, above its current value. In modification of STURTEVANT (1937), who asked "Why does the mutation rate not evolve to zero?" we ask "Why does the phenotypic mutation rate not evolve to lower levels, whereas the genotypic mutation rate has evolved?"
To address this question, we develop a mathematical model of a large population of single-cell organisms with a DNA chromosome, in which genotypic and phenotypic mutations occur. All mutations are assumed to be deleterious. For motivation, let us start by investigating the following simple case. Suppose that a certain gene can perform a phenotypic function if at least k error-free proteins (actually, molecules of the same protein) have been produced. The function leads to a fitness advantage s. If the gene is not transcribed and translated, hence the function not performed, the fitness is f0. Each protein molecule that is produced causes costs c. The probability that a protein is error free is given by 1 u. Thus, u is the phenotypic mutation rate. Then, the expected fitness of an organism that produces m copies of the protein is given by
![]() | (1) |
![]() |
Taking genotypic mutations of rate µ per gene into account and assuming that only genes without a mutation can produce functional protein (thus, all mutations considered are detrimental), we need
![]() | (2) |
1, inequality (2) holds if
![]() |
The mean number of error-free molecules produced is m(1 u), which needs to be
k. Therefore, we have
![]() | (3) |
![]() | (4) |
In the following sections, we make this argument more precise by elaborating on a more detailed model that includes an arbitrary number of genes. We show in particular that natural selection leads to phenotypic mutation rates that are much higher than genotypic mutation rates. Specifically, we address (and partially solve) the following questions:
- When is it beneficial to transcribe and express a new set of genes that bring about a selective advantage, but production of protein is costly?
- What is the optimum number of protein molecules to be produced if k and the other parameters (genotypic and phenotypic mutation rates, fitness advantage, and costs) are given?
- Is there an evolutionary explanation for why phenotypic mutation rates are so much higher than genotypic mutation rates?
- What are the consequences of costs associated with higher fidelity of protein production?
- What is the optimum number of protein molecules to be produced if k and the other parameters (genotypic and phenotypic mutation rates, fitness advantage, and costs) are given?
n
L) under consideration a number mn of protein molecules is produced. There is no recombination between loci. Errors occur both during DNA replication (i.e., cell division) and during transcription of DNA to RNA and subsequent translation into protein. We call errors of the first kind genotypic mutations and those of the second kind phenotypic mutations. If µn and un denote the genotypic and phenotypic mutation rates at locus n, respectively, then
is the probability that DNA is produced without error, and
![]() | (5) |
We assume that a certain number of mutation-free protein molecules can increase the fitness of a cell because then a beneficial phenotypic function can be performed, but production of protein has costs. The costs per protein molecule produced by gene n are cn > 0. They reduce the fitness of the cell. More precisely, we assume that every gene, n, must produce at least kn mutation-free copies of the protein so that the fitness of the cell is increased by an amount s > 0. If only one of the genes produces less mutation-free protein than required, no such fitness increase occurs. To formalize these assumptions, let i = (i1, ... , iL) and denote by fi the fitness of a cell that has in error-free and mn in erroneous (protein) molecules produced by gene n (n = 1, ... , L) as well as an error-free DNA (at all loci). If we denote the total costs of protein production by
and assume that the costs reduce the fitness of a cell by an additive amount, we obtain for the (Malthusian) fitnesses
![]() | (6) |
0 = f0 ctot. This kind of selection involves strong epistasis and is similar to what is called truncation selection in population genetics. Because mutated DNA will always produce mutated RNA, the fitness of cells with mutated DNA is
0. Throughout, we require that
0
0. Cells that do not express these genes, because they do not exist or are not activated, have fitness f0.
Let xi denote the relative frequency of cells that have error-free DNA and in denote error-free protein molecules from locus n. Further, let y be the frequency of cells whose DNA carries at least one mutation at one of the L loci. The probability that a cell produces in error-free molecules from each locus n is
. We emphasize that the numbers i of functional molecules produced by an offspring are independent of the numbers j produced by its parent. Because we assume that the population size is large enough to ignore stochastic fluctuations, it follows that the dynamics of cell frequencies are given by the system of
differential equations
![]() | (7a) |
![]() | (7b) |
is the mean fitness. We note that the system of differential equations given by (7a) and (7b) can be written as the replicator equation
, where z = (x, y) and x has the components xi, and the
x
matrix A has entries Aij = QRifj, Ai
= 0, A
j = (1 Q)fj, and A
=
0.
Let
. Then, the matrix A has the eigenvalues
![]() | (8) |
0, and 0, which has multiplicity
. The equilibrium solution corresponding to
, the equilibrium mean fitness, is uniquely determined and locally stable if and only if
or, equivalently, if
. In this case, it attracts all solutions with initial value y > 0 because (7a) and (7b) are equivalent to the linear system
(THOMPSON and MCBRIDE 1974; BÜRGER 2000). The equilibrium frequencies of cell types are readily shown to be
![]() | (9) |
, then all solutions converge to y = 1. This occurs, for instance, if s = 0.
Let
denote the probability that in a cell at least kn molecules produced by gene n are error free. Because these are precisely the cells with fitness
0 + s and
, where i
k means in
kn for all n, we obtain
![]() | (10) |
, hence an explicit expression for the mean fitness
, could have been derived without resorting to the full dynamics (7). However, uniqueness and global stability of the corresponding solution can be inferred only from the complete evolutionary dynamics.
has to be compared.
For an analytical treatment we assume that all loci are equivalent; i.e., cn
c, mn
m, kn
k, un
u, for all n. Therefore, we have ctot = cLm and
0 = f0 cLm. Furthermore, if we denote by P
Pn the probability that at least k error-free protein molecules per gene are produced, we have
![]() | (11) |
![]() | (12) |
![]() | (13) |
![]() | (14) |
A population of single-cell organisms that transcribe and translate the set of genes under consideration has an increased equilibrium mean fitness relative to one that does not produce this protein if and only if
![]() | (15) |
depends on various parameters.
|
|
Figure 1 displays
as a bivariate function of m and u. More precisely, it shows
to clearly display the parameter combinations that confer a fitness advantage to the population (in the figures, f0 = 1). Most distinctive is the steep "wall" surrounding the "mesa-like mountain." It signifies a threshold-like increase of
as m increases above a critical value or u decreases below a critical value. For each given u, there is a value m that maximizes
(see The optimum number of protein molecules). The linear decrease of
in m is caused by the costs, which are proportional to cm. If m is too large [in this case m
1996, cf. (16)], then
. Moreover, there is a maximum value of u, above which
for every m. Its value is
0.73; see (20).
Figure 2 displays
as a function of log10u for several different parameter combinations. The threshold-like dependence on u is distinctive, as is the fact that
is effectively constant for values of u below the threshold. This is investigated and explained in Selection on mutation rates.
Error thresholds and other necessary conditions for performing an advantageous function:
From inequality (15), simple conditions for some of the parameters can be derived that must be satisfied so that incorporating or maintaining a set of genes may confer a fitness advantage to the population. Because we always have P
1, the following simple upper bound for m is easily deduced from (15) and (14) by setting P = 1:
![]() | (16) |
![]() | (17) |
k. The approximation is valid if
. For smaller values of s, there is no parameter combination that provides a selective advantage to a single-cell population with these genes activated. A simpler, but less accurate, estimate than (17) for the necessary selective advantage is
![]() | (18) |
By simple rearrangement of (14), we obtain from (15) the following upper bound on the genotypic mutation rate per gene:
![]() | (19) |
1 (1 + s/f0)1/L and the right-hand side is attained if u = 0 (hence P = 1) and c = 0.
No such simple and precise formula exists for the phenotypic error threshold because P is a complicated function of u and m (but see Selection on mutation rates). However, an upper bound on the phenotypic mutation rate u per gene can be derived, above which the mean fitness of a cell population is less than f0 for every m. It is given by
![]() | (20) |
The optimum number of protein molecules:
Figure 1 shows that for a given phenotypic mutation rate u, there is an optimum number, mopt, of molecules to be produced. Indeed, this is intuitive because in the absence of phenotypic mutation, m = k molecules should be produced to minimize the costs. With phenotypic mutation, k or more molecules have to produced to obtain a correctly expressed gene. We can restrict attention to phenotypic mutation rates u < umax. Since we have
by (8), and because Q is independent of m, it is sufficient to find the m that maximizes
.
Let us first assume
. Then the binomial distribution (12) can be approximated by a Poisson distribution with mean m(1 u), and we obtain from (13)
![]() | (21) |
assumes its maximum at m = k if uk < c/s. [Note that (17) implies that
if
, so the assumption
is automatically satisfied if a set of genes can be added at all and
.] Thus, for very small phenotypic mutation rates the fitness is maximized at m = k.
If u is sufficiently large, the binomial distribution (12) can be approximated by a normal distribution. [This is accurate for all possible m if ku(1 u)
5.] Then the very accurate approximation
![]() | (22) |
(APPENDIX B). In particular, the true mopt is always >k/(1 u). Figure 3 displays mopt/k as a function of k for various parameter combinations. Figure 3 and (22) demonstrate that the optimum number of protein molecules to be produced is only slightly larger than k, unless u is very high or k very small. In fact, we have
; cf. the Introduction, before Equation 3. The convergence, however, is slow so that k must be on the order of a few hundred that 1/(1 u) becomes an accurate approximation for mopt. It is also important to note that mopt is independent of L and depends only very weakly on c and s; larger c/s slightly decreases mopt. The mean fitness
at mopt, however, depends strongly on L, s, and c. Even though the derivation of (22) assumes ku(1 u)
5, (22) remains accurate if uk < 1 and correctly predicts that mopt
k as u
0. Additional numerical results (not presented) show that the relative error of the approximation (22) rarely exceeds 5% and often is much lower. Finally, we point out that the minimum possible m is only slightly smaller than the optimum m given by (22); this is best seen from Figure 1.
|
Selection on mutation rates:
Here, we explore how the equilibrium mean fitness
depends on the genotypic and phenotypic mutation rates. The dependence of
on the genotypic mutation rate is very simple because it is proportional to (1 µ)L. Therefore, the larger the number of genes involved, the more advantageous is a low genotypic mutation rate. Even for a single gene, there is significant selection for reducing the genotypic mutation rate well below 102 in any cell population of size 103 or higher because selection dominates random genetic drift if population size times selective advantage exceeds
10. With L = 20 genes, we have (1 µ)20
0.99 if µ
5 x 104. Thus, in cell populations of size as small as 103 there is already significant selection pressure for reducing the mutation rate below 5 x 104.
In contrast, for the phenotypic mutation rate, there is hardly any selection pressure to reduce it to such low levels (even in extremely large populations). Indeed, Figure 2 shows that the equilibrium mean fitness becomes nearly independent of u as u gets smaller than
101. There are two reasons for this. First, a reduction in genotypic mutation rate affects fitness in a structurally different way than a reduction in phenotypic mutation rate. In general, the fitness increase caused by a reduction in µ is only weakly dependent on s because it is proportional to the (typically) much larger term f0. However, fitness changes induced by u are always proportional to s because they enter
through changes in P, the probability that at least k error-free proteins are produced; see Equation 14. The second reason is that PL has a sigmoid, often nearly threshold-like, shape. The cumulative binomial density P is extremely close to its maximum value 1 if the mean number of correctly produced molecules, m(1 u), exceeds k by only a few standard deviations. Hence, if m is sufficiently large so that the binomial distribution can be approximated by a Gaussian, then
approaches its maximum value as u
0 approximately as fast as
approaches zero as x
.
A simple explicit, but approximate, expression for the minimal mutation rate below which
is nearly independent of u can be obtained by approximating the binomial cumulative probability P by a Gaussian. To this aim, let q be a small positive number and let d denote the (1 q)1/L quantile of the standard normal distribution. Then, we have 1 q
PL
1 if
. Solving for u yields
![]() | (23) |
. If
for all u, for instance, because s is too small or c too large, then there will be no selection pressure at all to reduce u because activating the set of genes automatically leads to a fitness disadvantage.]
Figure 4 displays umin for a single gene as a function of k for several parameter combinations. It shows that umin is very small only if either k is very small or m is only slightly larger than k. The latter a priori requires small phenotypic mutation rates. It is also of interest to note that if we assume that m is a fixed multiple of k, i.e., m = ak, and let k tend to infinity, then
and umin
1 1/a. Thus, if many more proteins are produced than required (a large), then umin will be close to 1 if k is large. If, on the other hand, m is not much larger than k [as suggested by expression (22) for the optimal m], then umin will be relatively small. For example, if a = 1.2 as in some of the graphs in Figures 4 and 5, then
. These considerations strongly suggest that, on the basis of our model, smaller phenotypic mutation rates are not likely to evolve. It is also notable, although obvious from the derivation, how weakly umin depends on q and how much larger than sq it is under most conditions. The latter is important because for a single gene the corresponding µmin would be sq. Hence, µmin << umin. For L genes, µmin would be correspondingly lower, i.e., µmin = 1 (1 sq)1/L.
|
|
The above argument, that the selective pressure to reduce the phenotypic mutation rate below umin is less than sq, depends on the assumption that m is given and constant. It does not, however, involve any costs for reducing the mutation rate. Such costs are investigated below. Theoretically, the phenotypic mutation rate could evolve to zero, or at least to much lower levels than given by umin, if m and u could be optimized simultaneously. This can be seen from Figure 1 and would correspond to evolution along the top of the (curved) ridge. Substantial bivariate optimization, and evolution to very low values of u, does not appear to be a very likely scenario because it would require extreme fine tuning of m and u. If, for given u, m is only slightly larger than mopt = mopt(u) (
2% is sufficient), then the selective advantage to reduce u is already vanishingly small (the population is on the gentle slope, which is completely flat in the direction of increasing u). If, in contrast, only a few protein molecules less than mopt are produced, then the fitness decrease is substantial (the population "drops off the steep wall"). It seems questionable if mechanisms for the required simultaneous fine tuning of both m and u exist, in particular, because mopt is a population property, not a property of the cell. It would require that a cell knows exactly quite how many error-free molecules it produces.
The critical mutation rate:
We have already derived an approximation for the phenotypic error threshold, i.e., the maximum mutation rate umax above which the set of genes cannot be maintained. Here we take a closer look at the distinctive threshold-like dependence of
on u (Figure 2) and investigate how it depends on m and the other parameters. This threshold-like dependence is a characteristic feature of our model and has a simple explanation. Let us approximate P by the Gaussian cumulative distribution function with mean m(1 u) and variance mu(1 u). Then, P switches from a value close to 0 to a value close to 1 near the mean m(1 u). This transition occurs within about two standard deviations of the mean. For a single gene, this implies that the transition occurs if m(1 u)
k, whence u
1 k/m follows. For the parameter values of Figure 2, a and b, this yields u
0.17, or log10u
0.78, a reasonably good approximation to the critical value ucrit defined as the solution of
. If there is more than one gene, then the transition occurs near
and becomes sharper as L increases. Approximating P by the corresponding Gaussian cumulative distribution, i.e.,
, where
, we obtain the critical value ucrit by solving
![]() | (24) |
Figure 5 displays the critical phenotypic mutation rate ucrit (solid lines), calculated numerically by solving
, for four selective coefficients as a function of k, and compares it with the minimum phenotypic mutation rate umin (dashed lines), calculated for two choices of q. For the two smaller values of s, the curves for ucrit end when
, i.e., when expression of the set of genes causes a fitness reduction for all u.
The role of costs for reducing the phenotypic mutation rate:
So far, all arguments have assumed that no costs are associated with lower mutation rates. Here, we briefly explore the consequences of such costs. To illustrate the (quite obvious) effects, let us assume that the cost of producing a single protein molecule is c(1 +
/u), where
0.
Figure 6 displays the mean equilibrium fitness
as function of log10u and m with
= 0.01. Thus, the costs are slowly increasing with decreasing u. With
= 0 there are no costs for reducing the phenotypic mutation rate and Figure 1 would be obtained (except for the different scaling of the u-axis). The fitness optimum is near (m, u) = (571, 0.0805) with
(log100.0805 = 1.094). We note that 0.0805 is close to umin(m = 571, k = 500, q = 104) = 0.0817.
|
If the costs increase is even slower, for instance logarithmic, then the optimal u is somewhat smaller (results not shown). In the presence of costs, most of the quantities derived above are much more difficult to compute, and we have not developed an analytical theory that takes into account costs for fidelity.
The average protein lengths for E. coli, Saccharomyces cerevisiae, and Homo sapiens are 317, 496, and 499 amino acids, respectively. Let us be conservative and use 500 for the average protein length. For a 500-amino-acid long protein and a phenotypic mutation rate of 4.5 x 104 mistranslations per amino acid, we will have
0.23 incorrectly synthesized proteins per synthesized protein. However, only a fraction of these 23% will carry amino acid substitutions that render them nonfunctional.
Exhaustive amino acid substitution assays on HIV-1 protease (LOEB et al. 1989), T4 lysozyme (RENNELL et al. 1991), and Lac repressors (MARKIEWICZ et al. 1994) showed that 59, 12, and 34% of the examined amino acid substitutions were deleterious (summarized in SAUNDERS and BAKER's 2002 Table 1). If we choose 35%, the average of these three values, as the fraction of amino acid substitutions that are deleterious, we have a phenotypic mutation rate of 0.08 deleterious mutations per synthesized protein.
Because a substantial fraction, if not the vast majority, of genotypic mutations are detrimental (KEIGHTLEY and EYRE-WALKER 1999; KEIGHTLEY and LYNCH 2003), deleterious per-locus mutation rates can be expected to be between 104 and 107 (see Introduction), where the upper bound is likely to be an overestimate. In any case, deleterious genotypic mutation rates are several orders of magnitude smaller than phenotypic ones.
In addition to the (deleterious) phenotypic mutation rate we also have to consider the number of protein molecules produced per cell. Early studies showed that only a few hundred proteins account for most of the protein content of a cell and that most of the proteins are present in low copy numbers (O'FARRELL 1975). Low copy numbers range from a few proteins per cell to several hundred. For instance, the Lac repressor, a regulatory protein, is thought to be "occurring in about ten copies per gene" (GILBERT and MULLER-HILL 1966). E. coli DNA photolyase, a DNA repair enzyme, has a copy number of
1020 molecules per protein (HARM et al. 1968). High copy number proteins can have abundances of many thousand molecules per cell (GYGI et al. 1999). A recent study shows that the costs associated with the production of protein may be substantial, and that they increase faster than linear with the amount of protein produced (DEKEL and ALON 2005).
Our main results are, first, that there is basically no selective pressure to reduce the phenotypic mutation rate per gene below a minimum value, umin, which is rarely <0.05, and often near 0.1. This compares surprisingly well with the 8% deleterious mutations per synthesized protein calculated above.
In contrast, and despite the simplicity of the model, there is selective pressure to reduce the genotypic mutation rate to much lower levels, one order of magnitude at least. If several genes have to be expressed to increase fitness, the difference becomes larger. Second, for given parameters, there is a critical phenotypic mutation rate, ucrit, above which the fitness of the population is actually reduced if the set of genes is expressed. Unless the potential fitness increase, s, is very high relative to the costs, c, and k very small, umin is not much smaller than ucrit, in particular, if several genes are involved. Both umin and ucrit depend only very weakly on c and s. No simple formulas for ucrit and umin are available. Their (approximate) calculation involves computation of quantiles of the normal distribution. However, and this is the third result, there is a simple formula for the maximum phenotypic mutation rate, umax, above which there is a fitness disadvantage for expressing the genes under consideration for any number m of actually produced protein molecules. This can be interpreted as a phenotypic error threshold. Unless kc/s is very small or m is only slightly larger than k, umax differs from umin by less than a factor of 10. Fourth, we show that for all other parameters given there exists an optimum number of protein molecules to be produced, mopt, in the sense that the mean fitness of the population is maximized. We derive a simple and very accurate approximation for mopt. Unless the phenotypic mutation rate is very high or k is small, mopt is not much larger than k and nearly independent of the selective advantage and the costs. It is independent of the number of loci and of the genotypic mutation rate.
The formal reason for the absence of a selective pressure to reduce the phenotypic mutation rate to such low levels as that of the genotypic mutation rate is that the two types of mutation rates enter mean fitness,
in (14), in qualitatively different ways; this is discussed in Selection on mutation rates. A more intuitive reason is that as soon as only a few more than the optimum number, mopt, of protein molecules are produced, the selective pressure to reduce the phenotypic mutation rate vanishes because the function can be fulfilled anyway. In such a situation, there is, however, weak selective pressure to reduce the number of actually produced molecules m to mopt. In principle, simultaneous evolution of m and u could lead to much lower phenotypic mutation rates. However, as argued in Selection on mutation rates this would require extreme fine tuning of these processes (in particular, m has to be adjusted extremely closely to mopt) and, thus, seems unlikely. This, together with the role of genetic drift, will be the topic of future investigation.
The above results do not involve any costs for reducing the phenotypic or genotypic mutation rate. If there are costs for reducing the phenotypic mutation rate, the parameter range in which a fitness advantage can be realized by incorporating the set of genes is substantially reduced (or even annihilated if the costs are too high), and fitness is maximized at an intermediate phenotypic mutation rate. Unless the costs are high, this maximum is close to umin, as given by (23).
Our model, hence the conclusions, rests on a number of assumptions. We assumed that, if there is more than one locus, all loci are completely equivalent. In reality, this will not be the case because loci can differ in any of the parameters. It appears to be of most interest to study cases in which the number of required error-free protein molecules, k, and the actually produced number, m, vary among loci. We have not yet studied such a scenario.
Our most critical assumption concerns the dependence of fitness on the number of protein molecules produced. Many fitness functions other than our step-like function (6) are conceivable. For instance, fitness could increase smoothly as the number of error-free proteins increases. We have not studied such a scenario. However, it appears quite reasonable to assume that the performance of a, at least moderately complex, function requires many genes to interact in an appropriate manner. There may be many possibilities of modeling such gene interaction, but none has been studied in the present context.
The following example shows that there are fitness functions that can induce strong selection toward low phenotypic mutation rates. Assume that k error-free proteins are needed to increase fitness by s, but that cells that produce one or more erroneous molecules do not have this fitness advantage. Also assume, as in our model, costs c for producing a protein molecule. Then using the previous notation, we have
![]() | (25) |
. It is trivial to show that this yields the condition
![]() | (26) |
![]() | (A1) |
No general explicit approximation for P is available. However, P can be approximated by the cumulative density function of the normal distribution with mean m(1 u) and variance mu(1 u). Therefore, P = P(m, k, u) is close to 1 (
0.97) if
and starts to decline rapidly as m becomes smaller. The inequality
is satisfied if and only if
![]() | (A2) |
If we approximate the left-hand side of (A1) by P = 1 and the right-hand side by (
+ f0µ)/s [which is accurate to order O(µ)], we obtain the desired (approximate) upper bound by solving
![]() | (A3) |
![]() | (A4) |
Numerical evaluation of the true upper bound shows that this provides an excellent approximation if k
10. By ignoring terms of order c2 and higher, we obtain (20). In general, (20) is nearly as good as (A4), but slightly smaller. A similar procedure yields (20) if L > 1.
5, then the binomial distribution (12) can be accurately approximated by a normal distribution. By partial differentiation of
(13) with respect to m we obtain that the fitness is maximized at the largest solution m of
![]() | (B1) |
![]() | (B2) |
If A
2, which is satisfied if (approximately)
, we have erf(A)
0.995, and the terms [1 erf(A)]L1 and 2L1 cancel.
Because of the rapid decline of exp(A2) for m > k, we can approximate
by (ku)1/2 and obtain an excellent approximation for the solution of (B1) by solving exp(A2) =
for m, where
. Ignoring terms of order u/k and smaller, we arrive at (22).
BÜRGER, R., 2000 The Mathematical Theory of Selection, Recombination, and Mutation. Wiley, Chichester, UK.
DEKEL, E., and U. ALON, 2005 Optimality and evolutionary tuning of the expression level in protein. Nature 436: 588592.[CrossRef][Medline]
DRAKE, J. W., B. CHARLESWORTH, D. CHARLESWORTH and J. F. CROW, 1998 Rates of spontaneous mutation. Genetics 148: 16671686.
EDELMANN, P., and J. GALLANT, 1977 Mistranslation in E. coli. Cell 10: 131137.[CrossRef][Medline]
EIGEN, M., and P. SCHUSTER, 1977 The hypercycle: a principle of natural self-organization. A. emergence of the hypercycle. Naturwissenschaften 64: 541565.[CrossRef][Medline]
ELLIS, N., and J. GALLANT, 1982 An estimate of the global error frequency in translation. Mol. Gen. Genet. 188: 169172.[CrossRef][Medline]
GILBERT, W., and B. MULLER-HILL, 1966 Isolation of the Lac repressor. Proc. Natl. Acad. Sci. USA 56: 18911898.
GYGI, S. P., Y. ROCHON, B. R. FRANZA and R. AEBERSOLD, 1999 Correlation between protein and mRNA abundance in yeast. Mol. Cell. Biol. 19: 17201730.
HARM, W., H. HARM and C. S. RUPERT, 1968 Analysis of photoenzymatic repair of UV lesions in DNA by single light flashes. II. In vivo studies with Escherichia coli cells and bacteriophage. Mutat. Res. 6: 371385.[Medline]
IBBA, M., and D. SÖLL, 1999 Quality control mechanisms during translation. Science 286: 18931897.
KEIGHTLEY, P. D., and A. EYRE-WALKER, 1999 Terumi Mukai and the riddle of deleterious mutation rates. Genetics 153: 515523.
KEIGHTLEY, P. D., and M. LYNCH, 2003 Toward a realistic model of mutations affecting fitness. Evolution 57: 683685.[CrossRef][Medline]
LOEB, D. D., R. SWANSTROM, L. EVERITT, M. MANCHESTER, S. E. STAMPER et al., 1989 Complete mutagenesis of the HIV-1 protease. Nature 340: 397400.[CrossRef][Medline]
MARKIEWICZ, P., L. G. KLEINA, C. CRUZ, S. EHRET and J. H. MILLER, 1994 Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as "spacers" which do not require a specific sequence. J. Mol. Biol. 240: 421433.[CrossRef][Medline]
O'FARRELL, P. H., 1975 High resolution two-dimensional electrophoresis of proteins. J. Biol. Chem. 250: 40074021.
RENNELL, D., S. E. BOUVIER, L. W. HARDY and A. R. POTEETE, 1991 Systematic mutation of bacteriophage T4 lysozyme. J. Mol. Biol. 222: 6788.[CrossRef][Medline]
SAUNDERS, C. T., and D. BAKER, 2002 Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J. Mol. Biol. 322: 891901.[CrossRef][Medline]
SCHUSTER, P., and W. FONTANA, 1999 Chance and necessity in evolution: lessons from RNA. Physica D 133: 427452.[CrossRef]
SHAW, R. J., N. D. BONAWITZ and D. REINES, 2002 Use of an in vivo reporter assay to test for transcriptional and translational fidelity in yeast. J. Biol. Chem. 277: 2442024426.
SNIEGOWSKI, P. D., P. J. GERRISH, T. JOHNSON and A. SHAVER, 2000 The evolution of mutation rates: separating causes from consequences. BioEssays 22: 10571066.[CrossRef][Medline]
SPRINGGATE, C. F., and L. A. LOEB, 1975 On the fidelity of transcription by Escherichia coli ribonucleic acid polymerase. J. Mol. Biol. 97: 577591.[CrossRef][Medline]
STURTEVANT, A. H., 1937 Essays on evolution. I. On the effects of selection on mutation rate. Q. Rev. Biol. 12: 467477.
THOMAS, M. J., A. A. PLATAS and D. K. HAWLEY, 1998 Transcriptional fidelity and proofreading by RNA polymerase II. Cell 93: 627637.[CrossRef][Medline]
THOMPSON, C. J., and J. L. MCBRIDE, 1974 On Eigen's theory of the self-organization of matter and the evolution of biological macromolecules. Math. Biosci. 21: 127142.
WITHEY, J. H., and D. I. FRIEDMAN, 2002 The biological roles of trans-translation. Curr. Opin. Microbiol. 5: 154159.[CrossRef][Medline]
Communicating editor: M. K. UYENOYAMA
This article has been cited by other articles:
![]() |
N. Stoletzki and A. Eyre-Walker Synonymous Codon Usage in Escherichia coli: Selection for Translational Accuracy Mol. Biol. Evol., February 1, 2007; 24(2): 374 - 381. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. O. Wilke and D. A. Drummond Population Genetics of Translational Robustness Genetics, May 1, 2006; 173(1): 473 - 481. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.105.046599v1
172/1/197 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Bürger, R.
- Articles by Nowak, M. A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Bürger, R.
- Articles by Nowak, M. A.






























.











