The Evolution of Mutator Genes in Bacterial Populations: The Roles of Environmental Change and Timing
Mark M. Tanaka, Carl T. Bergstrom, Bruce R. Levin


Recent studies have found high frequencies of bacteria with increased genomic rates of mutation in both clinical and laboratory populations. These observations may seem surprising in light of earlier experimental and theoretical studies. Mutator genes (genes that elevate the genomic mutation rate) are likely to induce deleterious mutations and thus suffer an indirect selective disadvantage; at the same time, bacteria carrying them can increase in frequency only by generating beneficial mutations at other loci. When clones carrying mutator genes are rare, however, these beneficial mutations are far more likely to arise in members of the much larger nonmutator population. How then can mutators become prevalent? To address this question, we develop a model of the population dynamics of bacteria confronted with ever-changing environments. Using analytical and simulation procedures, we explore the process by which initially rare mutator alleles can rise in frequency. We demonstrate that subsequent to a shift in environmental conditions, there will be relatively long periods of time during which the mutator subpopulation can produce a beneficial mutation before the ancestral subpopulations are eliminated. If the beneficial mutation arises early enough, the overall frequency of mutators will climb to a point higher than when the process began. The probability of producing a subsequent beneficial mutation will then also increase. In this manner, mutators can increase in frequency over successive selective sweeps. We discuss the implications and predictions of these theoretical results in relation to antibiotic resistance and the evolution of mutation rates.

IN the face of fluctuating environments, organisms that reproduce asexually have a reduced capacity to generate variation compared to sexual organisms. Bacteria can occasionally undergo genetic exchange due to the wide variety of accessory genetic elements available to them: integrons and gene cassettes, transposons, plasmids, and phages. These elements allow bacteria to exchange genes not only within the narrow bounds of a given species, but also across far wider phylogenetic divides than those typically bridged by sexual eukaryotes (Levin and Bergstrom 2000; Ochmanet al. 2000). As the rapid spread of antibiotic resistance by horizontal transfer so strikingly and frighteningly illustrates, bacteria can and often will acquire beneficial genes from external sources, so long as these genes exist somewhere in the prokaryotic world (Levin and Bergstrom 2000). We have suggested elsewhere that spatially and temporally varying environments may be the driving forces behind the evolution and maintenance of many mobile genetic elements (Bergstromet al. 2000).

What happens when bacteria are faced with novel or fluctuating environmental conditions and no external sources of beneficial genes are available? In such circumstances, evolution must resort to a second source of variation: de novo generation by mutation. But elevated mutation rates come at a cost: they create deleterious changes in the genome. In fact, the classical theory of mutation rate evolution concluded that in sufficiently stable environments, mutation rates evolve to be quite low (Kimura 1960, 1967; Leigh 1970, 1973; Eshel 1973a,b; Painter 1975; Gillespie 1981; Liberman and Feldman 1986; Altenberg and Feldman 1987). In general, we do indeed observe low mutation rates and elaborate mechanisms of DNA repair, which presumably evolved to keep mutation rates low (Drakeet al. 1998; Miller 1998; Sniegowskiet al. 2000). And, until recently, all was in place: intuition, formal theory, and experimental observations all supported the proposition that although higher rates of mutation can appear occasionally, mutation rates in bacterial populations would remain low and mutator alleles (alleles elevating the genomic mutation rate) would not be selected.

In recent years, however, researchers have begun to uncover high frequencies of mutator genes in natural or clinical (LeClercet al. 1996; Maticet al. 1997; Bucciet al. 1999; Oliveret al. 2000; Björkholmet al. 2001) and experimental (Maoet al. 1997; Sniegowskiet al. 1997; Funchainet al. 2000; Giraudet al. 2001) bacterial populations. What is happening in these populations? A broad but not mechanistic answer from the classical theory is simply that those populations probably faced environments that fluctuated rapidly enough to favor mutator genes and that such fluctuations may be very common for bacteria. These observations prompted theorists to revisit the subject of bacterial mutation rates to understand it in closer detail (Taddeiet al. 1997; Gerrish 1998; Gerrish and Lenski 1998; de Visseret al. 1999; Fieldet al. 1999; Johnson 1999a,b; Tenaillon et al. 1999, 2000; Masel and Bergman 2003). We propose an alternative to the models developed in this recent renaissance of mutator gene evolution theory and offer a resolution to the following problem.

Figure 1.

—Timing of events during multiple sweeps: simulation with parameters from Table 1. Top, frequencies of mutators (solid line) and nonmutators (dashed line) over time. Thick vertical lines indicate environmental shifts. Bottom, frequencies for the same run of the parent subpopulations from which mutations arise. Thin lines, “unevolved” lineages; thick lines, evolved lineages. Again, solid lines are for mutators and dashed lines for nonmutators. The dynamics are described in Details of the simulation in the appendix.

Both experiments and theory suggest that the fate of mutator alleles will depend critically on their initial frequencies. Whatever their long-term advantages may be, mutators will be at a disadvantage when initially rare. Cox and Gibson (1974), Chao et al. (1983), and Chao and Cox (1983) offer a simple and elegant explanation. For example, consider a rare mutator allele in a large bacterial population. Suppose the mutator gene increases the mutation rate substantially (say, 100-fold) but is initially present at a very low frequency (say, 1 × 10-5). In such a population, beneficial mutations will first appear in a nonmutator individual 99.9% of the time simply because the latter are present in vastly greater numbers. Furthermore, the mutator subpopulation will take, on average, 1000 times as long as the larger nonmutator subpopulation to acquire the same beneficial mutation. Experimental studies confirm this basic intuition: in Escherichia coli populations exposed to novel environmental conditions, mutators are unable to invade when rare (Cox and Gibson 1974; Chao and Cox 1983; Chaoet al. 1983; Giraudet al. 2001).

Given this daunting accord between theory and observation, how do mutators ever reach high enough frequencies to be detected, let alone to have an appreciable effect on the process of bacterial evolution in response to environmental variation? In this study we show that one answer lies in the relative timing of the generation of beneficial mutations on mutator and nonmutator backgrounds. While the small mutator subpopulation is unlikely to be the first to generate a beneficial mutation, it is likely to generate this beneficial mutation before most similarly sized subsets of the nonmutator subpopulation. This relative timing advantage turns out to be sufficient to allow mutators to ascend in many cases. We begin with a simulation model in which the environment changes stochastically at a low rate (simulation model). In analytical model we present an analytical approximation to this first model, which examines the details of a single selective sweep. We then discuss the implications of this theoretical study for the problems of acquired antimicrobial resistance and how this theory can be tested experimentally.


We begin with a simulation model of the population dynamics of mutator genes. Consider a large population of bacteria, composed of both nonmutator and mutator cells. The mutator type is rare; it starts at a frequency of 10-5. Although this may seem like a high frequency of mutators (amounting to 105 mutator cells in a population of 1010), it is a reasonable number at mutation-selection balance (Johnson 2000).

Both the mutator and nonmutator strains can adapt or evolve to the current environmental conditions by generating a beneficial mutation. Prior to a shift in environmental conditions, we assume that there is a single advantage-conferring phenotype that the nonmutator and mutator subpopulations have the opportunity to acquire through mutation. Mutator cells generate the mutation at a higher rate (μm) than nonmutator cells (μn). This gives us four possible genotypes: “unevolved nonmutators,” “unevolved mutators,” “evolved nonmutators,” and “evolved mutators.” We track the number of cells of each genotype over time t, labeling these n(t), m(t), N(t), and M(t), respectively.

The advantage conferred by the beneficial mutation is expressed as a growth advantage b relative to cells without that mutation. We assume bacteria with mutator genes have an intrinsic disadvantage (s; because of the higher rate of generating deleterious mutations) both in the ancestral state and when carrying the beneficial mutation. Thus evolved mutators have the growth advantage b - s over unevolved nonmutators. With probability 0.002 per generation, the environment changes; each time this occurs, the evolved lineages lose their fitness advantage (and mutators retain their fitness costs). We provide the details of the dynamics in the appendix, in Details of the simulation.

View this table:

Parameter values used in calculations

In Figure 1, we follow the changes in the nonmutator and mutator populations during the course of a simulation in which the mutators eventually dominated the population. The parameter values used for this simulation are presented in Table 1 and are based on the estimates provided in Chao and Cox (1983). In this run, the first beneficial mutation occurred in the nonmutator subpopulation and, while that mutant began to rise in frequency, the mutator subpopulation declined slowly due to its intrinsic disadvantage. Before the first environmental change, a beneficial mutant appeared in the mutator population, so that by the time of the environment shift, the frequency of the mutator was boosted higher than its initial frequency. Consequently, the probability of a beneficial mutation occurring in the mutator subpopulation increased dramatically. Environmental changes continued to occur, and each time the mutators acquired a beneficial mutation, their absolute numbers as well as their frequency increased; with each increase in size there was a corresponding increase in the probability of acquiring another beneficial mutation. Eventually, the size of the mutator subpopulation increased to the point at which, because of the higher mutation rate, this subpopulation became more likely than the nonmutator subpopulation to acquire a beneficial mutation following an environmental change.

While the mutator population eventually dominated in the simulation depicted in Figure 1, success of the mutator gene occurred in only ∼20% of the 1000 simulation runs performed with this parameter set and initial conditions. In 69% of the runs the subpopulation bearing the mutator gene declined to a frequency of 10-10 before the end of the simulation at generation 2500. The mutator persisted without dominating the population in the remaining 11% of runs. Of course, if the mutator-bearing subpopulation were initially larger, the frequency of successful outcomes would have been greater. Conversely, if the mutator subpopulation had been smaller, then even fewer runs would have ended with mutators present at high frequency.

View this table:

Essential notation used in analytical model

Several features of the trajectory shown in Figure 1 are common to virtually all trajectories in which mutators reached high frequencies. First, the mutator need not be the first to generate the benefical mutation. It just has to acquire that mutation before its frequency declines to the point where it is too rare to get the beneficial mutation at all. Second, the mutator subpopulation reaches high frequency not in a single selective sweep, but rather in a series of smaller steps, during the course of a number of changes in environmental states. Third, although the mutator can climb in frequency in steps, we note that it occasionally declines in frequency.


To gain a more detailed understanding of the dynamics of this process, we construct a simple analytical model that focuses on the dynamics observed during a single environmental cycle as simulated in the previous section. While this approach will not capture everything about the full process as it extends over a series of environmental changes, it will be helpful in characterizing the probability that a mutator phenotype ascends. Our notation is summarized in Table 2 and the process is diagrammed in Figure 2. As in the simulation, we assume the population is composed of nonmutator and mutator cells. The major differences between the simulation and this analytical model are that in the latter (i) time is continuous rather than discrete, (ii) the number of unevolved nonmutator cells n is approximated as constant, and (iii) the numbers of evolved mutators and nonmutators (M and N) increase exponentially.

Figure 2.

—Timing of events during a single selective sweep. A typical time cycle is shown, in which the evolved nonmutator has arisen before the evolved mutator. (A) The case where nμnfn < 1. (B) The case where nμnfn > 1. The dashed lines indicate the trajectories of mutators that would exactly recover their original density by the end of the time cycle. That is, these are mutator lineages that “break even.” See text and Table 2 for descriptions of notation and time intervals.

We limit our analysis to the period of time beginning with the occurrence of an environmental shift and ending once a beneficial mutant from the nonmutator pool has exponentially increased in number (at rate b) until it reaches size n, at time t*. We call this period a time cycle. The end of the time cycle here corresponds in the simulation model to the time at which the evolved nonmutator reaches an appreciable fraction (e.g., one-half) of the total population. Concomitantly, the unevolved nonmutators slowly decline in the simulation; therefore, as an approximation in this section, we let n remain constant during the time cycle. In Definition of time cycle (appendix) we justify this assumption by showing that the number of unevolved nonmutators stays large (>n/2) throughout the time cycle. Hence, there is no need to normalize the dynamics to keep the population size constant, and the selective advantages enjoyed by the evolved types are the fitnesses relative to the unevolved nonmutator.

Rate of appearance of successful mutants: Let λn be the rate at which beneficial mutations destined to survive drift appear in the subpopulation of unevolved nonmutators. Similarly, λm is the rate at which beneficial mutations surviving drift appear among unevolved mutators. On average, λn=nμnfn (1) and λm=meμmfm, (2) where fn is the probability of fixation of each new evolved nonmutator at the time of its appearance, fm is the probability of fixation of each new evolved mutator, and me is the effective population size of mutators during a time cycle of average length. We postpone discussing our choice of me until after describing the dynamics of the various cell types.

We use the probability of fixation of a mutant as approximated by diffusion theory (Kimura 1983), 1e2α1e2Nevα, (3) where α is the fitness advantage of the mutant over the dominant strain in the population and Nev is the variance effective population size. Because the number of unevolved nonmutators (n) is set to be large throughout the time cycle, α= b in the case of evolved nonmutators, and α= b - s for evolved mutators. For bacterial cells, which replicate by binary fission, the distribution of the number of offspring is 0 or 2 with equal probability, the variance in offspring number is 1 (Johnson and Gerrish 2002), and Nev equals the very large total number of cells. Hence, as reasonable approximations we can write fn=1e2bfm=1e2(bs). (4)

Dynamics: Let W be the time at which the first evolved nonmutator that avoids immediate loss while rare appears in the population, and let X be the time at which the first evolved mutator avoiding loss appears. We model the appearance of the beneficial mutation in the nonmutator subpopulation in two different ways depending on the magnitude of the supply of successful beneficial mutations per generation. If λn = nμnfn < 1 (case 1), the waiting time for a mutation that survives drift is an exponentially distributed random variable. Alternatively, if λn = nμnfn > 1 (case 2), the nonmutator subpopulation immediately generates multiple (nμnfn) beneficial mutations that avoid loss by drift. In this case, W = 0. Evolved mutators that survive drift are assumed to proliferate at rate b - s until the time cycle ends.

Case 1: nμnfn1 < 1. We assume the waiting times until successful mutants arise are distributed as follows: W ∼ Exp(λn) and X ∼ Exp(λm). The dynamics are described by m(t)=m0est (5) N(t)={eb(tW)ifWt0if0t<W} (6) M(t)={e(bs)(tX)ifXt0if0t<X.} (7) We note that this system slightly underestimates the rate of increase of successful mutants because in the cases where the beneficial mutation rises faster than expected by chance due to drift, the allele is more likely to persist (Otto and Barton 1997).

Case 2: nμnfn > 1. Here, multiple beneficial mutations arise in the first generation and survive drift. In fact, there is an ongoing supply of successful beneficial mutants after the first generation. We can approximate the dynamics of N with the equation dNdt=bN+nμnfn. (8) Using the initial condition N(0) = nμnfn, the solution to this differential equation is N(t)=nμnff(1+1b)ebtnμnfnb, (9) which we use to replace Equation 6 in this case. The appearance and increase of evolved mutators is modeled in the same way as in case 1.

We stress that the frequencies of the evolved cells will not always change in exponential fashion. However, if the evolved mutator arises after this exponential period, this will inevitably lead to the dramatic loss of mutator numbers. In this case we can thus assume that M is zero at the end of the cycle. Alternatively, if the evolved mutator arises before the evolved nonmutator, the approximations may not be an adequate description of the subsequent dynamics, but this is not a concern because at that point, we already know what we wanted to find out—that the mutator has ascended.

We can now return to the effective population size of unevolved mutators me, which we model by using the harmonic mean of m(t) [with m(t) as given by Equation 5] over the time cycle of average length (): me=m0stest1. (10) We use me instead of the initial number of mutators m0 because m(t) declines over time, along with the supply rate of beneficial mutations among mutator cells. It is a conservative assumption in that it does not favor mutators.

The average length of a time cycle is the average time it takes for the successful beneficial mutation to arise [E(W) = 1/λn] plus the time it takes for that mutation to reach size n. That is, t={1nμnfn+ln(n)bifnμnfn<11b[ln(1+bμnfn)ln(1+b)]ifnμnfn>1.} (11)

Log change in number of mutators: We now turn to the statistical properties of the gain made by the mutator subpopulation during the course of a given selective sweep. Define R = ln(m(t*)) - ln(m0); this is the change in frequency of the mutator over the course of a single time cycle, as measured on a logarithmic scale. If for a given cycle R > 0, the mutator has increased in frequency. Using the waiting time distributions of W and X, we can derive the probability density function of R (see the appendix: The distribution of R).

The expected log change in mutator number is E(R)=(bs)[t(1λm+v)], (12) where is the average length of the time cycle, as given above, and where is the time from the appearance of the evolved mutator until the end of a cycle such that the mutator strain would exactly recover its original population size. From (7) and Figure 2 we can write = ln(m0)/(b - s). The quantity (1/λm + ) is the average time for an evolved mutator to appear plus the amount of time necessary for the mutator to break even. For the parameter values considered (Table 1), this quantity is ⪢ because on average the mutator generates the benefical mutation much later than would be necessary for the mutator to break even. Hence, with b - s > 0 by assumption, E(R) will almost always be negative. Should the average time for the evolved mutator to appear exactly equal the time required for an evolved mutator to break even, then we would have E(R) = 0. When mutators are rare, if we consider only the mean change in mutator population size, mutators face a dire fate. Figure 3 shows E(R) for the parameter values in Table 1 as a function of the initial mutator subpopulation size m0.

Figure 3.

— The expected log increase E(R) of the mutator over the course of a selective sweep, as a function of the initial size of the mutator subpopulation m0. The function is given by Equation 12. The parameter values used for this function are given in Table 1.

Probabilities of increase and jackpot: While the average change in frequency, E(R), suggests a general failure of the mutator to establish itself, if we examine the probability of success, a different story emerges. To illustrate this, we consider two probabilities. Let PI = Pr(R > 0) be the probability that the mutator has increased in number after the selective sweep, and let PJ = Pr(R > rJ) be the probability that the mutator has increased to at least the “jackpot size” by the end of the selective sweep. We define the jackpot size, J, as the population size of mutators such that each of the two subpopulations, mutator and nonmutator, has an equal chance of producing the next beneficial mutation. In other words, the jackpot size is m such that λmn; so J = (nμn fn (est¯ - 1))/(st¯ μmfm) and rJ = ln(λnm).

These two probabilities are easily computed from the cumulative density function (CDF) of R, given in The distribution of R in the appendix. Generally, Pr(R>r)=1Pr(X>W)eλm(wvr(bs)), (13) where in case 1 (nμnfn < 1) Pr(X > W) =λn/(λm + λn) and = ln(n)/b, and in case 2 (nμnfn > 1) Pr(X > W) = 1 and is as given in the second part of Equation 11. Note that is the duration of time from the appearance of the evolved nonmutator until the end of a time cycle.

We next explore how PI and PJ are affected by key parameters. Figure 4 shows these two quantities as functions of the initial number of mutators m0, the fitness advantage of the beneficial mutation b, and the mutator strength gmn. We used two different values for n, requiring the two alternative cases: when n = 108, nμnfn < 1 (Figure 4, left column), and when n = 1010, nμnfn > 1 (Figure 4, right column). To check the analytical model against the simulation model, we ran the simulation multiple times over a single time cycle and kept track of the fraction of times the mutator succeeded. The simulation was run for many different parameter values as shown in the figure.

In natural populations, the frequency of mutators can be very low if selective sweeps in nonmutators periodically eliminate mutators. On the other hand, if mutation rates toward mutators are high, the mutator frequency can be quite high: as high as 2% at “mutation-indirect selection balance” (Johnson 2000). These considerations, along with uncertainties inherent in measuring these mutation rates, suggest that a broad range of mutator frequencies should be considered: we use frequencies of ∼10-6 to ∼10-2.

A comparison between Figure 3 and the top two graphs of Figure 4 shows that while the expected log gain of the mutator strain can be extremely low when its initial numbers are low, the probability of making a gain is nonnegligible for exactly the same parameter values. Indeed, when n = 108, for a large proportion of the time the mutator increases, it also hits the jackpot. Although each particular mutator cell is overwhelmingly likely to be lost, the probability that at least one mutator cell in that subpopulation will succeed is substantial. We note the distinction between rarity in numbers and rarity in frequency. In our model, mutators are more likely to produce beneficial mutations when they are in large numbers, even when their frequency is low in a large total population. However, mutators also benefit from being relatively common because nonmutators are then less able to drive them out though selective sweeps.

Curiously, the success of mutators is not greatly affected by the fitness advantage b conferred by the mutations that they facilitate (see middle two graphs of Figure 4). This relative insensitivity results from the balance of two countervailing factors. On the one hand a rare mutator is better able to survive genetic drift when b is high, i.e., when beneficial mutations have a large effect. On the other hand, the larger the effect of these beneficial mutations, the faster the evolved nonmutator is able to sweep through the population—and thus the shorter the time window during which the mutator has the opportunity to generate its own beneficial mutation.

Figure 4.

—Comparison between simulation and analytical approximation. Left column, n = 108 (nμnfn < 1); right column, n = 1010 (nμnfn > 1). The vertical axis represents the probability that by the end of a time cycle the mutator strain has increased in frequency (PI) or hit the jackpot or better (PJ). Each point represents 1000 simulations. The variable g is the mutator strength, i.e., μmn, which is varied while μn is kept constant. Apart from the varied parameters, the parameter values are given in Table 1.

In our model, the strength of the mutator gmn has the effect of monotonically increasing mutator success. We note, however, that this effect is an artifact of the way our model is set up. In reality, increasing g would also increase the deleterious effects of the mutator (i.e., s would also increase), so that we would expect something like a bell-shaped function, as discussed by Orr (2000). An extension of this model should take this phenomenon into account by modeling the deleterious effects explicitly.

In this section we have analyzed only the dynamics that occur during the period where the beneficial mutation has yet to reach fixation in the population. By doing so, we have been able to estimate the probabilities that mutators make a net gain following an environmental shift. This allows us to explain the puzzles with which we started: how the mutators often manage to increase in frequency despite the dual challenges that they are not the first to acquire the beneficial mutation, and that in the expectation, they will undergo a large decline in frequency. We have not treated the dynamics that occur subsequent to fixation of the beneficial allelle (on whatever combination of mutator or nonmutator background), as this phase of the dynamics is not central to explaining these puzzles. Nonetheless, the reader may note that if the cost of accelerated mutation rate s is sufficiently small and the rate of environmental change is not too slow, genotype frequencies will not change appreciably once the beneficial mutation has swept through the population on any combination of mutator and nonmutator background. In this case, the dynamics simulated in simulation model will be closely approximated by the dynamics treated here, followed by a period of no change in genotype frequencies that extends until the next environmental shift.


Mutators can succeed: We have demonstrated that in continually changing environments where the adaptation to any given environmental state is independent of the adaptation to other states, mutator genes can climb to prominence by associated selection (i.e., by hitchhiking) with the linked beneficial mutations they generate. We have shown that even when mutators are extremely rare, there is a substantial probability that they will increase in frequency. We see this nonequilibrium analysis as a reminder that the average behavior of a process is sometimes an inadequate descriptor of the system as a whole. Although mutators are often lost in our model, in reality there would be recurrent mutation to the mutator state, and even if they are occasionally lost, organisms carrying a mutator gene would eventually dominate. “Eventually,” however, could be a very long time so that we should be unsurprised to sometimes observe low mutator frequencies even in systems that do conform to the assumptions of our model.

While our model answers the questions we raised, we recognize its limitations. First, we have not considered a number of phenomena that would also influence the dynamics of this process. Most prominent of these are multiple mutations in a given environment (Taddeiet al. 1997; Tenaillonet al. 1999), clonal interference (Gerrish and Lenski 1998), and genetic exchange (Johnson 1999b; Tenaillonet al. 2000). We discuss some of these issues in more detail below. We expect that our qualitative conclusions about the role of timing of mutations in the success of mutator genes will hold with more general variants of this model.

Second, it is not clear how commonly environmental changes and fitness relationships of the type considered in this model occur in natural populations of bacteria. How often does a mutation augment fitness in one environment while remaining selectively neutral or deleterious in another? Whatever the natural prevalence of such scenarios, we suggest that the all-too-common human intervention of antibiotic treatment and prophylaxis approximates the scenario that we have modeled. The necessary conditions are as follows: (1) the intensity of antibiotic-mediated selection is less than absolute (resistant bacteria have an advantage but sensitive bacteria can also grow); (2) resistance can arise by single mutations; (3) different antibiotics are used sequentially; and (4) resistance to one drug is independent of resistance to other drugs in the sequence. Other environmental conditions may meet these criteria as well. How robust our results are to deviations from the scenario considered in this particular incarnation of the model is a question that can and should be addressed theoretically. As we consider later, the properties of this model lead to predictions that can be tested empirically.

Third, while we have modeled environmental changes, we have not explored the effect of the rate of these changes. Classical theory (Kimura 1967; Levins 1967) has suggested that in the long run this rate should affect the success of mutators. When the rate is too low, the opportunities for creating innovations vanish, and low mutation rates are favored. When the rate of environmental fluctuation is too high, rare mutators will never have the chance to increase in frequency. According to these models, increased mutation rates are favored when the rate of fluctuation is appropriately matched to the dynamics of fixation. In our model, the times between environmental shifts are long enough to allow roughly one fixation event. The consequences of this assumption should be examined further in a more general model. For further discussion, see Travis and Travis (2002). Again, we do not know to what extent these conditions occur in nature.

Comparisons with other models: In this study we have focused on how mutator genes rise in frequency when they are initially rare. An alternative explanation for this has been provided by Taddei et al. (1997) and Tenaillon et al. (1999). In their model, mutators ascend in frequency because they increase the probability of producing adaptive characters that require multiple mutations. The way this process works can be seen with a simple thought experiment in which two mutations are acquired in rapid succession. Suppose that, in a population of 1010 bacteria, the frequency of mutators is 10-5 and the more-fit phenotype requires two independent mutations, each occurring at rate 10-8 in the nonmutator subpopulation and 10-5 in the mutator subpopulation. Assuming that neither mutation is present in the initial population (an assumption that may be unlikely, depending on the nature of the mutation in question), we see that—despite the 105-fold difference in their population sizes—the mutator subpopulation will generate the two adaptive mutations at ∼10 times the rate of the nonmutator subpopulation: (105 × 10-5 × 10-5)/(1010 × 10-8 × 10-8) = 10. Here, we emphasize that mutators are able to succeed even in cases where only a single genetic change is required for the adaptive phenotype. The mutator subpopulation does not have to be the first to acquire the adaptive phenotype, so long as it acquires this phenotype substantially sooner than the nonmutator subpopulation relative to its initial population size.

Although their study was experimental rather than theoretical, Mao et al. (1997) also present an alternative model for the ascent of mutator genes when they are initially rare. How that model works can also be seen with a numerical thought experiment. Assume complete selection for resistance to an antibiotic (no sensitive cells survive) and mutation rates of 10-8 and 10-5 for nonmutators and mutators, respectively. If a population of 1010 cells are plated on agar containing an antibiotic, there would be ∼100 resistant nonmutator colonies (1010 × 10-8) and 1 mutator colony (105 × 10-5). Thus, in one passage, if all colonies were of the same size, the frequency of mutator cells would have gone from 10-5 to 10-2. In a second passage with another antibiotic with the same total population size and mutation rates for resistance, the community would be dominated by bacteria carrying mutator genes. That is, there would once again be ∼100 colonies of resistant nonmutators (1010 × 10-8) and 1000 (108 × 10-5) colonies of resistant mutators. The model considered here is similar (and was in fact inspired by their experiments) except that in the present model, selection for resistance is not absolute.

In their study of bacterial population genetics, Gerrish and Lenski (1998) examined, among other things, the substitution rate of beneficial mutations in an asexual population. Their model takes into account multiple beneficial mutants that compete with each other and hinder the fixation process—an effect they call clonal interference. Of particular relevance to this investigation is their finding that while the substitution rate increases with mutation rate, this increase slows down due to clonal interference. As a result, mutation rates are not expected to increase indefinitely. A law of diminishing returns holds. The Gerrish and Lenski analysis also considers a distribution of values for the selective advantage of mutants, which we do not address here.

Frequency-dependence and dependence on initial conditions: Our theoretical model is very similar to the experimental models of Chao and Cox (1983) and Chao et al. (1983), which consider single adaptive changes. In both systems, the success of mutators depends on the initial frequency of mutators in the population. Moreover, in the data of Chao et al. there appears to be a threshold frequency (sometimes called the Edge of Chao’s), above which fixation of mutators will take place, and below which mutators will be lost. This phenomenon should not be confused with frequency-dependent selection in the classical sense, as defined, for example, by Lewontin (1958) or Levin (1988). Under disruptive frequency-dependent selection, a frequency exists below which fixation is impossible, and above which fixation is certain.

In the case of mutator dynamics, no systematic force produces a threshold of this kind. The apparent threshold observed in the earlier empirical studies can be explained by models such as ours, in which the probability of mutator success as a function of the logarithm of initial mutator frequency is sigmoidal. When considering a very wide range of initial frequencies the transition between failure and success may seem sharp when in fact it is not. Another way of putting it is that if we consider the probability of increase per cell, then this probability does not change much over frequency. In fact, it will decrease slightly with increasing mutator frequency, because increasing the mutator frequency also increases the number of able competitors.

Predictions: At this juncture we cannot say very much about the generality of this model for the evolution of mutator genes or the prevalence of the necessary conditions in natural populations of bacteria. We can say, however, that this model has the considerable virtue of making specific predictions that can be tested empirically. One way to test the quantitative as well as the qualitative predictions derived from the analysis of this model is with experimental populations of bacteria containing predetermined numbers of mutator and nonmutator cells already well adapted to chemostat or serial transfer culture. Using antibiotics or other bacteriocidal or bacteriostatic chemicals at concentrations below the minimum inhibitory concentrations for a population of mutators and nonmutators, the sensitive bacterial population would be able to grow, and selection would favor resistant mutants. By manipulating the concentration of the antibiotics, the intensity of this selection could be varied. By transferring washed cells into fresh medium with different antibiotics or other chemicals that are detrimental to bacteria, it would be possible to mimic the conditions specified in the model for the ascent of mutator genes. With independent estimates of the mutation rates to resistance to the chemicals and the intensity of selection, one should be able to use the model to predict (1) the probability that the mutator will ascend and (2) the dynamics of its ascent. We could follow the latter using mutator strains with selectable markers.

Another prediction of this model (and other similar models) could be evaluated retrospectively as well as experimentally. If antibiotic-mediated selection of the sort considered in this model is the reason for the ascent of the mutator strain, then there should be a positive association between antibiotic resistance and mutation rates. In fact, the mutator strain is more likely than the nonmutator strain to be multiply resistant. The findings of Oliver et al. (2000) with regard to Pseudomonas aeruginosa in cystic fibrosis patients are suggestive of this effect. Other models (Taddeiet al. 1997; Tenaillonet al. 1999) are also consistent with this observation.

Why mutation rates are low: It is possible that a broad array of environmental conditions favoring the evolution of higher mutation rates exists in nature. And, as seen above, it is relatively easy to produce models to account for the evolution of genes that increase the overall mutation rate. The next question that must be addressed is why mutation rates in bacterial populations are as low as empirical studies have found them to be. We see four general hypotheses for the relative rarity of mutator genes in natural populations of bacteria:

  1. While their form may vary, almost all of the models for the evolution of increased mutation rates require continuous adaptation to novel or varying environmental conditions. It may well be that the natural habitats that bacteria face are relatively stable. Novel conditions and the environmental changes may be relatively rare. Defective mismatch repair and other mutator genes do not ascend because of their intrinsic disadvantage.

  2. As a consequence of intense bottlenecks, bacteria with high rates of mutation randomly fix deleterious alleles and decline in frequency. For clever experimental demonstrations of this effect in bacteria, see Funchain et al. (2000) and Andersson and Hughes (1996).

  3. For selection to favor mutator genes, it is essential that the mutator and the beneficial mutations it generates maintain their linkage relationship. Consequently, if recombination occurs frequently enough, it would break down these essential linkage relationships and the frequency of the mutator gene will wane (Leigh 1970; Tenaillonet al. 2000).

  4. Bacteria switch from habitat to habitat. While populations of bacteria carrying mutator genes may adapt more rapidly to a particular habitat than populations with lower rates of mutation, that evolution may force these mutator populations into a rapid overspecialization in one habitat and a corresponding disadvantage in another (Giraudet al. 2001).


Details of the simulation: The population sizes of the four types at time t are as follows: n(t) and m(t) are the numbers of unevolved nonmutators and unevolved mutators, respectively, while N(t) and M(t) are the numbers of evolved nonmutators and evolved mutators, respectively. Let the total population size be constant. Cells do not mutate from the nonmutator to the mutator phenotype, or vice versa.

At each time step (generation), a beneficial mutation may arise in each subpopulation. We calculate the probabilities of these events as follows. In the case of the nonmutators, the per-cell mutation rate per generation is μn, so the number of mutations approximates a Poisson distribution with parameter μnn(t). Similarly, the number of mutations that occurs in the mutator population is approximately Poisson with parameter μmm(t). Selection is included as indicated in the main text. Beneficial mutants that newly arise are subject to loss by genetic drift. In each generation, the number of evolved mutators that pass to the next generation is sampled using a Poisson distribution with parameter (1 + b - s)M(t). Similarly, the number of evolved nonmutators that form the next generation is found by a Poisson distribution with parameter (1 + b)N(t).

The equations describing the simulation are therefore Kn(t+1)=n(t)Vn(t)KN(t+1)=N(t)+Vn(t)Km(t+1)=(1s)m(t)Vm(t)KM(t+1)=M(t)+Vm(t), (A1) where N(t)Poisson[(1+b)×N(t)]M(t)Poisson[(1+bs)×M(t)]Vn(t)Poisson[μnn(t)]Vm(t)Poisson[μmm(t)], and where K is the normalizer given by the sum of the right-hand sides of Equations A1 divided by the total population size (n(t) + N(t) + m(t) + M(t)), which we held constant over time. If N(t) or M(t) exceed the preset value of 100, then we allow them to continue growing without drift. That is, the equations for the evolved lineages become the deterministic expressions: N(t + 1) = [(1 + b)N(t) + Vn(t)]/K and M(t + 1) = [(1 + b - s)M(t) + Vm(t)]/K. We note that the equations above are approximate, since as N(t) approaches the n(t), the fitness of M(t) must shift to be relative to evolved nonmutators rather than to unevolved nonmutators.

When the environment changes, n(t) in the new environment is assigned the value n(t) + N(t) from the previous environment, and similarly, m(t) is changed to m(t) + M(t) evaluated before the change. N(t) and M(t) are then reset to zero. The initial conditions, as given in the main text, are N(0) = M(0) = 0, m(0) = m0, and n(0) = 1010 - m0.

Definition of time cycle: The time cycle used in the analytical model is defined as the period following an environmental shift, during which a beneficial mutation appears in the nonmutator subpopulation and proliferates until it reaches size n. We narrow our focus on cases in which environmental cycles appear after this event. For this analysis to be interpreted in the long run, we also assume that the next environmental shift occurs soon enough after the end of the time cycle so that the gain made by mutators is not substantially deteriorated through their cost s.

The length of the time cycle t* is the time it takes for a beneficial mutant destined to survive drift to appear in the nonmutator subpopulation (W) plus the time it takes for that mutant cell to sweep through the population (). For the purposes of comparing the simulation and the analytical models, let the (varying) number of unevolved nonmutator cells in the simulation be denoted by (t), and we continue to denote the constant number of cells in the analytical model by n. Similarly, let the number of evolved nonmutators in the simulation model be (t). Under the assumption that mutators (m + M) remain relatively rare during the time cycle, the dynamics of (t) and (t) are similar to a diallelic haploid selection model whose dynamics are approximately logistic (Crow and Kumura 1970, p. 192), leaving aside genetic drift. Thus, with n as the carrying capacity we can write N^(t)n1+((nN0)N0)eb(tW)fort>W, (A2) where N0 is the initial number of evolved mutators.

Case 1: (nμnfn < 1). If the mutant were to grow exponentially (Equation 6), then the time taken for N(t) to reach n from N0 = 1 is = ln (n)/b. Substituting this quantity for time in (A2) gives (t*) = n2/(2n - 1) ≈ n/2. That is, the end of the time cycle in the analytical model roughly corresponds in the simulation model to the time at which the unevolved nonmutators ( = n - ) decline to n/2. Because the dynamics are approximately logistic, (t) spends most of the time period close to n.

Case 2: (nμnfn > 1). To compare Equation 9 with (t), we need to choose N0 such that the two functions are comparable in the long term (as t* is approached). We therefore use N(t) = nμn fn(1 + 1/b)ebt instead of (9) and set N0 = nμnfn(1 + 1/b). The time cycle in this case has the deterministic length given by the second part of Equation 11. Substituting this time period in (A2) simplifies to N^(t)=n1+(1N0n)(b(μnfn+b)). Since μn fnb/(1 + b), (t*)n/2. Again, we conclude that during the time cycle, (t) slowly declines from n to ∼n/2.

The distribution of R: We first present the model for cases where nμnfn < 1. Consider the two exponentially distributed random variables: X ∼ Exp(λm) and W ∼ Exp(λn). To find the probability density function (PDF) of U = X - W, we perform the following convolution: fU(u)={ufX(x)fW(xu)dxifu00fX(x)fW(xu)dxifu<0.} (A3) The integration limits are given by the constraint that the variables W and X must both be greater than zero.

We therefore obtain fU(u)={fU1(u)=λmλnλm+λneλmuifu0fU2(u)=λmλnλm+λneλnuifu<0.} (A4) From Equation 7, ln(M(t))=(bs)(tX)=(bs)((tW)U). (A5) The term t* - W = is the time taken for a successful beneficial mutation in nonmutators to reach size n, which equals the constant ln(n)/b (from Equation 6). Thus R=(bs)U+bsbln(n)ln(m0), (A6) or U=Rk2k1, (A7) where k1=(bs)<0,by assumption, (A8) and k2=bsbln(n)ln(m0). (A9) We now transform the distribution of U (A4) using (A6) to produce the PDF of the log gain made by the mutator in a given time cycle (R): fR(r)={fR1(r)=λmλn(k1)(λm+λn)eλm(rk2)k1ifrk2fR2(r)=λmλn(k1)(λm+λn)eλn(rk2)k1ifr>k2.} (A10 The cumulative density function (CDF) of R is given by FR(r)={rfR1(y)dyifrk2k2fR1(y)dy+k2rfR2(y)dyifr>k2}={λnλm+λneλm(wvr(bs))ifrk21λmλm+λneλn(wvr(bs))ifr>k2.} (A11 Because we are generally interested in rare mutators (small m0), we focus on the first case of this equation (rk2). Equation 13 is based on this first case.

The case where nμnfn > 1 follows the same reasoning but is simpler to derive because W = 0. In this case, the length of the time cycle is deterministic and also equal to the duration of the selective sweep: t* = t = = [ln(1 + (bnfn)) - ln(1 + b)]/b. We next transform the waiting time until the evolved mutator appears to produce the PDF of the log gain: fR(r)=λm(bs)exp[λm(wln(m0)bsrbs)]. (A12 We can again find the CDF and thus write Pr(R>r)=1eλm(wvr(bs)). (A13


We thank Sally Otto, Olivier Tenaillon, Toby Johnson, and a reviewer for their astute comments on this work. We also thank Lin Chao, Jeffrey Miller, Paul Sniegowski, Lauren Ancel, Phil Gerrish, Michael Lachmann, Roland Regoes, and François Taddei for helpful discussions and Marc Lipsitch for providing humor with an edge. This work was supported by National Institutes of Health grants AI40662 and GM33682 to B.R.L. and a University of NSW Faculty Research Grant to M.M.T.


  • We dedicate this article to the memory of Johanna Björkman, an extraordinary scientist and human being.

  • Communicating editor: S. P. Otto

  • Received December 9, 2002.
  • Accepted March 7, 2003.


View Abstract