## Abstract

Mutator and antimutator alleles often arise and spread in both natural microbial populations and laboratory evolution experiments. The evolutionary dynamics of these mutation rate modifiers are determined by indirect selection on linked beneficial and deleterious mutations. These indirect selection pressures have been the focus of much earlier theoretical and empirical work, but we still have a limited analytical understanding of how the interplay between hitchhiking and deleterious load influences the fates of modifier alleles. Our understanding is particularly limited when clonal interference is common, which is the regime of primary interest in laboratory microbial evolution experiments. Here, we calculate the fixation probability of a mutator or antimutator allele in a rapidly adapting asexual population, and we show how this quantity depends on the population size, the beneficial and deleterious mutation rates, and the strength of a typical driver mutation. In the absence of deleterious mutations, we find that clonal interference enhances the fixation probability of mutators, even as they provide a diminishing benefit to the overall rate of adaptation. When deleterious mutations are included, natural selection pushes the population toward a stable mutation rate that can be suboptimal for the adaptation of the population as a whole. The approach to this stable mutation rate is not necessarily monotonic: even in the absence of epistasis, selection can favor mutator and antimutator alleles that “overshoot” the stable mutation rate by substantial amounts.

DNA replication occurs with extremely high fidelity, despite taking place in the noisy environment of the cell. For example, laboratory strains of *Escherichia coli* produce a point mutation at a rate of roughly one nucleotide per 10 billion copied (Wielgoss *et al.* 2011; Lee *et al.* 2012), which implies that hundreds of generations can elapse before a single mutation is introduced into the genome. To achieve such low error rates, bacteria and eukaryotes employ a complex array of cellular machinery, which must be maintained by natural selection.

One explanation for the low observed error rates is that the genome encodes a large number of functions, all of which are essential for survival. Low mutation rates could then emerge from hard selection against these lethal errors. But, in practice, observed mutation rates lie far below the levels that would quickly result in extinction. This is evident from the fact that mutator strains, whose mutation rates are 10- to 1000-fold higher than the wild type, can be propagated for thousands of generations in the laboratory without significant loss of viability (McDonald *et al.* 2012; Wiser *et al.* 2013). This suggests that mutation rates are not maintained solely by hard selection, but also by more direct evolutionary competition between strains with different mutation rates. These variants feel the effects of natural selection indirectly, by being linked to other mutations that directly influence fitness.

Strains with higher mutation rates are more likely to be linked to deleterious mutations, and will therefore experience an effective fitness cost. This cost can be measured in head-to-head competitions between mutator and wild-type strains (Tröbner and Piechocki 1981; Chao and Cox 1983; Chao *et al.* 1983; Giraud *et al.* 2001; Thompson *et al.* 2006; Gentile *et al.* 2011). In the absence of beneficial mutations, natural selection will therefore act to decrease the mutation rate, until it is eventually balanced by genetic drift, mutation, or other physiological costs. A large body of previous theoretical work has explored these dynamics (Kimura 1967; Liberman and Feldman 1986; Dawson 1998, 1999; Johnson 1999a; Lynch 2008; Desai and Fisher 2011; Lynch 2011; Soderberg and Berg 2011; James and Jain 2015).

When beneficial mutations are available, lineages with higher mutation rates are also more likely to produce and hitchhike with a successful beneficial variant. In the absence of deleterious mutations, natural selection will therefore act to increase the mutation rate, until the supply of beneficial mutations is eventually exhausted. In accordance with these expectations, mutator alleles are often found to spontaneously arise and spread in rapidly adapting microbial populations in the laboratory (Sniegowski *et al.* 1997; Notley-McRobb *et al.* 2002; Shaver *et al.* 2002; Pal *et al.* 2007; Voordeckers *et al.* 2015), and are often correlated with pathogenic lifestyles in the wild (LeClerc *et al.* 1996; Matic *et al.* 1997; Oliver *et al.* 2000; Bjorkholm *et al.* 2001; Denamur *et al.* 2002; Giraud *et al.* 2002; Richardson *et al.* 2002; Prunier *et al.* 2003; Watson *et al.* 2004; del Campo *et al.* 2005; Labat *et al.* 2005).

In a few cases, laboratory populations that have previously fixed a mutator allele have been shown to evolve a lower mutation rate later in the experiment, suggesting that mutation rate evolution is an ongoing process (Tröbner and Piechocki 1984; Notley-McRobb *et al.* 2002; McDonald *et al.* 2012; Turrientes *et al.* 2013; Wielgoss *et al.* 2013). Because these populations often continue to increase in fitness during this process (Wielgoss *et al.* 2013), the success of mutator and antimutator alleles must depend on the interplay between beneficial and deleterious mutations, rather than the complete cessation of adaptation. However, our understanding of this interplay remains incomplete. A few simulation studies have shown how these effects can, in principle, favor either increases or decreases in mutation rates, depending on the specific population parameters (Taddei *et al.* 1997; Tenaillon *et al.* 1999, 2000; Travis and Travis 2002). Some analytical progress has also been made in the case where beneficial mutations are rare, and occur one-by-one in discrete selective sweeps (Leigh 1970, 1973; Painter 1975; Gillespie 1981; Ishii *et al.* 1989; Kessler and Levine 1998; Johnson 1999b; Tanaka *et al.* 2003; Andre and Godelle 2006; Wylie *et al.* 2009; Desai and Fisher 2011). However, our understanding is much more limited in larger populations, where beneficial mutations are more common, and multiple adaptive lineages must compete with each other for fixation. This is the regime where the indirect effects of linked selection are likely to be maximized, and it is also the most relevant for understanding the microbial evolution experiments where mutation rate modifiers have been observed to spread. In this clonal interference regime, even the most basic questions about this process remain unanswered: exactly how large can the deleterious load be before it effectively selects against a mutator allele? And how does this depend on the population size, the supply of beneficial mutations, and the magnitude of the mutator phenotype?

Here, we address these questions in the context of a simple model of adaptation, which is motivated by recent microbial evolution experiments. In particular, we focus on a regime in which deleterious mutations have a negligible impact on the rate of adaptation, even though they may have a large effect on the fates of mutation rate modifiers. Within this model, we calculate the fixation probability of an allele that changes the mutation rate by a factor *r* (with corresponding to mutator alleles and corresponding to antimutators). We consider both small changes in mutation rate (), as well as substantial changes (where ).

We find that clonal interference amplifies the beneficial advantage of mutator alleles, even as it constrains the overall rate of adaptation. For large *r* this can be a substantial effect, resulting in >100-fold increases in the probability of fixation. However, clonal interference also amplifies the effects of the deleterious load, resulting in a much narrower window of mutator favorability. We show that this window depends most strongly on the strengths of typical driver mutations, and relatively weakly on the rate at which beneficial mutations arise. With the steady production of such modifier alleles, the mutation rate will evolve toward a stable point at which neither mutator nor antimutator alleles are favored. Surprisingly, we find that the approach to this stable point is not necessarily monotonic, even in the absence of epistasis. Instead, alleles that overshoot the stable point can be positively selected, and selection must then act to change the mutation rates in the opposite direction.

## Model

Our goal is to understand how beneficial and deleterious mutations influence the fates of mutation rate modifiers in rapidly adapting populations. We focus on the simplest possible model in which we can address this question. Specifically, we consider an asexual haploid population of constant size *N* (our analysis also applies to diploids when all mutations have intermediate dominance effects). We assume beneficial and deleterious mutations occur at genome-wide rates and , respectively. We define as the total mutation rate, and as the ratio of beneficial to deleterious mutation rates. We focus on adapting populations, and consider both the successional-mutations regime (where beneficial mutations are rare), and the clonal interference regime.

For most of our analysis, we assume that beneficial mutations all confer the same fitness advantage , though we also comment on how our results can be extended to the case where beneficial mutations have a distribution of fitness effects. We also assume that there is no macroscopic epistasis among beneficial mutations [as defined by Good and Desai (2015)], so that and remain fixed as the population adapts. On sufficiently long timescales, this assumption may eventually break down, so we also comment on the effects of changing or in the *Discussion*.

We make no specific assumptions about the fitness effects of deleterious mutations, other than that they are purgeable, *i.e.*, deleterious mutations are sufficiently costly that they are unlikely to fix, and do not significantly reduce the overall rate of adaptation of the population. We describe the technical conditions required for this to be true, and the qualitative implications of nonpurgeable deleterious mutations in more detail in the *Discussion*.

Our goal is to analyze the fate of a modifier allele that changes the overall mutation rate *U* by a factor *r* (while leaving *ϵ* and unchanged). For simplicity, we assume that this allele has no direct influence on fitness, although in the *Discussion* we show how this assumption can be relaxed. These modifier alleles are produced at rate *μ* from the wild-type background, and we assume that *μ* is sufficiently small that the modifiers can be treated independently. In this case, the substitution rate of the modifier is , where is the fixation probability of a single modifier individual. A natural measure of how selection “favors” or “disfavors” the modifier can be obtained from the scaled fixation probability, . When , the allele is favored by selection, since it substitutes more rapidly than a neutral allele; conversely, the allele is disfavored when This scaled fixation probability will be our primary focus in the analysis below. In particular, we are interested in determining the parameters where transitions between favored () and unfavored ().

We are primarily interested in situations applicable to microbial populations, and particularly to laboratory evolution experiments. This motivates certain technical assumptions we describe in the analysis below. It also motivates our focus on asexual populations where recombination can be neglected. This asexual case is particularly relevant because, when recombination is absent, modifiers of mutation rate remain perfectly linked to the beneficial or deleterious mutations they generate, and hence experience the strongest indirect selection pressures. In principle, we could also apply our analysis to physically linked regions within sexual genomes (“linkage blocks”) that are unlikely to recombine on the appropriate timescales (Neher *et al.* 2013; Good *et al.* 2014; Weissman and Hallatschek 2014). However, the scale of these linkage blocks is a complex problem, and understanding these effects is beyond the scope of this study.

In the remaining sections, we calculate as a function of the underlying model parameters, and then we use these predictions to analyze the dynamics of mutation rate evolution.

### Data availability

The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article.

## The Successional Mutations Regime

To gain intuition into the forces that determine , it is useful to start by considering the simplest case, where deleterious mutations can be neglected, and adaptation proceeds by a sequence of discrete selective sweeps. This “successional mutations” regime applies whenever beneficial mutations are sufficiently rare that they seldom segregate within the population at the same time. This will be true provided that (Desai and Fisher 2007). In this regime, beneficial mutations are produced at rate and establish with probability . Selective sweeps then occur as a Poisson process with rate(1)When a successful beneficial mutation establishes, it starts to grow deterministically and requires roughly generations to complete its sweep. By definition, this time is small compared to the time between sweeps, so the rate of adaptation is just . A modifier allele would increase or decrease this rate by factor of *r*, if it was able to fix. We now analyze the dynamics of this process.

When a modifier allele first arises, it starts at an initial frequency , and it will then drift neutrally until the next selective sweep occurs. Most of the time, the lineage will fluctuate to extinction within the first few generations. However, with probability , it will survive for generations, and reach size (Fisher 2007). In the absence of selective sweeps, the modifier lineage can only fix by drifting to fixation; this happens with probability , and occurs over a timescale of order generations. Thus, depending on how this drift timescale *N* compares with the sweep timescales and , two distinct types of fixation dynamics can emerge.

When and are both much larger than *N*, then the fate of the modifier lineage will be determined long before the next selective sweep occurs. Drift is the dominant evolutionary force, and .

In contrast, if either or are small compared to *N*, the fixation process is instead controlled by genetic draft. In this case, the modifier lineage will fix only if it is lucky enough to produce and hitchhike with the next selective sweep. Since we are primarily interested in understanding these effects, we will focus on this regime for the remainder of our analysis.

The next selective sweep is produced by two competing Poisson processes, corresponding to the beneficial mutants from the wild type and modifier lineages. These respectively produce sweeps at rates(2a)(2b)The probability that the next sweep is produced by the modifier lineage is then simply(3)where the angled brackets denote an average is over the random lineage trajectory . In the regime we are considering, the modifier lineage will remain rare until the next selective sweep occurs. The trajectory is therefore described by the Langevin dynamics,(4)where is a Brownian noise term (Gardiner 1985). The solution of Equation (3) can be complicated, since both the overall rate of sweeps and their probability of arising in the modifier lineage depend on the random trajectory . We present an exact solution of these equations in Appendix A, using techniques developed by Weissman *et al.* (2009).

For the present purposes, however, it will be more useful to focus on an approximation. If the modifier lineage stays sufficiently rare that it does not influence the overall rate of sweeps, then we can approximate . Equation (3) then reduces to(5)In other words, the fixation probability is given by the average size of the lineage at the time of the next sweep, and this time is unaffected by the presence of the modifier. In this case, the modifier lineage is neutral and , so that(6)Thus, we see that the modifier fixation probability (like the proportional change in the rate of adaptation) is independent of the population size, the mutation rate, and the fitness benefits of the driver mutations.

To derive Equation (6), we made the approximation that . Substituting our expressions for and in Equation (2), we see that this is equivalent to the assumption that However, since and *t* are both random variables, we cannot determine the validity of this assumption based on the average values alone. We must instead consider the *typical* dynamics that contribute to the average in Equation (5): with probability , the modifier lineage survives for generations and reaches size . This ensures that the average size of the lineage is just , as expected. It also shows that the typical values of will remain small compared to one provided that . For simplicity, we will assume that this condition holds for the remainder of our analysis. For moderate *r*, this is actually the same assumption we already made in focusing on the genetic-draft regime. For large *r*, it imposes an upper limit on the range of modifier effects that we consider. However, this is not a fundamental problem for the analysis; we consider such large-effect modifiers in Appendix A.

### Incorporating purgeable deleterious mutations

A similar picture applies in the presence of strongly deleterious mutations. In particular, we assume that the typical fitness costs are larger than , so that a driver mutation can fix only in a background that is free of deleterious mutations. If the costs are also much larger than , then the vast majority of the population will be of this mutation-free type. We refer to these as purgeable mutations, and we will assume that all deleterious mutations are purgeable in the analysis that follows.

When a new beneficial mutation occurs, the wild type will be in mutation-selection balance with respect to its deleterious mutations, so the mean fitness of the population is . Even if it arises on a mutation-free background, the new beneficial lineage will continue to produce loaded individuals at rate . This exactly cancels the growth advantage of the mean fitness of the population, so that the establishment probability is still just . Since most genetic backgrounds are in this unloaded state, the overall rate of sweeps is unchanged from before. This means that if a modifier allele were to fix, it would still change the rate of adaptation by a simple factor of *r*.

However, while the deleterious mutations do not affect the rate of adaptation, they can still have a large effect on the modifier lineage while it is rare. The modifier produces doomed lineages at rate , which is not completely cancelled by the mean fitness of the wild type. Instead, mutator alleles will feel an effective cost of magnitude , while antimutators will experience an effective fitness advantage. Generalizing Equation (4), the mutation-free portion of the modifier lineage will then evolve as(7)Beneficial mutations produced by this lineage will likewise have an effective fitness , and will establish with probability

(8)In exactly the same manner as before, the next selective sweep is produced by two competing Poisson processes with rates(9a)(9b)and the fixation probability is given by the average in Equation (3). If we again make the assumption that the modifier lineage has a negligible influence on the overall sweep rate (), this reduces to(10)In this case, the average lineage size is now , and we obtain(11)The validity of this expression depends on our assumption that . In the same manner as above, we can establish the region of validity of this approximation by considering the typical dynamics behind the averages in Equation (10). These will depend on the sign of .

For a mutator allele (), the modifier lineage will feel an effective fitness cost . If this cost is small compared to , then selection will barely have time to influence before the next sweep occurs, and mutators will continue to fix with probability . On the other hand, if , then this fitness cost exponentially suppresses the probability that the mutator survives until the next sweep. In particular, the mutator will survive for generations with probability , and, provided that it does so, it will no longer grow indefinitely, but will instead reach a maximum size of order (Desai and Fisher 2007). If the time between sweeps was deterministic, this would simply suppress the fixation probability by a factor . However, this is not actually the case, since the next sweep is generated by the random Poisson process described above. In particular, the random nature of this process will occasionally produce a sweep that occurs much earlier than . Such anomalously early sweeps are uncommon, occurring with probability , but they are far more common than a mutator lineage that survives for a typical sweep time. Combining these facts, we find that the mutator fixation probability is dominated by anomalously early sweep times for which , which leads to a power law decay . For a strong-effect mutator (), this power-law decay exactly balances the lineage’s increased beneficial mutation rate, so that the *r*-dependence of is mediated primarily through the reduced establishment probability, .

In contrast, antimutators will have an effective fitness advantage relative to the wild type, of magnitude . If , this fitness advantage will still have a negligible influence on . However, when , the antimutator will establish with probability , and will start to grow deterministically as . Again, if the time to the next sweep was deterministic, this would result in a simple exponential enhancement of . But because *t* is exponentially distributed, it interacts with the exponential growth of the lineage in a strong way. In particular, as approaches 1, the average in Equation (11) will be dominated by anomalously late sweep times for which is no longer rare, and our approximation that breaks down. Thus, Equation (11) will only be valid provided that . Outside this regime, the antimutator lineage will appear to sweep to fixation on its own, without the help of an additional driver mutation.

The fixation probability in Equation (11) is illustrated in Figure 1 for several choices of parameters. Its overall shape is determined by the key parameter(12)which depends only on *N*, , and , and is independent of the overall mutation rate *U*. Depending on the value of , Equation (11) takes on one of two characteristic shapes. If , the fixation probability is a monotonically decreasing function of *r* (see Figure 1), and mutators will never be favored to invade. Instead, antimutators will be positively selected. Note that this happens even though the antimutator actually causes a *decrease* in the overall rate of adaptation.

On the other hand, for , the fixation probability takes on a paraboloid shape (Figure 1). It crosses at exactly two points: once at and once at(13)When , modifiers will be favored in the range , and disfavored elsewhere (*i.e.*, some mutators will be favored). Conversely, when , modifiers will be favored in the range (*i.e.*, some antimutators will be favored). It is also useful to rewrite in terms of the mutation rate of the modifier lineage:(14)Note that this is always positive (since ), and it is independent of *U*.

### Dynamics of mutation rate evolution

We are now in a position to analyze how mutation rates evolve over time. Imagine that we start with some particular values of *N*, *U*, *ϵ*, and . This will correspond to a particular value of and . If , mutators are disfavored and antimutator alleles will be positively selected instead. After an antimutator allele fixes, both and are reduced, but is unchanged, so natural selection will continue to select for lower mutation rates until this force is balanced by drift, mutational pressure, or other physiological costs.

On the other hand, if and , mutators will be favored provided that their resulting mutation rate is less than . If a mutator allele fixes, and will be increased, but and will remain constant. Thus, natural selection will continue to favor increased mutation rates until *U* reaches a special value,(15)which is stable against further changes in the mutation rate. If instead the initial mutation rate , antimutators will be favored provided that their resulting mutation rate is . Natural selection will then continue to favor decreased mutation rates until *U* reaches . We describe these dynamics in more detail in the *Discussion*.

Of course, this analysis crucially depends on the assumption that any mutation rate increases will still lie within the successive mutations regime. This will be true provided that . But, in order for a nonzero stable mutation rate to exist in the first place, we previously required that These two conditions can be satisfied only if(16)This is an extremely small region of parameter space, and it grows increasingly narrow as increases. As a result, the stable mutation rate in Equation (15) is actually a rapidly varying function of *N*, , and *ϵ*. For larger values of *ϵ* that violate the stringent constraints in Equation (16), mutator alleles will generically drive mutation rates into regimes where selective sweeps begin to interfere with each other. We now turn to an analysis of this case.

## The Clonal Interference Regime

When multiple beneficial mutations segregate at the same time, many potential drivers are lost due to competition with other, fitter genetic backgrounds. This reduces the rate of *successful* drivers in a way that depends on the relative values of and . In the regime most relevant for our current study, Desai and Fisher (2007) have shown that is reduced to(17)which is valid provided that , and . Similar expressions can be obtained for other parameter regimes, all of which share the weak dependence on (Fisher 2013). Since adaptation is no longer mutation-limited, one might guess that mutators will be less strongly favored in this regime. However, previous simulation studies (Tenaillon *et al.* 1999) and heuristic reasoning (Desai and Fisher 2011) suggest that the opposite can actually be true: clonal interference enhances the fixation of mutator alleles, even as they provide a diminishing overall benefit for the rate of adaptation.

In the following subsection, we introduce a traveling-wave formalism for calculating in the presence of clonal interference. Before doing so, however, it will be useful to consider this process from a heuristic perspective. This will allow us to identify the key forces and parameters involved, and will provide intuition for the more rigorous analysis that follows.

### Heuristic analysis

In the absence of deleterious mutations, clonal interference alters the dynamics of fixation in two main ways. First, successful mutations can only occur in the most highly fit genetic backgrounds in the population. These individuals lie in the extreme right tail, or “nose,” of the population fitness distribution, which steadily increases in fitness as the population adapts. In the regime described by Equation (17), the relative fitness of the nose is given by(18)which is much larger than the size of a single driver mutation. These individuals have already acquired more adaptive mutations than the average individual, but they are not yet destined to fix. The reason is that there are still enough individuals in the nose that they will collectively produce *multiple* additional driver mutations in the next generations, and these will occur in a relatively narrow time window (Desai and Fisher 2007). By definition, all but one of these mutations must eventually be outcompeted. But this means that the process of fixation within the nose takes place over *multiple* establishment intervals, each of length .

Since , genetic drift is not directly relevant for most of this fixation process. Random fluctuations in the lineage sizes are still important, but these are now driven by *genetic draft*, which arises from slight differences in the relative order of the next round of driver mutations. During most of this process, the lineages founded by different driver mutations make up less than of the total size of the nose. However, approximately once every establishments, an anomalously early driver mutation will occur and reach an fraction of the nose (Desai *et al.* 2013). This roughly coincides with a fixation event. In order to fix, a lineage must therefore (i) arise in the nose, (ii) persist for additional establishment intervals, and (iii) be lucky enough to hitchhike with the special “jackpot” driver event.

Mutation-rate modifiers can be incorporated into this picture in a straightforward way. The key quantities and *q* are only weakly dependent on the mutation rate. Provided that , the modifier versions of these parameters will be similar to the wild type, and the competition between these lineages will play out according to the same dynamics as above. The modifier is no more or less likely to arise in the nose compared to a neutral mutation. However, provided that it does occur in one of these special genetic backgrounds, a mutator lineage is *r* times more likely than a neutral lineage to generate an additional driver, while an antimutator lineage is *r* times less likely to do so. Since the modifier lineage must generate additional drivers to fix, its overall fixation probability is given by(19)Thus, we see that clonal interference increases the fixation probability of mutators by factors of *r*. This increase can be substantial for large *r*, even when (see Figure 2). From our discussion above, we see that this increase is primarily driven by the fact that *multiple* additional drivers are required for fixation.

In the presence of deleterious mutations, a mutator lineage again feels an effective cost of , so it will tend to decline in frequency relative to the other individuals in the nose. In particular, when the next burst of driver mutations arises, the mutator lineage will have decreased in frequency by a factor of , which makes it that much less likely to survive to the next round. Similarly, an antimutator lineage will have increased in frequency by an analogous factor of . The overall fixation probability then becomes(20)We discuss the regimes of validity of this expression in our more rigorous analysis below. For the purposes of our heuristic analysis, it will be more useful to focus on the implications of Equation (20).

Similar to the selective sweeps case, the direction of selection in Equation (20) is again determined by the product . There are three characteristic regimes of behavior. When , mutators will be favored provided that is less than a maximum value,(21)which corresponds to the largest value of *r* for which Conversely, when , antimutators will be favored above a minimum value(22)Similar behavior is obtained for is close to one, except that is now given by(23)When , the range of favorable modifiers vanishes, and mutators and antimutators are both selected against. The evolutionarily stable mutation rate is therefore given by(24)Note that this is actually an implicit relation for , since depends on . Substituting our expression for in Equation (17), and solving for , we find that(25)where the region of validity for Equation (17) [and hence Equation (25)] now becomes(26)This is still a restrictive parameter range for *ϵ*, although it is much broader than Equation (16) in the successive mutations regime, and it grows larger with increasing .

Provided that these conditions are met, we see that the stable mutation rate in Equation (25) is only weakly dependent on *N* and *ϵ*, and is much more strongly influenced by . This contrasts with the behavior we found in the successive mutations regime. In addition, we see that, unlike in the successive mutations regime, the stable mutation rate () does not necessarily coincide with the maximally permitted mutator or antimutator allele () when . This can have important consequences for the dynamics of mutation rate evolution in this regime. If the mutation rate starts far above or below , natural selection can favor modifier alleles that overshoot the stable value by a substantial amount, which can lead to a nonmonotinic approach to the stable point. We will revisit these dynamics in more detail in the *Discussion*.

### Formal analysis

We now turn to a more formal derivation of the results described above. To do so, we make use of “traveling-wave” formalism developed in previous work (Tsimring *et al.* 1996; Rouzine *et al.* 2003; Neher *et al.* 2010; Hallatschek 2011; Neher and Shraiman 2011; Good *et al.* 2012; Fisher 2013; Good and Desai 2014). This formalism focuses on the distribution of relative fitness within the population, which we denote by , and the fixation probability of a (wild-type) individual with relative fitness *x*, which we denote by . In the absence of mutator or antimutator alleles, we and others have previously shown that and satisfy the partial differential equations,(27a)(27b)where is the average rate of adaptation (Good and Desai 2014). Note that Equation (27) implicitly assumes that the deleterious mutations are purgeable, *i.e.*, they do not fix, and constitute a negligible fraction of the population. The rate of adaptation (or equivalently, ) is set by the self-consistency condition,(28)which is just another way of saying that the fixation probability of a neutral mutation must be equal to . For a detailed derivation of these equations, see Good and Desai (2014). Note that, because we have assumed that deleterious mutations are purgeable, enters into Equations (27) and (28) only as an overall shift in the mean fitness of the population. If we measure fitnesses using the shifted variable (*i.e.*, fitness relative to the average *unloaded* individual), then the evolution equations revert back to the purely beneficial case. We will adopt this convention from now on.

Even in the simplified model of Equation (27), there are many possible parameter regimes that one may consider (Fisher 2013). Here we will focus on a particular approximate solution (based on ideas introduced in earlier studies by Tsimring *et al.* (1996), Neher *et al.* (2010), and Hallatschek (2011)), which is thought to be relevant for many microbial evolution experiments (Desai *et al.* 2007; Perfeito *et al.* 2007; Wiser *et al.* 2013; Barroso-Batista *et al.* 2014). In this regime, the fitnesses of unloaded individuals are approximately normally distributed,(29)with a sharp cutoff at . This corresponds to the “nose” of the fitness distribution described above. Meanwhile, the survival probability can be approximated by the piecewise form(30)In this context, can also be thought of as an *interference threshold*. Lineages that are at relative fitness will fix provided they survive drift and establish; this occurs with probability . Below , the fixation probability drops off rapidly because lineages can establish but still be lost to interference. Once a lineage is more than below , it can essentially never catch up with the remaining lineages at the nose, and it will therefore have a negligible fixation probability. The location of is determined by the auxiliary condition(31)After substituting our expressions for and into Equation (28), the self-consistency condition becomes(32)Together with Equation (31), this completely determines *v* and as a function of the underlying parameters. Asymptotic formulae for these quantities are given in Equations (17) and (18); more accurate estimates can be obtained by solving Equations (31) and (32) numerically.

We have previously shown that this solution applies whenever , , and (Good and Desai 2014). The first of these conditions says that the parents of successful lineages must be highly fit, *i.e.*, that clonal interference is indeed present. The second condition, which also implies the third, says that the vast majority of individuals in the population have roughly the same number of driver mutations, *i.e.*, mutation pressure is not too strong. In terms of the underlying parameters, these conditions become , as we stated after Equation (17) above.

Once we have set up this traveling-wave formalism, mutation-rate modifiers can be added in a straightforward way. Since the fate of any allele is determined while it is rare, we can neglect the effects of the modifier on the wild-type population, so that and *v* are unchanged from above. When a modifier allele occurs, it will arise on a genetic background whose fitness is drawn from . We then introduce a new function, , describing the fixation probability of the modifier allele as a function of its initial fitness. This function satisfies a similar equation as , except with a different mutation rate:(33)Thus, in addition to the increase in , we see that a mutator lineage experiences an effective fitness cost corresponding to its increased deleterious load. This cost enters as a shift in the overall location of , which we can account for by defining a shifted function, . After this transformation, Equation (33) is of the same form as Equation (27b). Then, provided that the modifier mutation rate is still in the same regime as , the solution for the shifted function will have the same form as ,(34)except with a different interference threshold, , which satisfies(35)Like the wild-type interference threshold , the location of is independent of . Since mutators are favored when , we expect that whenever In other words, we expect that mutators have a lower interference threshold, since these lineages generate beneficial mutations more rapidly and, once in the nose, are therefore less likely to be lost to clonal interference.

In order for this solution to apply, it must satisfy the same conditions as the wild-type population: and . This will be true provided that is still close to , or, equivalently, that the fractional difference is small compared to 1. Given this assumption, we can divide Equation (35) by Equation (31), and solve for (36)Since in this regime, we see that this approximation will hold provided that . This places an upper limit on the range of modifier effects that we can consider. But since , this includes many realistically large modifiers of order

Given our solution for , we can compute the marginal fixation probability of the modifier by averaging over all the fitness backgrounds that the modifier could have arisen on:(37)We evaluate this integral in Appendix B. We find that(38)where we have employed the short-hand notation , , and . In the asymptotic regime we are considering, and *ν* are both small parameters. To leading order, Equation (38) converges to the simpler expression from our heuristic analysis, which we now recognize as the technical statement that(39)However, since the errors are multiplied by a large number *q* and exponentiated, the full version in Equation (38) is often required for quantitative accuracy.

We illustrate the full expression in Equation (38), and compare it to forward-time simulations in Figure 2 and Figure 3. The accuracy is generally good, although there are some systematic deviations for large *r* when the effects of the deleterious load are particularly costly. These are ultimately the result of two factors: (i) inaccuracies in our approximate solution for for , and (ii) the fact that is starting to approach . More accurate approximations for derived by Fisher (2013) could be used to improve the quantitative accuracy for these parameters; we leave this for future work.

## Discussion

In rapidly adapting populations, the fates of mutator and antimutator alleles depend on a careful balance between the benefits of hitchhiking and the cost of a deleterious load. Our analysis shows how the fixation probabilities of these mutation-rate modifiers depend on the population size, the beneficial and deleterious mutation rates, and the strength of selection. By searching for parameters where mutators and antimutators are both disfavored by selection, we can identify particular mutation rates, , that are stable under future evolution. At these mutation rates, the benefits of hitchhiking are exactly balanced by the costs of the deleterious load, so that neither mutators nor antimutators are favored. We now consider how a population will approach the stable mutation rate, and the implications of this mutation rate evolution for the rate of adaptation within the population. Finally, we describe the approximations we have made throughout our analysis and the resulting limitations of our approach.

### Approach to the stable mutation rate

Consider a situation in which the population starts with a mutation rate , and fixes a sequence of modifier alleles with mutation rates . What are the typical values of this mutation rate trajectory, under the hypothesis that each step was favored by selection?

In both the successional mutations and clonal interference regimes, we found that depends only on *N*, , and *ϵ*. This implies that the mutation rate will always evolve toward this unique stable point: . However, the manner in which the population approaches can be strongly influenced by clonal interference.

In the successional mutations regime, selection favors a monotonic approach to . Consider, *e.g.*, a mutation rate . In this case, we have shown that mutators will be able to fix, provided that the new mutation rate lies in the range . When one of these alleles fixes, the new mutation rate will be given by , and the process will repeat itself. Mutators will still be favored to fix provided that , which will lead to a , and so on. Thus, evolution will favor a monotonic sequence of mutator alleles until , after which the mutation rate will be stable to further changes. An exactly analogous conclusion holds if starts above here, selection will tend to fix antimutator alleles that lie in the range until . In both cases, mutator or antimutator alleles that “overshoot” the stable mutation rate are always disfavored. This is true even if such alleles would move the mutation rate closer to the stable rate; *e.g.*, an allele that takes a population from would not be positively selected.

When clonal interference is present, the approach to the stable mutation rate is more complex. This is easiest to see if we rewrite Equation (20) in terms of , *U*, and . Provided that , the quantities and will stay roughly constant over the relevant range of mutation rates, and we can approximate . This yields a simple heuristic formula,(40)which depends only on the scaled mutation rates , , and *q*. A more accurate (but more cumbersome) version can be obtained by substituting the full expressions for and into Equation (38). We illustrate this full expression in Figure 4 for a particular combination of and *ϵ*, although the important qualitative features are already contained in Equation (40).

If the population starts at a mutation rate , then only modest changes in the mutation rate will be favorable. However, if the population starts at a mutation rate , Equation (40) shows that mutators will be positively selected, provided that . While the most strongly beneficial modifier has , the range of favored mutators also includes values of that are *larger* than by a factor This means that selection can favor mutator alleles that overshoot the stable mutation rate by a substantial amount. If a mutator allele of this maximal strength fixes, the new mutation rate will be , and selection will immediately favor the fixation of antimutator alleles, despite the fact that the location of has not changed (see Figure 4 and Figure 5). If , these antimutator alleles can overshoot the stable mutation rate in the other direction, by a factor In the example above, this could then lead to , which is larger than the initial mutation rate but still much smaller than . Thus, while the population will still approach the stable mutation rate over time, this approach does not have to be monotonic, even in the absence of epistasis or environmental variation. Moreover, the functional form of Equation (40) implies that this behavior is highly asymmetric: an antimutator allele can overshoot by a larger amount (and with a larger ) than a mutator allele with the same value of . This can also be seen in Figure 4 and Figure 5.

### Application to a long-term evolution experiment in *E. coli*

Several studies have observed the spread of mutator alleles in microbial populations in the laboratory. One of the best-studied examples is Lenski’s long-term evolution experiment (the LTEE), where 12 replicate populations of *E. coli* have been propagated under constant conditions for more than generations (Lenski *et al.* 1991, 2015). Mutator phenotypes have fixed in six of the 12 populations over the course of this experiment, and have typically increased the mutation rate by -fold. The rate of adaptation in the mutator populations increased less than twofold, suggesting that clonal interference plays an important role (Wiser *et al.* 2013). In one of the 12 replicates, the dynamics of mutation rate evolution have been studied in more detail. Wielgoss *et al.* (2013) have shown that, soon after the fixation of the mutator allele, an antimutator phenotype fixed, which lowered the mutation rate by a factor of approximately two. The antimutator phenotype had two independent origins, suggesting that both it and the original mutator allele were favored by selection. Given these findings, it is interesting to compare various explanations for this behavior in the context of our theoretical analysis above.

One potential explanation for these findings is suggested by the “overshooting” behavior illustrated in Figure 4 and Figure 5. Although the precise evolutionary parameters of these populations are not known, Figure 5 shows that it is at least feasible to overshoot the optimum by a factor of 2 when the population starts at a mutation rate of . Note, however, that is not necessarily large in the reverse direction, so it remains to be seen whether this effect could efficiently select for lower mutation rates on the rapid timescales observed.

Another potential explanation for the mutation-rate reversal, initially suggested by Wielgoss *et al.* (2013), is that long-term epistasis reduced the effective advantage of hitchhiking later in the experiment, thereby shifting the location of below its initial value. Our analysis above does not account for these effects directly, since it assumes a constant value of *ϵ* and . However, as long as any epistasis manifests itself on timescales that are longer than the fixation time, we can still account for these effects in a crude way by examining how changes after a shift in or . For example, epistasis could lead to a reduction in *ϵ*, reflecting a diminished supply of beneficial mutations (or an increasing supply of deleterious mutations) as the population adapts. Due to the presence of clonal interference, Equation (25) suggests that rather drastic reductions in *ϵ* are required to cause a twofold reduction in . Alternatively, diminishing-returns epistasis could reduce the magnitude of , while leaving the beneficial mutation rate unchanged. Equation (25) suggests that this has much stronger effect on . Wiser *et al.* (2013) have recently proposed and fit a concrete model for how declines with fitness in the LTEE. These estimates suggest that declines by in the 4000 generations that separate the mutator and antimutator alleles. This would seem to be too small to account for a reduction in , although the magnitudes are sufficiently close that a careful comparison with simulations is required. This would be an interesting avenue for future work.

A third potential explanation is a direct fitness benefit to either the mutator or antimutator allele. While such benefits are difficult to measure directly (due to the confounding effects of the deleterious load), they can be incorporated into our model in a straightforward way. We can simply replace in our formulae for , where is the direct benefit of the modifier allele. Since the effect of the deleterious load is typically of order , even a small direct benefit () can shift the balance between hitchhiking and load in important ways.

### The evolution of mutation rates and the average rate of adaptation

Because we have assumed that deleterious mutations are purgeable, increasing *U* always increases the rate of adaptation, even though these increases may be small in the presence of clonal interference. Our results therefore suggest that mutation rates will evolve toward a stable mutation rate that is less than what would be optimal for the population (which, in this case, is technically ). Of course, as we increase mutation rates, the assumption that deleterious mutations are purgeable will eventually fail. Somewhere above this point, there will be an optimal mutation rate that maximizes the rate of adaptation [see, *e.g.*, Orr (2000)]. To confirm that the stable mutation rate is indeed below , we must verify that is still in the purgeable deleterious mutations regime. For the example in Figure 5, we can verify this directly, since the rate of adaptation continues to increase as the mutation rate passes through . This behavior will hold more generally provided that the typical cost of a deleterious mutation, , is much larger than . Since is typically smaller than in the regimes that we consider, this will indeed be the case whenever .

In these cases, the stable mutation rate is below the optimal mutation rate, which implies that the dynamics of the evolutionary process can sometimes favor changes in mutation rate that slow the adaptation of the population. Antimutator alleles can be favored even when their fixation will ultimately reduce the overall rate of adaptation. Conversely, mutator alleles can be disfavored even when their fixation would have increased the rate of adaptation.

Although our results are limited to the purgeable regime, a similar distinction between and may also apply even when deleterious mutations are no longer purgeable. We cannot prove this conjecture in our present framework, but it is an interesting hypothesis for future work.

### Limitations to our analysis

We have made a number of key assumptions throughout our analysis. Most crucially, we have assumed that deleterious mutations are purgeable. For this to be true, two conditions must be met: the deleterious mutations cannot fix, nor can they affect the overall rate of adaptation by reducing the fixation probabilities of beneficial mutations. This is a slightly stronger condition than the “ruby in the rough” approximation used by earlier authors (Charlesworth 1994; Peck 1994). Our previous work on the effects of deleterious mutations in adapting populations provides a detailed analysis of when these conditions will hold, and of the effects of deleterious mutations on adaptation when the purgeable assumption fails (Good and Desai 2014). This earlier work suggests that deleterious mutations are likely to be purgeable in most microbial evolution experiments, though recent experimental work hints that this may not always be the case (McDonald *et al.* 2016). Here, we first summarize the conditions under which deleterious mutations are purgeable, and then describe the potential effects of deleterious mutations when this condition is violated.

In our earlier work, we showed that deleterious mutations with fitness cost will typically not fix provided that , where is the coalescence timescale (Good and Desai 2014). Since in all the regimes that we consider, this will be true provided that is not much smaller than . Deleterious mutations can also hinder the fixation of beneficial mutations that arise in less-fit backgrounds. However, provided that , deleterious variants will be rapidly eliminated from the population, and most genetic backgrounds will be free of deleterious mutations. Assuming typical values of and for microbial evolution experiments (Wloch *et al.* 2001), this condition will often be met.

When deleterious mutations have small enough fitness costs that they are not purgeable, our quantitative results all break down. However, sufficiently weakly deleterious mutations cannot affect mutator or antimutator dynamics on the relevant timescales, so they are effectively neutral from the point of view of mutation rate evolution. On the other hand, sufficiently strongly deleterious mutations are purgeable. Thus, it is only some range of deleterious mutations between these limits whose effects are both important to mutation rate evolution and not described by our analysis. Since these mutations are less deleterious than purgeable mutations, they must, by definition, be less unfavorable to mutator alleles (and less favorable to antimutators). Thus, they do not affect the fate of a modifier of mutation rates as strongly as a corresponding purgeable mutation would. This suggests that we can qualitatively understand the effects of nonpurgeable deleterious mutations in adapting populations by weighting them less heavily than purgeable ones, so that the total effective deleterious mutation rate is actually somewhat less than . Of course, this is only a rough intuition. To analyze the effects of nonpurgeable deleterious mutations more fully, we need to include them in our traveling wave framework. Our earlier work explains how these mutations affect , , , and *v*, which provides a potential starting point for these calculations (Good and Desai 2014). More generally, we may wish to consider the evolution of mutation rates in the limit where nonpurgeable mutations become so important that the population no longer adapts, and instead Muller’s ratchet plays a crucial role. These are interesting but complex directions for future work.

In addition to these key limitations of our analysis, we have also made a number of more technical assumptions. For example, we have assumed that beneficial mutations all provide the same fitness benefit, . In reality, beneficial mutations will have a range of different fitness effects, drawn from some distribution. However, earlier work has shown that in this case the evolutionary dynamics can often be summarized using a single effective beneficial fitness effect, and corresponding effective beneficial mutation rate (Desai and Fisher 2007; Good *et al.* 2012; Fisher 2013). Thus our analysis can be applied to this situation using the appropriate effective and .

In our analysis of clonal interference, we focused on a regime in which . The latter condition implies that clonal interference is not exceptionally strong, and is often a good approximation for microbial evolution experiments at wild-type mutation rates. For some large effect mutators, we start to approach the boundary of this regime, and our quantitative expressions become less accurate (see Figure 5). In principle, we could extend our analysis to the “high-speed” [] or “mutational-diffusion” regimes [] by using analogous solutions for and derived by Fisher (2013) or Hallatschek (2011). We leave this for future work.

Throughout our analysis, we have assumed that modifier effect sizes can be large but not exceptionally so [*e.g.*, in the clonal interference regime, ]. Since , this includes many realistically large mutators of order , and, in practice, our expressions appear to work reasonably well even when (see Figure 3). For sufficiently large *r*, we may also encounter situations where modifiers switch from one regime to another (*e.g.*, from the clonal interference to successive mutations regime, or from to ). Our quantitative predictions for break down in all of these cases. However, the evolutionarily stable mutation rates are defined by the behavior of in the local neighborhood of , so the location of is not influenced by this problem. If the population starts sufficiently far from , then the fixation probabilities of the first few are not well-described by our analysis, but we know that the mutation rate must still approach (in a possibly nonmonotonic manner). Eventually, the mutation rate will become close enough to that our expressions start to apply, and the remaining steps of the mutation trajectory can be predicted.

Finally, we have assumed throughout that modifier mutations are sufficiently rare that their fates are determined independently. In other words, we have neglected clonal interference *between* different modifier mutations. For deleterious or weakly beneficial modifiers, this will often be a good approximation. However, for strongly beneficial modifiers, this requires that the establishment time between successive mutators is large compared to the fixation time. In the clonal interference regime, , and this condition becomes . This can sometimes be violated for strongly beneficial mutator alleles with a large target size (*e.g.*, loss-of-function mutations in multiple genes). In these cases, our quantitative predictions become inaccurate, although the overall direction of selection [*i.e.*, whether or ] will remain unchanged.

### Conclusions

Our analysis has explained how the interplay between beneficial and deleterious mutations in adapting populations creates indirect selection pressures on modifiers of mutation rates, and we have shown how these indirect selection pressures affect the fates of mutator and antimutator alleles. Our analysis of the successional mutations regime follows the logic of earlier work (Andre and Godelle 2006; Desai and Fisher 2011), balancing the probability a new beneficial mutation arises in a mutator, or antimutator, background with the effects of the modifier allele on the accumulation of deleterious load. We have also studied rapidly adapting populations where clonal interference is widespread. Our approach to this question builds on the traveling wave framework we and others have recently introduced (Neher *et al.* 2010; Hallatschek 2011; Neher and Shraiman 2011; Good *et al.* 2012; Fisher 2013; Good and Desai 2014). In this framework, analyzing the fate of an allele modifying mutation rate is similar in spirit to calculating the fixation probability of any lineage: we must solve the same equation for , except instead of a mutation changing the fitness *x*, it changes the mutation rate *U*. This general framework can also be applied to analyze indirect selection pressures that act on other modifiers of the evolutionary process. For example, we could analyze the fate of a mutation that changes the distribution of fitness effects of new mutations by solving for for an allele that modifies . Our analysis does not need to be limited to adapting populations, since the traveling wave framework applies whenever interference selection is widespread, even if the population is not adapting on average. Of course, in practice, some modifiers and parameter regimes will lead to equations for that are analytically tractable, while others will not. Thus further work is needed to more fully understand both the limits and promise of this approach.

## Acknowledgments

We thank Ivana Cvijović for useful discussions and helpful comments on the manuscript. Simulations in this article were run on the Odyssey cluster supported by the Faculty of Arts and Sciences Division of Science Research Computing Group at Harvard University. This work was supported in part by the Simons Foundation (grant 376196), grant PHY 1313638 from the National Science Foundation, and grant GM-104239 from the National Institutes of Health.

## Appendix A: Exact Solution of Equation (3)

In this section, we evaluate the integral

where and , and satisfies the Langevin dynamics(A2)with . Using integration by parts, we can rewrite this integral as(A3)where we have used the short-hand . To evaluate this integral, we must first solve for the generating function for the *weight* , which is defined by(A4)This can be done using techniques described in Weissman *et al.* (2009). Briefly, we first introduce the joint generating function,(A5)By Taylor expanding and using the Langevin dynamics in Equation (A2), we can show that obeys the partial differential equation(A6)which can be solved using the method of characteristics. The characteristic curves satisfy(A7)and the solution is given by(A8)The marginal generating function for can then be obtained by setting (A9)Substituting this expression into the integral in Equation (A3), we find that(A10)(A11)where is the digamma function. Asymptotic expansions for small and large arguments yield

## Appendix B: Derivation of Equation (38)

To obtain an expression for , we must evaluate the integral

using the expressions for and in Equations (29) and (34) in the main text. This is straightforward, although the algebra is somewhat tedious. Since and are piecewise functions, the form of the integrand will depend on whether is greater or less than and .

We first consider the case where . In this case, the mutator interference threshold is reached before the nose, so that(B2)After multiplying both sides by , we obtain(B3)Evaluating the remaining integrals and substituting , we find that(B4)where . The last two terms are only relevant for extremely high deleterious mutation rates where . After neglecting these terms, we obtain Equation (11) in the main text.

Now we consider the case where . In this case, the edge of the fitness distribution is reached before the mutator interference threshold. If , then the edge of the fitness distribution never even makes it to the region where is positive, and On the other hand, if lies between and , then(B5)Multiplying by , and evaluating the integrals, we obtain(B6)In practice, the differences between Equations (B4) and (B6) do not become very pronounced until is already quite small. For simplicity, we will therefore use Equation (B4) for the full range of *r*.

## Appendix C: Forward-Time Simulations

We validate several of our key approximations by comparing our predictions with the results of forward-time simulations similar to the standard Wright-Fisher model. In these simulations, the population is divided into fitness classes depending on the number of beneficial mutations (), and the number of deleterious mutations (), that each individual possesses. The simulation starts with a homogeneous population of *N* individuals with and To form the next generation, each individual produces a Poisson number of identical offspring with mean , a Poisson number of beneficial offspring () with mean , and a Poisson number of deleterious offspring () with mean , where is a constant recalculated at each generation to ensure that the expected number of offspring for the entire population is *N*. Starting from the initial population, all simulations are allowed to “burn-in” for generations before any measurements are made.

To measure the fixation probability of a modifier allele, we modify this basic algorithm so that new modifier offspring are produced from wild-type individuals at rate *μ*. These modifier lineages reproduce the same way as wild-type individuals except with and . Further modifications to the mutation rate, or reversion to the wild type, are not allowed. To minimize the effects of initial conditions, *μ* is artificially fixed at zero until the burn-in period has elapsed. We then record the number of generations, *T*, between the end of the burn-in period, and the fixation of the modifier phenotype. In the limit that , this is related to the fixation probability through the relation

To ensure that *μ* is chosen to be small enough for Equation (C1) to apply, we repeat this process times, with a sequence of modifier mutation rates , and a sequence of fixation times . The mutation rate at step *i* is chosen based on the previous , so that the predicted value of is generations:(C2)The value is chosen because, for the parameters considered here, , and the mutation rates are capped at to minimize the correlated effects of Muller’s ratchet (Neher and Shraiman 2012). The sequence is started with , and is allowed to “burn-in” for 10 iterations before 's are recorded. The fixation probability is calculated from the maximum likelihood estimator,(C3)A copy of our implementation in Python is available at https://github.com/benjaminhgood/mutator_simulations.

## Footnotes

*Communicating editor: S. I. Wright*

- Received July 7, 2016.
- Accepted September 13, 2016.

- Copyright © 2016 by the Genetics Society of America