A fundamental problem of asexual adaptation is that beneficial substitutions are not efficiently accumulated in large populations: Beneficial mutations often go extinct because they compete with one another in going to fixation. It has been argued that such clonal interference may have led to the evolution of sex and recombination in well-mixed populations. Here, we study clonal interference, and mechanisms of its mitigation, in an evolutionary model of spatially structured populations with uniform selection pressure. Clonal interference is much more prevalent with spatial structure than without, due to the slow wave-like spread of beneficial mutations through space. We find that the adaptation speed of asexuals saturates when the linear habitat size exceeds a characteristic interference length, which becomes shorter with smaller migration and larger mutation rate. The limiting speed is proportional to μ1/2 and μ1/3 in linear and planar habitats, respectively, where the mutational supply μ is the product of mutation rate and local population density. This scaling and the existence of a speed limit should be amenable to experimental tests as they fall far below predicted adaptation speeds for well-mixed populations (that scale as the logarithm of population size). Finally, we show that not only recombination, but also long-range migration is a highly efficient mechanism of relaxing clonal competition in structured populations. Our conservative estimates of the interference length predict prevalent clonal interference in microbial colonies and biofilms, so clonal competition should be a strong driver of both genetic and spatial mixing in those contexts.
ONE of the most basic questions of evolutionary biology that can be studied in controlled evolution experiments is: How fast do microbial populations adapt to new environments by the accumulation of beneficial mutations? Traditionally, it was thought that the accumulation process is limited by the supply of beneficial mutations (Novick and Szilard 1950; Atwood et al. 1951). If a rare beneficial mutation arises and becomes sufficiently frequent, it will expand rapidly until it is present in all individuals of the population. After the completion of such a selective sweep, the population is stationary again until the next beneficial mutation arises. The accumulation rate of beneficial mutation should thus be controlled by the appearance of new beneficial mutations, which is proportional to the population size and the mutation rate (Sniegowski and Gerrish 2010). This scenario of “periodic selection” crucially rests on the assumption that mutation rates are so small that beneficial mutations occur strictly sequentially. Evolution experiments of the last decade have shown, however, that beneficial mutation rates in microbes can be as large as 5 × 10−5 per genome per generation in the bacterium Escherichia coli with typical fitness effects in the range of a few percent (Desai et al. 2007; Perfeito et al. 2007; Sniegowski and Gerrish 2010). These high mutation rates have severely restricted the parameter range of periodic selection. Multiple selective sweeps, simultaneously in progress, are characteristic of a well-mixed microbial population containing >104 cells (see, e.g., estimates in Appendix D). In the past, most evolution experiments had effective population sizes exceeding this threshold, including the Lenski experiment, in which the bacterium E. coli was evolved for >50,000 generations to adapt to minimal medium (De Visser et al. 1999; Miralles et al. 1999; Barrick et al. 2009). Accordingly, these experiments did not reveal a linear relation between adaptation speed and population size, but instead a much weaker dependence indicative of a mechanism of “diminishing returns” (De Visser et al. 1999; Shaver et al. 2002). The reason for this behavior is that multiple beneficial mutations arise on different genetic backgrounds simultaneously and compete with one another for sweeping through the population. As a consequence, only a small number of arising beneficial mutations can go to fixation, and most of them are lost in the competition between clones. It was first noted by Fisher (1930) and Muller (1932) that sex relaxes clonal competition and speeds up the process of adaptation as it allows beneficial mutations to be combined in a single genome even if they first appeared in different lineages. Today, this Fisher–Muller advantage of sex is one of the most important explanation for why most organisms engage in some form of genetic exchange.
In recent years, extensive research efforts have been invested in modeling clonal interference (Rouzine et al. 2003; Desai and Fisher 2007; Park and Krug 2007; Hallatschek 2011) with the goal to quantitatively explain adaptation speeds and genetic diversities measured for well-mixed microbial populations (Gerrish and Lenski 1998; Desai et al. 2007). Although theoretical models depend on largely unknown effect distributions of beneficial mutations, they all predict that key quantities of interest, including the speed of adaptation and the genetic diversity, depend only weakly (logarithmically) on population size and mutation rates. While these predictions shape our current view of adaptation in large asexual populations, they primarily apply to well-mixed test tube populations of microbes. Many naturally occurring microbial populations attach to surfaces in the form of biofilms exhibiting a pronounced spatial structure (Tolker-Nielsen and Molin 2000; Watnick and Kolter 2000). Even in laboratories microbes are routinely grown on agar plates by which they acquire spatial structure. More complicated spatial structure is relevant to the evolution of viral populations such as influenza and severe acute respiratory syndrome (SARS), which spread in space via human transportation networks (Hufnagel et al. 2004; Eggo et al. 2011). Moreover, there is increasing evidence that spatial structure is also important to the evolutionary processes involved in cancer progression (Greaves et al. 2006; Merlo et al. 2006; Salk et al. 2009). One might thus wonder whether clonal interference is relevant to spatially extended populations and, if so, whether its effect on adaptation differs from that in well-mixed populations.
Although some simulation studies have considered the case of structured populations with one individual per deme (Gordo and Campos 2006; Gonçalves et al. 2007), a crucial fact about spatially continuous populations has been ignored so far: Beneficial mutations spread in the form of waves, first described by Fisher (1937) and Kolmogorov et al. (1937). Since these adaptation waves spread at a constant speed, mutant clones grow linearly with time and hence much slower than they would if they grew in well-mixed populations. The slowness of selective sweeps has the consequence that growing clones are much more likely to interfere than in well-mixed populations. The link between slow adaptive waves and the potential importance of clonal interference was in fact anticipated by Fisher (1937) in his seminal article on “The wave of advance of advantageous genes.” Therein, he considers the concrete example of a mutation with selective advantage s = 1% spreading along a continuously occupied shoreline. He estimates that spreading over 100 miles might take 10,000–100,000 generations and concludes that “. . . at any one time, the number of such waves of selective advance, simultaneously in progress, must be large” (Fisher 1937, p. 367). This suggests to consider a scenario of clonal interference characterized by many interfering adaptation waves, as illustrated in Figure 1B.
In the following, we will use simulations and analytical arguments to answer the following key questions: What is the effect of interfering Fisher waves on the speed of adaptation and the genetic diversity in an asexual population? Are there simple mechanisms of mitigating clonal interference and thus accelerating adaptation? Are these effects relevant to microbial colonies and biofilms, and perhaps measurable in evolution experiments?
Model of Interfering Adaptation Waves
To investigate clonal interference in spatially extended populations, our model is set up to allow for an efficient simulation of many adaptation waves simultaneously in progress. The habitat is modeled by a lattice with periodic boundary conditions and is always fully populated. Each lattice site records the genetic identity of the locally dominating clone in a subpopulation. This approximate description of the gene pool of the whole population, which accelerates our simulations, is appropriate if subpopulations are strongly dominated by single clones; see Appendix A and Discussion for an explicit range of validity.
We first consider a linear habitat, where the genetic state of the population is represented by a lattice of length L. Natural selection is implemented so that adaptation waves run across sites and shift the spatial extent of different clones: In each generation, the clone at site i is replaced by the neighboring clone at site j with a probability proportional to 1 + sgn(ΔW)c(|ΔW|), where c is the speed of a Fisher wave driven by a fitness difference ΔW = Wj − Wi, where the sign of ΔW defines the direction of the wave.
In effect, the replacement rule generates adaptation waves traveling at an average velocity given by c = c(ΔW). The function c(ΔW) is chosen to represent the classical Fisher wave speed (with migration rate m = ), valid for large populations with negligible genetic drift (Fisher 1937; Kolmogorov et al. 1937). Alternatively, we use the wave speed given by c ∼ 2ρmΔW for strong noise (Doering et al. 2003; Hallatschek and Korolev 2009) with ρm = 1, where ρ represents the population density (see Appendix A for further details).
To simulate planar habitats, we represent the population on a hexagonal lattice of size L × L. Natural selection is implemented such that the clone at site i is replaced by one of the neighboring clones j with a probability proportional to its fitness, Wj (normalized by the fitness summed over all neighbors). This replacement rule generates adaptation waves traveling at an average velocity with a migration rate of m = , consistent with the classical Fisher–Kolmogorov wave theory (see also Appendix A).
New beneficial mutations appear at a rate μ per site (μ thus represents the product of population density ρ and beneficial mutation rate per genome) and survive genetic drift with probability 2s (Maruyama 1974), where s is the selective fitness advantage. When a mutation becomes established at position i, the fitness Wi of the i-th clone is updated according to Wi(t) = Wi(t − 1) ⋅ (1 + s). Effects of epistasis are absent, and the habitat is homogeneous such that selective pressures are the same throughout the habitat. The selective fitness advantage s of a mutation is chosen to be either constant s0 or drawn from a distribution with mean value s0. We choose either an exponential distribution or a hump-shaped gamma distribution with shape parameter k = 2, which are both frequently used in theoretical evolution models (Eyre-Walker and Keightley 2007). See inset in Figure 3 for the shape of these distributions.
In our simulations of interfering Fisher waves, we vary the following parameters: the beneficial mutation rate μ per site, the habitat size L, the relation between speed c of traveling waves and fitness differences (linear habitat only), and the mean selective fitness advantage s0.
Speed limit for asexuals
Starting from a population devoid of genetic variation, our model generates a population of genotypes that differ in the amount and type of beneficial mutations they carry. After a transient period, in which genetic diversity builds up, the mean fitness in the population increases at a steady pace by the fixation of beneficial mutations. We quantify the speed V of adaptation by the mean fitness increase per generation.
Figure 2 shows the adaptation speed as a function of the habitat size L for various mutation rates and selection coefficients. For small habitat sizes, the adaptation speed is linear in the habitat size, which indicates that adaptation is limited by the occurrence of mutations: Doubling the habitat size, and thus the influx of beneficial mutations, doubles the adaptation speed. We thus recover the classical regime of periodic selection, in which beneficial mutations sweep strictly sequentially, as illustrated in Figure 1A. In this regime, the dynamics of individual sweeps are irrelevant, and the mean fitness increase per generation is given by , just as in the corresponding well-mixed case. However, as the habitat size exceeds a characteristic length scale Lc, we observe that the adaptation speed begins to saturate due to clonal interference. For very large systems, the adaptation speed approaches a limiting value Vmax.
These results call for an explanation of the “interference” scale Lc at which clonal interference sets in and of how the magnitude of the “speed limit” Vmax depends on the parameters, in particular mutation rates and Fisher wave speed. It turns out that—for constant fitness effects—relatively simple estimations can be given for both quantities.
The condition for clonal interference, and hence the characteristic scale Lc, can be determined by comparing two important timescales. The first timescale estimates how long it takes for a single adaptation wave to run across the length L of the habitat. This fixation time is given by tfix ≈ L/c0, where the traveling speed c0 depends on the selection coefficient s0. The second timescale is the waiting time tmut for a new beneficial mutation to become established, which is inversely proportional to both the establishment probability 2s and the rate μbLd at which beneficial mutations appear. The latter depends on the dimension d of the habitat (d = 1 and d = 2 for a linear and a planar habitat, respectively). We thus obtain the mutation waiting time tmut ≈ (2s0Ldμ)−1. Now, the mode of adaptation depends on the relative size of tfix and tmut: If the fixation time is smaller than the mutation waiting time (tmut < tfix), we expect periodic selection. In the opposite case, we expect clonal competition. The crossover from periodic selection to clonal interference occurs just when tfix ∼ tmut, implying a characteristic ”interference scale” of(1)
Next, we use this estimate to find an approximate expression for the adaptation speed limit Vmax observed in Figure 2. To this end, it is convenient to express the adaptation speed in the following general form:(2)The factor on the right-hand side is the rate at which beneficial mutations accumulate in the absence of clonal interference. The function F represents the probability that a mutation reaches fixation once established. When mutations arise sequentially, each established mutation also reaches fixation; hence F ∼ 1 for small habitats, L ≪ Lc. On the other hand, for large habitats (L ≫ Lc) clones interfere and F becomes very small. How small? Given that the adaptation speed saturates at large system sizes, as inferred from our simulation results in Figure 2, we must require that the habitat length L drops out of Equation 2 for large habitats. This can occur only if F(L/Lc) ∼ (Lc/L)d for large L. Hence, we estimate the speed limit for large systems by(3)It can be seen from Equation 2 that the scaled adaptation speed V/V0 with V0 = s0c0/L should be a unique function of the scaled mutation rate μ/μ0 with μ0 = c0/(2s0Ld+1). Therefore, if we plot our data sets for different parameters in one figure using axes V/V0 and μ/μ0, they should all collapse on a single master curve. The resulting scaling plots for linear and planar habitats are shown in Figure 3. Data corresponding to constant selection coefficients (Figure 3, black line) do indeed collapse on a single master curve, even though we varied μ, s0, and the type of adaptation wave (weak/strong genetic drift). Note also the transition from a linear regime to the sublinear regimes with power law exponents and in one and two dimensions, respectively. These exponents are consistent with our prediction in Equation 3.
Moreover, Figure 3 displays results for simulations in which the fitness effects of new mutations were drawn from an exponential and a gamma distribution (with shape parameter k = 2), respectively. Note that although data for the different distributions follow our scaling predictions individually, they are slightly shifted with respect to each other. This indicates that the prefactors of the adaptation speed depend on the tails of the distribution. Broader distributions tend to yield larger adaptation speeds, apparently because they will more frequently give rise to the sampling of unusually large-s clones that outcompete average-s clones. As a consequence, fixations are more frequent and the adaptation speed is higher (see also Appendix F). The effect is stronger in the clonal interference than in the periodic selection regime.
Both the existence of a speed limit Vmax for large habitats and its dependence on mutation rates contrast with the well-mixed case, where the adaptation speed depends logarithmically on both population size and mutation rates (see Discussion).
To further characterize the clonal interference regime, we have measured fixation times and fitness correlations in linear habitats as a function of habitat size; see Appendixes B and C. We find that tfix ∼ L3/2 for clonal interference and that the fitness variance scales linearly with the size of the habitat. Both observations suggest that our model shares some universal features with certain types of crystal growth models that have been extensively studied in physics (Kardar et al. 1986).
Genetic and spatial mixing
We investigate two strategies of mitigating clonal interference and increasing adaptation speeds. The classical solution to clonal interference is recombination, which allows us to combine beneficial mutations arising on different genetic backgrounds. For well-mixed populations, it has been predicted that the adaptation speed increases with the recombination rate as V ∝ r2 for large populations where clones interfere (Neher et al. 2010). This result demonstrates that increasing recombination rates can strongly speed up evolution. The effectiveness of this strategy can be appreciated from Figure 4A, which displays the adaptation speed as a function of habitat size for various recombination rates. To generate these data, we have incorporated recombination in our simulations as follows. Recombinants are produced at a rate r per site from two (haploid) individuals randomly selected from the nearest neighborhood in the parent generation (selfing is allowed). Recombinant genotypes are formed from the parental genotypes by one-point recombination: The parent chromosomes are paired and a point is randomly chosen by which both chromosomes are split. The mutations to both sides of the recombination point are then exchanged between the two chromosomes, forming the recombinant genotype (see Appendix A for further details).
Next, we study spatial mixing as an alternative way of mitigating clonal interference. Spatial mixing is implemented so that any given clone competes at a small rate ml with a randomly chosen clone rather than with its immediate neighbor (see Appendix A for details of the implementation). The effect of these long-range jumps on the adaptation speed is shown in Figure 4B. Long-range migration has no effect on the periodic selection regime, because it does not change the limiting supply of beneficial mutations. In the clonal interference regime, however, adaptation speeds are continually increasing as a function of jump rate. The effect is large: Allowing for long-range jumps only every 1000 generations approximately yields a fivefold increased adaptation speed in the clonal interference regime.
Long-range migration seems to be an efficient way to mitigate clonal interference, which has significant consequences for evolution (see Discussion). The basic mechanism can be understood as follows: Long-range migration allows clones to replicate themselves at different locations, which enables them to effectively grow at a faster rate. In other words, a clone that has led to a second “seed” by means of long-range migration will grow twice as fast as a clone with only one seed. A third seed triples the growth rate, etc. In effect, when clonal interference is absent, one obtains exponential growth with the rate(4)which depends on the selection coefficient through the wave velocity relation c0 = c(s0). A derivation for this rate is given in Appendix E. Note that the resulting exponential growth of mutants mirrors selective sweeps in well-mixed populations, however, with a quite different effective selection coefficient se (see also Discussion). Importantly, the resulting fast Malthusian growth accelerates the fixation process of beneficial mutations. As a consequence, mutations that would interfere without long-range migration are now more likely to arise sequentially. Consistent with this view, Figure 4 shows that the linear regime of periodic selection is extended to a larger parameter range as the long-range migration rate is increased. Of course, for sufficiently large habitat sizes, clonal interference also occurs in the presence of long-range migration. However, the number of simultaneously competing clones is reduced due to the faster sweeps. This leads to generally larger adaptation rates than for clonal interference without long-range migration, as seen in Figure 4.
A crucial feature of adaptation in spatially extended populations is that advantageous mutations spread through the population by means of traveling waves (Fisher 1937; Kolmogorov et al. 1937). This wave-like motion implies that clones grow at a constant speed, in strong contrast to the accelerating logistic growth characteristic of well-mixed populations. We have established a model of adaptation by waves to investigate the rate at which beneficial mutations are accumulated in spatially extended populations. Our model represents the gene pool of local subpopulations by the locally dominating genotype. This representation is appropriate when clonal interference is absent within these subpopulations. This sets an upper bound to the beneficial mutation rate (see Appendix A). When the beneficial mutation rate is still larger, one obtains a hybrid scenario of clonal interference on a local well-mixed scale and clonal interference by adaptation waves (E. A. Martens and O. Hallatschek, unpublished results). With the assumptions of a homogeneous habitat (constant selection pressure), short-range migration, and negligible epistasis, our model is a first step toward understanding clonal interference in spatially structured populations, such as microbial colonies or biofilms. We expect that similar phenomena to those reported here may be observed in other rapidly evolving populations. Many natural populations are, however, characterized by more complex migration patterns than the stepping-stone model used in our work, for instance, influenza or SARS (Hufnagel et al. 2004; Kaluza et al. 2010; Eggo et al. 2011). For concrete predictions, the existing model needs to be extended to include these migration networks. There is also increasing evidence that spatial structure is relevant in evolutionary processes related to cancer progression (Salk et al. 2009), suggesting that similar modeling approaches may apply (Martens et al. 2011). In the following, we estimate under which conditions clonal interference might be observable in microbial colonies and discuss our key predictions and how they may be verified in microbial evolution experiments.
Clonal interference generally occurs when fixation times are larger than the waiting time for new beneficial mutations to arise in the population. Since adaptation waves travel at a constant speed and thus advance much more slowly than Malthusian sweeps, fixation times are much larger in spatially extended than in well-mixed populations. As a consequence, the regime of clonal interference is inflated in structured populations compared to well-mixed ones. The condition for clonal interference is made explicit by comparing the typical fixation time of a single beneficial mutation with the waiting time for it to arise. We found that clonal interference occurs when the habitat size is larger than a characteristic size Lc. The interference scale Lc not only defines a crossover, but also measures the typical distance an adaptation wave travels freely before it collides with another wave that arose independently of the first one. The tug-of-war between adaptation waves and the role of the interference scale Lc can be discerned from the simulation results in Figure 5.
Our simple quantitative expression in Equation 1 for the interference scale was found to increase with the speed of adaptation waves and decrease with the rate of beneficial mutations per site. A similar characteristic length appears in the context of soft sweeps, where μ has to be interpreted as the establishment rate for a particular beneficial mutation (Ralph and Coop 2010). For L > Lc, multiple clones carrying the same mutation collide and subdivide the habitat into patches of size Lc (see Figure 5A). This situation is similar to our case of s = const. when μ is interpreted as the occurrence rate of a particular adaptation; in our model, however, waves belonging to different adaptations keep interfering with each other until one clone has reached fixation in the habitat.
To see whether clonal interference is relevant to microbial colonies, we estimate the order of magnitude of Lc (for details see Appendix D). We assume fitness effects on the order of 1% and mutation rates in the range of 10−6–10−4 per genome and generation. Given the large variance of the thickness of biofilms and microbial colonies, we consider a range of cell densities between 1 and 1000 per square cell diameter. This yields a mutation supply rate μ that lies within the range of 10−6 and 10−1 per square cell diameter. For s = 1%, we estimate a deterministic Fisher wave speed c0 in the range of 0.1–10 cell diameters per generation by assuming that microbes in dense colonies disperse between 1 and 100 cell diameters per generation. Thus, we estimate Lc = (c0/(2s0μ))1/3 to lie within 4 and 800 cell diameters. The characteristic interference length scale Lc = (c0/(2s0μ))1/2 for a linear habitat could be relevant for expanding microcolonies. For instance, it has been shown that the expansion of a colony is driven by a thin layer of pioneer cells (Hallatschek and Nelson 2010; Nadell et al. 2010). This pioneer population at the expanding front evolves in an effectively linear habitat. Applying similar estimates to the above to the context of an expanding microbial colony, we estimate an interference scale between 50 and 500 cell diameters. Importantly, the length scales both for planar and for linear habitats are much smaller than typical sizes of microbial colonies or biofilms. We thus expect clonal interference to be a widespread phenomenon—rather than the exception—in growing dense cellular clusters that conserve their spatial structure over time.
Our most important predictions for the regime of clonal interference may be summarized as follows: When the habitat size exceeds a characteristic length Lc, multiple adaptation waves are simultaneously in progress at any one time because beneficial mutations frequently arise on different genetic backgrounds. Which of the competing clones prevails and reaches fixation depends on a tug-of-war between interfering Fisher waves. Remarkably, we find that in this “frustrated” situation, increasing the habitat size does not only lead to an increased loss of beneficial mutation: The adaptation speed actually even saturates to a limiting value Vmax, which is seen for both constant and distributed selective coefficients (see Appendix F). This speed limit of adaptation and its dependence on parameters were not revealed in a previous study (Gordo and Campos 2006), seemingly because too small habitats were simulated. A simple approximate expression for the speed limit is provided by Equation 3. In particular, we found that Vmax is proportional to μ1/2 and μ1/3 in linear and planar habitats, respectively. This scaling was shown to hold for various distributions of fitness effects, including constant and exponentially distributed fitness effects, for very noisy and for deterministic Fisher waves. We may thus conclude that the predicted power laws are very robust toward details of the underlying model.
Both the speed limit and the robust power law scaling with the mutation rate are in stark contrast to well-mixed standard models, which predict that adaptation speeds never saturate and depend logarithmically on both population size and mutation rates. An experimental comparison of adaptation in well-mixed and spatially structured habitats should therefore yield qualitatively and quantitatively different results. Experiments have already found qualitative differences in the adaptation dynamics of well-mixed and structured E. coli populations (Habets et al. 2006, 2007; Perfeito et al. 2008). Our model suggests that the speed limit of adaptation, Vmax, may be detected and thus quantified by varying the size of the microbial colony. The power law dependence of the limiting adaptation speed on both mutation rate and population density could be tested by comparing wild type with a mutator strain (Shaver et al. 2002) and by manipulating the thickness of the colony, respectively.
The prevalence of clonal interference and the associated speed limit raises the question of how clonal interference may be overcome in spatially extended populations. Certainly, genetic exchange [in bacteria through lateral gene transfer (Cooper 2007)] provides a mechanism that lessens clonal interference as it allows one to bring together beneficial mutations that arose on different genetic backgrounds; see Figure 5B. This hypothesis, originally formulated by Fisher (1930) and Muller (1932), is corroborated by our simulation results in Figure 4A. However, our simulations also show that spatial mixing is a very efficient alternative mechanism of mitigating clonal interference in structured populations. Spatial mixing increases the growth rate of fitter clones and thus lowers fixation times. Lowering fixation times increases the likelihood for beneficial mutations to occur sequentially on the same background, and therefore clonal interference is attenuated. In our simulations, spatial mixing was introduced by allowing for a small rate of long-range jumps. These jumps enable clones to replicate themselves at distant locations in the habitat, which was shown to lead to accelerated clonal growth, as illustrated in Figure 5C. When clones do not interfere, the growth is exponential. The rate of this exponential growth, or effective selection coefficient se, was found to depend on both the actual selection coefficient and the rate of long-range jumps (cf. Equation 4). Importantly, we found that, due to these fast Malthusian sweeps, even very small rates of long-range jumps strongly increased adaptation speeds in the clonal interference regime (cf. Figure 4B). Thus, long-range migration seems highly beneficial in spatially extended populations with homogeneous selection pressures. Nevertheless, it remains to be seen in future work whether alleles conferring the ability of long-range migration are actually selected, even if large jumps impose a danger (for instance, a long-range jump could be fatal to a cell if the cell is swept away to a harmful environment). This evolutionary mechanism could be relevant for biofilm-forming bacteria and select for occasional switching from the biofilm state to the more efficiently dispersing planctonic phenotype.
In summary, we have analyzed a model of interfering waves. Our model suggests that clonal interference is widespread in biofilms and leads to markedly different adaptation dynamics than in well-mixed populations. In particular, the speed limit Vmax and its power law dependence on mutation rates should be quantifiable in evolving microbial colonies. Finally, we found spatial mixing to be highly beneficial in structured populations as it relaxes clonal interference and speeds up adaptation.
We thank P. Sniegowski, A. deVisser, J. Krug, F. Stiewe, S. Boekhoff, and J. Nullmeier for helpful discussions.
Appendix A: Model Description
We represent the habitat by a linear lattice of length L in one dimension and a planar hexagonal lattice of side length L in two dimensions. In both cases we choose periodic boundary conditions. Each site represents the identity of the dominating clone (represented by one haploid genome). The habitat is homogeneous and the selection pressure is the same everywhere. The habitat is fully populated at any time. Generations are nonoverlapping, such that the genome at site i descends from the genomes of the neighboring sites of the parent generation.
The algorithm stores at each lattice site the genotype of the locally dominating clone, which represents a subpopulation of K individuals in a well-mixed deme. This representation is valid when (i) clones within the deme do not interfere and (ii) the growth of a clone within the deme generates expanding clonal waves. These conditions are quantified as follows. If a beneficial mutation becomes established, it grows into an expanding wave after a timescale of test = (1/s). During this establishment time, the clone spreads diffusively over a length scale(A1)
The establishment time test can be estimated by test ≈ log(Kξs0)/s0, which is the time it takes for a beneficial mutation to fix in a well-mixed population of size Kξ. As a consequence, the length scale ξ is determined self-consistently via(A2)Up to logarithmic corrections, we thus have test ≈ 1/s0 and , which corresponds to the width of a Fisher wave (Fisher 1937). For Fisher waves to occur, we have to require that the habitat size L > ξ and that ξ ≫ 1, i.e., that wave fronts are much broader than a single deme size. The representation in terms of locally dominating genotypes at each lattice site is appropriate only when clones do not interfere on the length scale ξ. In other words, our model is valid when the time between the establishment of two successive beneficial mutations in a habitat region of size ξ, estimated by (KξUb2s0)−1, is larger than the expected establishment time test. This condition is equivalent to(A3)which—up to logarithmic corrections—amounts to . This condition defines an upper bound for the mutation rate, given by(A4)
Simulations are started with a population devoid of genetic variation. As time passes, beneficial mutations occur at a constant rate μ per site and generation. The selective fitness advantage s of a mutation is chosen to be either constant or drawn from a distribution with mean value s0. We choose either an exponential distribution or a hump-shaped gamma distribution with shape parameter k = 2, which are both frequently used in theoretical evolution models (Eyre-Walker and Keightley 2007).
Fitness is updated according to Wi → Wi(1 + s) when a mutation with effect s is drawn at site i (hence epistasis is neglected). To avoid numerical overflow we compute the logarithmic fitness. The speed of adaptation on the population is defined by the rate of change of logarithmic fitness; i.e., , where , where N is the number of sites.
Simulation of the Linear Habitat
For the simulation of the linear habitat, we devised an algorithm that enables us to control the speed of advancing clonal waves. This allows us to check the robustness of our scaling hypotheses toward the different wave velocity relations for strong selection, (Fisher 1937; Kolmogorov et al. 1937) (m represents the migration rate between sites), and for weak selection, c ∼ 2ρmΔW, respectively (Doering et al. 2003; Hallatschek and Korolev 2009). Natural selection is implemented such that waves run across sites with the velocity c = c(ΔW) (ΔW is the fitness difference between the clones on both sides of the wave front) and alter the spatial extent of different clones. We use an auxiliary lattice every half generation, shifted by half a lattice site, as shown in Figure A1. As a consequence, an individual at location i in the offspring generation (t + ) may descend from only two individuals in the parent generation (t), i.e., the neighbor to the left or to the right (see Figure A1). The fitness difference between the left and right sites is then defined as ΔW = Wl − Wr. The lattice formed by full and half generations forms a hexagonal lattice in space–time coordinates.
In Figure A1, a more advantageous clone is shown on the left (red) and a less fit wild type on the right (white). The fitter clone is more likely to reproduce and therefore, on average, tends to move to the right. The genome to be passed on to site i in the offspring generation t + descends from either the left or the right immediate neighbor in the parent generation t, depending on its fitness difference ΔW. To determine from which site the genome is passed on, we assign two probabilities pl and pr to the left and right parents. The probabilities must sum to unity; i.e., pl + pr = 1. To obtain a resulting motion of the advantageous clone according to a given velocity relation c(ΔW), we let the difference of the two probabilities be equal to the wave velocity; i.e., |c(ΔW)| = |pl − pr|. As a consequence, the advantageous clone moves with an average velocity of c(ΔW).
To confirm that our algorithm reproduces the correct wave velocity relations, we have measured ensemble averages of the wave velocity c in the absence of clonal interference; see Figure A2A. For the deterministic Fisher wave speed the migration rate is m = 1/4, and for the noisy Fisher wave speed the product of migration rate and population density is ρm = 1.
For the case of nonstructured populations in the absence of clonal interference, it is known that the probability of fixation of a mutant with a small fitness effect s is equal to 2s (Haldane 1927; Kimura 1962). Maruyama showed that the fixation probability is the same for spatially structured populations (Maruyama 1974), under the assumptions of a structurally subdivided population with symmetric migration and haploid individuals that evolve according to the Moran process in each subpopulation. This result was later also confirmed in simulations of spatially structured habitats (Gordo and Campos 2006; Gonçalves et al. 2007) (see also Drossel 2001 and Patwa and Wahl 2008 for detailed reviews of fixation probabilities). However, when we choose to use the linear wave velocity relation for weak selection, we find that our algorithm yields a fixation probability that deviates from 2s.
We therefore made an adjustment to our simulation to compensate for this deviation, such that mutations fixate with probability 2s. To accomplish this, we imposed a survival probability on every new mutation. We first measured the fixation probability for a given velocity, Pfix(c). Given that a new mutation must survive with probability 2s, we determine a survival probability via Psurv(s, c) = 2s/Pfix(c). Accordingly, in the adjusted algorithm, we allow only new mutations to survive the next generation with probability Psurv. The resulting fixation probability approximates the required fixation probability of 2s well for small s, as shown in Figure A2B.
Spatial mixing is implemented so that any given clone competes at a small rate ml with a randomly chosen clone rather than with its immediate neighbor. Such a long-range migration event is indicated by the blue arrow in Figure A1. Accordingly, we determine the probabilities for passing on the genomes from the two competing sites by calculating the fitness difference ΔW between the randomly chosen site and a neighboring site.
The implementation of recombination is shown in Figure A3. Recombinants are produced at a rate r per site from two (haploid) individuals randomly selected from the nearest neighborhood in the parent generation (selfing is allowed) by determining the probabilities pl and pr, according to the rules described above without recombination. Recombinant genotypes are formed from the parental genotypes A and B by one-point recombination: The parent chromosomes are paired and a point xexc is randomly chosen by which both chromosomes are split. The mutations to both sides of the recombination point are then exchanged between the two chromosomes, forming the recombinant genotypes A1B2 and A2B1. One of these two genomes is chosen randomly and passed on to the offspring generation.
To perform recombination of two genomes, we need to keep track of all mutations and any time. Doing so is computationally expensive when many clones are sweeping simultaneously in the population. In this situation, the algorithm exceeds the complexity of (L) that is expected otherwise and scales also with the number of present clones. As a consequence, simulation times explode when we try to obtain results deep in the clonal interference regime, i.e., for L ≫ Lc.
Simulation of the Planar Habitat
For planar habitats we do not directly control the wave velocity relation, but use a simplified scheme. The habitat is represented using a hexagonal planar lattice. As in the simulation of the linear habitat, we use an auxiliary shifted lattice at every half time step (t + ), as shown in Figure A4. Thus we have three sites in the parent generation t surrounding the offspring site at t + . Selection is performed such that the offspring site inherits the genome of one of the neighboring sites proportional to their fitness differences. This means that we need to compute three probabilities pk = Wk/(W1 + W2 + W3) for inheriting the genome from the parent at site k.
We measured the velocity relation c(ΔW) resulting from this algorithm, shown in Figure A5A. The relation matches the behavior expected from a deterministic Fisher wave, , with a prefactor k = . We also measured the fixation probabilities, shown Figure A5B. The fixation probability assumes the expected behavior of 2s for small selective coefficients (Maruyama 1974; Gordo and Campos 2006).
Appendix B: Fixation Times in the Clonal Interference Regime
The time to fixation in the periodic selection regime is given by tfix ∼ L/c. For the clonal interference regime, we expect that in general, fixation times should be larger, but it is not obvious how much larger they should be. For the linear habitat we measured the mean fixation time in dependence of the relative habitat size L/Lc, shown in Figure B1. Simulations have been carried out for a large range of habitat sizes L, using the deterministic wave speed and s = const. Different data sets for a range of beneficial mutation rates μ and selective coefficients s were generated (see Figure B1 legend). Rescaling the fixation time, tfixc/Lc, we find that all data sets collapse on a single master curve and recover the following power law behavior:(B1)
This result suggests the following fixation scenario: On length scales smaller than Lc, the growth of clones is ballistic; that is, they grow linearly in time at a constant speed. As soon as clones reach the size Lc, they start to collide with neighboring clones, which may have higher or lower fitness depending on the mutations they have accumulated in the past. As a consequence, the size of a clone follows a random walk on length scales larger than Lc. The variance of the size of the winning clones increases over time until it becomes of order L2, upon which fixation occurs. Now, if the random walk of the clone size was a normal random walk (similar to, e.g., Brownian motion), we would expect the clone size variance to grow linear with time. Hence, fixation should occur after a time of order L2. By contrast, we find much earlier fixation after a time of order L3/2. This indicates that the random walk of the clone size is superdiffusive and (with the dynamical exponent of ) belongs to the Kardar-Parisi-Zhang universality class that describes a wide range of surface growth models (Kardar et al. 1986). Indeed, if fitness as a function of time is interpreted as the height of a crystal, then our asexual model with constant selective advantage s0 and deterministic adaptation waves maps to the so-called polynuclear growth model (Prähofer and Spohn 2000).
Appendix C: Fitness Correlation Function and Interference Scale LC
We also measured the space-dependent mean-squared fitness variance (MSFV),(C1)
which correlates the log fitness W at a location separated by a distance x. The angle brackets indicate that the average is taken over all positions ξ in the habitat. As shown in Figure C1, the MSFV is linear in distance for large habitats and obeys the scaling relation(C2)depending on the mean selective fitness advantage s0 and the interference scale Lc. Since Equation C2 predicts , it is consistent with the view that clones expand over a characteristic length Lc until they collide with neighboring clones, which differ in log fitness (typically) by s0. Furthermore, Equation C2 allows us to calculate the global variance in log fitness as(C3)in large systems. The fitness variance in large systems is thus proportional to the total habitat size and is inversely proportional to the interference scale Lc.
Appendix D: Estimates of Critical Habitat Sizes for Clonal Interference
Mutation Rate and Selective Coefficient in Microbes
While many studies provide estimates for the mutation rates in microbes, fewer results are available regarding the beneficial mutation rate U (per generation and per genome) (Elena et al. 1996; Desai et al. 2007; Perfeito et al. 2007) and their selective coefficients. Some results that are relevant to us are compiled in Table D1 (for a review see Sniegowski and Gerrish 2010).
Some studies suggest mutation rates as low as U ∼ 10−9 or 10−8 per genome per generation (Gerrish and Lenski 1998; Imhof and Schlötterer 2001; Rozen et al. 2002). However, these studies were carried out for large effective population sizes (Ne ∼ 107) that had not been previously exposed to the used growth environment (and adapted accordingly). Under such circumstances, one expects that the supply rate NU is rather high; at the same time, effects of clonal interference are predicted to be very strong. In this case, not taking into account effects of genetic drift and clonal interference would lead to an underestimate of μ and an overestimate of the average effect of beneficial mutations (Sniegowski and Gerrish 2010).
More recent studies suggest that the rate of beneficial mutations is much higher. Experiments with E. coli (Perfeito et al. 2007) and Saccharomyces cerevisiae (Desai et al. 2007) find for smaller effective population sizes (Ne ∼ 104)—thus avoiding clonal interference—beneficial mutation rates of U ∼ 10−5. Data from previous studies yielding the lower estimates of U ∼ 10−9 (Elena et al. 1996; Gerrish and Lenski 1998) have been reanalyzed (Sniegowski and Gerrish 2010) by using a likelihood model that accounts for both drift and clonal interference. One then obtains a much higher value of U ∼ 10−5, which is in line with recent estimates (Wloch et al. 2001; Desai et al. 2007; Perfeito et al. 2007).
Considering an estimate for the rate of all mutations (i.e., including neutral and deleterious mutations) of order 10−3 (Drake et al. 1998) means that beneficial mutations make up ∼1% of the total number of mutations. Still, the matter is not quite that simple. Even if we restrict our attention to habitats on petri dishes, we may expect that different environments may yield different U. Moreover, spatial structure may also play a role. Thus, to estimate Lc, we use a range of U ∼ 10−6 … 10−4 for the mutation rate. For the selective coefficient we use s0 = 0.01, which compares well with the estimates in Table D1.
Characteristic Population Size for Well-Mixed Populations
The crossover condition between periodic selection and clonal interference regimes is tfix ∼ tmut, which allows us to determine a characteristic population size above which clonal interference is expected. For a nonstructured population, the waiting time for successful mutations is(D1)where the fixation probability is π = 2s0. In nonstructured populations the fixation time differs from the one in spatial structures and is given by(D2)(Crow and Kimura 1970). The crossover condition becomes the implicit equation 2NU log(Ns/2) ∼ 1. Using a selective coefficient of order s0 ∼ 0.01 and a mutation rate in the range of U = 10−6 … 10−4, we find a lower and an upper bound for the characteristic population size as follows:(D3)(D4)
Upper and Lower Boundary for Lc in a Planar Habitat
To estimate the interference scale Lc for microbial populations, we are going to need estimates for the population density ρ and the wave velocity c. In the forthcoming, we measure length in units of cell length, which for the rod-shaped E. coli are of order 3 μm and for S. cerevisiae of order 5–10 μm.
Provided that cells are packed densely, the absolute lower boundary of the population density must be ρ = 1 per square cell lengths. To estimate the population density in a planar population of E. coli, we refer to experimental data (Perfeito et al. 2007), where bacteria are grown on a petri dish as a lawn (diameter 6 cm = 60,000 cell lengths, i.e., a radius of R = 30,000 cell lengths). The area of a petri dish occupies ∼A = πR2 ∼ 109 square cell lengths. The population in this experiment grows to a final size of Nf = 1010. Thus the population density ρ ≡ Nf/A is in this case ρ ∼ 10 per square cell lengths. However, microbial colonies on rich medium can certainly grow to larger thicknesses. For instance, a thickness of 1000 cell diameters is not uncommon for a colony of S. cerevisiae. To account for a range of situations we consider a range of densities bounded by(D5)(D6)
Estimating the Wave Velocity
The wave velocity is a function not only of the selective coefficient s0 and the population density ρ, but also of the diffusion constant m. The relevant diffusion constant in two dimensions is given by(D7)where ΔX is the expected dislocation distance to occur per generation. For a densely packed population of bacteria, it is reasonable to assume that ΔX is at least of the order of 1 cell length and at most of the order of 10 cell lengths (far displacement is possible in street formations of rod-shaped E. coli); i.e., ΔX = 1 … 100 cells. Thus, we expect m = 0.25 … 2500 square cell diameters per generation. To estimate the Fisher wave velocity we use the deterministic wave velocity relation . We find a lower bound of cmin = 0.1 and for the upper bound cmax = 10 cells per generation. In conclusion, we find the following upper and lower bounds for the interference scale Lc in a planar habitat:(D8)(D9)These are the estimates that we provide in the Discussion of the main text.
Appendix E: Long-Range Migration
As discussed in the main text, small rates of long-range jumps drastically increase the adaptation speed V. Figure E1 shows the increase of the adaptation speed V relative to the adaptation speed without long-range migration, .
Long-range jumps generate “seeds” with identical genomes far off the ancestral clone and thus multiply the growth rate of the ancestral clone. Long-range migration has no effect on the periodic selection regime, because it does not change the limiting supply of beneficial mutations. However, generating seeds may reduce the time for the clone to fixate. As a consequence, effects of clonal interference are mitigated (and thus, the adaptation speed is increased). Moreover, the range of parameters over which periodic selection occurs, characterized by tfix ≪ tmut, becomes larger with increasing rates of long-range migration.
We briefly investigate a simple model that explains how long-range migration may lead to an exponential growth of a clone. We assume that waves propagate deterministically (i.e., on straight lines) and that jumps occur before the first (ancestral) clone reaches fixation. Furthermore we assume that at any time there is no more than one mutation present in the habitat (i.e., absence of clonal interference). To effectively generate a seed, an individual must jump from a region already occupied by the ancestral clone to an unoccupied region in the habitat. We denote the average number of such clonal seeds, present at time t, by the variable . Furthermore, is the average length in a linear habitat (or the area in a planar habitat) that is occupied by the clone. The growth of a clone follows from the following considerations: First, the rate of change of the number of seeds is given by the product of the spatial occupation of the clone, , and the rate at which long-range migration events occur per site, i.e., ml. Second, the rate at which the spatial occupation increases, i.e., the effective growth rate of the clone, , is twice the velocity of a single wave front times the number of waves present in the population. We denote the velocity of the ancestral clone by c0. These considerations are summarized in the following ordinary differential equations:(E1)(E2)
At time t = 0, just after a mutation has occurred in the habitat, space is unoccupied, , and the number of clonal waves is . With this initial condition, the exact solution of Equations E1 and E2 is(E3)where we define the effective growth rate(E4)This is Equation 3 in the Results section of the main text.
The growth rate se is related to the time it takes for a second mutation to appear. Solving for the condition n(t2) = 2 yields(E5)
The growing number of seeds eventually results in fixation of the clone once space is fully occupied by the clone; i.e., . This condition yields a characteristic timescale for the fixation process,(E6)
Equation E3 describes exponential growth at rate se when . The solution is a reasonable approximation until . For larger times, the assumption that new seeds are born into unoccupied regions cannot be maintained. When the seeds start merging, the effective wave velocity also slows down because the number of traveling wave fronts decreases until the growth of the clone comes to a complete halt.
Appendix F: Selective Fitness Distributions and Their Effects
We have discussed above that the speed of adaptation saturates for large habitats when clones interfere. Figure F1 demonstrates the same saturation phenomenon when the selective fitness is exponentially distributed. We expect that the saturation behavior documented for constant and exponentially distributed selection coefficients occurs for all subexponential fitness effect distributions. At this point, it is unclear how superexponential distributions might affect the adaptation dynamics.
Compared to the case of the constant selection coefficient, a nonzero variance of the distribution of fitness effects g(s) increases the speed of adaptation. For periodic selection, this can be easily understood as follows. The expectation value of the adaptation speed is given by 〈V〉 = 〈NUb 2s2〉 or, more precisely, by(F1)
The mean value of the distribution is Because of the identity , we have that(F2)This implies that a nonzero variance increases the adaptation speed even for the periodic selection regime. When s = const., the variance is zero, but we have for the exponential and for the gamma distribution (in our simulations, we chose a shape parameter of k = 2). The shifted values of the adaptation speed 〈V〉 seen in Figure 3 can thus be explained by the differing variance values of the various selective fitness distributions.
- Received April 29, 2011.
- Accepted September 5, 2011.
- Copyright © 2011 by the Genetics Society of America
Available freely online through the author-supported open access option.