# The Genetic Basis of Phenotypic Adaptation I: Fixation of Beneficial Mutations in the Moving Optimum Model

- Michael Kopp
^{1}and - Joachim Hermisson

- Mathematics and Biosciences Group, Max F. Perutz Laboratories and Faculty of Mathematics, University of Vienna, A-1030 Vienna, Austria

- 1
*Corresponding author:*University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria. E-mail: michael.kopp{at}univie.ac.at

## Abstract

We study the genetic basis of adaptation in a moving optimum model, in which the optimal value for a quantitative trait increases over time at a constant rate. We first analyze a one-locus two-allele model with recurrent mutation, for which we derive accurate analytical approximations for (i) the time at which a previously deleterious allele becomes beneficial, (ii) the waiting time for a successful new mutation, and (iii) the time the mutant allele needs to reach fixation. On the basis of these results, we show that the shortest total time to fixation is for alleles with intermediate phenotypic effect. We derive an approximation for this “optimal” effect, and we show that it depends in a simple way on a composite parameter, which integrates the ecological parameters and the genetic architecture of the trait. In a second step, we use stochastic computer simulations of a multilocus model to study the order in which mutant alleles with different effects go to fixation. In agreement with the one-locus results, alleles with intermediate effect tend to become fixed earlier than those with either small or large effects. However, the effect size of the fastest mutations differs from the one predicted in the one-locus model. We show how these differences can be explained by two specific effects of multilocus genetics. Finally, we discuss our results in the light of three relevant timescales acting in the system—the environmental, mutation, and fixation timescales—which define three parameter regimes leading to qualitative differences in the adaptive substitution pattern.

WHEN a population adapts to a changing environment, what is the genetic basis of this process? For example, does adaptation occur in small or large steps? These questions have been asked ever since the debate on micro- *vs.* macromutationalism in the early days of evolutionary theory (Provine 2001). Subsequently, the genetics of adaptation have become the subject of different modeling approaches (see Orr 2005a,b for review). Considerable work exists, in particular, in the context of Fisher's geometric model (*e.g.*, Fisher 1930; Kimura 1983; Orr 1998; Welch and Waxman 2005; Martin and Lenormand 2006), Gillespie's mutational landscape model (*e.g.*, Gillespie 1983, 1984; Orr 2002), and various models of so-called “adaptive walks” on rugged fitness landscapes (*e.g.*, Kauffman and Levin 1987; Kauffman 1993). These approaches treat organismal adaptation as a process in a high-dimensional phenotype or genotype space characterized by pleiotropy and epistasis. Another common feature is that all these models assume a constant selection pressure. Yet, many organisms live in environments that are continually changing (*e.g.*, Hairston *et al.* 2005; Thompson 2005; Parmesan 2006; Perron *et al.* 2008). This fact is acknowledged in the so-called moving optimum model, which assumes that the selectively favored phenotype changes over time and, hence, experiences a mixture of stabilizing and directional selection.

The moving optimum model was originally devised in the field of quantitative genetics, where its primary aim was to study adaptation at the phenotypic level, without an explicit focus on the underlying genetics. In this context, the model has been used, for example, to study the ability of populations to avoid extinction in the face of global change (*e.g.*, Lynch *et al.* 1991; Lynch and Lande 1993; Bürger and Lynch 1995; Nunney 2003), to study the role of temporal variation for the maintenance of genetic variation (Bürger 2000; Bürger and Gimelfarb 2002), and to understand the evolution of sex and recombination (Bürger 1999; Waxman and Peck 1999). Only recently, several studies have started to investigate how a moving optimum affects the dynamics of adaptive substitutions (Bello and Waxman 2006; Collins *et al.* 2007; Kopp and Hermisson 2007; Sato and Waxman 2008). A key result is that, under conditions of slow environmental change, mutations with small effect tend to become fixed earlier than those with large effect (Bello and Waxman 2006; Collins *et al.* 2007; Kopp and Hermisson 2007). This is in direct contrast to what is found under constant selection after a sudden change in the environment (*e.g.*, Orr 1998, 2002, 2005a; Kim and Orr 2005).

In a previous note (Kopp and Hermisson 2007), we identified three parameter regimes in the moving optimum model that lead to qualitative differences in the dynamics of adaptive substitutions. These regimes are defined by three timescales that may dominate the adaptive process: the ecological timescale, which is given by the (inverse) speed of the optimum; the mutation timescale, which determines the waiting time for a new beneficial allele; and the time that it takes for this allele to rise to fixation. In particular, we found that small-effect mutations are fixed before large ones if the ecological timescale is dominating, that is, if the speed of adaptation is limited primarily by the speed of environmental change (see also Collins *et al.* 2007 for similar findings). In Kopp and Hermisson (2007), this result was obtained from an approximate calculation in a minimal model with two haploid biallelic loci. Here, we extend the previous analysis in two main ways. First, we significantly improve our approximation for the total fixation time of a given allele under moving-optimum selection. Second, we analyze the order of adaptive substitutions in a full haploid or diploid multilocus model. We confirm the preliminary results from the two-locus study, but show that regime boundaries depend on both the ecological parameters and the genetic architecture of the trait. The current article focuses on the order of substitutions over relatively short timescales. The complementary question about which mutations dominate long-term evolution will be treated in a separate study.

## THE MODEL

#### Assumptions on fitness:

Consider the evolution of a quantitative trait *z* that is under stabilizing selection with respect to a moving optimum *z*_{opt} = *z*_{opt}(*t*). We assume that the optimum increases over time at a constant rate *v*; that is,(1)Note that, unlike in Kopp and Hermisson (2007), the optimum increases indefinitely. Assuming Gaussian stabilizing selection, the fitness of an individual with phenotype *z* at time *t* is given by(2)where σ > 0 determines the strength of selection. The selection coefficient of a mutant with phenotype α in a wild-type population with phenotype 0 at time *t* is(3)As long as selection is weak (σ or |*z* − *z*_{opt}| small), Equation 3 can be approximated by(4a)where(4b)Thus, selection for α increases approximately linearly over time.

#### Assumptions on genetics:

The aim of this article is to model the first steps of the adaptive process in the moving optimum model. We assume that the trait *z* is influenced by *L* additive, unlinked, haploid or diploid loci. Each locus has two alleles, which we refer to as the wild-type and the mutant allele, respectively. The contribution of the wild-type allele to the phenotype *z* is 0, and the contribution of the mutant allele is α_{i} (). The α_{i} are referred to as (locus) mutational effects. In the one-locus case, we suppress the index *i* and simply write α_{1} = α. Mutations from the wild-type to the mutant allele (and vice versa) occur recurrently at rate μ per locus. In our calculations, we usually use the population mutation parameter θ = 2*N*μ, where *N* is the population size. Environmental variation is not modeled explicitly, but is subsumed in the selection parameter σ (*e.g.*, Bürger 2000). Our analytical results are compared to and extended by stochastic computer simulations, which are described in appendix a.

## RESULTS

In the following, we first derive an approximation for the expected time to fixation of a mutant allele at a single haploid locus. Consequences of diploidy are considered next. We then build on these results and complement them with computer simulations to determine the order of fixed substitutions in a model with multiple loci of unequal effect.

#### Expected time to fixation of a single mutant allele:

Consider a monomorphic population with the wild-type phenotype *z* = 0 and a mutant allele (arising by recurrent mutation at rate μ) that alters the phenotype to *z* = α. At time *t* = 0, the frequency of the mutant allele is *p* = 0. Our basic question is: How long does it take, on average, for the mutant allele to go to fixation?

Some care is needed to formulate a meaningful notion of “fixation.” Problems may arise, in particular, from the final phase of the fixation process where mutant frequencies are close to *p* = 1. In fact, a mutant may never reach *p* = 1 in models with back mutation (and, of course, in natural populations). More generally, in the context of adaptation, the importance of the accidental time point when a mutation finally reaches full fixation at *p* = 1 is questionable. To avoid these problems, and following previous work (Bello and Waxman 2006; Kopp and Hermisson 2007), we instead calculate the time until the mutant allele reaches majority status (*p* ≥ ) and, thus, dominates the locus genetics. For ease of terminology, we nevertheless refer to this state as fixation. (Additional simulations with a fixation criterion of *p* ≥ 0.9 yielded results that are qualitatively and quantitatively similar to those reported below.)

Denote the total time to fixation (*i.e.*, to *p* = ) by . Following Kopp and Hermisson (2007), this time can be subdivided into three periods,(5)where(6)is the *lag time* until the mutant allele becomes beneficial (*i.e.*, has a positive selection coefficient, see Equation 4a). *T*_{w} is the *waiting time* for a successful mutation (*i.e.*, for the appearance of a mutant allele that is not subsequently lost due to drift). Finally, *T*_{f} is the (narrow-sense) *fixation time* of this successful mutation (*i.e.*, the time needed for the frequency to increase from *p* = 1/(2*N*) to *p* = ). To calculate the waiting and fixation times, we need to distinguish two different cases. In the first case, fixation occurs from a mutant allele that appears after the end of the lag time, so that the waiting time is positive. We refer to this case as fixation from a “new” mutation. In the second case, fixation occurs from a mutant allele that is already segregating in the population at the end of the lag time, and the waiting time is zero.

##### Case 1—fixation from a new mutation:

In contrast to the lag time (Equation 6), the waiting time *T*_{w} is a random variable. It can be approximated as resulting from an inhomogeneous Poisson process with time-dependent rate θ*s*(*t*). The probability density function of *T*_{w} is(7a)(Kopp and Hermisson 2007), which has mean(7b)and variance(7c)(Note that, here, we use the symbol *T*_{w} for both the random variable and its realization.) This approximation assumes that, during the short time period in which the fate of a new mutation is determined, the selection coefficient does not change significantly. This is the case if or, with *s* = λ*T*_{w} (see below) and Equation 7b, if (from diffusion theory; J. Hermisson, unpublished results). The approximation is best if , such that mutations appear so late that selection is already strong. For , adaptation occurs from an already segregating allele, and *T*_{w} = 0 anyway. The approximation therefore performs reasonably well in the whole parameter space.

The fixation time *T*_{f} also is a random variable, but its variance is small relative to that of the waiting time (as is well known for constant selection, *cf.* Etheridge *et al.* 2006). Therefore, we treat it as deterministic. However, for fixation from a new mutation, still depends on the stochastic waiting time, because *T*_{w} determines the selection pressure during the fixation phase. In appendix b, we derive(8a)where(8b)The expected total time to fixation from a new mutation, , is given by(9)Since (as a function of *T*_{w}) is approximately linear around (not shown), Equation 9 can be approximated by(10)Inserting from (7b) into (8b), the value of *s** becomes(11)which, for small θ is approximately equal to .

##### Case 2—fixation from an already segregating allele:

If θ is large, there is a high probability that fixation occurs from a mutant allele that has appeared already during the lag time. In this case, *T*_{w} = 0 and the fixation time is given by the same expression as (8a), but with(12)(see appendix b). The expected total time to fixation from an already segregating allele, , is simply(13)In summary, combining Equations 8a, 10, 13, A7b, and A9, the total time to fixation can be estimated as(14a)where(14b)A necessary condition for the validity of this approximation is *s** > 2μ + 1/*N*. In appendix b, we show that the probability of fixation from an already segregating allele is(15)Thus, the expected total time to fixation from either a new or a segregating allele is(16)Numerical analysis of Equation 16 shows that the total time to fixation for a mutant with a given effect decreases with increasing *v*, σ, and θ. This means that adaptation is fastest if the environment changes quickly, selection is strong, and the populationwide mutation rate is high. The predictions from Equation 16 are in excellent agreement with results from stochastic simulations (Figure 1). Visible deviations occur only for the combination of large *v* and small θ (*v* = 0.01, θ = 0.02 in Figure 1), where the predicted time to fixation is slightly shorter than the observed one. The reason is that, for these parameters, the selection coefficient may become quite strong before a beneficial allele arrives. In this case, assuming a fixation probability of 2*s*(*t*) (as is done in the derivation of Equation 7a) is an overestimation. The true fixation probabilities are lower and, hence, the true waiting times are longer. With the more accurate approximation for the fixation probability, 1 − exp(−2*s*(*T*_{w} + *T*_{ℓ})), the waiting time distribution becomesThis equation provides a very good fit to the simulated waiting time distribution, but cannot easily be used in further analytical derivations.

For given, *v*, σ, and θ, the total time to fixation is a U-shaped function of α (Figure 1). That is, mutations with intermediate effect become fixed faster than mutations with either small or large effects. The reason is that mutations with small effects are only weakly selected for and, thus, have long waiting times (due to a low fixation probability) and long fixation times. In contrast, mutations with large effects are deleterious for a long time and, therefore, have long lag times.

The value of α that minimizes the total time to fixation is denoted by α*. In appendix c, we derive an analytical approximation (Equation C2), which elucidates how α* depends on the model parameters (see Figure 2). To a good approximation,(17a)(where ∝ denotes proportional) and, if θ is small,(17b)Thus, fast environmental change favors relatively large mutations, whereas a large populationwide mutation rate and strong selection both favor relatively small mutations. The reason is that fast environmental change leads to the buildup of a large lag between the wild-type phenotype and the optimal phenotype, which can then be bridged by a large mutation. In contrast, high mutation rates and strong selection lead to the fixation of small mutations before the lag becomes large. For a mutation of size α* we show in appendix c that the lag time is approximately one-third of the total time to fixation, suggesting that none of the three timescales is dominating the other two.

##### The diploid case:

So far, we have assumed that the locus under study is haploid. Here, we show how the previous analysis can be extended to the diploid case. We assume that heterozygotes have phenotype α > 0 and that mutant homozygotes have phenotype β ≥ α. This allows for various degrees of (partial) dominance, but not for over- or underdominance (nor for complete recessiveness). Furthermore, we define μ as the diploid (twice the haploid) mutation rate (such that the number of new mutations per generation remains *N*μ). Since, initially, the mutant allele occurs only in heterozygotes, the lag time and the waiting time are identical to those in the haploid case (*i.e*., Equations 6 and 7 remain valid). The same is not true for the fixation time, however, because the further spread of the mutant allele depends on both the heterozygous and the homozygous fitnesses. Nevertheless, there is a simple argument that allows us to derive a rough approximation based on the haploid result.

The mutant allele can invade the population once the heterozygotes have higher fitness than the wild type, which is the case for *t* > *t*_{1} = α/(2*v*). However, the wild-type allele can be replaced only once the mutant homozygotes have higher fitness than the heterozygotes, that is, for *t* > *t*_{2} = (α + β)/(2*v*). For *t* between *t*_{1} and *t*_{2}, heterozygotes have the highest fitness, and selection favors a stable polymorphism of the two alleles. During this time, the frequency of the mutant allele cannot exceed the equilibrium frequency of the polymorphism, which increases from 0 at *t*_{1} to 1 at *t*_{2}. At time *t*_{3} = β/(2*v*), in particular, the equilibrium frequency reaches (our criterion for fixation). Therefore, the total time to fixation can be estimated as the maximum of β/(2*v*) and of the time predicted for the haploid model (Equation 16). As seen in Figure 3, this approximation works surprisingly well for most parameter values, and also the minimum α* remains largely unchanged. Visible deviations from simulation results occur only for intermediate values of α that are close to the point where the line β/(2*v*) intersects with the haploid prediction. Note that for β = 0 (full dominance) the diploid and the haploid result coincide (confirmed by simulations, not shown).

#### The order of fixations in the multilocus case:

In the single-locus model, we have seen that a moving optimum favors fast fixation of mutations with an intermediate phenotypic effect, and that the size of the fastest mutation depends on the mutation rate and on the ratio of the speed of environmental change to the strength of selection. What does this result predict (qualitatively and quantitatively) for the order of fixed mutations in a full multilocus model, in which alleles with different effects “compete” against each other? Figure 4 shows simulation results for a model with 40 biallelic loci and uniformly distributed mutational effects α_{i}. Details of two example runs are shown in Figure 5. We see that the characteristic U shape from Figure 1 is indeed preserved (Figure 4, A–C). Mutant alleles with intermediate effects are fixed earlier than those with either small or large effects. Not surprisingly, the full pattern is seen only if the range of mutational effects is sufficiently large (Figure 4, D–F): If only small (large) alleles are available, then only the left (right) branch of the U appears, leading to a monotonic and almost linear decrease (increase) of fixed effects over time.

Despite the qualitative agreement with the single-locus results (Figure 1), there are two quantitative differences that require explanation. First, the size of the fastest alleles is shifted to the left, that is, to effects less than α*. And second, the pattern assumes a Y shape, rather than a U shape, if mutation rates are high (Figure 4C). These differences can be explained by two effects of multilocus genetics: a “sampling effect” for small θ and an “interaction effect” for large θ.

##### Sampling effect:

The sampling effect arises from a combination of two factors: For small θ, the waiting time has a large variance (Equation 7c), and small mutations have a shorter lag time than large mutations. As a consequence, even though small mutations might have a high expected total time to fixation, their minimal possible time is quite low. If the number of loci is large, there is a high probability that one of the small mutations will fix before the lag time of the larger mutations has passed. Thus, the sampling effect leads to an “advantage” for small mutations, which is strongest if the waiting time is the dominating timescale.

Since the sampling effect is a purely statistical phenomenon, it can still be estimated within a single-locus framework. In appendix d, we use our results from the one-locus model to derive an approximation for the distribution of the first fixation in the multilocus model. This approximation neglects epistatic interactions between mutant alleles at different loci (see below). Nevertheless, it is in good agreement with simulation results, as long as θ is not too large (compare dashed lines and shaded histograms in Figure 4, A–C). The reason is that, for small θ, narrow sense fixation times are small, and different alleles rarely segregate simultaneously (see Figure 5A). Note also that the mode of the distribution of the first fixation coincides well with the base of the U in Figure 4, A and B.

##### Interaction effect:

If θ is large, the probability of mutant alleles segregating simultaneously is high (see Figure 5B). These alleles interact with each other, because stabilizing selection entails epistasis for fitness: Each mutant allele that increases in frequency brings the population mean closer to the optimum and, thereby, decreases the selection coefficient of the other alleles. In particular, segregating small mutations can delay the fixation of large mutations by effectively prolonging their lag time.

The interaction effect has two main consequences: First, the size of the first fixation is smaller than predicted from the one-locus model, even if the sampling effect is negligible. This can be seen from Figure 4C, where the prediction for the first step according to Equation D5 (dashed line) is close to α*, but the observed distribution (shaded bars) is considerably smaller. Second, the interaction effect causes a delay in the fixation of some intermediate and all large mutations, thus transforming the U-shaped into a Y-shaped pattern (Figure 4, C and E). Figure 5B gives a more detailed view of this process. The initial decrease in effect sizes (the stem of the Y) arises because, at the beginning of the simulation, several alleles with small and intermediate effect start to increase in frequency more or less simultaneously. During their increase, these alleles suppress other mutant alleles, since their combined effect keeps the population mean close to the optimum. At about the “branching point,” the bulk of the initial alleles have become fixed, and only two classes of alleles remain: those with very small effect, which have a very long fixation time and slowly increase in frequency “in the background,” and those with large effects and a long lag time, whose frequency up to this point was almost zero and who now, one by one, quickly go to fixation.

The interaction effect depends on the cosegregation of beneficial alleles at multiple loci and is strongest if both small and large mutations are competing for fixation. This is demonstrated in Figure 4, C–F. The Y-shaped pattern is seen only if the effects of available mutations cover a large range (Figure 4, C and E). In contrast, if all mutations are large, they rarely cosegregate at higher frequencies. No delay in fixation is observed, and the size of fixed mutations increases almost linearly over time (Figure 4F). Finally, if all mutations are small, epistatic interactions between them are even smaller. Cosegregating alleles hardly interfere, and the size of fixed mutations decreases linearly over time (Figure 4D).

##### Robustness of results:

To test the robustness of the above results, we conducted a large number of supplementary simulations that consider additional factors (Figure 6).

Distribution of locus effects: Both theoretical and empirical evidence suggests that small beneficial mutations are more common than large ones (

*e.g.*, Bürger 2000; Barton and Keightly 2002; Orr 2005a; Eyre-Walker and Keightley 2007). As seen in Figure 4, a limited range of mutational effects can have a large influence on the order of fixations. However, if the range is sufficiently large, the exact shape of the distribution from which mutational effects are drawn seems to matter little. For example, results qualitatively similar to those in Figure 4 are obtained if the distribution of mutational effects is exponential instead of uniform (Figure 6A).Diploidy: As in the single-locus case, haploid and diploid genetics lead to qualitatively identical results (Figure 6B).

Variable mutation rate: So far, we have assumed that the per-locus mutation rate μ = 10

^{−5}, and we have varied the population size*N*to obtain various values of θ = 2*N*μ. Choosing a different value for μ has little effect in most cases (Figure 6C), showing that the crucial parameter is θ, not*N*or μ individually. Substantial differences occur only if μ is extremely high (≥10^{−3}in Figure 6C). In this case, recurrent mutation may become more important than selection in determining the dynamics of allele frequencies (μ >*s**/2, see remarks regarding Equations 14 and B5). This leads to a “boost” for small mutations, which are under the weakest selection and have the shortest lag times. As a consequence, the U shape disappears, and the size of fixed mutations increases monotonically over time (Figure 6C3).Linkage: The multilocus simulations presented above assume linkage equilibrium (see appendix a). However, additional simulations using individual-based modeling (Figure 6D; M. Kopp and J. Hermisson, unpublished results) show that the results are robust in the presence of moderate linkage (recombination rates ≥0.01, Figure 6D1). Strong linkage in combination with large θ, in contrast, leads to an increase in stochasticity and an increase in the size of the first fixation (Figure 6, D2 and D3). Such a shift is also known from other models of adaptation and has been described in the context of clonal interference (

*e.g.*, Gerrish and Lenski 1998; Park and Krug 2007) and the Hill–Robertson effect (Hill and Robertson 1966; Barton 1995). Individual-based simulations also show that our results are not biased by the quadratic approximation (A2).Standing genetic variation: The simulations above were started with frequency

*p*= 0 for all mutant alleles at time*t*= 0, but genetic variation was allowed to accumulate during the lag time (see fixation from “already segregating alleles” above). In another set of simulations, we let the population equilibrate to the initial conditions before starting to move the optimum. This allows for the buildup of standing genetic variation (Hermisson and Pennings 2005; Barrett and Schluter 2008). It turns out that the order of fixations is only slightly affected, with small loci benefiting more from standing variation than large ones (which are more deleterious initially). The reason for the small effect is that, for large Θ, most fixations occur from mutations that arise during the lag time. However, this is already accounted for in our model (Equation 15), and adding genetic variation that is present even before the lag time does not change the qualitative picture.Speed of environmental change: As in the one-locus case, we performed simulations with

*v*= 0.01 and*v*= 0.0001 and obtained qualitatively identical results (not shown). For*v*> 0.01, small populations sometimes are unable to follow the moving optimum and, instead, go to extinction (Bürger and Lynch 1995). While the persistence or extinction of populations in the face of environmental change is an important topic (Lynch*et al.*1991; Lynch and Lande 1993; Bürger and Lynch 1995; Nunney 2003) with obvious implications for conservation, it is not the focus of this study.

## DISCUSSION

The speed and pattern of phenotypic adaptation depend on a combination of factors that are external or internal to an organism. While new selection pressures on a phenotypic trait result from changes in the external environment, the response to selection depends on the details of the internal genetics, that is, on the rates and effects of new mutations. The moving optimum model with an explicit genetics is a first attempt to combine these various factors in a single modeling framework and to analyze their relative roles in the adaptive process.

In this article, we have focused on a specific issue: Assume that mutations with various effects on the trait compete to be recruited in the adaptive process: What is the expected order of adaptive fixations? Which size of mutations—small, intermediate, or large—is fixed first under the given environmental and genetic conditions? We have taken a two-step approach to address this question. In the first step, we consider a single mutant allele. We have derived a highly accurate approximation for how the expected total time to fixation of this allele depends on the internal (genetic) and external (environmental) factors. In the second step, we examine how these results are altered by multilocus effects in the context of a polygenic trait.

In both cases, we find that the fastest mutations are those with intermediate size. They have the shortest time to fixation in the one-locus model (Figures 1 and 3), and they are the first ones to reach fixation in the multilocus case (Figures 4 and 5). But what exactly does “intermediate” mean in this context? In the following, we give a qualitative and quantitative answer to this question by applying the notion of timescales and evolutionary regimes.

#### Fixation of a single mutation:

Our analysis for the total time to fixation of a single allele is based on the insight that fixation in the moving optimum model is controlled by three different timescales (Kopp and Hermisson 2007). The first timescale is ecological. It is given by the (inverse) speed of the trait optimum and governs the strength of selection. The other two timescales are genetic and measure the waiting time for a successful beneficial mutation and the time for the mutant allele to rise to fixation. Accordingly, the total time to fixation can be subdivided into three components that correspond to these timescales: The lag time until the allele becomes beneficial as a result of environmental change, the waiting time for a successful mutation (which may be zero if fixation occurs from an already segregating allele), and the (narrow-sense) fixation time.

The three timescales partition the genetic and environmental parameter space into three evolutionary regimes (Kopp and Hermisson 2007): In the *environmentally limited regime*, the total time to fixation of an allele is dominated by the lag time; in the *mutation-limited regime*, it is dominated by the waiting time; and in the *fixation time-limited regime*, it is dominated by the fixation time. The latter two regimes may be summarized as *genetically limited*. The principal distinction between them is the degree to which adaptation is rather stochastic (mutation-limited regime, populationwide mutation rate θ ≲ 1) or rather deterministic (fixation time-limited regime, θ ≳ 1). For our present issue, however (that is, the order of adaptive substitutions), what matters most is the distinction between the two genetically limited regimes and the environmentally limited regime.

As seen in Figure 1, the total time to fixation is shortest for mutations with intermediate effect α = α*. It increases with α for α > α* and decreases with α for α < α*. The limiting factor is different for small and large mutations, respectively. Mutations with small effects are under weak selection and, therefore, have low fixation probabilities and long fixation times. They are thus genetically limited (either by mutation or by fixation time, depending on θ). In contrast, mutations with large effects have increasingly long lag times before they become beneficial and are environmentally limited. For mutations of size α*, the lag time is approximately one-third of the total time to fixation (*i.e.*, it is equal to the mean of the waiting time and the fixation time), suggesting that fixation is fast because the genetic and environmental timescales are balanced. Because of all this, it makes sense to say that α* parameterizes the boundary between the two genetically limited regimes and the environmentally limited regime. The fastest mutations are located at this boundary.

In consequence, α* provides a natural scale for measuring allelic effects (see appendix c). Its numerical value decides which effect sizes are “small” or “large” in the context of the adaptive process. Of course, this classification is a purely relative one. Whether a mutation of a given absolute effect will be counted as small or large depends on the value of α* and, hence, on the ecological and genetic parameters.

Indeed, the relationship between α* and the model parameters is very simple (Equation 17). To a good approximation, α* depends on the speed of environmental change relative to the strength of selection, *v*/σ, and on the populationwide mutation rate θ. If θ is small, α* depends only on a single composite parameter(18)(see Equation 17b), which summarizes the genetic and ecological factors and should be amenable to empirical study. In nature, γ is expected to vary over several orders of magnitude. The speed of environmental change *v* experienced by a population depends on the ecological scenario and on the generation time, and it may vary from virtually imperceptible to extremely rapid. Similarly, the effective population size (and hence θ) differs greatly among species. Thus, for a given ecological change, α* will usually be larger for microbes (with large populations and short generation times) than, for example, mammals. On the other hand, the dependence of α* on γ as a third root is rather weak (Equation 17). For example, a 10-fold increase in the speed of environmental change leads to only a 2.14-fold increase in α*. For the parameter values covered in Figure 2, α* is on the order of 0.1 or 1. If all phenotype-related variables (α, σ, *v*) are measured in units of the environmental standard deviation, this is well within the range of mutational effects estimated from empirical data, which vary between 10^{−2} and 1 (Bürger 2000, p. 264). In addition, the distribution of mutational effects is thought to be highly leptokurtic; that is, mutations with large effects are overrepresented relative to a Gaussian distribution (Bürger 2000, p. 264). Therefore, it is likely that the majority of mutations will be smaller than α*, but a nonnegligible part is bound to be larger. This means, in particular, that the fastest mutations are likely to be available in the population.

#### The order of fixations at multiple loci:

In the single-locus model, we have argued that the fastest mutations are those at the boundary of the environmentally limited regime. A similar analysis can also be applied to the order of fixations in the multilocus model. However, to do so, we first need to define the notion of evolutionary regimes in a multilocus context. We will do so on a per-locus basis. Thus, in a multilocus model with unequal locus effects, one part of the loci may belong to one regime (*e.g.*, environmentally limited), whereas others fall into a different regime (*e.g.*, mutation limited). Note that this definition deviates slightly from the one used in Kopp and Hermisson (2007), where regimes were defined for the whole model, not for single loci.

Using only the timescale arguments from the single-locus case, the results shown in Figure 4 can be interpreted in the following way:

If all loci are in one of the genetically limited regimes, large-effect loci adapt before small-effect loci. This corresponds to the classical pattern known for adaptation under constant selection (

*e.g.*, Orr 1998, 2002; Kim and Orr 2005).If all loci are in the environmentally limited regime, loci with small effect are favored over ones with larger effect. This confirms the pattern described by Bello and Waxman (2006), Kopp and Hermisson (2007), and Collins

*et al.*(2007).In a general model with loci in both regimes, we see a branching pattern in the order of fixations, where loci with an intermediate effect fix first and mutations with either small or large effects follow up later. Thus, the linear patterns described under i and ii can be seen as special cases, which only arise if the trait has a genetic architecture with a restricted range.

Despite the qualitative agreement with the single-locus results, however, Figure 4 suggests that the boundary between the genetically limited and the environmentally limited regimes is shifted to the left, that is, to locus effects less than α*. This can be explained by two genuine multilocus effects that are acting at small or large mutation rates, respectively:

For small θ, a

*sampling effect*causes the average size of the first fixations to be smaller than α* (the fastest mutation in the one-locus model). The reason is that the waiting time for the first successful mutation in a “race” at multiple loci is shorter than the average waiting time at a single locus. This advantage is larger for loci with small effect, for which the waiting time is the dominating timescale. The sampling effect can still be captured analytically by using results from the one-locus model and ignoring interactions between loci (appendix d).For large mutation rates (θ ≳ 1), the fixation pattern is affected by interactions between cosegregating alleles. Figure 4, C and E, shows that the split into two branches is again shifted to smaller locus effects and sets in only after an initial phase (Y shape). The reason is epistatic interactions: For large θ, mutations with small effect are dominated by the (narrow sense) fixation time. At any given time, therefore, there will be multiple segregating mutations that have not yet reached fixation. These mutations contribute to a narrowing of the gap between the mean phenotype and the optimum phenotype and reduce the selection pressure on mutations at other loci. In particular, they increase the lag time for mutations with larger effect, which leads to a shift in the pattern of early fixations toward smaller loci.

#### Discussion of the modeling approach:

Previous modelers have used different approaches for studying the genetics of adaptation (Orr 2005a). Given the complex nature of the adaptive process, it is not surprising that different models have focused on different aspects. The adaptive walk and mutational landscape models view evolution as a search process on a high-dimensional fitness surface (Gillespie 1983, 1984; Kauffman and Levin 1987; Kauffman 1993; Orr 2002). Here, the key factor is epistasis, which makes the landscape rugged. Models of clonal interference focus on the effects of linkage and recombination on the fixation process (Barton 1995; Gerrish and Lenski 1998; Kim and Orr 2005; Park and Krug 2007). None of the above models contains an explicit phenotype. In contrast, Fisher's geometric model explores the consequences of pleiotropy in a high-dimensional genotype–phenotype map (Fisher 1930; Orr 1998; Welch and Waxman 2005; Martin and Lenormand 2006). In our moving optimum model, finally, the key point is the inclusion of a changing environment, that is, of ecology.

So far, the specifics of the alternative approaches—epistasis, pleiotropy, and linkage—are largely ignored in our model. However, all these factors are potentially important for a full understanding of the genetic basis of adaptation. Since the moving optimum model can be readily extended to include all these aspects, it provides a natural basis for an integrative approach toward the genetics of phenotypic adaptation.

Extending our model in this way might well lead to changes in some of our predictions. For example, many of our results depend on the fact that large mutations initially overshoot the optimum and only have a chance at fixation after a certain amount of time. This conclusion is inevitable in our model, where the phenotype space has only a single dimension. However, starting with Fisher (1930), many authors have stressed the necessity to view adaptation as an inherently multidimensional problem. In a high-dimensional phenotype space, it is conceivable that a mutation overshooting the optimum in a particular dimension is nevertheless selected for due to beneficial effects in other dimensions (*i.e.*, due to pleiotropic side effects). Thus, large mutations might become fixed earlier than it is possible in the present model.

Another way to compare the various models is with respect to the included timescales. In the models of Kauffman, Gillespie, and Orr, only the mutation timescale is relevant. Kim and Orr (2005) allow for an increased mutation rate, which introduces the fixation time as a second important scale. The interaction of the mutation and fixation timescales also determines the outcome of clonal interference models. Our approach adds as a third component the ecological timescale. As we have shown, the inclusion of this scale can change the predicted pattern of fixations substantially.

Models of constant selection predict that large mutations fix before small mutations (regardless of whether mutational effects are measured in terms of phenotype or fitness; Orr 2005a,b). Our model contains this result as a special case if all alleles are in the mutation-limited regime (*i.e.*, if their effects are sufficiently small). However, in the general case, it is mutations of intermediate size that fix first. If all mutations are in the environmentally limited regime, we can even observe a complete reversal of the pattern, with small mutations fixing before large ones.

So far, the classical prediction of large mutations fixing first has largely been confirmed by studies of microbial experimental evolution (Elena and Lenski 2003). However, almost all of these studies have subjected microbes to selection toward a fixed new optimum. One exception is the recent work by Perron *et al.* (2008), who exposed bacteria to increasing concentrations of an antibiotic. However, these authors did not resolve the size and order of individual substitutions. Also, even though the selection pressure increased over time, the selection target itself (resistance) was fixed. In other words, selection was directional with increasing intensity, not stabilizing with a moving optimum, as assumed in our study. This probably leads to predictions different from the ones derived here, as large mutations (conferring a high degree of resistance) are likely to be selected for from the very beginning (unless there are strong trade-offs involved). The above example shows that gradual environmental change can imply different scenarios, which call for different modeling approaches.

In this article, we have focused on the order of adaptive substitutions. Another important question is the *distribution* of fixed mutations over a longer bout of adaptation (Orr 1998, 2002, 2005a,b). However, while the present biallelic model can successfully predict the size of the first fixation (Equation D5), it is less suited for studying the distribution of subsequent adaptive steps. The reason is that we assume only a limited number of mutant alleles, all of which eventually go to fixation. As a consequence, these mutations are no longer available for future steps. For long-term adaptation, a more realistic approach is to use a continuum-of-alleles model (*e.g.*, Bürger 1999), in which the effects of new mutations at each step are picked from a (fixed) distribution. The main question in such a model is how the (long-term) distribution of fixed substitutions is related to the distribution of new mutations. This will be the subject of a separate study.

Key concepts and findings of the present study include the role of timescales and evolutionary regimes and the dependence of regime boundaries on the ecological parameters and the genetic architecture. They are not the results of specific model assumptions, but were developed and refined in a stepwise process from a single haploid locus to the full diploid multilocus case. We therefore expect that these findings also remain valid in a general context and can serve as guide for future model extensions.

## APPENDIX A: SIMULATION METHODS

Here, we describe the computer simulations. We assume that the population has a constant size *N*, individuals are hermaphroditic, and generations are nonoverlapping. Simulations start at time *t* = 0, where the population is assumed to be monomorphic for the wild-type alleles at each locus. Each generation is modeled in two steps (see Kopp and Hermisson 2007). First, we use deterministic equations to calculate the *expected* genotype frequencies after selection, (free) recombination, and mutation. Then, we use stochastic multinomial sampling to obtain the *actual* genotype frequencies in a finite population subject to genetic drift (*e.g*., Gillespie 1993). This means that the next generation is obtained by drawing *N* individuals (with replacement) from the frequency distribution calculated in step one. (In the diploid case, 2*N* haplotypes are drawn, assuming Hardy–Weinberg proportions before selection.)

As the number of loci (and hence the number of genotypes) increases, this model rapidly becomes computationally intractable. For the multilocus case, we therefore used a weak-selection approximation: As long as the population mean phenotype is sufficiently close to the optimum, linkage disequilibria can be neglected, and fitness can be approximated by the quadratic function(A1)This makes it possible to model the evolution of allele frequencies (instead of genotype frequencies) directly, using the equations(A2)(*e.g.*, Bürger 2005), where *p _{i}* is the frequency of the mutant allele at locus

*i*, Δ

*p*is the change in

_{i}*p*from one generation to the next, and is the population mean phenotype. When using this approximation, sampling is performed on alleles instead of genotypes.

_{i}## APPENDIX B: DERIVATION OF THE FIXATION TIME *T*_{F}

Here, we derive the narrow-sense fixation time *T*_{f}. For ease of notation, we define an alternative timescale , which measures time since the start of the fixation process (so that ). The selection coefficient on this timescale is given by(B1)Because depends on the waiting time, so does *T*_{f}. Once the fixation process has started (at ), the dynamics of the mutant allele frequency *p* can be approximated deterministically by the differential equation(B2)Here, the first term on the right-hand side describes logistic growth due to selection, the second term describes the input from mutation (both forward and backward), and the third term describes the contribution of genetic drift to the frequency increase of a mutation that is conditioned to go to fixation (Ewens 2004). Numerical solutions of this equation are in excellent agreement with simulation results (not shown). Unfortunately, Equation B2 does not have a closed analytical solution. To derive a tractable approximation, we neglect selection as long as it is weaker than mutation and drift, and we neglect mutation and drift when, together, they are weaker than selection. As a consequence, the fixation time is subdivided into two components,(B3)where *T*_{f,1} is the time span dominated by mutation and drift, and *T*_{f,2} is the time span dominated by selection. In the following, we need to distinguish between fixation from a new mutation and fixation from an already segregating allele.

#### Fixation from a new mutation:

For , we neglect selection (and back mutations) and assume that *p* initially increases due to mutation and drift alone, yielding . Mutation and drift remain dominant until . Using (B1), this yields(B4)At this time, the frequency of the mutant allele is(B5a)with(B5b)This approximation is valid as long as *p** is small. At the very least, it must be <0.5, which is true if *s** > 2μ + 1/*N*.

For , we neglect drift and mutation, such that Equation B2 becomes . With the initial condition , this can be solved to yield(B6a)where(B6b)and(B6c)and the condition (our criterion for fixation) leads to(B7)Equation 8a follows from (B5b) and (B3).

#### Fixation from an already segregating allele:

If fixation occurs from an already segregating allele, *T*_{w} = 0 and thus and . The expected frequency of the mutant allele at the end of the lag time, *p*(0), can be calculated as follows: During the lag time (*i.e*., for ), *p* is small, and Equation B2 can be approximated by . [Note that, here, the drift term 1/(2*N*) is omitted, because we are not conditioning on eventual fixation.] This can be solved to yield(B8)With the initial condition *p*(−∞) = 0, we find(B9)For , we follow the same approach as for fixation from a new mutation. For , *p* increases linearly according to until *p* = *p** (Equation B5a), leading to(B10) is again given by (B7), but with . Finally, the probability of fixation from an already segregating allele can be estimated as(B11)(J. Hermisson, unpublished data), where *p*(0) is the expected frequency of the mutant allele at the end of the lag time. Inserting (B9) into (B11) directly leads to (15).

## APPENDIX C: ANALYTICAL APPROXIMATION FOR α*

Here, we derive an analytical approximation for α*, the value of α that minimizes the total time to fixation in the one-locus model. Ignoring the dependence on α in the logarithmic term in Equation 14a (*e.g*., by setting α = 1 in this term only), the total time to fixation can be written as(C1a)where the constant *C* is given by(C1b)with(C1c)(C1d)In the following, we suppress the dependence on parameters in our notation. has a minimum at(C2)This approximation for α* is very accurate for small θ, but shows small deviations from the true value if θ is large (see Figure 2). Furthermore, plugging (C2) into (C1a) shows that, for a mutation of size α*, the total time to fixation, , is given by(C3)where the first term corresponds to the lag time and the second term to the sum of waiting time and fixation time. Therefore, within the limits of the approximation, the lag time of the fastest mutation is exactly one-third of the total time to fixation.

#### Scaling relationships:

We can use the above results to derive several useful scaling relationships. First, plugging (C1b) to (C1d) into (C1a) and ignoring the logarithmic term shows that α* is proportional to . For small θ, , and Equation C2 reduces to , which, if we again ignore the logarithmic term, is proportional to .

If we continue to focus on the case with small θ, we further see from (11) that . Thus, using (7b) and (8a), . For small θ, this sum is dominated by ; that is, *T*_{f} is small relative to *T*_{w} (and negatively correlated with it). Thus, the total time to fixation is determined by two components: the lag time *T*_{ℓ}, which is proportional to α/*v*, and the waiting time *T*_{w} (or rather, the sum of *T*_{w} + *T*_{f}), which is proportional to . However, if we choose to measure mutational effects in units of *v*/(σθ)—that is, we measure them relative to α*—we find that both *T*_{ℓ} and *T*_{w} + *T*_{f} can be measured in units of . At this scale, *T*_{ℓ} is proportional to α, and *T*_{w} + *T*_{f} is proportional to , but both are independent of the other parameters. In other words, α* defines a characteristic mutational step size and a characteristic time for adaptation.

## APPENDIX D: THE DISTRIBUTION OF THE FIRST FIXATION IN THE MULTILOCUS MODEL

Here, we derive the predicted distribution of the first fixation in the multilocus model. Assume that there are *L* loci with two alleles each. What is the probability that the mutant allele at locus *i* is the first to reach fixation? In the following, we present an approximation that is an extension of the two-locus approach by Kopp and Hermisson (2007). The key simplification is that allele frequency changes at different loci are assumed to be independent (neglecting epistatic interactions for fitness).

We first need to combine the cases of fixation from a new mutation and from an already segregating allele. By a slight abuse of notation, we use *T*_{w} = 0 to indicate fixation from a segregating allele. We also introduce additional indexes for the focal locus. Using (4b), (7a), and (15), the waiting time distribution (7a) at locus *i* then becomes(D1)Note that *g _{i}* has a discontinuity at

*T*

_{w,i}= 0. In particular, . The corresponding cumulative distribution function is(D2)For a given waiting time

*T*

_{w,i}, the total time to fixation is(D3)where(D4)Assume the mutant allele at locus

*i*has waiting time

*T*

_{w,i}. Then a mutant allele at another locus

*j*will fix first only if its waiting time is less than (

*T*

_{w,i}), which can be obtained by numerically solving the equation

_{i}(

*T*

_{w,i}) =

_{j}(

*T*

_{w,j}) for

*T*

_{w,j}. [More precisely, we need to distinguish three cases: If

_{i}(

*T*

_{w,i}) <

_{j}(0), then the allele at locus

*j*can never fix before the one at locus

*i*, even if it starts from an already segregating allele. Thus, can be set to a negative value (which has zero probability). If , then the allele at locus

*j*needs to start from a segregating allele and thus . Finally, if , then the allele at locus

*j*can start from a new mutation, and is the solution to the above equation.] Finally, the probability π

_{i}that the mutant allele at locus

*i*fixes before all other mutant alleles is given by(D5)

## Acknowledgments

We thank N. Barton, R. Bürger, C. Rueffler, A. de Visser, and two anonymous reviewers for helpful comments on the manuscript. M.K. and J.H. are members of the Mathematics and BioSciences Group at the University of Vienna, which is funded by a grant from the Vienna Science and Technology Fund to J.H.

## Footnotes

Communicating editor: M. W. Feldman

- Received December 16, 2008.
- Accepted February 24, 2009.

- Copyright © 2009 by the Genetics Society of America