# The Genetic Basis of Phenotypic Adaptation II: The Distribution of Adaptive Substitutions in the Moving Optimum Model

- Michael Kopp
^{1}and - Joachim Hermisson

- Mathematics and Biosciences Group, Max F. Perutz Laboratories and Faculty of Mathematics, University of Vienna, A-1030 Vienna, Austria

- 1
*Corresponding author:*University of Vienna, Dr. Bohr-Gasse 9, A-1030 Vienna, Austria. E-mail: michael.kopp{at}univie.ac.at

## Abstract

We consider a population that adapts to a gradually changing environment. Our aim is to describe how ecological and genetic factors combine to determine the genetic basis of adaptation. Specifically, we consider the evolution of a polygenic trait that is under stabilizing selection with a moving optimum. The ecological dynamics are defined by the strength of selection, , and the speed of the optimum, ; the key genetic parameters are the mutation rate Θ and the variance of the effects of new mutations, ω. We develop analytical approximations within an “adaptive-walk” framework and describe how selection acts as a sieve that transforms a given distribution of new mutations into the distribution of adaptive substitutions. Our analytical results are complemented by individual-based simulations. We find that (i) the ecological dynamics have a strong effect on the distribution of adaptive substitutions and their impact depends largely on a single composite measure , which combines the ecological and genetic parameters; (ii) depending on γ, we can distinguish two distinct adaptive regimes: for large γ the adaptive process is mutation limited and dominated by genetic constraints, whereas for small γ it is environmentally limited and dominated by the external ecological dynamics; (iii) deviations from the adaptive-walk approximation occur for large mutation rates, when different mutant alleles interact via linkage or epistasis; and (iv) in contrast to predictions from previous models assuming constant selection, the distribution of adaptive substitutions is generally not exponential.

AN important aim for both empirical and theoretical evolutionary biologists is to better understand the genetics of adaptation (*e.g.*, Orr 2005a). For example, among the multitude of mutations that arise in a population, which ones are eventually fixed and contribute to evolutionary change? That is, given a *distribution of new mutations*, what is the *distribution of adaptive substitutions* (or fixed mutations)? Here, distribution means the probability distribution of the effects of mutations on either the phenotype or the fitness of their carriers. In principle, both the distribution of new mutations and the distribution of adaptive substitutions can be measured empirically, the former from mutation accumulation experiments (Eyre-Walker and Keightley 2007) and the latter from QTL (*e.g.*, Bradshaw *et al.* 1998) or experimental evolution (Elena and Lenski 2003) studies. However, as only a small subset of all mutations is beneficial, such measurements are difficult. Therefore, a large role in studying the genetics of adaptation has to be played by theoretical modeling.

In recent years, several different approaches have emerged for modeling the process of adaptation. Considerable work exists, in particular, in the context of Fisher's geometric model (*e.g.*, Fisher 1930; Kimura 1983; Orr 1998; Welch and Waxman 2005; Martin and Lenormand 2006), Gillespie's mutational landscape model (*e.g.*, Gillespie 1983, 1984; Orr 2002), various models of so-called “adaptive walks” on rugged fitness landscapes (*e.g.*, Kauffman and Levin 1987; Kauffman 1993), and models of clonal interference in asexual populations (*e.g.*, Gerrish and Lenski 1998; Park and Krug 2007). Together, these models have yielded several robust predictions. For example, both Fisher's geometric model and the mutational landscape model predict that the distribution of adaptive substitutions should be approximately exponential (with respect to either phenotype or fitness) (Orr 1998, 2002, 2005a,b). This means that most substitutions have little effect, but that a significant fraction of the overall evolutionary change is due to a small number of substitutions with large effects. These results are in qualitative agreement with empirical data (Orr 2005a; Elena and Lenski 2003) and have shed new light on the classical debate about micro- *vs.* macromutationalism (Fisher 1930; Provine 2001).

One way to look at adaptation is to view selection as a sieve that transforms the distribution of new mutations into the distribution of adaptive substitutions (Turner 1981; Orr and Betancourt 2001). This perspective emphasizes the role of environmental factors and directly leads to the question of how different selective regimes (sieves) affect the adaptive process. Yet, almost all studies to date have focused on the simplest possible ecological scenario: a population that, after a sudden change in the environment, is now under constant stabilizing selection.

In reality, however, environmental change is often gradual rather than sudden (*e.g.*, Hairston *et al.* 2005; Thompson 2005; Parmesan 2006; Perron *et al.* 2008). To account for this possibility, several authors (Bello and Waxman 2006; Collins *et al.* 2007; Kopp and Hermisson 2007; Sato and Waxman 2008; Kopp and Hermisson 2009) have recently turned to the so-called moving optimum model, which was originally devised in the field of quantitative genetics (*e.g.*, Lynch *et al.* 1991; Lynch and Lande 1993; Bürger and Lynch 1995; Bürger 1999; Waxman and Peck 1999; Bürger and Gimelfarb 2002; Nunney 2003; Jones *et al.* 2004). In this model, the selectively favored value of a quantitative trait changes over time, such that the trait is under a mixture of stabilizing and directional selection. An important aspect of the moving optimum model is that it introduces an additional timescale (the timescale of environmental change), which is absent in the previous models.

In a recent article (Kopp and Hermisson 2009) and a previous note (Kopp and Hermisson 2007), we have used the moving optimum model to investigate the time to fixation of a single mutation and the order in which mutations of different phenotypic effect go to fixation. However, the fastest mutations in the short term are not necessarily those that dominate evolution in the long term. The present article focuses on this long-term evolution, which can be characterized by the distribution of adaptive substitutions.

## MODEL AND METHODS

We investigate the evolution of a quantitative trait *z* that is under stabilizing selection with respect to a moving optimum *z*_{opt}(*t*) (see Table 1 for a summary of our notation). At time *t*, the fitness of phenotype *z* is(1)where σ > 0 determines the strength of selection. We assume that the optimum increases linearly over time; that is,(2)where *v* is the speed of the optimum and δ(0) is its initial value. As we assume that, at *t* = 0, the mean phenotype of the population is 0, δ(0) equals the initial *phenotypic gap*. In general, we define the phenotypic gap at time *t* as(3)Note that δ has also been called the phenotypic lag (*e.g.*, Bürger and Lynch 1995). Here, we prefer the term gap to avoid confusion with the “lag time” for mutant alleles (Kopp and Hermisson 2009; see discussion), which is large if the initial gap is small.

The trait *z* is determined additively by *L* haploid or diploid loci with a continuum of possible alleles. Environmental variation is not modeled explicitly, but can be thought of as being subsumed in the selection parameter σ (*e.g.*, Bürger 2000). At each locus, new alleles appear by mutation at rate *u* (meaning that, in the diploid case, the mutation rate per individual allele is *u*/2), following a random walk mutation model: The allelic value of a mutant allele deviates from that of the parent allele by an amount α drawn from a probability distribution *p*(α), which is referred to as the *distribution of new mutations*. We assume that *p*(α) is symmetric around 0 with variance ω^{2} and that it decreases monotonically with |α|. Unless otherwise stated, we choose a reflected exponential distribution,(4)for our numerical examples. In appendix a, we also present some limited analytical results for other distributions (uniform, Gaussian, quartic Gaussian, and Gamma). It is convenient to use the mutational standard deviation ω as the unit for trait measurements and to rescale all dependent parameters and variables accordingly. Formally, this is done by requiring ω = 1 in (4). On the same scale, the speed of the optimum *v* is measured in mutational standard deviations per generation. Assuming a constant population size *K*, new mutations appear at rate *KLu*. In the following, we use the standard parameter Θ = 2*KLu* as a measure for the population- and genomewide mutation rate. The above model is analyzed by analytical approximations and stochastic simulations. We first describe the analytical methods.

## ANALYTICAL METHODS: THE ADAPTIVE-WALK APPROXIMATION

The aim of this article is to investigate the *distribution of adaptive substitutions* ϕ(α), that is, the distribution of the effects of those mutations that eventually go to fixation and contribute to adaptation. For an analytical treatment and heuristic understanding, we need some simplifying assumptions. In particular, we assume that a new mutation reaches fixation if and only if it survives stochastic loss while rare. Thus, the fate of a new mutation (fixation or loss) is decided immediately at its origin, and we assume that a successful mutation instantly reaches fixation. As a result, the population is nearly always monomorphic, and adaptation occurs in well-defined, quasi-instantaneous steps. A sequence of such steps is called an *adaptive walk* (see Figures 1A and 2A for examples). The adaptive-walk approximation is a standard tool for studying the adaptive process (*cf.* Kauffman 1993; Orr 2005a). Due to the assumption of instantaneous fixation, it ignores interactions between cosegregating alleles, which may arise from linkage or epistasis. The scope and limits of the adaptive-walk approximation are investigated in the simulation section below.

A single iteration of the adaptive walk can be characterized by three elements (Figure 1A): (i) the *initial gap* δ_{i}, which equals the phenotypic gap δ(*t*) at the end of the previous step (see Equation 3); (ii) the *final gap* δ_{f}, that is, the phenotypic gap at the beginning of (*i.e.*, right before) the current step; and (iii) the *step size* α. For given δ_{i}, both δ_{f} and α are random variables, with the distribution of the δ_{f} depending on δ_{i} and the distribution α depending on δ_{f}. The update between consecutive steps in terms of these elements is given by(5)where is the initial gap for the next step (Figure 1A).

Ultimately, the adaptive walk depends on the distribution of new mutations and on their fixation probabilities. In the moving optimum model, the latter depend explicitly on time. Using Haldane's approximation (Haldane 1927) and neglecting chance fixations of deleterious mutations, the fixation probability of a single new mutation at time *t* is(6)where *s*(α, *t*) is the selection coefficient of a mutation with effect α that appears at time *t*. (Conditions for the validity of this approximation are discussed in Kopp and Hermisson 2009.) Assuming weak selection, the selection coefficients in the moving optimum model are given by(7)(see Figure 1B). Here, and the overbar are used as equivalent notations for population averages, and denotes the marginal fitness of a mutation with effect α. To the leading order considered here, the selection coefficient depends on time only through the phenotypic gap δ(*t*) (Equation 3). Only mutations between 0 and 2δ(*t*) have a positive selection coefficient. In particular, for positive δ(*t*), only positive steps (α > 0) are selected for and for negative δ(*t*) only negative ones (Figure 1B). As the fixation probability for mutant alleles is determined by the heterozygote fitness, our analysis applies to both haploid and diploid populations. In the following, we derive analytical expressions for the distributions of the key variables of the adaptive walk (see Equation 5).

#### The distribution of the final gap δ_{f}:

The final gap δ_{f} is the phenotypic gap right before an adaptive step. Obviously, it is determined by the initial gap δ_{i} and by the time at which the step occurs. However, it is convenient to frame our analysis directly in terms of δ_{f}, with the time between steps entering only implicitly. In appendix a, we derive the distribution of δ_{f} conditioned on δ_{i}. In particular, we obtain the complementary distribution function *F*_{γ}(δ_{f} | δ_{i}), that is, the probability that the final gap is greater than δ_{f} for given δ_{i}. We find *F*_{γ}(δ_{f} | δ_{i}) = 1 for δ_{f} < δ_{i} and(8a)for δ_{f} ≥ δ_{i}, where(8b)and(8c)

Explicit expressions for *g*(δ) with particular choices of *p*(α) are given in appendix a. Note that the dependence of *F* on the ecological and genetic model parameters can be summarized in the single composite parameter γ. As we discuss below, this parameter plays a key role for the characteristics of the adaptive process. Differentiation of (8) yields the corresponding density function, which is(9)for δ_{f} ≥ δ_{i} and *f*_{γ}(δ_{f} | δ_{i}) = 0 for δ_{f} < δ_{i}.

#### The conditional distribution of the step size α:

Assume now that a step occurs when the total gap equals δ_{f}. The probability that it has size α is proportional to the fixation probability of the respective mutation, 2*p*(α)*s*(α | δ_{f}), where(10)is the selection coefficient given δ_{f} (see Equations 6 and 7). As only mutations with a positive selection coefficient (α between 0 and 2δ_{f}) can result in a substitution, we obtain the density of α given δ_{f} as(11)

An adaptive walk following Equation 5 can be easily simulated by specifying the initial gap for the first step and then iteratively applying (*i.e.*, drawing from) Equations 8 and 11. For further progress, however, we need to say something about the distribution of the initial gap values δ_{i} over the course of the adaptive walk.

#### The distribution of the initial gap δ_{i}:

The sequence of initial gaps δ_{i} can be viewed as a Markov chain. From (9) and (11), and using (5), the distribution of the initial gap, ρ(δ_{i}), evolves according to(12a)where the transition density *R* is given by(12b)The integration runs over all final gap values δ_{f} that are larger than the old initial gap δ_{i} and in absolute value larger than the new initial gap . Since the transition density *R* depends only on γ [and on the distribution of new mutations, *p*(α)], this must also hold for any *n*-step distribution of δ_{i} and, in particular, for the equilibrium distribution (δ_{i}). In extension, the same is true for the distributions of δ_{f} and α. In the limit γ → 0, we obtain *R*_{γ→0}( | δ_{i}) ∝ *p*(δ_{i} – )( – ()^{2}) for || ≤ |δ_{i}| and *R*_{γ→0}( | δ_{i}) = 0 otherwise. The equilibrium distribution for the initial gap ρ_{γ→0}^{(∞)}(δ_{i}) then converges to a point measure with weight 1 at δ_{i} = 0 (*i.e.*, the gap becomes infinitely small).

#### The unconditional distribution of the step size α:

We can express the unconditional distribution of step sizes α as(13a)where(13b)is the conditional distribution of α given the initial gap δ_{i}. To derive the unconditional distribution ϕ_{γ}(α), we need to decide on a distribution ρ_{γ} for the initial gap. In general, it is not possible to solve Equation 12a for the equilibrium distribution . For an analytical approximation of ϕ_{γ}(α), we instead focus on the distribution of δ_{i} at the beginning of the adaptive process. In particular, we assume that the population is initially (at the time the optimum starts moving) perfectly adapted. Then the distribution of the initial gap before the first step, (δ_{i}), is a point measure with weight 1 at δ_{i} = 0, and (13) evaluates to(14)for α ≥ 0 and (α) = 0 for α < 0. For a refined approximation, we can apply Equation 12 to (δ_{i}) and derive the distribution of the initial gap before the second step, (δ_{i}), and from that, the distribution of the size of the second step, (α). As we show below, and are good approximations for the equilibrium distribution .

#### The distribution of fitness effects at the time of fixation:

In addition to the phenotypic effects of adaptive substitutions, we might also be interested in their distribution of fitness effects (see Gillespie 1983; Gerrish and Lenski 1998; Orr 2002, 2006; Rozen *et al.* 2002; Martin and Lenormand 2008), where fitness is measured at the time the substitution takes place. To calculate this distribution, we first need to derive the distribution of fitness effects of new mutations for a given final gap δ_{f}. Due to the symmetry of the fitness function (10), there are generally two α-values for any selection coefficient *s* (corresponding to under- and overshooting of the optimum),(15)provided that . The cumulative distribution function (cdf) of the fitness effects of new mutations given δ_{f} is(16)where *A* and *S* are random variables for the step size and the selection coefficient, respectively, and *P*(α) is the cdf of the phenotypic effects of new mutations [the integral of *p*(α)]. Differentiation yields the corresponding density(17)for *s* ≤ σ and 0 otherwise. Here, the denominator is the magnitude of the partial derivates of α_{1,2} with respect to *s*. In analogy to Equation 13b, and focusing on the first step of the adaptive walk (*i.e.*, δ_{i} = 0), the distribution of fitness effects of adaptive substitutions is then(18)for 0 < *s* ≤ σ and 0 otherwise. Further results for the distribution of fitness effects are given in appendix c.

## SIMULATION METHODS

To further study adaptation in the moving optimum model, we developed an individual-based stochastic simulation program (written in C++, available upon request) similar to the one described in Bürger (2000, p. 273). The program simulates individuals whose genotypes are characterized by the allelic values at *L* additive genetic loci, which may be haploid or diploid. The sex ratio is always 1:1, and males and females are not modeled explicitly. Time is discrete and generations are nonoverlapping. Each generation, the following steps are performed:

*Viability selection*: For each individual, the phenotype is determined by summing over all allelic values, and the fitness is calculated according to Equation 1. With probability 1 –*w*, the individual dies and is removed from the population.*Population regulation*: If less than*K*individuals have survived selection, all of them proceed to reproduction. Otherwise,*K*individuals are selected randomly, and the rest is removed from the population. Thus,*K*is the carrying capacity of the environment.*Reproduction*: The selected individuals form random mating pairs, and each pair produces four offspring. This procedure ensures that the effective size of a well-adapted population is equal to*K*(Bürger 2000, p. 274). The genotype of the offspring is derived from the parent genotypes by segregation (in the diploid case), mutation (see model description), and recombination. Recombination between pairs of “adjacent” loci occurs at rate*r*≤ 0.5.

In a polymorphic population, an allele can be called fixed if the population has been taken over by that allele or its descendants (*e.g.*, Park and Krug 2007). Therefore, the program keeps track of the genealogical relationship between the alleles at a given locus. A substitution is recorded whenever there is a change in the root of such an “allele tree” (*i.e.*, when the surviving alleles get a new most recent common ancestor).

In all simulations, the initial population contained *K* identical individuals with phenotype 0. In the diploid case, the individuals were homozygous at each locus. The initial value of the optimum *z*_{opt}(0) = δ(0) was likewise set to zero. The simulations were iterated until reaching 1000 substitutions.

For some parameter combinations, the mean population fitness drops permanently below 0.5, such that reproduction (two offspring per individual) cannot compensate for the losses from viability selection. In this case, the population cannot keep track with the moving optimum and is doomed to extinction. Population extinction in the moving optimum model has been discussed in detail by Lynch *et al.* (1991), Bürger and Lynch (1995), and Nunney (2003). In the present study, however, we focused on parameter combinations where the population is able to maintain itself.

We also performed a limited number of simulations in which the optimum stopped moving after reaching a maximal value *z*_{max} (see Kopp and Hermisson 2007). The aim here was to determine the distribution of substitutions until the population had adapted to the new optimum. Unlike in the simulations with an indefinitely moving optimum, we counted a fixation event whenever the frequency of a mutant allele exceeded 0.9 for the first time. (The reason is that the ultimate replacement of all copies of an ancestral allele might take much longer than the movement of the optimum.) Simulations were stopped once the following criteria were met: (i) The optimum has reached the final value; (ii) the mean population fitness has exceeded a value of 0.999 at least once; and (iii) after i and ii are fulfilled, one final step has been recorded, or no step has occurred within 10,000 generations. For each parameter combination, we then determined the average distribution of adaptive substitutions over 100 replicated simulations.

## RESULTS

In the following, we first present analytical results based on the adaptive-walk approximation (5). Later, we compare the approximations to the results obtained from individual-based simulations.

#### Analytical results:

Figure 2 shows the distribution of adaptive substitutions in the adaptive-walk approximation for two different values of the parameter γ (Equation 8c) and for different stages of the adaptive process. The shaded areas show the equilibrium distribution , obtained from numerical iteration of Equations 8 and 11. is bimodal with a large positive and a very small negative part (indicating that backward steps are rare). In addition, the solid line in Figure 2B shows the distribution of the first step, , which is very close to for α ≥ 0, but does not extend to α < 0. Backward steps are, however, included in the distribution of the second step, , which is virtually indistinguishable from the equilibrium distribution (Figure 2C). In the following, we often neglect backward steps and use as a useful first approximation to .

Comparing the left and right columns of Figure 2, we see that ϕ_{γ} depends strongly on the composite parameter γ = *v*/(σΘ). To understand the role of this key parameter, it is instructive to rewrite Equation 8c as(19)Here, *v* and σ are our usual model parameters, which are measured in units of the mutational standard deviation ω (see model description), and and are the unscaled counterparts. The unscaled version in (19) makes explicit the strong dependence of γ on the width of the distribution of new mutations (γ ∼ 1/ω^{3}). Since γ itself is dimensionless, it does not depend on a particular scale. It relates two key determinants of the model behavior: While the numerator captures the dynamics of the environment, the denominator can be understood as a measure for the “adaptive potential” of the population, that is, its ability to follow the moving optimum. The adaptive potential includes both genetic factors (the mutation rate and the distribution of new mutations) and the static component of the selective environment (the selection strength σ). If γ is small, the environment changes slowly relative to the adaptive potential. This suggests that the adaptive process will be determined primarily by the environmental dynamics. In contrast, for large γ, the adaptive process is determined mainly by genetic factors, especially by the distribution of new mutations (see below). To make these ideas more explicit, we now formalize the concept of the selective sieve (see Introduction).

#### The selective sieve:

According to Equation 13, the distribution of adaptive substitutions is the product of three factors:(20a)Here, *p*(α) is the distribution of new mutations, which provides the raw material for adaptation. Ψ(α, γ) constitutes the *selective sieve*, which expresses the average fixation probability over the adaptive process and transforms new mutations into substitutions. As a function of α, the selective sieve splits into two basic components (compare Equation 13):(20b)(20c)(where ∝ indicates proportionality; all factors not depending on α are ignored).

The first component Ψ_{s} is the *static sieve*, which is independent of the ecological and genetic dynamics. It reflects the fact that (in our additive model) mutations with small phenotypic effects can have only small fitness effects and, hence, are often lost due to genetic drift. In consequence, the static sieve favors large adaptive steps, which is the classical argument raised by Kimura (1983) in the context of Fisher's geometric model.

The second sieve component Ψ_{d} is the *dynamic sieve*. Unlike the static sieve, it depends on γ and accounts for the dynamic changes in both the population and the environment. It is the dynamic sieve that removes mutations overshooting the optimum by more than the final phenotypic gap δ_{f} (see Figure 1B). Thus, in contrast to the static sieve, the dynamic sieve favors small mutations over large ones.

The three factors in Equation 20 combine to determine the shape of the distribution of adaptive substitutions (Figure 3). Clearly, the substitution rate of small mutations is limited by the static sieve Ψ_{s}. As new mutations with small phenotypic effects have low fixation probabilities, ϕ_{γ}(α) decreases linearly to zero for |α| → 0. This holds even if small mutations are abundant (as is the case for the exponential distribution of new mutations), as long as their density is bounded at α = 0 (since also the dynamic sieve Ψ_{d} is always bounded at α = 0).

In contrast, the substitution rate of large mutations is limited by both the dynamic sieve Ψ_{d} and the distribution of new mutations *p*(α). As shown in Figure 3, the relative importance of these two factors depends on γ. For small γ (Figure 3A), the right-hand tail of ϕ_{γ}(α) is determined mainly by the dynamic sieve, which sharply cuts off the right-hand tail of *p*(α). In contrast, for large γ (Figure 3B), the dynamic sieve is weak (*i.e.*, Ψ_{d} decreases only slowly with α), and the right-hand tail of ϕ_{γ}(α) is determined primarily by *p*(α). For intermediate γ, both factors are important. In appendix b, we show that they interact in a complex way, whose details depend on the fourth moment of *p*(α).

#### Genetic *vs.* environmental limitation:

The way in which γ determines the relative importance of the dynamic sieve and the distribution of new mutations in shaping the distribution of adaptive substitutions is a key result of this article. Indeed, the two examples shown in Figure 3 can be generalized to describe two different modes of adaptive evolution, or adaptive regimes, in the moving optimum model. In accordance with our interpretation of the parameter γ (see above, Equation 19), we say that the adaptive process (which can be characterized by the distribution of adaptive substitutions) is *environmentally limited* if γ is small and the right-hand tail of ϕ_{γ}(α) is mainly determined by the dynamic sieve. In contrast, we say that the adaptive process is *mutation limited* if γ is large and the right-hand tail of ϕ_{γ}(α) is mainly determined by the distribution of new mutations. These two cases are best viewed as the ends of a continuum. However, they can be studied in a “pure” form by considering the limits of very small or very large γ, in which only one of the two factors, Ψ_{d} or *p*(α), is operating.

##### Small γ—environmentally limited regime:

If , the speed of environmental change is small relative to the adaptive potential of the population, and the dynamic sieve is maximally efficient. As described after Equation 12b, the distribution of the initial gap converges to δ_{i} ≈ 0. Similarly, the final gap δ_{f} does not get much larger than δ_{i}, and the population tracks the optimum closely. In consequence, only very small adaptive steps are possible, and the distribution ϕ_{γ}(α) becomes independent of the shape of *p*(α). In the limit γ → 0, we find Ψ_{d}(α) → 0 for α ≠ 0, and the entire weight of ϕ_{γ=0}(α) is trivially concentrated at α = 0. For finitely small γ, we can approximate *p*(α) by a uniform distribution with the same mutational density Θ*p*(0) at α = 0. The distribution of adaptive substitutions then satisfies(21)An exact expression for is given in appendix a (Equation A18). In particular, the mean step size (*i.e.*, the expectation of ϕ_{γ}) in this limit can be expressed as(22)where Γ is the gamma function.

##### Large γ—mutation-limited regime:

In the opposite extreme of large γ, the environment changes fast relative to the adaptive potential of the population. As a consequence, the phenotypic gap δ_{f} will typically be large, and large mutational steps can contribute to the adaptive process. In this case, the dynamic sieve is not effective, and ϕ_{γ}(α) is limited for large α by the distribution of new mutations *p*(α). In the limit γ → ∞, Ψ_{d}(α) is constant. Only the static component of the sieve, Ψ_{s}(α), remains in place, and the distribution of adaptive substitutions satisfies(23)

Normalization yields the solution(24)where μ is the average size of a *positive* new mutation (α > 0) under the distribution *p*(α). Since Equation 24 is independent of the initial gap δ_{i}, it holds for all steps in the adaptive walk. For an exponential distribution of new mutations, (24) is a Gamma distribution with shape parameter 2 (Rozen *et al.* 2002). Furthermore, since , the mean step size evaluates to(25)

If the optimum increases indefinitely (as is assumed here), the limit γ → ∞ (*e.g.*, *v* → ∞) might appear unrealistic. However, Equation 24 also holds if the optimum stops moving once the phenotypic gap has reached a value δ_{f} sufficiently larger than μ. In this case, γ → ∞ corresponds to a sudden change in the environment (see *Simulation results* below). Indeed, an alternative derivation of ϕ_{∞}(α) is as a limit of the conditional distribution ϕ(α | δ_{f}) (Equation 11) for large δ_{f} (in which case *s* ∝ α). It is worth noting that Equation 24 corresponds to the distribution of the first substitution predicted by Kimura (1983) for Fisher's geometric model (see also Orr 1998).

For an exponential distribution of new mutations, the distribution of adaptive substitutions is well approximated by the limiting distribution if γ 0.1 and by the limiting distribution ϕ_{∞} if γ 10 (Figure 4). For γ ≈ 1, the distribution of adaptive substitutions is between those extremes, and the mean step size is less than that predicted by either of the two limits (Figure 4F). A more in-depth analysis of the transition between the environmentally and mutation-limited regimes is given in appendix b.

#### The distributions of the first steps and in equilibrium:

Finally, we can explain why the equilibrium distribution of adaptive substitutions is usually well approximated by the distributions of the first and second steps (see Figure 2). The explanation depends on whether the adaptive process is environmentally or mutation limited. This is because and have the same limit distributions for either γ → ∞ or , respectively (see above).

If γ is small and evolution is environmentally limited, the distribution of adaptive substitutions is strongly influenced by the dynamic sieve Ψ_{d}. In particular, the conditional distribution ϕ_{γ}(α | δ_{i}) (Equation 13b) depends strongly on the initial phenotypic gap δ_{i} (Figure 5A1). However, in this case, the gap typically remains small (Figure 5B1) and can be approximated by δ_{i} = 0.

In the opposite case of large γ (mutation-limited regime), the initial (and final) phenotypic gap can be substantial (Figure 5B2). Equation 13 shows that the only effect of the distribution of the initial gap, ρ(δ_{i}), on ϕ_{γ}(α) is through the dynamic sieve Ψ_{d}. As discussed above, however, the dynamic sieve is not effective in the mutation-limited case, and ϕ(α) is limited by the distribution of new mutations. Similarly, the value of δ_{i} has little influence on the conditional distribution ϕ_{γ}(α | δ_{i}) (Figure 5A2).

In summary, for small γ the phenotypic gap is small and can be approximated by δ_{i} = 0, whereas for large γ, the gap is large but has little effect on the distribution of adaptive substitutions. In consequence, the distribution of step sizes during an adaptive walk is not confounded by transient nonequilibrium effects. As long as the decisive parameter γ is kept constant, the equilibrium distribution is reached almost immediately and is well predicted by the distribution of the first (or second) step.

#### Simulation results:

In the following, we compare the analytical predictions from the adaptive-walk approximation to results from individual-based simulations. In contrast to the previous section, we use the unscaled parameters and introduced in Equation 19. Also the substitution effects are measured on a given constant scale, that is, not relative to ω. Using this constant scale makes explicit that an increase in γ due to a decrease in ω (see Equation 19) leads to a decrease in the average absolute substitution effect , whereas the relative effect α always increases with γ (compare Figures 4 and 6).

Figure 6 shows that the predictions from the adaptive-walk approximation are very accurate for populations experiencing a low influx of mutations (small Θ). A comparison with Figure 2 shows that backward steps are even rarer than in the adaptive-walk approximation. Therefore, the equilibrium distribution of adaptive substitutions is well predicted by the distribution of the first step, ϕ_{γ}^{(1)} (Equation 14). This is true for various shapes of the distribution of new mutations (including Gaussian, Gamma, and uniform distributions, data not shown) and in both the environmentally limited (large ω, small γ) and mutation-limited (small ω, large γ) regimes. Furthermore, the result holds in both haploid and diploid populations.

In contrast, if the influx of mutations into the population is high (large Θ), the observed distribution of adaptive substitutions deviates from the adaptive-walk prediction (Figure 7), and the direction of the deviation depends on the genetic architecture. For given Θ, substitutions are larger than predicted if the number of loci, *L*, is small but smaller than predicted if the number of loci is large (assuming the total individual mutation rate *Lu* is kept constant).

To explain these findings, recall that, in the adaptive-walk approximation, fixation (or loss) of mutant alleles is instantaneous and, therefore, adaptation occurs in well separated steps. In populations with large Θ, this assumption is violated. Instead, at any given time, different mutant alleles are likely to be segregating simultaneously. These alleles interact with each other and compete for fixation. The important point is that they do so in two different ways. One is based on linkage and the other one on epistasis.

##### Linkage effects:

If two mutations get established in different genetic backgrounds, the only way for them to go to fixation together is by a recombination event that brings them onto the same background. Otherwise, one of them will eventually outcompete the other. This effect has been described as the Hill–Robertson effect for sexual populations (Hill and Robertson 1966; Barton 1995) and as clonal interference for asexual populations (*e.g.*, Gerrish and Lenski 1998; Park and Krug 2007). In Figure 7, the linkage effect is strongest for small *L*, because decreasing *L* (while keeping Θ constant) amounts to reducing recombination. The effect of recombination is also investigated in Figure 8, where the recombination rate *r* between adjacent loci is varied.

Both Figures 7 and 8 show that, in the moving optimum model, increased linkage (interference) favors mutations with large phenotypic effect. Two factors contribute to this phenomenon. First, interference leads to a reduced substitution rate (because some mutations are lost even after they have survived drift), which increases the average gap between the mean phenotype and the optimum and leads to mutations with larger average effect being recruited. Second, a mutation with large phenotypic effect is more likely to outcompete an interfering small one than vice versa, even if both mutations initially have the same selective advantage. The reason is that, with a moving optimum, the selection coefficient of the large mutation will increase faster (see Equation 7).

As shown in Figure 9, the importance of the linkage effect increases with the rate of environmental change *ṽ*, but is relatively insensitive to the selection strength . This can be explained as follows: The amount of interference depends on the probability of cosegregating mutations, which is determined by the substitution rate (the number of substitutions per time) and the average fixation time *T*_{f} (*i.e.*, the time a mutation takes to increase from a single copy to frequency 1, *cf.* Kopp and Hermisson 2009). Increasing *ṽ* leads to a strong increase in the substitution rate, because adaptation needs to cover a larger phenotypic distance per time, whereas the mean step size increases only relatively weakly (see Figure 4F). It also leads to a decrease in the average fixation time (because overall selection is stronger), but this is not enough to compensate for the increased substitution rate. In consequence, more alleles are segregating at any given time, and the opportunity for interference increases. In contrast, if *ṽ* is constant and increases, the substitution rate increases only slightly (because the average step size decreases), and this effect approximately cancels with the decreased fixation time due to stronger selection. Because the linkage effect is (more or less) independent of , it also has no strong relation to γ and can be equally strong in the environmentally and mutation-limited regimes.

##### Epistasis effects:

The second way in which cosegregating alleles interfere is via epistasis. More precisely, the stabilizing selection component of the moving optimum model induces negative epistasis for fitness. Each mutant allele that brings the population mean phenotype closer to the optimum reduces the fitness of all other mutant alleles with the same sign. In other words, each segregating mutation that increases in frequency effectively reduces the speed of environmental change “perceived” by other mutations. This reduction in the effective *ṽ* favors small mutations and leads to a reduced average step size. The effect can also be summarized as follows: Several small mutations can be substituted together with overlapping fixation time intervals, while several large mutations cannot. The epistasis effect increases with increasing Θ, but does not strongly depend on *ṽ* and (data not shown). For tight linkage (small *L* or *r*), it tends to be overshadowed by the linkage effect, which works in the opposite direction. Therefore, the epistasis effect is best seen if the number of independent loci is large (see Figure 7C with *L* = 100).

#### Limited increase of the optimum:

So far, we have assumed that the optimum increases indefinitely, which is, of course, an idealization of biological reality. To test the robustness of our results for shorter bouts of adaptation, we now assume that the optimum stops after reaching a maximal value *z*_{max} (see *Simulation methods*). To avoid complications caused by interference between cosegregating alleles (see above), we focus on populations with small Θ.

In the model with an indefinitely moving optimum, we have shown that the distribution of step sizes changes little over the course of the adaptive process. Indeed, a key result is that the long-term equilibrium distribution is very similar to the distribution of the first steps (Figure 2). In contrast, with a limited increase of the optimum, the size of the later steps is bound to decrease as the population approaches the final optimum. This is, indeed, observed in our simulations (right-hand columns in Figure 10).

We can now ask for how long the optimum must move so that the distribution of adaptive steps is accurately predicted by the adaptive-walk approximation (which assumes an indefinite movement). As illustrated in Figure 10, this is the case if the ratio of the total movement of the optimum to the predicted average size of a single step, , is sufficiently large. Most steps will then occur during the moving phase, and the final steps during the constant phase have no significant effect on the overall distribution (Figure 10C). Our simulations indicate that a value of is sufficient for a good fit with the prediction from Equation 14. This result is independent of whether the ratio is large because *z*_{max} is large or because is small (additional results not shown).

The sudden change scenario assumed in most previous studies of adaptation can be seen as a limiting case of the moving optimum with *v* → ∞. In *Analytical results*, we have argued that the limiting distribution ϕ_{∞}(α) (Equation 24) is an approximation for the sudden change scenario, provided the jump of the optimum (which is equivalent to the final phenotypic gap δ_{f}) is large enough. This is illustrated in Figure 11. For a small jump, there is an excess of small and negative steps relative to the prediction from Equation 24, but for a large jump, the fit is very good. Consequently, even the extreme case of a sudden environmental change is well approximated in the moving optimum framework if (and thus the total number of adaptive steps) is moderately large.

## DISCUSSION

Adaptive evolution is the product of two factors: (i) genetic variation, which ultimately derives from mutation, and (ii) the selective environment, which “sieves” the genetic variation and turns a subset of new mutations into adaptive substitutions. Both factors are inherently dynamic: Mutation is a stochastic process, and new selection pressures arise as a result of environmental change. The aim of this study (and the recent study by Kopp and Hermisson 2009) was to develop a model of adaptation that accounts for the dynamics at both levels and to analyze their combined effects. A natural framework for such a study is provided by the moving optimum model, which combines simple genetics (a single polygenic trait under stabilizing selection) with a very basic “ecology” (the change of the optimum at constant speed).

The genetic basis of adaptation can be characterized by the (expected) distribution of adaptive substitutions, ϕ(α), that is, by the distribution of the phenotypic effects of those mutations that are fixed during a bout of adaptation. Here, we have focused on how this distribution depends on the genetic and environmental parameters. In the moving optimum model, the genetic parameters comprise the population and trait-wide mutation rate Θ and the distribution of new mutational effects *p*(α). The selective environment is characterized by the selection strength σ and the speed of the optimum *v*. To derive analytical results, we have used an adaptive-walk approach, which assumes immediate fixation of adaptive alleles. In addition, we performed individual-based simulations, which cover the whole fixation process and account for potential interactions between cosegregating mutant alleles.

#### The shape of the distribution:

Most adaptive substitutions are in the forward direction of the moving optimum (*e.g.*, Figure 6). Although backward steps after a previous “overshooting” (Sato and Waxman 2008) are possible, they are very rare (Figure 2). The distribution of forward substitutions is unimodal: Most substitutions are of an intermediate effect, while both very small and very large steps are rare. Furthermore, the distribution is remarkably stable over the course of the adaptive process. That is, if the optimum of a perfectly adapted population starts moving, already the distribution of the first step is very similar to the equilibrium distribution after many steps. As a technical consequence, we can use the distribution of the first step (which is easier to derive) as a valid approximation for ϕ(α). Furthermore, all predictions about ϕ(α) also hold for short-term adaptation, where the movement of the optimum stops after a limited phenotypic distance. Our simulations show that the distribution of the first step approximates the distribution over an entire adaptive bout if the optimum changes by at least five times the average step size (Figure 10).

#### The selective sieve:

The distribution of adaptive substitutions can be decomposed into three factors: ϕ(α) = *p*(α)Ψ_{s}(α)Ψ_{d}(α) (Equation 20), where *p*(α) is the distribution of new mutations, and Ψ_{s} and Ψ_{d} are two *sieve functions*, which describe the effect of selection (Figure 3). The *static sieve* Ψ_{s} ∝ |α| reflects the fact that, independently of the ecological dynamics, alleles with small phenotypic effect have small fixation probabilities. As a consequence, ϕ(α) always approaches zero for α → 0. The *dynamic sieve* Ψ_{d}(α) captures the ecological and genetic dynamics. It cuts off mutations with large phenotypic effects when these would overshoot the moving optimum by more than the current phenotypic gap. The unimodal shape of ϕ(α) arises from the combined action of the two sieves.

#### The parameter γ:

To a large degree, the combined effect of the genetic and environmental dynamics on ϕ(α) depends on a single composite measure γ, which determines the properties of the dynamic sieve Ψ_{d}. In arbitrary units of the trait, γ is given by(Equation 19), where the distribution *p*(α) of new mutations enters through its standard deviation ω. The numerator describes the rate of environmental change, whereas the denominator can be understood as a measure of the adaptive potential of the population to follow the optimum.

#### Adaptive regimes:

Depending on the value of γ, we can distinguish two adaptive regimes, which form the endpoints of a continuum (Figure 4). The *mutation-limited* regime (large γ) is characterized by a large average gap between the population mean trait value and the optimum. As a consequence, the population follows the optimum with large steps, whose size is restricted only by the distribution of new mutations *p*(α). In the limit of γ → ∞, the dynamic sieve Ψ_{d} is “flat,” and the distribution of adaptive substitutions is independent of the ecological dynamics: ϕ(α) ∝ ∣α∣ *p*(α) (Equation 24). This limiting distribution corresponds to the one suggested by Kimura (1983) for Fisher's geometric model. In contrast, in the *environmentally limited* regime (small γ), the population tracks the optimum with only a narrow gap. Large mutations are usually not adaptive and are cut off by the dynamic sieve (which in this case is “steep”). Therefore, evolutionary change occurs in many small steps. For very small γ, ϕ(α) becomes independent of the shape of the distribution of new mutations and depends only on the selective sieve: ϕ(α) ∝ Ψ(α) (Equation 21).

The transition between the mutation and environmentally limited regimes occurs roughly over an interval γ ∈ [0.1, 10]. Biologically, γ is determined by the population size, the size of the mutational target, and the rate of environmental change (in mutational standard deviations per generation). In nature, these parameters can vary over many orders of magnitude. Small values of γ are expected, for example, for unicellular organisms with large population sizes and short generation times or under the gradual climatic shifts of geologic history. More generally, environmental limitation will almost necessarily play a role if the population is able to track a moving optimum for a long period of time. In our model, extinction risk is high already for γ ≥ 10 (simulation results not shown; *cf.* Lynch *et al.* 1991; Bürger and Lynch 1995; Nunney 2003). Purely mutation-limited adaptation is therefore possible only over short time spans.

#### Interaction effects:

As long as the frequency of adaptive substitutions is low or moderate, the results of the adaptive-walk approximation are fully confirmed by individual-based simulations. For strong “adaptive traffic,” however, we observe deviations due to interactions between cosegregating beneficial alleles (which are not accounted for in the adaptive-walk approximation). Interaction effects are most relevant for high mutation rates (Θ > 1), but are also favored by a fast moving optimum (which requires many substitutions per unit time). Therefore, the importance of interactions does not depend on γ and can be large in both the mutation-limited and the environmentally limited regime. We find two different types of interactions, which have opposite effects:

Linkage disequilibria between cosegregating alleles lead to Hill–Robertson (or clonal) interference, which induces a shift of ϕ(α) toward larger steps. This effect is well known from previous studies (

*e.g.*, Barton 1995). It can be strong if linkage is very tight (*r*< 0.01) or, equivalently, if the trait is affected by mutations at only a few loci (*L*< 10).In contrast, interactions through negative epistasis for fitness (due to stabilizing selection) favor smaller alleles over larger ones. The reason is that multiple small alleles can have overlapping fixation times (because epistatic interactions between them are weak), whereas multiple large ones cannot.

#### Comparison to Kopp and Hermisson (2007, 2009):

The present article, which focuses on the long-term distribution of adaptive substitutions, complements two earlier studies on the moving optimum model (Kopp and Hermisson 2007, 2009), in which we investigated the fixation *time* of individual mutations and the *order* of substitutions over short bouts of adaptation. In the following, we summarize the picture arising from all three studies and highlight the similarities and differences between them.

All three articles stress the importance of the ecological timescale. They show that the rate of environmental change has a significant impact on the pattern of adaptive substitutions. If environmental change is sufficiently slow, the ecological timescale will influence or even dominate the adaptive process, causing the size of the most relevant mutations to decrease. In both the present study by and the recent study by Kopp and Hermisson (2009), the impact of the environment can be determined by a single composite parameter γ, which relates the rate of environmental change to the adaptive potential of the population. On the technical side, analytical results in all three articles build on a single-locus framework, in which interactions among segregating alleles are ignored. Simulations show that this approximation is valid in a large part of the biologically relevant parameter space.

However, there are also notable differences among the three articles. Most fundamentally, they address different questions. In Kopp and Hermisson (2007, 2009), we investigated the *order* of fixations and aimed at identifying the *fastest* substitution, that is, the substitution that initiates the adaptive response. In contrast, the present article considers the *distribution* of substitutions and, in particular, the *most frequent* substitution over a complete adaptive bout. Although the fastest and the most frequent substitution are often similar, they are not identical. The fastest substitution depends on the total *time* to fixation, whereas the most frequent one depends mainly on the (average) fixation *probability*. This difference has important consequences for the definition and demarcation of adaptive regimes.

When studying the order of substitutions (Kopp and Hermisson 2007, 2009), mutations with different effects can be seen as participating in a “race to fixation.” The outcome of this race depends on the total time to fixation, which has three components: the lag time until the allele becomes beneficial; the waiting time for the appearance of a successful mutation (which, in turn, depends on the mutation rate and the fixation probability); and the narrow sense fixation time, that is, the time the successful allele needs to sweep through the population. Accordingly, we distinguished three adaptive regimes with regard to the race-to-fixation problem (Kopp and Hermisson 2009): the environmentally limited regime, in which the total time to fixation is dominated by the lag time; the mutation-limited regime, in which it is dominated by the waiting time; and the fixation time-limited regime, in which it is dominated by the fixation time. Small mutations are either mutation or fixation time limited (*i.e.*, they have long waiting or fixation times), whereas large mutations are environmentally limited (long lag times). The fastest mutations are those of intermediate size, which are located at the boundary to the environmentally limited regime.

Just as the fixation time in the previous articles can be decomposed into the lag time, the waiting time, and the fixation time, the distribution of adaptive substitutions can be decomposed into three factors: the distribution of new mutations *p*(α), the static sieve Ψ_{s}, and the dynamic sieve Ψ_{d} (Equation 20). In part, these decompositions are analogous. In particular, the lag time is implicit in the dynamic sieve and can be seen as the reason for why large mutations are cut off. However, there are also two important differences.

First, for the distribution of adaptive substitutions studied in the present article, the narrow sense fixation time is unimportant (and, consequently, is ignored in the adaptive-walk approximation). Once a mutation is picked up by selection and destined to go to fixation (which depends on the fixation probability), it does not matter how long the allele needs to actually reach frequency 1. In consequence, there is no fixation time-limited regime with respect to the distribution problem. (In fact, the fixation time still plays an indirect role by influencing the probability of interference between cosegregating alleles. With some right, one could therefore define an “interference regime” for the distribution problem to complement the fixation time-limited regime in the race-to-fixation problem. Note, however, that, in the race problem, the fixation time is relevant even in the absence of interaction effects.)

Second, in the present article, the distribution *p*(α) of new mutations enters as a (necessary) new player influencing the distribution problem. While, in the previous article, all mutations participating in the race to fixation were equipped with the same (recurrent) mutation rate θ, the traitwide mutation rate Θ in the present article is unevenly distributed across mutations with different effect. Since the waiting time for a successful mutation depends on both the fixation probability (*i.e.*, the static sieve) and the mutation rate, there are now two ways in which the substitution rates of two mutations can differ as a result of differential “mutation limitation”: either due to differences in the fixation probability or due to differences in the mutation rate. The former possibility (differential fixation probabilities) is present in both articles and explains the pattern for small mutations (long fixation time, low substitution rate). The latter possibility (differential mutation rate) occurs only with a nonuniform distribution of new mutations and contributes to limiting the substitution rate of large mutations in the present article. However, for which kind of mutations (intermediate or large ones) this second kind of mutation limitation is more or less important than environmental limitation (*i.e.*, the dynamic sieve) depends on the precise shape of *p*(α) (appendix b, Figure B1). In consequence, determining the limiting factor for the substitution rate of alleles with specific effect (as was done for the fixation time in Kopp and Hermisson 2009) can be complex (see appendix b). Instead, for the distribution problem, it proved more useful to define the adaptive regimes with respect to the whole adaptive process, that is, the entire distribution of substitutions. Indeed, the two regimes are best characterized via the limiting distributions given by Equations 21 and 24, respectively.

The nonuniform distribution of new mutations is also the reason why there is no direct correspondence between the summary parameters γ appearing in the two studies. In Kopp and Hermisson (2009), a parameter γ = *v*/(σθ) determines the effect of the fastest mutation α*, which defines the boundary to the environmentally limited regime. Unlike γ in the present study, it depends on the per-locus mutation rate θ, not on the per-trait mutation rate Θ. For nonuniform *p*(α), these two parameters cannot be straightforwardly translated into each other. Nevertheless, the two kinds of γ play essentially similar roles in describing the transition between mutation-limited and environmentally limited evolution.

#### Comparison to other models of adaptation and future directions:

Previous studies of adaptation to a single sudden change in the environment suggest that the distribution of adaptive substitutions should often be roughly exponential (Orr 1998, 2005a). In particular, an exponential distribution of adaptive substitutions has been obtained for both Fisher's geometric model (Orr 1998) and Gillespie's mutational landscape model (Orr 2002), although these two models are different in many aspects. For example, mutational effects are measured as phenotypic distances in Fisher's model and as fitness effects in the mutational landscape model. Yet, there are two main reasons why the two models show similar results (Orr 2005b): First, in both models, the distribution of new mutations is biased toward small mutations. Second, in both models, each adaptive step leads to a rescaling of the original situation: The population is closer to the optimum (in Fisher's model), or the likelihood of further beneficial mutations has decreased (in the mutational landscape model). As a consequence, the initial steps lead to sizable increases in fitness, whereas the later steps are mere “fine tuning.” It is this second point that is fundamentally different in the moving optimum model. For this reason, Orr (2005b) predicted that the distribution of adaptive substitutions in a moving optimum version of Fisher's geometric model should resemble that of the first step (originally calculated by Kimura), which is unimodal rather than exponential. This is, indeed, what was found in the present study. A similar unimodal distribution is also found for the fitness effects of adaptive substitutions (see appendix c, Figure C1).

However, a more important point seems to be what exactly determines the shape of the distribution of the first step. Here, we have shown that a “Kimura-type” distribution (Equation 24) arises only in the mutation-limited regime, where the mutational step size is limited by genetic constraints. In the environmentally limited regime, in contrast, the average mutational step size is no longer small relative to the distance to the (moving) optimum, and the distribution of adaptive substitutions is strongly influenced by the dynamic sieve. This leads to a significant decrease in average step size and also to a change in the shape of the distribution. In particular, the right-hand tail of ϕ(α) is proportional to exp(–*C*α^{2}) in the environmentally limited regime (see appendix b, Equation B1).

Interestingly, we usually do not find an exponential distribution of adaptive substitutions even in the case of a sudden and limited jump of the optimum (Figure 11). Indeed, if the jump is sufficiently large relative to the expected size of a single step, the overall distribution of step sizes is well approximated by the unimodal Kimura-type distribution (Equation 24). As seen in Figure 11C2, the mean size of subsequent steps declines only slightly, and the overall distribution of step sizes is almost identical to that predicted for the first step. The formal reason is that the distribution of new mutations *p*(α) introduces a fixed trait scale into the problem (the standard deviation ω) that does not rescale with the distance of the population to the optimum. As a consequence, the self-similarity arguments evoked by Orr (1998, 2005a) will generally not apply. Self-similarity is maintained, however, for the special case of a uniform distribution *p*(α), and, in this case, the distribution of the absolute values of step sizes approximately follows an exponential (in the limit of strong selection and weak mutation to avoid interference, results not shown). For the general case, Orr (1998) suggested to rescale the distribution of new mutations after each step with the distance to the optimum (*i.e.*, to consider percentage changes with respect to the distance to the optimum). Obviously, this procedure reinstalls self-similarity. However, the assumption implies that the effect not only of *substitutions*, but also of *new mutations* decreases to zero as the population approaches the optimum. This seems to be biologically implausible at least for our model (but also for Fisher's geometric model considered by Orr 1998).

The current moving optimum model has been designed as a minimal model of adaptation to gradual environmental change. In its present form, it does not yet reflect the high dimensionality and nonlinearity of the genotype-to-phenotype-to-fitness map, which has rightfully been viewed as an essential characteristic of whole-organism adaptation (Fisher 1930; Orr 2005a,b). Future extensions should include multilocus phenomena, such as pleiotropy (as in Fisher's model) and epistasis (as in the mutational landscape model), and strive for a better understanding of the effects of linkage (as in clonal interference models). Another important topic is the role of standing genetic variation (Orr and Betancourt 2001; Hermisson and Pennings 2005; Barrett and Schluter 2008) in adaptation to gradual environmental change. Most generally, for a full understanding of adaptation, it will be necessary to take into account both the genetic complexities of real organisms and a realistic ecology.

So far, almost no data exist for testing the predictions of the moving optimum model. In particular, experimental evolution studies with microbes have almost exclusively employed constant selection regimes. Notable exceptions are provided by Collins and Bell (2004) and Perron *et al.* (2008), but a recent study by Collins and De Meaux (2009) is the first to (partly) resolve the evolutionary dynamics at the genetic level. For the future, we hope that more experiments will include gradual variation in environmental factors.

In microbial evolution experiments, it is often not possible to measure quantitative traits that mediate the relationship between genotype and fitness. Nevertheless, key assumptions and predictions of the moving model can also be tested in the absence of phenotypic data. For example, a basic prediction is that adaptation involves more substitutions in gradually changing environments than under constant selection. The number of substitutions might be estimated from DNA sequence data or by marker-based approaches as in Imhof and Schlötterer (2001). Furthermore, a hallmark of the moving optimum scenario is a reversal of fitness effects over time (*i.e.*, selection favors first small and then large mutations). Thus, fitness assays should reveal evidence for antagonistic pleiotropy (fitness trade-offs) over the environments encountered during the experiment, just as previous studies have found antagonistic pleiotropy in environments not encountered by the population (Hughes *et al.* 2007a,b).

#### Conclusions:

In the present study and the two previous articles (Kopp and Hermisson 2007, 2009), we have shown that the adaptive process is strongly influenced by the dynamics of the selective environment. For adaptation to gradual and long-term environmental change, this influence is almost unavoidable. In particular, if the environment changes slowly, environmental factors are more important than genetic factors. We believe that this observation is robust also for other (more complex) models and may change or even reverse predictions obtained under the assumption of constant selection (*e.g.*, small instead of large steps first, unimodal rather than exponential distribution of substitutions). Thus, our most general conclusion from this series of studies is also the most important one: Ecology matters for the genetic basis of adaptation.

## APPENDIX A: THE DISTRIBUTION OF THE “FINAL GAP” δ_{F} IN THE ADAPTIVE-WALK APPROXIMATION

As outlined in the main text, a single step of an adaptive walk in the moving optimum model can be characterized by three elements: The *initial gap* δ_{i} is the distance of the population mean phenotype to the optimum right after the previous step, the *final gap* δ_{f} is this same distance right before the new step, and α is the *step size*. Here, we derive the distribution of the final gap δ_{f} for a given initial gap δ_{i}. This distribution is a central piece in our derivation of the distribution of step sizes ϕ(α).

Without loss of generality, we assume that, at time *t* = 0, the population is monomorphic with phenotype *z*(0) = 0. Selection follows the moving optimum model (1) with an optimal phenotype *z*_{opt}(*t*) = *vt* + δ_{i}, where the initial phenotypic gap δ_{i} can be positive or negative. New mutations occur at rate Θ/2 = *KLu*. Their effect α on the focal trait follows the distribution *p*(α).

It is convenient to discuss the positive (α > 0) and negative (α < 0) part of the mutation spectrum (*i.e*., forward and backward steps) separately. Consider, first, the simple case that all mutations have the same effect α > 0 (*i.e*., there is only a single mutational step size) and that the initial gap δ_{i} = 0. The final gap δ_{f} is a random variable, which depends on the time at which the step occurs. However, the following analysis is framed directly in terms of δ_{f}, with time entering only implicitly. Furthermore, in abuse of notation, we use the same symbol δ_{f} for both the random variable and its realization. Then the distribution of the final gap can be derived as follows.

According to Equation 10, the selection coefficient for a given δ_{f} is *s*(α | δ_{f}) = ασ(2δ_{f} – α). The fixation probability of a new mutation can be approximated as 2*s*(α | δ_{f}) for δ_{f} > α/2 and 0 for δ_{f} < α/2 (Equation 6). Let *F*_{α}(δ_{f} | δ_{i} = 0) be the probability that the final gap is larger than δ_{f}; *i.e.*, 1 – *F*_{α}(δ_{f} | 0) is the cumulative distribution function of the final gap. Clearly, *F*_{α}(δ_{f} | 0) = 1 for δ_{f} ≤ α/2. For δ_{f} > α/2, the rate of new mutations that appear in the interval [δ_{f}, δ_{f} + ε] and are destined for fixation is . *F*_{α}(δ_{f} | 0) thus satisfies the differential equation(A1)which has the solution(A2)(*cf.* Kopp and Hermisson 2007). *F*_{α}(δ_{f} | 0) depends on the basic model parameters only through the combination γ = *v*/(σΘ). Unlike in the main text, however, the dependence on γ is omitted in the following for ease of notation.

Next, we again allow for adaptive steps of different size, but assume for the moment that the distribution of new mutations, *p*(α), is discrete. For δ_{i} = 0, the probability *F*_{+}(δ_{f} | 0) that no positive step of any size occurs while the gap is less than δ_{f} is(A3)

In the general case of a continuous distribution of new mutations, *p*(α) is a density, and the sum must be replaced by an integral,(A4)with(A5)Equation A4 is the complementary distribution function of δ_{f} for δ_{i} = 0. (Note that, for δ_{i} < 0, the argument of the function *g* may be negative. In this case, integration in (A5) goes from 0 to a negative value. This changes the sign of the integral, as we then use the convention that . Since the integrand in *g* is proportional to β, the two minuses cancel for β < 0, and *g* is always positive.)

To derive the distribution of δ_{f} for general values of the initial gap δ_{i}, we use the fact that the probability for no fixation can be decomposed according to *F*_{+}(δ_{1} + δ_{2} | 0) = *F*_{+}(δ_{1} | 0)*F*_{+}(δ_{1} + δ_{2} | δ_{1}). Here, the right-hand side is the probability that no fixation occurs while the phenotypic gap is between 0 and δ_{1}, times the probability that no fixation occurs while the gap is between δ_{1} and δ_{2}. These two events are independent (*i.e*., the probabilities are multiplicative), because the appearance of a mutation destined for fixation is a (inhomogeneous) Poisson process (see also Figure A1). In the above decomposition, δ_{1} can be negative, but δ_{2} should be positive. For an arbitrary initial gap δ_{i}, we thus obtain *F*_{+}(δ_{f} | δ_{i}) = 1 for δ_{f} < δ_{i} and(A6)

For negative δ_{i}, we also need to consider the possibility of backward steps, which have to occur while the final gap δ_{f} is still negative. We can then use the symmetry that the fixation probability of an allele with effect α at gap value δ_{f} is the same as for an allele with effect –α at gap –δ_{f}. If also the distribution of new mutations is symmetric, *p*(–α) = *p*(α), we can express the distribution function for backward steps by the distribution function for forward steps as *F*_{–}(δ_{f} | δ_{i}) = 1 for δ_{f} < δ_{i} and(A7)(illustrated in Figure A1, B and C). Combining Equations A6 and A7, the probability that no step of any size and direction occurs before the gap has reached δ_{f} is given by(A8)For δ_{f} ≥ δ_{i}, this yields the compact formula given in Equation 8. The three principal cases (δ_{i}, δ_{f} ≥ 0, δ_{i}, δ_{f} < 0, and δ_{i} < 0, δ_{f} ≥ 0) are illustrated in Figure A1.

#### Specific distributions of new mutations:

For specific distributions of new mutations, some of the integrals in the above equations can be evaluated explicitly (*e.g*., by using Mathematica). In particular, we can often obtain closed solutions for the function *g*(δ) (Equation A5).

##### Reflected exponential distribution:

For the reflected exponential distribution (4) used throughout most of this article, Equation A5 evaluates to(A9)where .

##### Gaussian distribution:

For a Gaussian distribution of new mutations (with mean 0 and standard deviation ω = 1), we obtain(A10)where Erf(.) is the error function.

##### Quartic Gaussian distribution:

For the quartic Gaussian distributionemployed in Figure B1, *g*(δ) evaluates to(A11)

##### Reflected Gamma distribution:

A reflected Gamma distribution has two free parameters. If we rescale the standard deviation to ω = 1, there is still one parameter left,(A12)and Equation A5 evaluates to(A13)with and *q* = 2 + 3η + η^{2}.

##### Uniform distribution:

Finally, for a uniform distribution of new mutations, which can be viewed as a limiting case if environmental limitation is strong (see main text), additional derivations are possible. Let *p*_{u}(α) = *p*_{o}. As a probability distribution, *p*_{u} must be normalized and should therefore be defined on a finite range. However, we assume that this range is large enough (relative to the typical phenotypic gap δ_{f}) to cover all mutations with a chance at fixation (*i.e*., with a positive selection coefficient). Adaptation is then always environmentally limited, and the boundaries of *p*_{u} can be ignored. Keeping *p*_{o} as a free parameter, Equation A5 becomes(A14)

The conditional distribution of step sizes given the final gap δ_{f} (Equation 11) evaluates to(A15)for α between 0 and 2δ_{f}. Since ϕ(α | δ_{f}) is symmetric around the δ_{f}, the mean step size equals the mean gap value before the step, . The density of δ_{f} becomes(A16)Using this relation, we can derive the mean gap and the mean and the distribution of the step size α for any given initial gap δ_{i}. In particular, for δ_{i} = 0,(A17)where Γ(.) is the gamma function, and(A18)where Erfc is the complementary error function and Γ(·, ·) is the (upper) incomplete gamma function. For an arbitrary mutation distribution *p*(α), we can use Equation A18 with the choice as an approximation for in the case of small values of γ (see Equation 21).

## APPENDIX B: TRANSITION BETWEEN ENVIRONMENTALLY AND MUTATION-LIMITED ADAPTATION

In the main text, we have described the transition between environmentally and mutation-limited adaptation as a transition of the distribution of adaptive substitutions, ϕ(α), between the two limiting distributions (Equation 21) and ϕ_{γ→∞} (Equation 24). We have also noted that the substitution rate of small mutations is limited by the static sieve Ψ_{s} (*i.e*., the fixation probability), whereas the substitution rate of large mutations is limited by either the dynamic sieve Ψ_{d} (in the environmentally limited regime) or the distribution of new mutations, *p*(α) (in the mutation-limited regime). In Figure B1, we further illustrate this transition and provide additional details.

The shading in Figure B1 indicates which of the above factors is limiting the substitution rate of mutations with a given phenotypic effect α. To demarcate the boundaries, we used the following (heuristic) criteria: The increasing part of ϕ(α) is always attributed to the static sieve, since Ψ_{s} is the only factor of Equation 20 that increases with α. The decreasing part is attributed to either Ψ_{d} or *p*(α). Which of these factors is more important can be decided by comparing their effect on the local slope of ϕ(α). Ψ_{d} is limiting ifOtherwise, *p*(α) is limiting.

For an exponential distribution of new mutations, the substitution rate of alleles with intermediate effect tends to be limited by *p*(α), whereas sufficiently large alleles are always limited by Ψ_{d}. For increasing γ, the boundary between the two regions is shifted to the right, until virtually all mutations are limited by *p*(α) (Figure B1, A–C). (Since the tail of the exponential distribution extends to mutations of arbitrary size, there is formally always a part dominated by Ψ_{d}, but this is of no practical relevance.)

In general, however, the observed pattern depends on the shape of *p*(α). For any distribution *p*(α) with existing first to third moments, the dynamic sieve has the limiting behavior(B1)which corresponds to the tail behavior of a Gaussian distribution. Therefore, the slope of ϕ_{γ} for sufficiently large α will be dominated by Ψ_{d} if the distribution of new mutations is leptokurtic (*i.e.*, has a heavier tail than a Gaussian), as is the case for the exponential distribution. In contrast, for a platykurtic distribution of new mutations, the right-hand tail of ϕ_{γ} is primarily shaped by *p*(α), and the dynamic sieve is most important for intermediate α. Therefore, in this case, the boundary between the domains of the two limiting factors moves from right to left as γ increases (Figure B1, D–F). In both cases, however, the entire decreasing part of ϕ(α) is (practically) dominated by Ψ_{s} if γ is sufficiently small (environmentally limited regime) and by *p*(α) if γ is sufficiently large (mutation-limited regime). (It should be noted, however, that the shading in Figure B1 does not mean that the nonhighlighted factor is irrelevant. For example, even though the dynamic sieve does not seem to play a role in Figure B1C, the distribution ϕ(α) is still very different from the limiting distribution ϕ_{∞} (Figure 4C). Indeed, Figure 4F suggests that for γ = 1 the adaptive process is right in the transition region between the two limiting regimes.)

## APPENDIX C: THE DISTRIBUTION OF FITNESS EFFECTS IN THE ADAPTIVE WALK APPROXIMATION

In this appendix, we investigate the distribution of fitness effects of adaptive substitutions, χ_{γ}(*s*). Unlike the distribution of phenotypic effects, this distribution is unimodal and strictly positive. Figure C1 shows how it depends on γ. The following points are noteworthy.

In the mutation-limited regime (*i.e*., for moderate and large γ), the equilibrium distribution of fitness effects, (*s*), is not well predicted by the distribution of the first step, (*s*) (Equation 18). The reason is that, unlike the distribution of phenotypic effects, the distribution of fitness effects depends strongly on the (final) phenotypic gap δ_{f}, which tends to increase after the first step. For the same reason, χ_{γ}(*s*) does not converge to a limiting distribution as γ → ∞, but increases without bounds. Note, however, that for large finite γ (or large *z*_{max} in the case of a sudden environmental change), the distribution of fitness effects is always proportional to the distribution of phenotypic effects ϕ_{∞}(α). This is because for , the relationship between α and *s* is effectively linear.

In the environmentally limited regime (small γ), the distribution of fitness effects converges to a limiting distribution , which can be derived by replacing *p*(α) with a uniform distribution of the same mutational density at α = 0, in analogy to the derivation of the limiting distribution for phenotypic effects. However, behaves differently from in at least two ways. First, the mean of is *smaller*, not larger, than the mean of χ_{γ}(*s*). This is because, with a uniform distribution of new mutations, many substitutions overshoot the optimum, and these have high α but low *s*. Second, the convergence toward the limiting distribution is slower for fitness effects than for phenotypic effects. For example, for γ ≤ 0.1, the fitness distributions χ_{γ}(*s*) and are markedly different (Figure C1, A and B), whereas the corresponding phenotypic distributions ϕ_{γ}(α) and are surprisingly similar (Figure 4, A and B), given that *p*(α) declines significantly (*i.e*., is clearly nonuniform) over the range of ϕ_{γ}(α) (see Figure 2B1). How is it possible that mutations with the same phenotype distribution have different fitness distributions in the two cases? The answer is that, with a uniform distribution of new mutations, the overall density of new mutations is higher and, hence, the time to the next adaptive step is shorter than with an exponential distribution. This results in a decreased phenotypic gap, which compensates for the increased probability of overshooting the optimum, leading to a cancellation of opposite effects with respect to phenotypes but not fitnesses.

## Acknowledgments

Two anonymous reviewers provided helpful comments on a previous version of the manuscript. The authors are members of the Mathematics and BioSciences Group at the University of Vienna, which is funded by a grant from the Vienna Science and Technology Fund to J.H.

## Footnotes

Communicating editor: M. W. Feldman

- Received June 12, 2009.
- Accepted October 2, 2009.

- Copyright © 2009 by the Genetics Society of America