## Abstract

We model selection at a locus affecting a quantitative trait (QTL) in the presence of genetic variance due to other loci. The dynamics at the QTL are related to the initial genotypic value and to the background genetic variance of the trait, assuming that background genetic values are normally distributed, under three different forms of selection on the trait. Approximate dynamics are derived under the assumption of small mutation effect. For similar strengths of selection on the trait (*i.e*, gradient of directional selection β) the way background variation affects the dynamics at the QTL critically depends on the shape of the fitness function. It generally causes the strength of selection on the QTL to decrease with time. The resulting neutral heterozygosity pattern resembles that of a selective sweep with a constant selection coefficient corresponding to the early conditions. The signature of selection may also be blurred by mutation and recombination in the later part of the sweep. We also study the race between the QTL and its genetic background toward a new optimum and find the conditions for a complete sweep. Overall, our results suggest that phenotypic traits exhibiting clear-cut molecular signatures of selection may represent a biased subset of all adaptive traits.

THE recent improvements in methods to detect positive selection from its molecular signature on neutral polymorphism (Nielsen 2005) and the great amount of information thus generated provide an unprecedented opportunity for evolutionary biologists to improve their understanding of the genetics of adaptation and of the recent adaptive history of species, in particular in humans (Nielsen *et al*. 2007).

On the one hand, genome scans search for recent (Glinka *et al*. 2003; Akey *et al*. 2004; Nielsen *et al*. 2005; Williamson *et al*. 2007) or ongoing (Voight *et al*. 2006; Tang *et al*. 2007) selection by genotyping many markers distributed throughout the genome and using the properties of the polymorphism pattern (heterozygosity, frequency spectrum) to reject neutrality either through a parametric model-based approach or by just picking outliers in the distribution (for the caveats of the latter method, see Teshima *et al*. 2006). This kind of study typically reveals many positive results. The next step is then to search databases for the functions of the identified genes (Voight *et al*. 2006; Tang *et al*. 2007), to go from the genotype to the phenotype, *i.e*., to answer the question “Which phenotypic traits were recently adaptive in the lineage leading to the species or population under study?” Yet, the relationship between the strength of selection on a gene and the selection on the trait is most often not explicitly defined in those studies. Clearly, a gene that was under strong selection must have affected a trait that was itself strongly selected. Nevertheless, not all traits that were recently under strong directional selection necessarily show strong signatures of selection at the gene level. If there were some systematic biases, it would be useful to identify and quantify them, to be able to interpret molecular signatures of selection in terms of phenotypic selection.

On the other hand, hitchhiking mapping methods focus on smaller candidate regions (Schlotterer 2003) that were previously identified either by functional genetics or by genome scans (Thornton *et al*. 2007). The aim is here to confirm recent positive selection in the region, as well as to localize more precisely the target of selection (Kim and Stephan 2002). The selection coefficient may also be estimated on the basis of the pattern of either the frequency spectrum (Kim and Stephan 2002) or the heterozygosity (Wang *et al*. 1999; Schlenke and Begun 2004; Olsen *et al*. 2006) in the genomic region. The selection coefficient thus estimated is a summary of the dynamics of the allele under selection, but it has no clear biological meaning; it measures only the propensity of the allele to grow in frequency. Yet in biological reality a mutation affects a phenotypic trait, and the trait is subject to selection. To better understand the genetics of adaptation, we would need to be able to interpret the selection coefficient estimated at the gene level in terms of parameters of selection at the trait level, especially in the context of complex quantitative traits (Falconer and Mackay 1996; Lynch and Walsh 1998).

There are two main views of how adaptation occurs as a genetic mechanism. The first one, periodic selection (Atwood *et al*. 1951), considers that adaptation proceeds by the successive fixations of beneficial mutations that sweep through the population one after the other. This view is supported by examples of experimental evolution with microbes (Elena *et al*. 1996; Elena and Lenski 2003). In microbes, the large population sizes make purifying selection very efficient, which strongly limits the amount of slightly deleterious polymorphisms [the so-called nearly neutral mutations (Ohta 1992)] that can be maintained by drift. As a consequence, phenotypic traits have little genetic variance in such organisms, except the variance generated anew by mutation. Moreover, asexuality induces clonal interference, which is expected to prevent simultaneous segregation at several loci (Gerrish and Lenski 1998). Note, however, that the process of clonal interference is partly challenged by recent theories of “travelling waves” that take into account the possibility that several beneficial mutations occur in the same lineage (Desai and Fisher 2007; Desai *et al*. 2007), so periodic selection may not apply to asexual populations under a high beneficial mutation rate.

The second view of adaptation is that of quantitative genetics, in which there is simultaneous selection at many loci that contribute to adaptive traits (polygenic selection). Such a vision is based on studies of the response to selection (Falconer and Mackay 1996) and of the maintenance of genetic variation (Barton and Keightley 2002). It is more relevant when linkage is loose, such that selective interferences between loci are low—allowing for simultaneous sweeps at many loci—and when selection is not very efficient, so that phenotypic traits can accumulate substantial genetic variation. Therefore it is intended for sexual organisms of reasonably low population sizes. At its most extreme, this view leads to the infinitesimal model designed by Fisher (1918), in which many loci of small effect contribute to a trait, such that the genetic values are normally distributed, an approach that has led to many fascinating developments in evolutionary quantitative genetics (Walsh 2007). In practice, the periodic selection and quantitative genetics views are probably two extremes of a continuum, and their main interest is that they provide fairly good approximations of more complex real genetic systems under certain conditions.

It is striking that the theory behind signatures of positive selection, namely genetic hitchhiking (Maynard Smith and Haigh 1974), is clearly one of periodic selection, whereas such signatures are searched in sexual species (since hitchhiking mapping is dependent on the recombination rate) that often exhibit substantial genetic variation for many traits. For such species (essentially plants or animals), the quantitative genetic approach is widely used and recognized as efficient to predict the response to selection, at least in the short term, for cultivated as well as natural populations (Falconer and Mackay 1996).

Here, our aim is to extend the theory of hitchhiking to the context of a locus affecting a quantitative trait that also harbors background genetic variation due to other loci. Our hope is to help draw clearer conclusions regarding selection on adaptive traits based on molecular signatures of selection. For instance, is it just the strength of phenotypic selection that matters or are there other parameters that affect the outcome of a selective sweep? If so, then what types of traits are more likely to exhibit molecular signatures of selection? When the selection coefficient of a gene can be estimated, what can be inferred about the strength of selection on the trait affected by that gene?

First, we recall the classic theory of hitchhiking and the way that selection affects the neutral polymorphism in such a context. Then, we use a model of selection on a quantitative trait controlled by one focal locus and a distribution of background genetic values—contributed to by many other loci—to calculate the trajectory of a beneficial mutation in such a case. We show that the strength of selection on a quantitative trait locus and its signature on neutral polymorphism can be strongly affected by the background genetics of the trait and that the outcome critically depends on the shape of the fitness function.

## MODEL

#### The classical hitchhiking model:

We first summarize the classical model of genetic hitchhiking as initially developed by Maynard Smith and Haigh (1974) and further studied by Stephan *et al*. (1992) and Barton (1998). Consider a diploid, randomly mating population and two biallelic loci *A* and *B* with recombination rate *r* between them. The first locus, with alleles *A _{1}* and

*A*in respective frequencies

_{2}*p*and

*q*= 1

*− p*, is under additive positive selection, such that the relative fitnesses of the genotypes

*A*,

_{2}A_{2}*A*, and

_{2}A_{1}*A*are 1, 1 +

_{1}A_{1}*s*, and 1 + 2

*s*, respectively. The other locus is neutral, with two alleles

*B*and

_{1}*B*in frequencies

_{2}*u*and 1

*− u*, respectively. In all the following, the subscript 0 denotes initial conditions.

At the selected locus, the change in the frequency *p* of allele *A _{1}* in one generation is(1)Since this expression is dependent on the mean fitness of the population , it is more convenient to write the recursion using the ratio ρ =

*p*/

*q*of allelic frequencies, which yields(2)where the prime denotes the value in the next generation and, for small

*s*,(3)

Hence, the selection model used in hitchhiking theory (as well as in many standard population genetic models) considers an exponential increase of the ratio of allelic frequencies at the selected locus. This is also equivalent to a logistic growth of the allele frequency *p*.

It can be shown easily that the change in the frequency of allele *B _{1}* at the neutral locus is(4)where

*D*is the linkage disequilibrium between loci

*A*and

*B*, defined as

*f*(

*A*)

_{1}B_{1}*− pu*[where

*f*(

*A*) is the frequency of the

_{1}B_{1}*A*haplotype]. The quantity

_{1}B_{1}*D/pq*is a measure of the statistic association between the loci, from the perspective of the selected locus. Using indicator variables to denote the presence of allele

*A*at locus

_{1}*A*and of allele

*B*at locus

_{1}*B*,

*D*is homologous to the covariance between loci

*A*and

*B*, and

*D/pq*is homologous to the coefficient of regression (covariance divided by the variance of one of the variables) of locus

*B*on locus

*A*. It can be shown (see,

*e.g*., Barton 2000) that

*D/pq*is also equal to the difference between the frequencies of allele

*B*in the selected (

_{1}*A*) and in the unselected (

_{1}*A*) genetic backgrounds at locus

_{2}*A*and that it changes only because of recombination, regardless of the strength of selection, such that(5)

Assuming *s* ≪ 1, the changes in frequencies are well approximated by continuous-time processes, and the total change in frequency at the neutral locus is(6)which is equivalent to Equation 1 (second line) in Barton (2000). The function *t*(*p*) is the inverse dynamics of the mutation *A _{1}*,

*i.e*., the time needed for this mutation to reach a given frequency

*p*, starting from a threshold frequency ε defined below. Note that the argument leading to (6) is purely deterministic, in that it does not take into account changes in frequencies resulting from genetic drift. It is a well-known result of the hitchhiking literature that the trajectory of a beneficial mutation in a finite population is very well approximated by its deterministic expectation in an infinite population, provided the frequency

*p*is sufficiently distant from the absorbing edges 0 and 1;

*i.e*.,

*p*∈ [ε, 1 − ε] (see,

*e.g*., Stephan

*et al*. 1992 or Barton 1998). However, at very low (

*p*< ε) or very high (

*p*> 1 − ε) frequencies, the trajectory of a beneficial mutation is mainly controlled by genetic drift. The threshold frequency ε is thus defined as the frequency above which the deterministic effect of selection overwhelms the stochastic effect of drift on the trajectory of the beneficial mutation in time. Using ε instead of 1/(2

*N*

_{e}) as the starting frequency of the mutation thus allows us to partially account for stochasticity in a deterministic framework: it acts as a “filter” to focus only on the mutations that can indeed reach high frequencies. In practice, simulation results show that a value of ε = 1/(4

*N*

_{e}

*s*) performs well (Kim and Stephan 2002). Note also that conditional on final fixation, a beneficial mutation reaches ε rapidly, so it is a good approximation to assume that no recombination occurred in this time lapse (Barton 1998).

The initial association depends on the starting conditions. When the mutation *A _{1}* appears in a single copy, it can be associated either with

*B*(with probability

_{1}*u*

_{0}), leading to , or with

*B*(with probability 1 −

_{2}*u*

_{0}), leading to . Combining these two events, the expected total reduction of heterozygosity at the neutral locus is, from Equation 6,(7)where

*E*denotes the expectation over all possible starting conditions. This equation can be understood from the standpoint of the set of haplotypes carrying the beneficial allele

*A*. At the beginning,

_{1}*A*starts in one copy, and the haplotype that carries it harbors no genetic diversity at other loci. During the selective sweep, diversity is introduced on these haplotypes through recombination, and the total amount of regained diversity depends on the trajectory in time of the selective sweep, that is, on the dynamics of

_{1}*A*. Note that formulated this way, Equation 7 is quite general and applies whenever the dynamics of the beneficial mutation are sufficiently slow that it can be approximated by a continuous process. Equation 7 indicates that the full trajectory of a beneficial mutation is sufficient and necessary information for predicting its hitchhiking effect. In the simple case studied by Maynard Smith and Haigh (1974), the reverse dynamics is, from Equation 3,(8)which yields for the reduction of heterozygosity(9)where

_{1}*B*is Euler's incomplete beta function. Assuming

*r*≪

*s*,

*s*≪ 1, and ε ≪ 1, the reduction of heterozygosity can be approximated by(10)which corresponds the simplest approximation proposed in Stephan

*et al*. (1992).

In this article, contrary to the usual approach for selective sweeps (Maynard Smith and Haigh 1974; Stephan *et al*. 1992), we do not define a constant selection coefficient *a priori* for *A _{1}*. Instead, we focus on a mutation affecting a quantitative trait and define selective pressure at the level of the trait only. Our aim is to describe the dynamics of the beneficial mutation in the presence of background genetic variation in a form similar to Equations 2 and 3. We thus wish to identify the factors affecting the dynamics of the mutation and to assess the range of parameter values for which our model converges to the classical hitchhiking model. We also wish to characterize the molecular signature left on neutral polymorphism by selection at a quantitative trait locus, to understand what information is provided by genome scans for traces of selection or hitchhiking mapping. We mainly use a deterministic argument as in Maynard Smith and Haigh (1974), since it allows us to characterize major trends analytically, but our results may be good approximations for stochastic populations when the frequency of the beneficial mutation is neither very low nor very high. We also discuss the qualitative and quantitative changes that may be introduced in a stochastic framework.

#### Lande's model—simultaneous selection on a focal gene and on background variation:

We use the model proposed by Lande (1983) to describe the effect of selection on a focal gene affecting a quantitative trait and on background genetic variation for this trait contributed by other loci. Our aim is to predict the complete trajectory of a beneficial mutation starting in one copy at the focal locus (*i.e*., of a hard selective sweep) as a function of the genetic and selective parameters of the trait. For the focal locus we use the same notations as in the previous section, with the exception that allele *A _{1}* now has an additive effect

*a*on the trait, and no selection coefficient is defined

*a priori*. Lande's (1983) model assumes normally distributed genetic values in the background, with mean

*m*and variance σ

^{2}, which is standard practice in quantitative genetics (Falconer and Mackay 1996). It also assumes a large population size (that is, the model is deterministic), no linkage, and no interaction of the focal locus with the genetic background, such that the distribution of genetic values within each genotypic class at locus

*A*is also normal, with the same variance σ

^{2}as in the entire population, and the mean shifted by 0,

*a*, or 2

*a*for genotypes

*A*,

_{2}A_{2}*A*, or

_{1}A_{2}*A*, respectively. This latter assumption is a good approximation as long as the number of individuals in each genotypic class is not too small (>50, see discussion). The absolute fitness of a phenotype with value

_{1}A_{1}*z is W*(

*z*), and we neglect environmental variance for the sake of clarity, so the phenotypic and genotypic values are identical.

Lande (1983) showed that in this context, since the distribution of genetic values is a linear combination of identical normal distributions, the change in the mean genetic background value *m* of the trait follows the same equation as in the case of a single normal distribution. Namely, the change in *m* in one generation is(11)where is the gradient of directional selection on the trait (Lande 1976), which measures the slope of the log mean fitness landscape at the position of the population. Under random mating, the mean fitness of the population is simply(12)where is the mean fitness across all genetic backgrounds of individuals with genotype *A _{i}A_{j}* at locus

*A*. The change in frequency of

*A*at the focal locus can be described using Wright's (1969) equation:(13)

_{1}Note that in the classical model described in the previous section, so Equation 1 is a particular case of Equation 13. Lande's (1983) model allows describing the changes in frequencies at a gene in a polygenic context in a simple way (without the use of complex multigenic recursions), by combining results from quantitative and population genetics. Note that as long as the normal approximation holds, any locus in the background could be the focal locus. In our context, however, the focal locus is chosen to be one where a new mutation has been introduced in one copy very recently.

In this article, we want to study the dynamics at the quantitative trait locus *A* under contrasted forms of selection on the trait. Indeed, there is no *a priori* reason to focus on one specific type of function. Studies aiming at estimating the shapes of fitness functions showed that those can vary substantially between traits (Schluter 1988). Moreover, simply assessing the relative prevalence of directional *vs*. stabilizing or disruptive selection in the wild is often difficult because of statistical issues (Kingsolver *et al*. 2001). We thus chose to study three simple cases that could induce adaptation and to use them to compare the dynamics of selective sweeps under markedly distinct selective pressures on the trait. We used a linear directional fitness function (*W*_{l}), an exponential fitness function (*W*_{e}), and a Gaussian stabilizing selection (*W*_{g}). In all cases, a parameter ω > 0 quantifies the strength of selection, such that(14)In the Gaussian fitness function, *z* is taken to be the distance to the genetic optimum, without loss of generality. Note that ω measures the width of the bell for the Gaussian fitness function, so the intensity of selection increases with 1/ω instead of ω for this fitness function (contrary to the other two). This notation was used for the sake of homogeneity with standard quantitative genetic literature. These three examples of fitness functions are not meant to be completely realistic, but rather to encompass three clearly different shapes, which together can well approximate many real fitness functions in the vicinity of the current state of the population (see discussion).

#### A measure of selection on a gene based on its dynamics:

The quantity used to describe the dynamics of the mutation is the growth rate ς of the ratio of allelic frequencies ρ *= p/q*; that is, ς = ρ′/ρ − 1. This approach stems from Fisher (1930), who underlined the interest of quantifying selection by measuring changes in the ratio of allelic frequencies rather than in the frequencies themselves, using an argument of geometric increase. Note that in the limit where selection can be approximated by a continuous process, , as in Fisher (1930, p. 34). The term ς does not change with the mean fitness of the population, so it is often used in experimental evolution to measure selection on genes on the basis of allele frequency change (Lenski *et al*. 1991; Perfeito *et al*. 2007). In the simple case presented in the first section, it also equals , but it is not always so. In the general case, ς captures the influence of parameters on the dynamics of the beneficial mutation more accurately than , as we will see later. We calculated the change in frequency Δ*p* using Equation 13 and then we calculated ς as(15)The full expression of ς depends on the details of the genetic architecture of the trait as well as on the type of selection acting on this trait. We then used ς to derive the complete dynamics of beneficial mutations and then to calculate the expected signature of selection on neutral polymorphism. We focused on selection on new mutations (*i.e*., on “hard” selective sweeps) and not on “soft” sweeps where a beneficial mutation that was already segregating in the population starts being selected after a change in the environment (Innan and Kim 2004; Hermisson and Pennings 2005; Przeworski *et al*. 2005). Soft sweeps are expected to leave weaker and more complex signatures on neutral polymorphism.

## RESULTS

In the following, we derive the equations for the dynamics of a mutation affecting a quantitative trait under distinct types of selection, focusing on hard selective sweeps, *i.e*., on beneficial mutations that start in one copy and eventually reach fixation. Subscripts l, e, and g are used throughout to refer to results obtained under linear, exponential, and Gaussian fitness functions, respectively. We also use an asterisk to denote approximated results under the assumption that the effect of the mutation is small relative to the mean value of the trait; that is, that |*a| ≪ |m*|.

#### Dynamics without background variation:

We first describe the dynamics of the mutation in the absence of background genetic variation (σ^{2} = 0). The general derivations are presented in the appendix. The growth rate of the focal mutation in the absence of background genetic variation is approximately(16)It can already be seen from Equation 16 that the way selection on a trait translates into selection on a gene affecting this trait crucially depends on the shape of the fitness function, which was already discussed in Kimura and Crow (1978). Under an exponential fitness function, the selective pressure on the mutation does not depend on the mean background genetic value *m*, which was emphasized in Lande (1983). In contrast, under linear and Gaussian fitness functions, selection on the mutation depends not only on its genetic effect *a* and on parameters of the selection function (*b*, ω), but also on the present mean genetic state of the population (*m*). Note that the influence of parameters on the dynamics of the quantitative trait locus is captured accurately by the term ς that we use here, but not by other definitions of the selection coefficient. For instance, using instead of ς (by identifying Equations 13 and 1), one could think that the mean genetic value *m* influences selection at locus *A* even in the case of exponential fitness function, whereas recursions show that it is actually not the case (not shown).

Our focus here is on hard selective sweeps, *i.e*., on new beneficial mutations that reach fixation. Under linear and exponential fitness functions, all mutations with *a* > 0 can sweep to fixation. Under the Gaussian fitness function, adaptation obviously takes place only if the population is not at the optimum, that is, if *m* ≠ 0. This can occur after a recent change in the environment, which rapidly shifted the optimal genetic value of the trait. If so, a mutation can reach fixation if it allows the population to get closer to the optimum, that is, if |*m* + 2*a*| < |*m*|. This is equivalent to stating that (*m* + 2*a*)^{2} < *m*^{2}, which leads to the condition *a*(*a* + *m*) < 0. Note that if the mutation effect is small relative to the mean genetic value, the only condition is that *a* and *m* have opposite signs, as can be seen directly from (Equation 16).

In the absence of background genetic variation, and assuming |*a| ≪ |m|*, does not change in time. We can then write, in a form similar to Equation 3:(17)In this context, the parameter is equal to the *s* defined in the classical hitchhiking model and can be estimated by hitchhiking mapping. Moreover, using the definition of the directional selection gradient in Equation 11 and using the same formalism as previously, we note that(18)for all three fitness functions. This means that, for a given strength of selection (*i.e*., gradient of directional selection) on an adaptive trait and in the absence of background genetic variance, the strength of selection on a weak-effect mutation affecting this trait is proportional to its genetic value for the trait.

#### Dynamics with background variation:

##### Changes in the frequency of the mutation and in the mean genetic background in one generation:

In the presence of background genetic variation for the trait (σ^{2} > 0), the expressions for the growth rate of ρ in one generation remain unchanged under linear and exponential selection. Under Gaussian stabilizing selection and assuming |*a| ≪ |m*|, the expression for ς_{g} with background genetic variation is approximately (introducing )(19)

Hence, among our chosen fitness functions, background genetic variance has a direct effect on the change of frequency at the focal locus only in the case of Gaussian stabilizing selection. In this case, the growth rate of the ratio ρ of frequencies at the selected locus in one generation decreases with increasing σ^{2}. With either exponential or linear selection, background genetic variance has no direct effect on the change in frequency at the focal locus. Nevertheless, the genetic variance does affect the dynamics of the mean genetic background value, which in turn influences selection at the focal locus under linear as well as under Gaussian selection. In contrast, selection at the focal locus is never affected by the genetic background of the trait under exponential selection.

The change in the mean value of the trait in one generation is given by Equation 11, where the exact expressions for the directional selection gradients are given in the appendix. Under the small-effect approximation, these selection gradients are(20)Note that under this small-effect assumption, we still find, for any generation, that , as in the absence of background genetic variation (Equation 18). Using a similar small-effect approximation, Kimura and Crow (1978) derived a formula that is close to the latter, except that they defined the selection coefficients as and the effect of the gene as (1 − *p*)*a* (after rearranging and reformulating with our notations). Therefore their prediction was that the change in the frequency of *A _{1}* is . This is not strictly equivalent to stating that , which proves more useful to calculate the complete trajectory of the mutation as we will see below. Note that their result can also be obtained in our case by noting that from Equation 13, .

Importantly, the approximate directional selection gradients in Equation 20 are independent of the frequency of the mutation at the focal locus, contrary to the exact expressions presented in the appendix. This uncouples the dynamics of the gene from those of the mean genetic background, which allows calculating first the trajectory in time of the mean genetic background *m* and then using it to derive the trajectory of the focal mutation, using Equations 16 and 19.

##### Complete trajectory for a mutation of small effect:

To calculate the trajectory of the beneficial mutation *A _{1}*, we first need to know the complete dynamics of the trait. To obtain it, we assume that the background genetic variance remains constant over time, as in Lande (1983). This assumption may not be very realistic on the longer term (Turelli 1988). However, under the infinitesimal model (

*i.e*., for a background composed of many loci with small effects as assumed here) it remains a reasonable approximation over the shorter term (

*i.e*., in the beginning of the sweep), and it allows a treatment of the problem that may remain reasonably robust when σ

^{2}changes through time. In any case, considering a constant variance may be a conservative assumption as to the effects of background variation on the dynamics of the focal locus, as we shall see in the discussion. We also assume that |

*a| ≪ |m*| all along the selective sweep, even in the case of Gaussian stabilizing selection, although as the population approaches the optimum, |

*m*| may ultimately decrease to zero. We assess the conditions that allow a selective sweep at the focal locus to complete under stabilizing selection in the next section.

Using Equations 11 and 20 and under the assumptions above, the dynamics of the mean genetic background for all three fitness functions are(21)where time 0 is taken when the mutation *A _{1}* is in the threshold frequency ε defined in Equation 6 and as previously. The results for Gaussian and exponential functions [ and ] are exact, whereas the one for linear selection [] is based on a continuous-time approximation. With a linear fitness function, the mean genetic value of the trait increases as a square-root function of time; that is, it increases more and more slowly with time. This can also be seen from the gradient of directional selection (Equation 20), which takes

*m*in its denominator. This stems from a well-known property of linear fitness functions, which generate negative epistasis for fitness among mutations in the same direction (see,

*e.g*., Tenaillon

*et al*. 2007); here, as

*m*increases, there is less and less advantage in increasing it any further. Under exponential selection, the mean value of the trait increases linearly with time, as a consequence of the constant gradient of selection. Finally, under Gaussian stabilizing selection, the absolute distance to the optimum decreases exponentially with time, at rate σ

^{2}

*/*ω

^{2}.

Knowing *m*(*t*), the growth rate of the ratio of allelic frequencies can be expressed as a function of time by combining Equations 20 and 21 and by recalling that , which leads to(22)This shows that the effect of background genetic variation on the dynamics of a locus affecting an adaptive quantitative trait crucially depends on the type of fitness function that governs selection on this trait. Under exponential selection, the growth rate of the mutation is constant and independent of background genetic variation. Under Gaussian stabilizing selection, it decreases exponentially with time and with the genetic variance σ^{2}. Finally, if the fitness function is linear, it has a more complicated decreasing dynamics, proportional to 1/*sqrt*(*t*).

In our model, as we assume no linkage and no epistasis, there is no covariance between the focal locus and the genetic background, so the part of variance explained by the locus is simply . This term cannot be easily related to the dynamics of the focal mutation; for instance, it does not affect it at all under exponential fitness function. Therefore, the weight of a given QTL for a selected trait (defined as the proportion of the total variance explained by the QTL) is not necessarily a strong determinant of the signature of selection that it will show at the molecular level, and the absolute genetic value *a* of the QTL may be more informative in that respect. Nevertheless, the weight of the QTL may also be correlated to some extent to the strength of the signature of selection since it is a growing function of *a*.

From Equation 22, the full trajectory of the mutation at the focal locus can be calculated using, for *t* > 0,(23)The term *S*(*t*) is the cumulative growth rate of the mutation *A _{1}*, that is, the total amount of increase of log(ρ) at time

*t*, resulting from the action of selection over all generations since the frequency of the

*A*allele became superior to ε. For each type of fitness function (and assuming that the dynamics for the linear fitness function can be approximated by a continuous-time process,

_{1}*i.e*., replacing the sum with an integral), the cumulated growth rate

*S*(

*t*) is(24)where is the initial gradient of directional selection under the linear fitness function. These expressions are compared to the product

*s*(

*t −*1) with a constant selection coefficient

*s*. The dynamics in Equations 23 and 24 can then be easily translated into that of the frequency of the beneficial mutation by noting that , which leads to(25)

Figure 1 shows the approximate and exact dynamics of the beneficial mutation under a linear fitness function. The presence of standing variation can substantially affect these dynamics. For instance, in Figure 1A, the time to fixation (defined by the frequency 1 − ε) is tripled compared to the situation without background variation. Even if |*a| >* |*m|* at the start of the selective sweep, *m* increases with time so that the weak-effect approximation (assuming *a ≪ m*) gets better as the selective sweep proceeds. On the contrary, under stabilizing selection, the approximation performs well only when *m*_{0} is substantially greater (in absolute value) than the effect of the mutation, *a* (Figure 2A). Indeed, if |*a*| is initially close to |*m*|, the approximation will worsen as |*m*| decreases under selection for an optimum (Figure 2B). Note, however, that even in cases where the approximation performs poorly, it still describes the dynamics more accurately than assuming no background genetic variation at all.

##### Conditions for a complete selective sweep under stabilizing selection:

The results above were obtained assuming |*a| ≪ |m*| all along the selective sweep. However, under stabilizing selection, *m* tends to 0 as the trait approaches the optimum, so this assumption may be violated in the course of the sweep. Furthermore, if the mean genetic value approaches the optimum too quickly because of the background genetic variation, the mutation at the focal locus will obviously become deleterious and eventually disappear from the population. To put it another way, mutations starting in higher frequencies (pooled here in the genetic background) will be more likely to reach fixation first—thus reducing the distance to phenotypic optimum—and to prevent the spread of new mutations (represented by the focal locus). Hence there are necessary conditions on the parameters of the system that allow the possibility of a complete hard sweep at the focal locus, that, is a beneficial mutation that starts in one copy and reaches fixation. Unfortunately, the range of parameters that defines these conditions is also the one for which the assumption |*a| ≪ |m*| is not valid, so there is interdependency between the dynamics at the focal locus and those of the mean genetic background, which prevents us from finding an exact solution. Nevertheless, a criterion for fixation can be built from the approximated system of Equations 21 and 24, and its accuracy can be tested with numerical examples. This criterion needs to be conservative (in the sense that it must lead to a parameter range that bounds the actual one) and informative enough (the obtained range must not be too large relative to the actual range). On the basis of numerical results under various parameter values, the criterion that we used was that the frequency *p* of the mutation reaches before the population gets to a distance 2*a* of the optimum, that is, before *a*(*m +* 2*a*) becomes positive. Using the simplified system in Equations 21 and 24 and assuming ε ≪ 1, this leads to(26a)(26b)where *m*_{0} is the initial distance to the optimum. Hence, there is a maximum value of the background genetic variance σ^{2} above which the mutation cannot reach fixation, however strong its own effect may be, because the population reaches the optimum too quickly as a consequence of the response to selection by the background. Importantly, this maximum value does not depend on the intensity of stabilizing selection on the trait, quantified by 1/ω^{2}. This is because ω affects both the dynamics of the QTL and those of the mean background genetic value in the same manner and hence does not influence their competition. The maximum variance that allows a complete selective sweep at the QTL depends only on the ratio of the squared initial distance *m*_{0} over the frequency ε that determines the beginning of the quasi-deterministic phase of the sweep. If σ^{2} is below this maximum value, the effect *a* of the focal mutation must still be above a given threshold (in absolute value) for the mutation to be able to spread to fixation in the population. This threshold depends on the ratio of the background variance over the squared initial distance to the optimum, as well as on ε, but is again independent of ω. Obviously, the initial conditions must also verify the condition *a*(*a* + *m _{0}*) < 0 mentioned earlier in the absence of background variation. Figure 3A shows an example of the threshold value for

*a*as a function of σ. Note that this threshold tends to infinity as σ approaches its maximal value. A selective sweep at the focal mutation can be completed only if the background genetic variation is very low and if the mutation effect is of the same order of magnitude as the background genetic standard deviation, that is, if the mutation has a very strong effect on the trait compared to other loci. Since the chosen criterion was defined

*ad hoc*, we used numerical recursions to test whether it was close to the actual conditions for a complete selective sweep in the context of our model. The conditions obtained under our criterion were in good agreement with the actual behavior of the system, as shown in Figure 3, B and C. When |

*a*| is below the threshold value defined in Equations 26, the frequency

*p*of the focal mutation does not reach and eventually decreases back to zero as the mean background genetic value reaches the optimum. In contrast, when the effect of the mutation is slightly above the threshold (in absolute value), the mutation manages to exceed a frequency of and eventually reaches fixation (despite a slowing down of its dynamics), whereas the mean genetic background goes back down, to the value

*m*− 2

*a*. Hence the background genetic variance of a trait has a strong negative impact on the probability of a complete selective sweep at a QTL under stabilizing selection, since it allows the population to adapt and reach its optimum without using any new mutation. Importantly, by assuming a constant background variance as we did in our simple model we likely underestimated the impact of background genetic variance on the focal locus (see discussion), so our results (Equations 26) are conservative in that respect.

Note also that these results are based on a deterministic argument. In a finite population, when the mutation *A _{1}* is in a very low number of copies, its frequency will also fluctuate stochastically, so it is very likely to be lost by genetic drift. This stochastic sieve has been studied thoroughly, and the probability of fixation of mutations can be calculated under various selective contexts (Crow and Kimura 1970; Ewens 2004), including that where there is competition between beneficial mutations at several loci (Barton 1995; Gerrish and Lenski 1998). Here, we assume that the stochastic sieve has been passed successfully (

*i.e*.,

*p*> ε), as mentioned earlier; hence this section focuses specifically on selective sweeps that are stopped because the optimum phenotype was reached (

*m*= 0) through selection at other loci, regardless of any stochastic effects. Stochastic effects should even worsen the situation for the focal locus, as we develop in the discussion.

#### Signatures of selection:

As shown in Equation 7, the expected reduction of heterozygosity *R*_{H} for a neutral locus located at a recombination distance *r* from a locus under positive selection can be found analytically by calculating the integral . The obtained pattern of neutral polymorphism can then be compared to the one provoked by a selective sweep of constant *s*, thus mimicking the empirical approach where Equation 7 is fitted to an empirical polymorphism pattern to infer the selection coefficient *s* (assumed constant). This estimated *s* can be thought of as an “effective selection coefficient” *s*_{e} for the hitchhiking effect in a context of varying selection coefficient, *i.e*., the constant value of *s* that would lead to the same signature of selection as the one observed. Using the approximated Equation 10, this leads to(27)

Unfortunately, when replacing with the inverse dynamics obtained from Equation 7, the resulting integral *I* cannot be solved. Alternatively, we can numerically compute the increase of polymorphism through recombination on the haplotype that carries the beneficial mutation. This is done by using the exact or approximated dynamics for the selected locus (Equations 24 and 25) and then converting them into the change in frequency at the neutral locus using Equation 4. The final polymorphism pattern can then be compared to an expected pattern assuming a constant selection coefficient, for instance, by fitting Equation 10 to the heterozygosity pattern. Figure 4 shows an example of such an estimation of the effective selection coefficient under Gaussian stabilizing selection (based on Equation 27). The actual and fitted polymorphism curves are shown in Figure 4A. The pattern of reduction of heterozygosity *R*_{H} with a changing selection coefficient (implied by the Gaussian stabilizing selection with background genetic variance) is quite similar to that expected under a constant selection coefficient. As seen in Figure 4B, the actual selection coefficient decreases exponentially with time, as is also apparent from Equation 24. The selection coefficient *s*_{e} estimated through the reduction of heterozygosity corresponds to the *s* in the early phase of the selective sweep, as expected. Indeed, the hitchhiking effect is mostly concentrated in the beginning of a selective sweep, when the mutation is in low frequency (Barton 1998). Yet this “effective” selection coefficient is substantially lower than the initial selection coefficient (about half of its value in our example).

Since the above results neglect the mutation events at the neutral locus during the selective sweep, we also ran coalescence simulations of a selective sweep with a decreasing selection coefficient. To do so, we modified the program “ssw” by Yuseob Kim (available online at http://yuseobkim.net/YuseobPrograms.html), by replacing the expected trajectory of the beneficial mutation with the approximate dynamics of Equations 24 and 25. We then applied Kim and Stephan's (2002) composite-likelihood-ratio test to estimate the selection coefficient involved in the selective sweep. This method uses the site-frequency spectrum (SFS) of neutral mutations (*i.e*., the proportion of mutations that are found at each frequency in a sample) to infer parameters of a selective sweep together with the (composite) likelihood of selection *vs*. neutrality. As seen in Figure 4B, the selection coefficient thus estimated is far below the *s*_{e} predicted through the reduction of heterozygosity and neglecting mutation during the selective sweep. This is because a decreasing selection coefficient induces the mutation *A _{1}* to remain at high frequencies for a longer time before it can fix. During this time lapse, (i) mutation restores some of the neutral genetic diversity in the population and (ii) genetic drift causes very frequent neutral variants to fix, thus decreasing their proportions in the SFS. Hence at the time of fixation of the beneficial mutation, the selective sweep looks older than a selective sweep with a constant

*s*. We also used another method in which the likelihood-ratio test is calculated from the linkage disequilibria between neutral sites (Kim and Nielsen 2004). Specifically, this method detects a peculiar pattern of linkage disequilibrium generated by complete selective sweeps, in which the linkage disequilibrium is strong between loci located on the same side of the selected locus, but is very low between loci located on each side of the selected locus (Kim and Nielsen 2004; Stephan

*et al*. 2006). However, this method failed to detect selection in our example, because the linkage disequilibrium was quickly broken down at the end of the selective sweep (not shown).

## DISCUSSION

We studied the dynamics of a beneficial mutation at a gene affecting a quantitative trait that also harbors background genetic variance contributed by other loci. To do so, we used a purely deterministic model in the range of allelic frequencies where stochastic effects can be neglected. We showed that, if the effect of a mutation at the focal locus is small relative to the mean genetic background value of the trait, the full trajectory of the mutation (and of the mean background) can be derived under various fitness functions and related to the genetic and selective parameters of the trait. We also found conditions for a complete selective sweep at a quantitative trait locus under stabilizing selection. The selection coefficient of the beneficial mutation decreases in time because of background genetic variance under two of the three fitness functions studied (linear and Gaussian). The deterministic reduction of heterozygosity at a linked neutral locus is mostly influenced by the selection coefficient in the early generations of the selective sweep, as expected from Barton (1998), but the effective selection coefficient for the hitchhiking effect can be much lower than the one at the starting conditions.

In the following, we first discuss the limits of the model that we used and the robustness of our results to departures from our assumptions, and then we discuss some possible improvements and applications.

#### Potential limits of the model:

In this study, we neglected the possible changes in the variance of the trait for the sake of clarity. In fact, selection modifies the genetic variance of the population at each generation following , where the gradient of quadratic selection on the trait γ measures the mean curvature of the adaptive landscape experienced by the population (Lande and Arnold 1983). Genetic drift decreases the genetic variance at rate (1 − 1/(2*N*_{e})) per generation, whereas mutation increases it by an amount . The combined effect of all these factors on the change in variance depends on the parameter values and on the type of selection (fitness function) operating on the trait. However, assuming a constant variance may be a good approximation in the short term, *i.e*., in the early generations of the selective sweep where the hitchhiking effect is strongest. Moreover, under stabilizing selection, and assuming that the population was at mutation–selection–drift equilibrium before the environmental change—that is, it remained at the optimum for many generations—Burger and Lynch (1995) showed that the environmental change actually increases the variance of the trait under selection (see their Figure 6). Hence, if the population was at equilibrium prior to the environmental change (which seems a reasonable assumption), our results obtained with a constant variance are conservative and actually underestimate the reduction of the selection coefficient of a new mutation affecting a quantitative trait due to selection on the genetic background of this trait.

As explained in the text, the fitness functions used here were chosen for illustrative purposes, but also because they can be good approximations to real fitness functions in the vicinity of the present state of a population. The linear and Gaussian fitness functions are widely used to describe directional and stabilizing selective pressures on traits, respectively, because of their simplicity and because they are good approximates to more complicated functions under a wide range of parameter values (see Lande 1976 for a discussion of the Gaussian approximation for general stabilizing selection). The exponential fitness function is less frequent in models of directional selection as it seems more extreme than the linear one (but see Lande 1983). Its prevalence in natural populations is difficult to assess, since it has rarely been tested explicitly on empirical data. Nevertheless Schluter (1988), using a nonparametric approach, demonstrated empirically that exponential-like fitness functions occur in natural populations. Moreover, these functions have the interesting property that they combine a strong positive slope (gradient of directional selection) with a positive curvature [positive quadratic selection gradient (Lande and Arnold 1983)]. Since a positive curvature is in general interpreted as a sign of disruptive selection, finding this feature together with a strong directional gradient may seem like an empirical paradox. Schluter (1988) emphasized that such a pattern may very well be caused by exponential-like fitness functions. This type of fitness function may thus be important in the study of evolutionary biology and is also quite illustrative, so we considered it as well.

Also, we assumed that the distribution of background genetic values was the same in every genetic class at the QTL. However, in a finite population, even a very large one, when the frequency of the beneficial mutation is very low (or very high), the number of individuals in a particular genotypic class at the focal locus is small, so this class may harbor a background genetic variance different from those of the other genotypic classes. This is an important stochastic factor, since a new mutation may be associated by chance to very beneficial or very deleterious alleles at other loci, instead of experiencing all possible genotypic values at other loci. This can be accounted for by noting that the background genetic variance in a genotypic class of *n* individuals is distributed like the sample variance for a sample of size *n* (since we assume no linkage and no epistasis). Its expectation is (*n* − 1)/*n*σ^{2}, which for large *n* tends to σ^{2}, the variance of the entire population, and its coefficient of variation (*i.e*., the standard deviation divided by the mean) is , a decreasing function of *n*. This coefficient of variation depends only on the number *n* of individuals in the genotypic class, regardless of the total population size. We can thus define as the threshold number of individuals in a given genotypic class at locus *A*, above which the coefficient of variation of the background genetic variance is below a tolerance level α. The definition of ε in Equation 7 can be extended to represent the frequency above which not only do the dynamics become quasi-deterministic without background variation (Stephan *et al*. 1992), but also all genotypes harbor background genetic variances close to that in the entire population (that is ε *> n*_{α}/*N*, where *N* is the population size).

Another stochastic effect that might matter in the competition between several loci affecting the trait is the reduction of the effective size of the population. Indeed, selection decreases the effective population size by creating variance in family size (Santiago and Caballero 1995), thus reducing the efficacy of selection. Since the most frequent alleles contribute most to the genetic variance in fitness (all other things being equal), they also have the strongest effect in decreasing the effective population size for other loci. Hence they should experience a positive feedback that further increases their deterministic advantage of starting at higher frequencies. A comprehensive treatment of this stochastic competition between unlinked mutations is worth investigating, but it is beyond the scope of this article.

Finally, we have assumed that the focal locus is completely unlinked to the loci in the genetic background and that there is no epistasis between the locus and its genetic background for the trait considered. Yet, epistatic interactions for quantitative traits (defined as departures from additivity) have been repeatedly reported in QTL analyses (Carlborg *et al*. 2003; Kroymann and Mitchell-Olds 2005). Epistasis could affect our results by modifying the mean effect of the mutation over all the genetic backgrounds, thus changing *a* in our model. However, there is no reason to think that epistasis is systematically biased toward positive or negative values, such that the pooled effect of all loci in the background would result in a systematic decrease or increase of *a*. Hence as far as each genotype at the focal locus is represented by many individuals, and the genetic background consists of many loci, epistatic interactions should sum up to zero. Linkage is difficult to incorporate in the framework that we used; it would introduce partial covariance between the focal locus and the genetic background, resulting in an apparent bias for *a*. Contrary to epistasis, this bias may be sustained over several generations, which may influence the outcome of the selective sweep. In any case, the presence of a second selective sweep in the vicinity of the focal selective sweep may substantially affect the polymorphism pattern in the region as compared to a single selective sweep (Chevin *et al*. 2008). Therefore, the present model is *a priori* best suited for fully sexual species and traits for which QTL are spread over the genome.

#### Potential developments and applications:

In practice, both the focal mutation and the genetic background may affect several traits under selection. Pleiotropy can alter predictions regarding selection at the focal locus based on a single character (Otto 2004), and genetic covariances in the background can lead to changes of the mean genetic value of the focal trait as an indirect consequence of selection on other traits (Lande 1979). Multivariate approaches are now a standard tool in evolutionary genetics (Walsh 2007) and have been applied recently to relate the phenotypic effect of pleiotropic mutations to their fitness effects, in the absence of background genetic variation (Martin and Lenormand 2006a). To understand how selection affects pleiotropic mutations in the presence of other polymorphic loci, our approach may be generalized by treating the focal mutation as a vector of effects on several traits, in the presence of a background genetic variance–covariance *G*-matrix, as in Agrawal *et al*. (2001). Even in this context, the simple results presented here may be good approximations in cases where the focal mutation has unbalanced effects and influences mainly one trait (low effective dimensionality) and where there is little background genetic covariance between this trait and other traits under selection.

Our results may have important implications regarding the methods used to detect signatures of selection. Teshima and Przeworski (2006) showed that dominance can modify the trajectory of a beneficial mutation (relative to the case of additive selective advantage) in a way that dramatically influences the signature that a selective sweep leaves on neutral variation. In our general context, background genetic variance can provoke a significant decrease in the selection coefficient over time. Note that a decrease of the selection coefficient, as described here under linear or Gaussian fitness function, may also occur under other types of fitness functions, for instance, those that increase until they reach a “plateau.” When we considered only the reduction of heterozygosity and neglected neutral mutations during the selective sweep, the two types of signatures (constant selection coefficient or decreasing selection coefficient) were barely distinguishable, but the effective selection coefficient in the case of a decreasing *s* was inferior to the initial *s*. When we included mutation during the selective sweep and looked at the frequency spectrum of mutations, the selection coefficient estimated was much lower, and we failed to detect selection through its effect on linkage disequilibrium. Hence at the time the beneficial mutation reaches fixation, a selective sweep with decreasing selection coefficient is very similar to an old selective sweep. In this context, it may thus be more efficient to use methods that search for an ongoing selective sweep (as in, *e.g*., Voight *et al*. 2006), since the methods that assume that the beneficial mutation is fixed may have a low power as a consequence of the decrease of the selection coefficient in time. It may also be possible to estimate the decrease of the selection coefficient of a mutation. This could be done, for instance, by modifying methods that jointly infer the selection coefficient and the age of a selected sweep, such as that of Przeworski (2003).

#### Conclusion:

There has been marked interest recently in the population genetic theory of adaptation. Besides the theoretical developments stemming from Fisher's geometrical model (Orr 2005), experimental evolution with microbes (Elena and Lenski 2003) provides examples of how adaptation proceeds in controlled laboratory conditions. Specifically, empirical estimates of the distribution of the fitness effects of mutations (reviewed in Eyre-Walker and Keightley 2007), and in particular of beneficial ones, aim at quantifying the raw material for adaptation. However, the applicability of fitness effects measured in the laboratory to natural populations has been questioned only in a few studies to date. It is not clear whether the selection coefficients estimated under specific controlled conditions can be directly translated into another context. Fundamentally, the underlying questions are as follows: Can we assign a constant selection coefficient to a given mutation? And if not, to what extent is this selection coefficient determined by other factors?

First, the environment that the population experiences affects the fitness effect of a mutation. Using a multivariate model of stabilizing selection and comparing it with data from the literature, Martin and Lenormand (2006b) showed that the effect of a change of environment on the distribution of fitness effects of mutations follows a predictable trend. The selection coefficient of a mutation may also depend on the genetic environment in which it occurs. Most empirical studies of the distribution of the fitness effects of mutations focus on single mutations in a given reference background (Eyre-Walker and Keightley 2007), so that they are only informative about the process of adaptation from rare *de novo* mutations where only one mutation can sweep at a time. When several mutations segregate simultaneously at several loci, it is not clear what their selection coefficients will become, *i.e*., what the effect of a variable genetic background is on allele frequency changes. Theoretical predictions (Martin *et al*. 2007), confirmed by experimental results (Elena and Lenski 1997; Sanjuan *et al*. 2004), indicate that epistasis for fitness between pairs of mutations can be substantial. In sexual species with smaller population sizes, such as higher eukaryotes, quantitative traits under selection can exhibit substantial variation caused by many loci (Falconer and Mackay 1996; Barton and Keightley 2002). In such a situation, the selection coefficient of an allele, defined either at the locus level only or as a departure from the mean fitness of the population (as in Kimura and Crow 1978 or Barton and Turelli 1991), can be informative only about selection acting in one generation. The actual dynamics of a mutation all along its trajectory are more complex. As we show here with a simple model, the dynamics of a beneficial mutation affecting a quantitative trait under selection depend not only on its own effect, but also on the mean and variance of the genetic background for the trait and on the strength of selection on this trait. Moreover, the relative importance of each of these parameters crucially depends on the shape of the adaptive landscape. In any case, the selection coefficient that matters for the dynamics of a gene cannot be related in a simple manner to the proportion of genetic variance explained by this gene [a quantity often used to quantify the QTL effect in empirical studies (Lynch and Walsh 1998)].

To better understand the meaning of molecular signatures of selection in humans or in model species of higher eukaryotes (such as fruit flies, Arabidopsis, etc.), it is thus essential to empirically assess how adaptation proceeds in those species. If the model of periodic selection (Atwood *et al*. 1951; Elena *et al*. 1996) applies, then theoretical results from the population genetics of adaptation and experimental evolution on microbes can be helpful in understanding adaptation in those species, too. In contrast, if the response to selection is essentially multigenic, selection at specific loci may strongly vary in time and depend on the background genetics of the trait. If so, the effective selection coefficient for molecular signatures of selection would be only partly informative about the actual advantage of the mutation while it segregated in the population. Moreover, some types of selection on traits (*i.e*., shapes of the fitness function) would be overrepresented in detectable selective sweeps, such that some categories of adaptive traits would be systematically missed by genome scans. On the other hand, specific models of phenotypic selection such as the ones proposed here provide alternative null models of variable selection coefficients that could be tested with molecular data.

## APPENDIX

In this appendix, we derive the exact and approximate dynamics of the frequency *p* of the *A _{1}* mutation at the focal locus and of the mean genetic background value

*m*.

We first calculate the mean fitness in the population under the three fitness functions. The mean fitness of each genotypic classe *A _{i}A_{j}* iswhere

*f*(.) is the Gaussian distribution with mean 0 and variance σ

^{2}and α

*= 2 if {*

_{ij}*i*,

*j*} = {1, 1}, α

*= 1 if {*

_{ij}*i*,

*j*} = {1, 2} or {1, 2}, and α

*= 0 if {*

_{ij}*i*,

*j*} = {2, 2}. Then the mean fitness in the population is, according to Equation 12,(A1)where

*q*= 1 −

*p*and . The change in frequency of the allele

*A*can be calculated using Equation 13, which gives(A2)

_{1}The growth rate of the ratio of allelic frequencies ρ *= p*/*q* can then be found following Equation 15, which, after some rearrangement, leads to(A3)

Under linear and Gaussian fitness functions, the dynamics are slightly frequency dependent, since *p* is present in the expression of . Note that it is also the case in the classical selection model (see Equation 2). We now introduce an approximation for mutations of small effects (similar to considering *s* ≪ 1 in Equation 3). This consists in considering that the effect of the focal mutation on the trait, *a*, is small relative to the mean background genetic value of the trait (discounting the effect of the mutation), *m*. The results obtained under this approximation are denoted by an “*” in the rest of this article. When |*a| ≪ |m*| (where “| |” denotes the absolute value), the growth rate of the focal mutation becomes(A4)which is now independent of the frequency *p* of the mutation. Note that only in the case of Gaussian selection does the selection coefficient in one generation depend on the background genetic variance σ^{2}.

The change in the mean background value is controlled by the gradient of directional selection on the trait, as shown in Equation 11. Using the expressions for the mean fitness in (A1), those gradients are(A5)

In the case of exponential selection, the directional selection gradient is constant and depends neither on the frequency of the mutation at the focal locus nor on the mean genetic background or the amount of genetic variance for the trait. In contrast, for the linear and Gaussian fitness functions, the selection gradient depends on the frequency *p* of the focal mutation. There is thus complete interdependency between the dynamics of the focal mutation and that of the genetic background for the trait: (i) the focal mutation influences the mean fitness , which changes the selective pressure on the trait (β), and (ii) the change in the mean background genetic value *m* changes the dynamics of the mutation, characterized by the relative growth rate ς. Therefore, there is no simple solution to the full system of equations. Nevertheless, under the small-effect assumption (|*a| ≪ |m|*) the gradients of directional selection become(A6)which are all independent of the frequency of the *A _{1}* allele. This allows calculating the trajectory in time of the mean genetic background

*m*first, and then using it to find the full dynamics of the beneficial mutation at the focal locus.

## Acknowledgments

We thank Emmanuelle Porcher, Guillaume Martin, Russell Lande, and three anonymous reviewers for helpful comments and criticisms on earlier versions of this manuscript. L.-M.C. is supported by a bourse de doctorat pour ingénieurs from the Centre National de la Recherche Scientifique.

## Footnotes

Communicating editor: J. Wakeley

- Received July 4, 2008.
- Accepted September 17, 2008.

- Copyright © 2008 by the Genetics Society of America