Abstract
We study the trajectory of an allele that affects a polygenic trait selected toward a phenotypic optimum. Furthermore, conditioning on this trajectory we analyze the effect of the selected mutation on linked neutral variation. We examine the well-characterized two-locus two-allele model but we also provide results for diallelic models with up to eight loci. First, when the optimum phenotype is that of the double heterozygote in a two-locus model, and there is no dominance or epistasis of effects on the trait, the trajectories of selected mutations rarely reach fixation; instead, a polymorphic equilibrium at both loci is approached. Whether a polymorphic equilibrium is reached (rather than fixation at both loci) depends on the intensity of selection and the relative distances to the optimum of the homozygotes at each locus. Furthermore, if both loci have similar effects on the trait, fixation of an allele at a given locus is less likely when it starts at low frequency and the other locus is polymorphic (with alleles at intermediate frequencies). Weaker selection increases the probability of fixation of the studied allele, as the polymorphic equilibrium is less stable in this case. When we do not require the double heterozygote to be at the optimum we find that the polymorphic equilibrium is more difficult to reach, and fixation becomes more likely. Second, increasing the number of loci decreases the probability of fixation, because adaptation to the optimum is possible by various combinations of alleles. Summaries of the genealogy (height, total length, and imbalance) and of sequence polymorphism (number of polymorphisms, frequency spectrum, and haplotype structure) next to a selected locus depend on the frequency that the selected mutation approaches at equilibrium. We conclude that multilocus response to selection may in some cases prevent selective sweeps from being completed, as described in previous studies, but that conditions causing this to happen strongly depend on the genetic architecture of the trait, and that fixation of selected mutations is likely in many instances.
TO improve our understanding of the genetics of adaptation, recent approaches of molecular population genetics and genomics have attempted to detect signatures of positive selection in the genome (selective sweeps) (Stephan 2010). Typically, these studies reveal many genes or gene regions that may have been under positive selection. However, the relationship between the genes under selection and associated traits remains usually unknown. Here we follow the opposite direction by starting with phenotypes and working toward the genotypes. A phenotype may be determined by a multitude of genes as well as the environment. Multilocus population genetics has been developed in the last decades to describe the evolution of multilocus systems and phenotypes (Bürger 2000). Different types of selection, such as directional, stabilizing or disruptive selection, modify the genetic constitution of the population and favor either extreme or intermediate genotypic values of the trait. In this study we focus on stabilizing selection, which drives a trait toward a phenotypic optimum. We investigate the trajectory of an allele affecting the phenotypic trait (from low frequency up to an equilibrium value). We are particularly interested in exploring the parameter range of trajectories that fix and therefore might generate selective sweeps.
Historically, there has been a great interest in the maintenance of genetic variability under stabilizing selection, because stabilizing selection is assumed to operate on traits in various organisms, for example, the coat color in mice (Vignieri et al. 2010), human facial features (Perrett et al. 1994), plant defense mechanisms (Mauricio and Rausher 1997), enhancer elements in Drosophila (Ludwig et al. 2000), and vocalization in frogs and toads (Gerhardt 1994); see also Endler (1986, Chap. V) for examples and discussion. Furthermore, it has been suggested that this type of selection exhausts genetic variation (Fisher 1930; Robertson 1956).
By contrast, many quantitative traits exhibit high levels of genetic variability. This contradiction motivated researchers to study the role of mutation (Lande 1975; Turelli 1984; Gavrilets and Hastings 1994; Bürger 1998), overdominance (Bulmer 1973; Gillespie 1984), migration (Tufto 2000), frequency-dependent selection through intraspecific competition for some resource (Bürger 2002; Bürger and Gimelfarb 2004), genotype–environment interaction (Gillespie and Turelli 1989), pleiotropy (Hill and Keightley 1988; Barton 1990; Zhang and Hill 2002), and epistasis (Zhivotovsky and Gavrilets 1992). Additionally, a lot of work has been devoted to exploring the ability of stabilizing selection in maintaining genetic variability of quantitative traits that are controlled by multiple loci in the absence of mutation. Theoretical focus was mainly on two-locus models, but also models of more than two loci have been analyzed.
Surprisingly, predictions about genetic variability depend profoundly on the number of loci. The two-locus model predicts that genetic variability may remain in the population due to stabilizing selection per se. On the other hand, in models with more than two loci the amount of genetic variability maintained by stabilizing selection is smaller. The reason is that the optimum can be approached very closely by various homozygous genotypes (Bürger 2000, Chap. VI) when there are more than two loci that control the trait. For the two-locus model, and assuming a symmetric viability model, such that the double heterozygous genotype is optimal and the fitness values of the remaining eight genotypes are symmetric about the optimum (e.g., Bodmer and Felsenstein 1967; Karlin and Feldman 1970), it has been shown that there are nine equilibria (Bürger 2000), seven of which can be stable but not simultaneously. Those seven equilibria split into four classes (Bürger and Gimelfarb 1999): they can be polymorphic for both loci or one of them or totally monomorphic. Because analytical solutions of the two-locus model are available, analysis of this model plays an important role in our study.
To our knowledge, the first effort that bridges quantitative trait evolution and selective sweeps was made by Chevin and Hospital (2008). Their work was based on a seminal article by Lande (1983). Lande’s model focuses on one locus of major effect on the trait and treats the remaining loci of minor effects as genetic background for this locus. It is assumed that heritable background variation is maintained at a constant amount by polygenic mutation and recombination (Lande 1975, 1983); also, the various loci that affect the trait are unlinked and there are no epistatic interactions. Chevin and Hospital (2008) used Lande’s model to infer the deterministic trajectory of a beneficial mutation that affects a quantitative trait in the presence of background genetic variability. They studied both directional and stabilizing selection and showed that fixation needs longer time than in the classical one-locus model (i.e., when genetic variability in the background is absent). In the case of stabilizing selection their approach (based on Lande’s model) suggests that the occurrence of selective sweeps at quantitative trait loci (QTL) is expected to be very rare. In contrast to Chevin and Hospital (2008) the present study assumes an explicit number of loci that determine the trait, as this was done by Bodmer and Felsenstein (1967), Karlin and Feldman (1970), and Bürger (2000, Chap. VI). Therefore, the assumption of constant variability in the genetic background is relaxed since the genetic background is modeled explicitly.
We analyze the evolution of the deterministic multilocus model and also its stochastic analog assuming a finite constant effective population size. The focus is on the properties of the trajectory of a new mutation at a certain locus (called focal locus thereafter) that affects the trait under selection. We also examine the parameters (such as the recombination rate and the contribution of the alleles to the phenotype) that affect the fixation probability of the new mutation. Finally, conditioning on the trajectory we generate coalescent simulations and examine the properties of the genealogy and the associated polymorphism patterns. Results are presented for the classical two-locus two-allele model, but we also extend the analysis up to an eight-locus two-allele model.
Methods
The general model
We consider a diploid population of size N and a quantitative trait under selection. The quantitative trait is controlled by l diallelic loci with no epistatic interactions on the phenotype. There is no dominance of allelic effects on the trait (but there may be for their effects on fitness). The alleles at the locus i are labeled as
The population evolves forward in time from
Recursion equations of the multilocus model
As mentioned above, the two-locus two-allele model has been widely used and will serve here as a reference point. We are particularly interested in the work of Willensdorfer and Bürger (2003) who explore the equilibrium properties of the two-locus two-allele model for Gaussian selection under the assumption of a symmetric fitness function for which the double heterozygous genotype is optimal and the fitness values of the remaining eight genotypes are symmetric about the optimum (see also Table S1). The analysis of Willensdorfer and Bürger (2003) provides the existence and stability criteria for the equilibrium points of the model. The fitnesses of the nine possible genotypes are shown in Table S1B. Let x1, x2, x3, and x4 represent the frequencies of the gametes
Willensdorfer and Bürger (2003) parametrize the model so that the effect of alleles
While Equation 1 describes the evolution of two-locus two-allele models, for more loci the following classical equation (see Gimelfarb 1998 and references therein) describes the evolution of gametic frequencies,
Lowess procedure
To capture the effect of the parameters on the equilibria in the system, we used the Lowess or Loess (locally weighted scatterplot smoothing) function. Lowess is a method that fits lines to scatterplots (Cleveland 1979). Lowess fits low-degree (usually 1 or 2) polynomial curves to localized subsets of the data. Thus, a global function (and therefore a model) is not required, and great flexibility can be gained. Here, we fit Lowess functions to data that assume two values (binary data): class 0 or class 1.
Coalescent simulations and SNP summary statistics
Assume a sample of k individuals from a present-day population (t = 0). Given the trajectory of the
Four statistics of coalescent trees have been used. First, h (the height of the coalescent tree) measures the scaled time from the present to the MRCA of the sample; second, l is the total length of the coalescent; third, two quantities,
Furthermore, we used SNP summary statistics to describe the polymorphism patterns in a present-day sample, as we move along the sequence alignment away from the
Results
Trajectories of the L 1 1 allele at the focal locus L1
Two-locus two-allele model with symmetric fitness function:
The first goal of this analysis is to illustrate the effect of the parameters of Willensdorfer and Bürger’s (2003) model on the fixation of the focal
Deterministic model:
For the deterministic two-locus two-allele model with symmetric fitness matrix, we numerically solve the system of recursions described in Equation 1 and record the frequency of the
Fixation of the allele is possible and this fixation may occur fast. These trajectories are similar to the trajectories obtained from the classical selective sweep theory. There is, however, a subset of trajectories that remain polymorphic for the focal locus. Furthermore, there is a class of trajectories that shows nonmonotonic behavior. The frequency initially increases and then may decrease to some equilibrium value.
To construct a trajectory we draw uniformly a value from the six-dimensional space described in Table 1. Since we draw random values for the parameters from the six-dimensional space, trajectories may become extinct, reach a polymorphic equilibrium point, or fix, depending on the parameter values. After 10,000 generations we record the final frequency of the trajectory. The proportion of parameter values that lead to a polymorphic equilibrium with frequency in the intervals (0, 0.5) and (0.5, 1) are 0.113 and 0.029, respectively. Similarly, the proportion of equilibrium points 0, 0.5, and 1 are 0.401, 0.415, and 0.041, respectively. Thus, the vast majority of trajectories lead to extinction or the polymorphic equilibrium value 0.5. Although it is not clear whether the frequencies reached after 10,000 generations represent equilibrium values, the fact that >40% of the trajectories remain polymorphic at frequency 0.5 is not surprising since the double heterozygote genotype is optimal for the symmetric fitness model. Thus, for the given parameter values the majority of trajectories either approach 0 or remain polymorphic at frequency 0.5 (Figure S1A).
To identify the factors that determine the fixation of the
Willensdorfer and Bürger (2003) proved that two conditions are required for the fixation or extinction of
In addition to
The impact of initial frequencies and allelic effects on determining the class of the trajectory for the comparison of the fixation class vs. the polymorphic class. Dark gray and black points depict trajectories that reach fixation, whereas light-colored points show trajectories that stay polymorphic. In dark gray , whereas in black
. In A and B, light-colored points are on the line y = 0, while dark gray and black points are on the line y = 1. In C, light gray points are represented as pluses (+). The focal allele is
. The curves in A and B represent the Lowess smoothing function for the data. There are two classes of trajectories: class 0 that denotes polymorphic equilibria and class 1 that represents fixation. As shown in A, for very small values of
the probability of a trajectory from the fixation class is small. In B, the initial frequency of
,
, shows nonmonotonic behavior: small and large values of
make the fixation of
possible. In C we can see how the contributions of the alleles interact. When
or
, then it is possible to obtain trajectories that reach fixation.
This finding may be explained as follows. Let the initial frequency of the
From this initial state, which is suboptimal, the population will move toward the optimal genotypes. Given that allele
Thus in this case there is a competition between alleles similar to the situation in other sweep models with multiple loci (e.g., Kirby and Stephan 1996). Given that the initial frequency of the
Furthermore, comparing the lowest frequencies for the
The parameters
Initial mean distance from the optimum and background genetic variance affect the probability of fixation for the allele. Class 0 represents extinction of
and class 1 represents fixation of
. Results are similar when class 0 represents the polymorphic equilibrium at frequency 0.5. (A) Small values of initial mean distance from the optimum disfavor fixation of the
allele because
is rarely beneficial. (B) Large values of initial background genetic variance disfavor fixation of
because it implies intermediate frequencies of
(and
), which has been shown (Figure 1B) to reduce the probability of fixation for
.
Background genetic variance at time
Stochastic model:
Next we study the behavior of the stochastic model when the fitness matrix is symmetric. The population size N = 10,000. The simulation parameters are similar to the deterministic two-locus two-allele model with symmetric fitness matrix. We use the average frequency of the last 500 generations,
The relation of six parameters to the class of the trajectory (fixed vs. polymorphic). The curve in each subfigure represents the Lowess smoothing function of the data. (A) The initial frequency of the allele. (B) The initial frequency of the
allele. (C) The parameter
, which defines the strength of selection. (D) The contribution of the allele
to the genotypic value. (E) The contribution of
to the genotypic value. (F) The parameter
. Black points represent class 1 (trajectories that result in the fixation of
), whereas gray points represent class 0 (trajectories that result in a polymorphic state for
).
The role of
Comparing the fixation class with the extinction class the following results have been obtained under the stochastic model. The roles of
Finally, it should be mentioned that symmetry itself is not solely responsible for the frequency of the trajectory classes we observed in our simulations. Obviously, if the model is symmetric but the fitness of the heterozygote is sub-optimal (this may occur when the genotypic values are
Two-locus two-allele model with general fitness function:
We relax the assumption of symmetry of the fitness matrix in the deterministic two-locus two-allele model using a general fitness scheme. The parameter space is given in Table 1. Essentially, the difference between this model and the symmetric fitness model is that there is no restriction on the relations between the contributions of the alleles. Thus, the effects
The shape of the trajectories in this model is similar to the symmetric fitness matrix model. The number of trajectories where the
An informative quantity for disentangling trajectories in which
Initial background genetic variance does not appear to have the same effect as in the symmetric model. In the symmetric model, we observed that the proportion of trajectories that reach fixation decreases as the initial genetic variance increases and that for large values of initial background genetic variance the proportion of trajectories that reach fixation diminishes (Figure 2B). In the general model, we observe only a slight decrease of the proportion of fixed trajectories as the initial background genetic variance increases.
The comparison between the symmetric and the general fitness model may be open to discussion because of the different sampling spaces for the allelic effects. In particular, for the two-locus two-allele model, we uniformly sample random values from
Biallelic models of up to eight loci with general fitness function:
We study the effect of the number of loci on the trajectory of a focal mutation at locus
We first analyzed the effect of the number of loci on the percentage of fixed trajectories in the deterministic model. For nonsymmetric fitness matrices, the maximum number of fixed trajectories occurs for two loci and then decreases as the number of loci increases (Figure 4). When the initial frequency of the
Effect of the number of loci on the proportion of trajectories that reach fixation for the deterministic model. The general fitness model has been studied. Parameter values used in the simulations are given in Table 1. In the legend box the number denotes the initial frequency of . For example, 0.001 illustrates the proportion of fixed trajectories for the general fitness matrix when the initial frequency is less than 0.001.
The effect of the number of loci on the percentage of fixed trajectories in the stochastic model is as follows. Due to drift the probability of a polymorphic equilibrium is reduced, and the population evolves mostly toward the absorbing states. For small initial frequencies of the
Effect of the number of loci on the frequency of trajectories that reach fixation for the stochastic model. Fixation is possible even for small initial frequencies of and the proportion of fixed trajectories decreases as the number of loci increases.
To examine how much the frequency of fixed trajectories is reduced by the presence of other loci under selection (beyond the effect of drift), we compare the proportion of fixed trajectories in multilocus models to a single-locus model with genotypes AA, Aa, aa, and constant selection coefficient
The ratio of the proportion of fixed trajectories/vs. the number of loci for the general model. The probability
is calculated as described in the main text. The initial frequency of
is 0.0001, and the
value is 100.
Under the general model, for more than two loci the effect of the initial background genetic variance (
The ratio of the effect of the focal allele on the phenotype to the mean initial trait value:
Another quantity of interest for multilocus models is the ratio of the effect of the focal allele to the mean initial trait value (henceforth denoted as ϕ). The results shown here refer to the deterministic model. ϕ is interesting because the mean initial phenotypic value may increase with the number of loci (results not shown). For the symmetric fitness model we observe the following patterns for ϕ: for the two-locus two-allele model the variance of ϕ is smaller for trajectories that reach fixation than trajectories that reach an equilibrium at frequency 0.5 or vanish (Figure S3A). The variance of ϕ is smaller for trajectories that reach equilibrium at a frequency in (0.5, 1) than trajectories that reach equilibrium at a frequency in (0, 0.5). Furthermore, ϕ is strictly negative for trajectories that fix and takes small absolute values. Negative ϕ implies that the signs of the initial mean trait value and the effect of
For the general fitness model, under the two-locus two-allele case the variance of ϕ is greater than for the symmetric two-locus two-allele model, and ϕ may be either positive or negative (Figure S4A). Thus, even if the effect of the focal locus on the phenotype is large compared to the initial mean phenotypic value, fixation of the focal allele might still occur. Furthermore, fixation of the focal allele is possible even when its effect is initially deleterious (i.e., ϕ is positive). Finally, trajectories that go to fixation or extinction have greater variance of ϕ than trajectories that stay polymorphic. For models with more than two loci, results are similar to the two-locus two-allele model (Figure S4B).
Coalescent simulations conditioning on the trajectory of the L11 allele
In this section we describe our coalescent simulations that were performed to obtain (i) the genealogies and (ii) the neutral polymorphism patterns in the neighborhood of the focal locus. The results are approximate because of two reasons. First, conditioning on the frequency of one allele implies that the coalescent rates of all genotypes that carry this allele are equal. However, in the case of multilocus models this is not true. For example, the coalescent rate of the
Given a trajectory, coalescent simulations require specifying the time point from which the backward process is considered. That means, the genealogies will be strikingly different if the backward process initiates 100 or 5000 generations after the onset of the
Backward simulations have been performed using either a modified version of the software mbs (Teshima and Innan 2009) or the msms software (Ewing and Hermisson 2010). Our mbs algorithm implements the infinite site model, in contrast to the original software, and it calculates and outputs statistics related to the coalescent trees, such as the height, the total length, and the balance of the coalescent. msms is more efficient. Both algorithms produce equivalent results. For the coalescent simulations we have used parameters related to human data. Assuming that the mutation rate
Summary statistics for the coalescent trees as a function of the distance from the locus. The solid line refers to the equilibrium frequency 1 (fixation), the dashed line refers to the equilibrium frequency in [0.9, 1), the dotted line refers to the frequency in [0.3, 0.4), and the thick gray solid line to neutral simulations with the same parameter values. The dash-dotted line presents the results for the nonmonotonic trajectories. Note that the results for the nonmonotonic trajectories overlap completely with the neutral curves. Each point is based on 2500 simulated trajectories. A coalescent tree was generated by conditioning on each trajectory. Trajectories are generated from the two-locus two-allele stochastic model assuming the multidimensional parameter space in Table S1. (A) The height of the tree is shown. (B) The total length of the coalescent. (C and D) The statistics and
, respectively.
Furthermore, we have used molecular population genetics summary statistics to describe the properties of the polymorphisms in the proximity of the
Tajima’s D (Tajima 1989) is negative over the whole region for frequencies >0.9. For fixed trajectories, Tajima’s D becomes less negative closely to the
Discussion
Overview
In this study, we explore selective sweeps in multilocus two-allele models of a quantitative trait. Selection works on the phenotype based on a Gaussian fitness function. The Gaussian function seems an appropriate choice for many quantitative traits (Endler 1986; Willensdorfer and Bürger 2003), because it naturally formalizes the concept of the evolution toward an optimum value. Furthermore, it is sufficiently flexible to allow for modeling both stabilizing and directional selection. Stabilizing selection is modeled by assuming that the optimal genotypic value is located between the extreme genotypic values that an individual may obtain. Directional selection can be modeled by assuming that the optimum is more extreme than the genotypic values that the individual may have. Therefore, the allele frequencies shift toward the direction of fixation of the most extreme genotype favored by selection.
Previous studies (Bodmer and Felsenstein 1967; Karlin and Feldman 1970) suggest that multiple equilibrium points exist in two-locus two-allele models with a Gaussian fitness function. Furthermore, conditions are provided for their existence and stability. However, the trajectories of the alleles toward the equilibrium points have not been explored. This study focuses on the trajectory of an allele, which initially is in low frequency and moves toward its equilibrium points.
An important result of our analysis shows that selective sweeps that initiate from a very low frequency of the
Relaxing the assumption of symmetric
Assuming an effective population size N = 10000 we also explore the effects of random genetic drift. Genetic drift increases the proportion of the trajectories that reach monomorphic states (Figure S1C). This is expected because genetic drift pushes the model toward its absorbing states. Therefore, selection needs to be sufficiently strong to maintain the polymorphic state of the trajectory. This is illustrated clearly in Figure 3, where the
When more than two loci are modeled, the proportion of trajectories that reach fixation decreases as the number of loci increases (Figures 4, 5, and 6). The proportion of trajectories that become extinct increases, whereas that of trajectories that remain polymorphic decreases. This is in agreement with the results of Bürger (2000) who shows that when the trait is determined by more than four loci the monomorphic equilibrium points become more likely.
Model limitations
We presented a study of selective sweeps in multilocus models based on numerical calculations and computer simulations. We studied the effects of recombination rates between loci, locus contributions to the phenotypic value, selection intensity
Furthermore, by using Lowess smoothing functions we study the effects of parameters independently of each other. Clearly, several parameters interact (e.g., initial allelic frequencies and
Additionally, we demonstrated that a significant percentage of trajectories may reach fixation under our simulation parameter values. This result might be invalid for multilocus models in general, since we omit mutations during simulations and we study only a single trait.
A further limitation of our analysis might be that different simulated models assume different numbers of free parameters. For example, as mentioned above, we sample allelic effects in
Coalescent trees for trajectories that approach fixation are similar to coalescent trees of classical selective sweeps
Conditioning on the trajectory of the
Coalescent trees for trajectories that stay polymorphic at intermediate or low frequencies resemble neutrality
When the trajectories do not reach fixation, then a part or all signatures of a selective sweep become invisible, depending on the equilibrium frequency of the trajectory. For example, when the equilibrium frequency is between 0.9 and 1, then the height of the coalescent tree equals the neutral expectations, because ancestral alleles (
Depending on the parameter values, a large fraction of trajectories is maintained at some equilibrium value and does not reach fixation. For these trajectories analysis of incomplete sweeps (Sabeti et al. 2002; Voight et al. 2006; Tang et al. 2007) may be useful. There is, however, an essential difference between incomplete sweeps and sweeps in multilocus models that were studied in this article. Incomplete sweeps are on the way to fixation, whereas the sweeps studied here remain at equilibrium frequency. Therefore, the signatures of selection will be visible only in the cases in which the equilibrium frequency has been reached recently. If the trajectory remained at the equilibrium level (either polymorphic or monomorphic for the focal allele) for too long, then the signatures of selection will fade away due to recombination.
Our results indicate that detection of selection from polymorphism patterns in multilocus models may be hard. When the focal allele fixes in the population, then the statistical tools that are used to detect sweeps in one-locus two-allele models may be useful (e.g., Kim and Stephan 2002; Nielsen et al. 2005; Pavlidis et al. 2010). This is also true for equilibrium points close to fixation. Even if the patterns appear to be different than those of fixed trajectories, the direction of perturbations is similar to the classical sweep models, and therefore the same statistical tools may be used. However, for smaller equilibrium frequencies some or all signatures of selection studied in this article disappear.
A hallmark of multilocus two-allele models are the nonmonotonic trajectories. This class of trajectories is absent from one-locus two-allele models. These trajectories quickly approach a certain frequency, but eventually they decline either to extinction or to some other equilibrium frequency. The difference between their maximum frequency and the equilibrium frequency may be quite large. In the simulated data sets, we observed differences even larger than 0.5. However, the polymorphism and coalescent patterns seem to be very similar to the neutral expectations. Thus, these trajectories may be completely invisible using the summary statistics studied in this article. Summarizing the results, it may be claimed that the statistical tools that have been developed to detect selective sweeps may detect only a small proportion of the multilocus selection cases, namely only those cases that result in fixed trajectories or equilibrium trajectories close to fixation. Tools that are used for detecting incomplete sweeps may be useful when the trajectory has reached its equilibrium frequency very recently. For trajectories that have reached their equilibrium frequency further in the past, we expect that recombination will destroy the signatures of selection. In fact, the results imply that positive or stabilizing selection may occur at a much higher rate than previous studies that analyze selective sweeps report (e.g., Li and Stephan 2006). However, the majority of the cases remains undetectable since both the coalescent trees (as summarized here) and the polymorphism summary statistics do not deviate from neutrality.
Comparison of the present study to Chevin and Hospital (2008)
To our knowledge the only study of selective sweeps at QTL was done by Chevin and Hospital (2008). They assume an infinite number of unlinked and independent loci that control a trait. Moreover they assume that the variability in the genetic background remains constant during the selective phase and that the effect of the focal locus on the trait value is small compared to the effect of the genetic background. These assumptions enable them to solve the trajectory of a new allele analytically for linear, exponential, and Gaussian fitness functions. Chevin and Hospital (2008) focus mainly on the trajectories that reach fixation, but they also study trajectories that initially increase and vanish eventually. Under their model polymorphic equilibria are also possible for certain initial conditions (Luis-Miguel Chevin, personal communication). The results of Chevin and Hospital (2008) indicate that trajectories of new alleles evolve slightly slower than classical selective sweeps, and given that the allele will be fixed, selective sweeps look slightly older than the classical one-locus selective sweep.
In our study, we model the genetic basis of a quantitative trait explicitly, taking into account that a finite number of loci makes the model mathematically intractable. Therefore, computer simulations were employed to study the trajectory of a new beneficial allele. The contribution of alleles may be arbitrary as well as the recombination fraction between the loci. We provide information about the role of various parameters on the fixation of the trajectories, but we also study extensively the trajectories that remain polymorphic. The present study may thus be considered complementary to the study of Chevin and Hospital (2008) for finite multilocus models, providing information about the trajectories of new alleles and the polymorphism patterns generated by selective sweeps in multilocus models. Furthermore, our study complements the deterministic analyses of Willensdorfer and Bürger (2003) and Gimelfarb (1998) by including random genetic drift. We expect that the results of this study as well as those of the previous theoretical investigations will be essential for the development of software for the detection of selective sweeps in multilocus models.
Acknowledgments
We are very grateful to two anonymous reviewers and Luis-Miguel Chevin for their valuable suggestions. This work has been supported by grants from the Deutsche Forschungsgemeinschaft Research Unit 1078 to W.S. (Ste 325/12) and D.M. (Me 3134/3) and a grant from the Volkswagen-Foundation to P.P. (I/824234).
Footnotes
Communicating editor: L. M. Wahl
- Received February 14, 2012.
- Accepted June 10, 2012.
- Copyright © 2012 by the Genetics Society of America