## Abstract

Large populations may contain numerous simultaneously segregating polymorphisms subject to natural selection. Since selection acts on individuals whose fitness depends on many loci, different loci affect each other’s dynamics. This leads to stochastic fluctuations of allele frequencies above and beyond genetic drift—an effect known as *genetic draft*. Since recombination disrupts associations between alleles, draft is strong when recombination is rare. Here, we study a facultatively outcrossing population in a regime where the frequency of outcrossing and recombination, *r*, is small compared to the characteristic scale of fitness differences σ. In this regime, fit genotypes expand clonally, leading to large fluctuations in the number of recombinant offspring genotypes. The power law tail in the distribution of the latter makes it impossible to capture the dynamics of draft by an effective neutral model. Instead, we find that the fixation time of a neutral allele increases only slowly with the population size but depends sensitively on the ratio *r*/σ. The efficacy of selection is reduced dramatically and alleles behave “quasi-neutrally” even for *Ns*≫1, provided that |*s*| < *s*_{c}, where *s*_{c} depends strongly on *r*/σ, but only weakly on population size *N*. In addition, the anomalous fluctuations due to draft change the spectrum of (quasi)-neutral alleles from *f*(ν) ∼ ν^{−1}, corresponding to drift, to ∼ ν^{−2}. Finally, draft accelerates the rate of two-step adaptations through deleterious intermediates.

THE genetic diversity of a population is determined by mutation, selection, recombination, and genetic drift, *i.e.*, the stochasticity inherent in reproduction. Understanding how genetic diversity depends on these elements of evolutionary dynamics is central to population genetics, since it allows us to make inferences about the past history and to predict how rapidly populations can adapt.

Population genetic inference focuses on the diversity at putatively neutral sites and assumes that the history of these sites is described by the neutral “coalescent” (Kingman 1982). Coalescent theory models the genealogy of asexual organisms or nonrecombining segments of a genome by positing that lineages merge at random, backward in time, due to common ancestry. Under this assumption, the mean time to the most recent common ancestor, *T*_{C}, of the extant *N* individuals, is 2*N* generations. The coalescence timescale is very important, since the genetic diversity of the population is given by the number of mutations that occur in all lineages descending from the most recent common ancestor of the population. Genetic diversity is therefore controlled by *T*_{C} and hence, under the assumption of neutral evolution, proportional to *N*. [Coalescent theory has been extended to weak selection (Krone and Neuhauser 1997) and recombination (Hudson 1983; Griffiths and Marjoram 1996).]

However, the prediction that neutral genetic diversity is proportional to *N* is at odds with observations: Population sizes of different organisms differ by many orders of magnitude, while genetic variation among organisms is fairly constant (Lewontin 1974). To resolve this “paradox of variation”, Maynard Smith and Haigh (1974) suggested that selection acting on linked loci might reduce diversity at a neutral locus. Rapid fixation of a novel mutation at a linked locus will perturb the allele frequencies. These perturbations can bring alleles to fixation and, more generally, reduce the coalescence time and hence the average genetic diversity (Kaplan *et al.* 1989; Barton 1998; Gillespie 2001). Such “hitchhiking” of neutral alleles on linked selected loci will dominate over genetic drift in large populations. Since hitchhiking leads to larger perturbations for more closely linked loci, one expects genetic variation to correlate with the recombination rate, as is indeed observed in *Drosophila* (Begun and Aquadro 1992; Stephan and Mitchell 1992; Andolfatto and Przeworski 2001; Sella *et al.* 2009).

A related effect was described earlier by Hill and Robertson (1966), who studied the reduction in the fixation probability of a novel beneficial mutation because of selection acting at a linked locus. This effect is commonly known as Hill–Robertson interference (Felsenstein 1974). Hitchhiking and Hill–Robertson interference are different aspects of the same phenomenon, one focusing on the effects of linked selection on genetic diversity and the other on the efficacy of selection. While hitchhiking and Hill–Robertson effects are primarily associated with positive selection for novel alleles, purifying selection against deleterious mutations also affects genetic diversity. The elimination of (neutral) alleles linked deleterious mutations is known as background selection. The lower the recombination rates are, the larger is the target for linked deleterious mutations, resulting in stronger background selection (Charlesworth *et al.* 1993; Hudson and Kaplan 1995; Nordborg *et al.* 1996).

Most models used to study Hill–Robertson and hitchhiking effects between positively selected mutations consider only two loci. Deleterious mutations, however, are expected to be much more frequent, and background selection models typically consider many mutations with small deleterious effects. A systematic study of the effect of interference between many weakly selected sites in a mutation/selection/drift equilibrium was presented by McVean and Charlesworth (2000), who used computer simulations of a model of codon bias evolution. They showed that linkage-dependent interference between a large number of weakly selected sites has substantial effects on genetic diversity, fixation probability of mutations, and the degree of adaptation measured as the frequency of preferred codons. This and subsequent computational studies reinforced the understanding that the Hill–Roberson effect reduces the effectiveness of selection and made clear that a quantitative understanding of Hill–Robertson effects in multilocus systems requires an analysis that goes beyond two-locus models (Comeron and Kreitman 2002; Seger *et al.* 2010); see Comeron *et al.* (2008) for a recent review.

It is common to interpret the effect of linked selection in terms of increased variance in offspring number. In this interpretation, linked selection can be thought of as a stochastic force analogous to genetic drift and is often referred to as *genetic draft*—a term coined by Gillespie (2000). Following Hill and Robertson (1966) and Felsenstein (1974), the increased variance in offspring number is often captured by a reduction in the “effective population” size, *N*_{e}, in a neutral model (which means enhanced drift and accelerated coalescence). It has, however, been noted that a rescaled neutral model does not consistently explain all observables (Charlesworth *et al.* 1993; Braverman *et al.* 1995; Fay and Wu 2000; McVean and Charlesworth 2000; Barton and Etheridge 2004; Seger *et al.* 2010) and that different effective population sizes need to be defined depending on the question and timescale of interest (Ewens 2004; Karasov *et al.* 2010). Furthermore, the dependence of *N*_{e} on the actual population size and other relevant parameters is not understood (Wiehe and Stephan 1993; Gillespie 2000; Lynch 2007).

Here, we provide analytic results on the effect of draft in a stochastic multilocus evolution model. Instead of a mutation/selection equilibrium considered in McVean and Charlesworth (2000), our focus here is a continuously adapting and facultatively sexual population, like human immunodeficiency virus (HIV) in coevolution with the host’s immune system. Our model and its relation to the biology of HIV are described in more detail below. The scope of the model, however, extends beyond HIV and is equally applicable to scenarios where background selection is dominant. Many important and well-studied organisms such as influenza, yeast, and plants are facultatively sexual. Rice, for example, is a partly selfing species and strong selection has acted during its domestication (Caicedo *et al.* 2007). While dominance effects can render the selfing of diploid organisms more complicated than facultatively sexual propagation of haploid organisms (Charlesworth *et al.* 1991; Kelly and Williamson 2000), our analysis still provides a null model on top of which dominance effects can be investigated.

Using computer simulations of an adapting population, we first demonstrate how quantities such as the coalescence time, the fixation probability of beneficial or deleterious mutations, and the allele frequency spectra depend on the rate of outcrossing relative to selection. We also show that our *in silico* observations cannot be described by a neutral model with a reduced effective population size. This is because single genotypes can, through clonal expansion, give rise to a wildly fluctuating number of recombinant genotypes. The distribution is so broad that its variance diverges, which in turn makes an effectively neutral diffusion limit impossible. To provide an analytic understanding of the simulation results, we use a branching process model that allows us to study the stochastic dynamics of novel mutations (neutral, beneficial, and deleterious) as they spread through the population. We calculate fixation probabilities and the typical time to fixation, *T*_{fix} (and more generally, the probability of attaining *n* copies after time *T*), for a new mutant allele, making explicit the dependence on the rate of recombination, fitness variance, and the population size. An important consequence of genetic draft is a qualitatively different frequency spectrum of rare alleles, which we also calculate analytically. Finally, we show that empirical HIV allele frequency spectra are in agreement with our theoretical prediction, confirming the relevance of our model to the dynamics of HIV adaptation.

## Model

Our model is inspired by the intrapatient evolution of HIV. We first outline briefly the biology of HIV and then describe our computational model and the branching process approximation used to study the phenomena analytically.

### Intrapatient evolution of HIV

After successful infection, the virus proliferates quickly to levels of >10^{6} viral genomes per milliliter plasma. This acute phase lasts for several weeks, after which the immune system of the host reduces the number of viral genomes per milliliter plasma to ∼100−10^{4}. The following chronic infection can last for many years and is largely asymptomatic up to the onset of AIDS. However, even in the chronic phase of infection the total number of viruses produced per day is estimated to be ∼10^{10}, while the generation time is ∼2 days (Perelson *et al.* 1996; Markowitz *et al.* 2003). The viral population is subject to clearance by cytotoxic T-lymphocyte (CTL)–mediated cell death and antibody-mediated destruction of virus (Paul 2008), which puts the virus under constant selection to change its proteins. CTL epitopes are found throughout the HIV genome (Korber *et al.* 2007) and escape mutations are often selected for with coefficients of a few percent per generation (Asquith *et al.* 2006). Antibodies bind primarily to the products of the envelope gene. In response to antibody recognition, numerous escape mutants arise in the envelope gene and spread through the population driven by selection coefficients of 1% to a few percent (Williamson 2003; Neher and Leitner 2010). Additional, even stronger, selection pressure is put on the viral population by drug treatment, resulting in rapid emergence of resistance during suboptimal therapy (Larder *et al.* 1989; Hedskog *et al.* 2010). Several hundred mutations are implicated in drug resistance (Rhee *et al.* 2003).

HIV carries two copies of its RNA genome (10^{4} bp), from which one cDNA is produced and integrated into the host cell genome. Recombination in HIV occurs through frequent template switching of the reverse transcriptase in the process of cDNA (Levy *et al.* 2004) production. In other words, the two genomes of the “diploid” virion are combined to produce a “haploid” cDNA, from which all the viral proteins and the next generations genomes are produced. The rate of recombination is limited by co-infection of host cells with several viruses, which is necessary to produce a heterozygous virus (see Figure 1A). Estimates suggest an effective recombination rate of ≈1.5 × 10^{−5} per base per generation (Neher and Leitner 2010), corresponding to a co-infection rate ≤10%. Viruses therefore undergo clonal amplification most of the time, while different parts of the genome are only weakly linked in the event of “outcrossing” (heterozygote formation, followed by template switching).

### Computational model

On the basis of the discussion above our model must include the following elements: a large population, selection at many polymorphic loci, a constant supply of beneficial mutations, and facultative mating with substantial reassortment in the case of outcrossing. Models combining all of these elements have already been established (Rouzine and Coffin 2005, 2010; Neher *et al.* 2010) and a similar model is used here. In our simulation a (nearly) constant variance of fitness in the population, σ^{2}, is maintained by a constant supply of beneficial mutations. This results in a continuously adapting population, with a rate of adaptation given approximately by the variance in fitness. Of course, one does not expect constant selection to persist indefinitely (see Mustonen and Lässig 2010 for a discussion of evolution in changing environments): steady conditions on the timescale of fixation of a single allele would suffice for the validity of our analysis.

To simulate population dynamics with the ingredients described above, we implemented discrete time (Fisher–Wright) dynamics of individuals with haploid genomes and a large number of loci that additively define the (log-)fitness *x* of an individual. Each individual contributes a Poisson-distributed number of gametes to the next generation with mean where is the current mean fitness and α is adjusted to keep the population size approximately constant at *N*. To implement facultative mating, a fraction *r* of the gametes are paired at random, the alleles at all of their loci reassorted, and the two resulting recombinant offspring are placed into the next generation. The remaining 1 − *r* fraction of gametes is placed unchanged into the next generation. New beneficial mutations with effect size *s*_{0} (*Ns*_{0}≫1) are introduced into a randomly chosen individual at rate *NU*_{b}≫1. After equilibration, the fitness distribution in the population is approximately Gaussian with a nearly constant variance and steadily increasing mean fitness. Into this adapting population wave, we introduce additional mutations, neutral, deleterious, or beneficial, and track their dynamics. In particular, we record after what time these mutations reach certain frequency thresholds and the fitness of genotypes on which successful mutations originate. We also measure the allele frequency spectra and the cumulative number of individuals that carried a particular mutation before it goes extinct. A further, and more detailed, description of the implementation is given in *Appendix B*.

Our focus here is on rapidly adapting populations with many concurrent sweeps responsible for the fitness variance σ^{2}. HIV, as any other organism, also suffers from deleterious mutations that vastly outnumber the beneficial mutations. Deleterious mutations also contribute to fitness variation and increase σ^{2}. For the present investigation, it does not matter whether fitness variation is due to sweeping beneficial mutations, deleterious mutations subject to purifying selection, or a combination of the two. The two extreme scenarios are illustrated in Figure 1B. We shall see that the fate of novel mutations depends on how the fitness of clones changes with respect to the mean fitness of the population. It is irrelevant whether this fitness difference decreases because the mean fitness increases or because the fitness of the clone decreases due to deleterious mutations.

### Mathematical model

To understand the results of the computer simulations, we analyze analytically an idealized model, which describes the propagation of a new allele (under selection, drift, and occasional recombination onto a different genetic background) within a population with a fitness distribution approximated by a Gaussian traveling wave. This is justified if many beneficial mutations are sweeping through the population at the same time (which requires *NU*_{b}≫1 and is true for HIV) and is supported by our numerical simulations. Gaussian fitness distributions are expected in large asexual populations (Tsimring *et al.* 1996; Rouzine *et al.* 2003; Desai and Fisher 2007) and in the context of background selection [the deterministic equilibrium distribution in the case of multiplicative selection is Poisson (Haigh 1978), which is well approximated by a Gaussian in the limit of interest]. Recombination stabilizes the Gaussian form if outcrossing rates are much larger than the inverse of the coalescence times (Cohen *et al.* 2005; Rouzine and Coffin 2005); see supporting information, Figure S1. Selection moves this Gaussian distribution toward higher fitness at a rate *v* equal to the variance in fitness σ^{2} (Fisher’s “fundamental theorem” of natural selection).

#### Branching process approximation:

The fate of a novel allele is decided mainly at early times when the allele is rare and the finite population size constraint can be neglected. In this case the dynamics of the novel allele are governed by a stochastic branching process that in addition to the birth/death events also accounts for the recombination process that transfers the novel allele to different genetic backgrounds with different fitness *x*′ (Barton 1995a; Neher *et al.* 2010). Within the branching process approximation the probability of finding *n* copies of the allele (with effect *s* on fitness) anywhere in the population at time *T*, given that there were *k* copies on genotypes with fitness *x* at time *t*, obeys the equation(1)(see File S1), where is the birth rate, *D* = 1, the rate at which individuals die (used to set the unit of time), and *r* is the outcrossing rate. The first term describes the probability flux out of state (*k*, *t*). The second term corresponds to a birth, which happens with rate *kB* and in which case the final state (*n*, *T*) is reached with probability *p*(*n*, *T* | *k* + 1, *t*, *x*). Analogously after a death, the probability of reaching (*n*, *T*) is *p*(*n*, *T* | *k* − 1, *t*, *x*). The last term describes the process of outcrossing with the offspring fitness distribution parameterized by a recombination function *K _{xx}*

_{′}, which depends on the details of the model.

In a model of outcrossing where offspring are produced by mating two individuals at random and reassorting all of their genes, the recombination kernel *K _{xx}*

_{′}, after averaging over the mating partner, is given by(2)In this model, known as the

*infinitesimal model*, the distribution of the offspring fitness is centered around half the fitness of the parent carrying the focal allele, measured relative to the instantaneous mean fitness (Bulmer 1980; Barton 2009) (the model was referred to as the

*full recombination model*in Neher

*et al.*2010). (Note that the variance of offspring fitness relative to parental mean is σ

^{2}/2. The variance of 3σ

^{2}/4 in Equation 2 is the result of averaging over the mate.) We assume that individuals are polygamous;

*i.e.*, each offspring is the result of an independent mating event.

For most of the analytic calculations, we employ an even simpler recombination model, where offspring fitness is simply a random sample from the population, independent of the parents:(3)In this *communal recombination* model new offspring are assembled from the genetic diversity in the entire population; *i.e.*, the novel genotype is drawn from a product distribution with the current allele frequencies in the population (Barton 2009; Neher *et al.* 2010). Note, however, that this model does not lead to instantaneous elimination of correlations between loci (linkage disequilibrium), since only a fraction of the population outcrosses in a single generation. The infinitesimal and communal models have been shown to yield very similar results in Neher *et al.* (2010). We confirm this again by simulation of both models. Even though this model is a drastic idealization, it might resemble rather closely the recombination process in HIV. Since recombination in HIV depends on co-infection of T cells with several viruses, most recombination occurs in centers with high concentrations of virus (Jung *et al.* 2002). In these centers, lineages might undergo several successive outcrossing events, effectively producing a cloud of offspring that contain genetic material from a large number of parents.

## Results

### Simulation results

Figure 2 shows simulation results for the fixation time of neutral mutations and the fixation probability of beneficial and deleterious mutations for populations of different size and different outcrossing rates. At low *r*/σ, fixation times (Figure 2A) are reduced by orders of magnitude below the neutral expectation of 2*N* and reach the latter only for *r*/σ≫1, an effect already observed in Charlesworth *et al.* (1992). The scaling of the data in Figure 2A reveals that fixation times are not proportional to the population size at low *r*/σ, in which case the curves would lie on top of each other. Genetic draft has an equally dramatic effect on the efficacy of selection, which is shown in Figure 2B. The fixation probabilities of beneficial and deleterious mutations are only slightly perturbed from *N*^{−1} if *r*/σ < 1, even though |*Ns*| ranges up to 80. The fact that genetic draft reduces fixation times and the efficacy of selection is of course well known and it is customary to describe this effect by a neutral model with a reduced population size, *i.e.*, treating draft as if it were just an enhancement of genetic drift. This, however, is often not appropriate, as draft causes fluctuations of a very different nature from those of genetic drift. To illustrate the problem, we define 2*N*_{e} as the fixation time of neutral mutations *T*_{fix}. With this definition, a mutation with effect size *s* should fix with probability However, the observed fixation probability does not follow this expectation if *r* ≤ σ, as shown in Figure 3A (note the logarithmic scale).

Another striking feature where a neutral model with a reduced effective population size fails to capture the effects of draft is the allele frequency spectrum, shown in Figure 3B for various ratios of *r*/σ. For small *r*/σ, the frequency spectrum *f*(ν) of neutral alleles falls off much more rapidly than the prediction of the diffusion theory: it decays as ν^{−2} with frequency instead of ν^{−1}. This effect is often referred to as “excess of rare variants” Braverman *et al.* (1995), but should, in fact, more aptly be called a “lack of common variants”. Very similar dependencies of the fixation time, fixation probabilities, and allele frequency spectra on *r*/σ are observed if fitness variation is not due to multiple selective sweeps but due to purifying selection. Simulation results for such a background selection scenario are shown in Figure S2 and Figure S3.

To rationalize the observed behavior we solved analytically a simplified branching process model describing the dynamics of alleles in an adapting population; see *Mathematical model*. This branching process solution shows explicitly how observables depend on population parameters and elucidates the difference between draft and drift.

### Analytic results for the branching process model

The key to understanding the qualitatively different behavior at low *r*/σ compared to *r*/σ≫1 is the fact that genotypes with fitness *x* can expand clonally if *i.e.*, if their growth rate exceeds the rate at which they outcross. In contrast to the case of *r*/σ≫1, for *r*/σ ≤ 1 the condition for clonal expansion is fulfilled for a substantial fraction of the genotypes in a finite population. A genotype establishes with probability *x* − *r* [setting ] and is clonally amplified so that its copy number subsequently grows as(4)where *T* is the time since establishment (Desai and Fisher 2007). The term accounts for the increasing mean fitness, of the rest of the population, competing with the clone. After establishment, a fit genotype therefore gives rise to a clone whose size has a Gaussian profile in time, as illustrated in Figure 4A. During its lifetime, an individual clone spawns an average of recombinant genotypes. For *x* − *r*≫σ one finds which increases rapidly with increasing *x* − *r*. In the *Discussion*, we develop this intuition into a simplified model of the genealogy of clones. Figure 4B shows the copy number of the mutant allele on different genetic backgrounds as a function of time in the continuous-time branching process model. The observed stochastic trajectory shows the features embodied in the simplified effective model: an anomalously fit genotype gives rise to many recombinant offspring genotypes, resulting in large fluctuations in the copy number of the mutant allele.

The dynamics of the distribution *p*(*n*, *T* | *x*) of a mutant allele, given that it arose on a genotype with fitness *x* (*k* = 1, *t* = 0), can be described by the branching process defined in Equation 1. We proceed by analyzing the fixation probability, the typical time to fixation, and the allele frequency spectra for neutral, beneficial, and deleterious mutations for the case *r*/σ ≤ 1. In the opposite limit of rapid outcrossing, the effect of selection during the lifetime of a genotype is small and Equation 1 can be analyzed perturbatively in σ/*r*. This perturbation analysis was performed in Neher *et al.* (2010). It was shown that the fixation probabilities of beneficial mutations are reduced by a factor 1 − ασ^{2}/*r*^{2}, where α is a numerical factor of order one and depends on the details of the recombination model. This is consistent with the classical argument of Robertson (1961) showing that the cumulative effect of draft is proportional to the square of the degree of linkage (Santiago and Caballero 1998; Barton 2009). The analysis and derivation of the results are mainly relegated to *Appendix A*; here we present and discuss the results.

### Extinction and survival of neutral mutations

The fundamental quantity describing the stochastic dynamics of an allele is the distribution, *p*(*n*, *T*), of the number of copies that exist *T* generations after the founding mutation. Note that the mutation could have arisen on genotypes with different background fitness *x*, and *p*(*n*, *T*) is the average over all possible *x* at time *T* = 0. Of particular importance is the survival probability *P*_{S}(*T*) = 1 − *p*(0, *T*). Figure 5 shows *P*_{S}(*T*) for a neutral mutation for different outcrossing rates in the communal recombination model, obtained by a numerical solution of Equation A1. Initially, *P*_{S}(*T*) decays as (1 + *T*)^{−1}, regardless of the outcrossing rate. This is the same behavior seen in homogenous or completely neutral populations. Selection on the heterogenous backgrounds starts influencing the fate of the allele at times *T* > σ^{−1}, *i.e.*, the reciprocal of the typical selection differentials in the population. After this initial transient, the survival probability depends strongly on the relative magnitude of the outcrossing rate and the fitness variance in the population, decaying faster for smaller *r*/σ. We show in *Appendix A* (Equation A8 and *Short-time solution* section), that the time dependence of the survival probability is given by(5)Consistent with our previous argument, we find that for *r*≫σ neutral mutations are unaffected by selection on the background, reproducing the familiar result *P*_{S}(*T*)∼ *T*^{−1}. However, a strong deviation from this neutral behavior occurs as soon as *r*/σ approaches 1. In the regime with *r* < σ, the ratio *r*/σ enters the dynamics of *P*_{S}(*T*) as a prefactor of log^{2} *T* in the exponent, resulting in faster extinction at lower *r*. (Yet another regime appears at even smaller *r* for and is discussed in File S1). Equation 5 was derived using the communal recombination model. In Figure A1 we show, via simulation of the branching process, that the infinitesimal model exhibits the same asymptotic dependence on parameters.

### Dynamics of surviving alleles and fixation times

Given that a new allele is not lost after *T* generations, what is the mean number of its copies? The answer to this question is surprisingly simple and will inform us about the scaling of fixation times of neutral alleles. The unconditional expected number of copies after time *T* is obtained by multiplying Equation 1 by *n*, summing over *n*, and averaging over the distribution of *x* at *T* = 0. One finds the simple expression(6)which reduces to in the neutral case. Here, denotes the average over *p*(*n*, *T*), while the overbar denotes the average over the fitness of the founding genotype. Hence, the expected number of copies depends neither on recombination nor on the speed of adaptation. It depends only on the intrinsic fitness effect of the novel allele. However, is a product of the survival probability, times the average *n* conditioned on survival. For neutral mutations, the mean *n* conditioned on survival is therefore the reciprocal of the survival probability, and when *P*_{S}(*T*) is of order *N*^{−1}, a nonextinct allele has reached copy numbers of ∼*N* and has essentially fixed. The first passage time, *T*_{fp}, to allele frequency ν, *i.e.*, to *νN* copies, is given asymptotically by(7)We see that the time to reach allele frequency ν scales very differently with *N* in the two regimes. Instead of diffusive dynamics with diffusion constant ∼*N*^{−1}, the rapid turnover of individuals by selection at *r* < σ results in large fluctuations, which can rapidly propel a mutation to large numbers. Simulation results confirming the validity of the argument that inverse survival probability can be used as a proxy for typical copy numbers, as well as the adequacy of modeling allele frequency dynamics by a branching process, are shown in Figure A1.

A different approach to calculate fixation times in a model of continuous adaptation with facultative outcrossing has been developed in Rouzine and Coffin (2007, 2010). Rouzine and Coffin use a coalescence approach based on the clonal structure of the population to calculate fixation times and obtain the same exponential dependence on

### Allele frequency spectra

In addition to changing the dependence of the fixation time on the population parameters, draft also results in a qualitatively different allele frequency spectrum. This has been noted previously (Braverman *et al.* 1995; Fay and Wu 2000; Gillespie 2000; McVean and Charlesworth 2000) but to our knowledge never calculated explicitly. Figure 3B shows the distribution of derived neutral allele frequencies ν = *n*/*N* at different outcrossing rates measured in individual-based computer simulations. At high outcrossing rates, we observe *f*(ν) ∼ ν^{−1} as expected for mutations subject to random drift alone. However, when draft is dominant, *f*(ν) decays more steeply as(8)This steep decay persists up to a crossover frequency that depends on *r*/σ and the population size. Beyond this crossover frequency, the finite population size constraint becomes important. The calculation of the allele frequency spectra is detailed in *Appendix A*, *Distribution of the allele copy number* and the result for the copy number distribution is given in Equation A22.

The *f*(ν) ∼ ν^{−2} behavior can be understood heuristically by considering the sizes of clones containing the allele. Let us consider the asexual case with *r* = 0. To calculate the copy number distribution of a novel allele, *T* generations after it originated, we have to average over the initial fitness *x* of the genotype that it could have arisen on. Given that the allele arose on a genetic background with fitness *x*, its average number after *T* generations is given by Equation 4. Since we know that the distribution of *x* is we can calculate the expected distribution of *n* after time *T* within the deterministic approximation of Equation 4. One finds that The allele frequency spectrum is an average over alleles of different age, *T*, and the above expression therefore has to be integrated over *T*, which yields *f*(*n*) ∼ *n*^{−2}. In a facultatively sexual population with *r* < σ, allele frequency spectra are also driven by the clonal expansion of individual genotypes. However, in contrast to the asexual case, additional clones are constantly seeded through outcrossing so that the novel allele resides in many different clones of different ages, similar to the average over asexual clones of different ages. For this reason, both the copy number spectrum of individual alleles and the frequency spectrum averaged over alleles of different ages have a leading behavior *p*(*n*, *T*) ∼ *n*^{−2} and *f*(ν) ∼ ν^{−2}. This reasoning is confirmed by the branching process calculation given in *Appendix A*.

In addition to a much steeper spectrum at low frequencies, the spectra of neutral and deleterious mutations are nonmonotonic and increase again at frequencies near one (Fay and Wu 2000). On a linear chromosome with isolated sweeps that affect tightly linked neutral variants, this effect is easy to understand: A tightly linked variant is brought close to fixation, resulting in a pile-up of derived variants at frequencies close to one. In our case with many unlinked sweeps in a facultatively mating population the effect is less intuitive, but ultimately of similar origin.

### Selection efficiency and quasi-neutrality

The rapid and erratic dynamics of alleles incurred by selection on other loci affect not only the time to fixation, but also the ability of selection to prune deleterious mutations and fix beneficial mutations. Mutations behave essentially neutrally, as long as selection on the intrinsic effect of the allele is outweighed by fluctuations. A beneficial mutation has established when it has risen to sufficient numbers that future extinction is improbable and the allele goes to fixation deterministically. Without draft, a beneficial mutation is established when it has reached copy numbers of ∼*s*^{−1}. With additional fluctuations through draft, however, the allele is more likely to go extinct and has to rise to much higher numbers to establish. Similarly, draft can propel deleterious mutations to copy numbers they would never reach under drift alone. This reduction of the efficacy of selection is a hallmark of Hill–Robertson interference (Hill and Robertson 1966) and was mainly studied in obligate sexuals (Barton 1994; McVean and Charlesworth 2000). Here, we explore these effects in facultative outcrossers.

Since an allele that has survived longer is likely to be present in more copies, the influence of stochastic fluctuations and hence the rate at which the mutant allele goes extinct decrease over time. This is explicit in the time derivative of the log survival probability, which in the regime *r* < σ is given by(9)(see *Appendix A*, *Beneficial and deleterious mutations*). The first term *s* accounts for the deterministic bias due to the intrinsic effect of the allele, which causes accelerated extinction if *s* < 0 and preferential survival if *s* > 0 (compare with Figure 5B). The second term accounts for the stochastic forces on the allele, whose importance decreases with *T*. The time after which the allele’s fate is dominated by selection on its intrinsic fitness can be roughly determined by equating the two terms in Equation 9. In the regime of intermediate outcrossing rates *r* < σ, we find this crossover time to be *T** ≈ σ^{2}log(*r*/*s*)/*r*^{2}*s*. If fixation of a neutral mutation occurs before that time, selection has little effect and the fate of the mutation is dominated by fluctuations until fixation. Hence, we find a window of quasi-neutrality for mutations with selection coefficient smaller than *s*_{c}(10)within which the fixation probability of a mutation is *p*_{fix} ∼ *N*^{−1}. Beneficial mutations with effects larger than *s*_{c} have a chance of fixation as already found in Neher *et al.* (2010). Propagation of deleterious mutations with |*s*| < *s*_{c} is also dominated by fluctuations, resulting in allele frequency spectra similar to neutral mutations. Only for |*s*| > *s*_{c} does the allele frequency spectrum *f*(ν) decay exponentially as expected for deleterious mutations. Fixation of deleterious mutations is exponentially suppressed with Simulation results, qualitatively showing this behavior, are presented in Figures 2B and 3.

A related phenomenon is observed in the context of a linear chromosome in the limit where only one sweep affects a polymorphism at any given time. Barton (1994) showed that a beneficial allele that is subject to many *sequential*, weakly linked interfering sweeps has little chance of fixation if its selection coefficient is below a critical value. The critical value reported by Barton is *s*_{c} ∼ σ^{2}/*R*, where σ^{2}/*R* is the fitness variance of sweeps per map length. In this approximation, the frequency of the focal allele is repeatedly reduced by a factor that depends on the degree of linkage and the strength of the interfering sweeps: this results in an effective reduction in the growth rate of the frequency of the focal allele to *s* − *s*_{c}. In the case of numerous *concurrent* sweeps that we consider, the situation is different because the focal allele is spread over many different backgrounds subject to selection. The older the allele is, the more likely it is to be spread over many backgrounds, attenuating the effect of interference. Hence we find an additional dependence of *s*_{c} on the typical fixation timescale and the extent of recombination that occurs during this time.

### Stochastic tunneling and complex adaptations

Consider an adaptation process where two mutations at the same locus are needed to confer a fitness advantage, but either mutation in isolation is neutral or deleterious. The rate of this process is given by the probability per unit time that the second mutation occurs in an individual already carrying the first mutation. More generally, the probability that the double-mutant genotype never existed up to a time *T* is(11)where μ is the mutation rate and *n _{i}*(

*t*) is the number of individuals carrying mutation

*i*. The deleterious single mutants form a transient subpopulation in the large background population, which we refer to as a mutant “bubble”. The important quantity determining the rate of such secondary events is the distribution of the cumulative number of single mutants

*i.e.*, the integrated bubble size. For homogeneous background populations, where genetic drift dominates the dynamics of neutral alleles, this quantity has been calculated by Iwasa

*et al.*(2004), Weissman

*et al.*(2009, 2010), and Lynch (2010). In most cases of interest the rate of secondary events is small, so that the probability of tunneling in a single bubble is low. Hence tunneling takes longer than the lifetime of a typical bubble and only the long time limit of

*w*(

*T*) is important. In analogy to the results presented above, we calculate the moment-generating function Φ(

*z*) = of the bubble size

*w*, where denotes the average over the bubble size distribution

*P*(

*w*). The calculation given in

*Appendix A*,

*The double mutation probability:*

*stochastic tunneling*yields for the Φ(

*z*)(12)The result for strongly deleterious mutations is the same as in homogenous populations (Weissman

*et al.*2009). The crossover between the two regimes occurs essentially at the quasi-neutrality threshold, with

*N*replaced by

*z*

^{−1}. The generating function Φ(

*z*), which is also the Laplace transform of

*P*(

*w*), is of immediate practical relevance, since it is exactly the probability that a secondary event with rate

*z*does not occur within a single bubble. From the Laplace transform, we calculate the distribution of

*P*(

*w*), which in the draft-dominated regime we find to be(13)The latter has to be compared to the result for drift alone:

*P*(

*w*) ∼

*w*

^{−3/2}. Both of these behaviors are clearly seen in the simulation results shown in Figure 6. Draft causes large neutral bubbles to become rarer, but also dramatically enhances the probability of large bubbles for deleterious (quasi-neutral) alleles.

## Discussion

The importance of hitchhiking and draft as a source of fluctuations was first discussed by Maynard Smith and Haigh (1974) as a possible explanation for the apparent insensitivity of genetic variation on the census population size, dubbed the paradox of variation (Lewontin 1974). Draft is expected to dominate over genetic drift in very large populations and hence is the factor determining the level of genetic diversity and the efficacy of selection (Barton 1995b; Gillespie 2001). It was shown to have appreciable effects on genetic diversity in *Drosophila* (Sella *et al.* 2009). Draft is expected to be even more pronounced in facultatively outcrossing organisms, where alleles stay associated with each other for longer periods of time. Such organisms include many plants, fungi, nematodes, and viruses, including the human pathogens influenza and HIV.

We have studied draft in continuously adapting facultatively outcrossing populations, using direct computer simulation of a model inspired by intrapatient evolution of HIV. We explain the simulation results by analyzing a simplified branching process model, which is amenable to analytic calculations that elucidate how fixation time, the efficiency of selection, and allele frequency spectra depend on fundamental parameters of the population dynamics. The simplification that makes these analytic results possible is a “large number” limit: due to the large number of concurrent sweeps, the distribution of fitness in the population assumes a Gaussian form and translates toward higher fitness with constant velocity. If the outcrossing rate, *r*, is comparable to or smaller than typical fitness differences, σ, between individuals in the population, clonal expansion of genotypes results in a heavy-tailed distribution of recombinant offspring with dramatic effects. In particular, we find that alleles with selection coefficients smaller than behave *quasi-neutrally*: their fate is dominated by fluctuations and is nearly independent of the intrinsic allele fitness *s*. Yet because selection acting on the allele is masked not by random drift, but by *draft*—transient associations with chromosomal “backgrounds” with different fitness—the behavior is quantitatively different from true neutrality. The conventional dynamics dominated by genetic drift are realized only in the limit of no linkage or complete neutrality (including the background). To emphasize this distinction we refer to the draft-dominated regime as *quasi-neutrality*.

In addition to a reduced efficacy of selection, draft reduces fixation and coalescence times, which instead of growing linearly with the population size, increase much more slowly. The probability that a given neutral locus is polymorphic with an allele at intermediate frequency is roughly given by the product of the neutral mutation rate, μ, and the coalescence time, *T*_{C}. Hence we expect the neutral diversity Details of the expected heterozygosity, however, depend on the allele frequency spectrum. The latter is also the most informative quantity that can be evaluated from a static population snapshot. We have shown that in our model, allele frequency spectra decay as which is much steeper than ν^{−1} predicted by drift. As a consequence, in contrast to the classic neutral theory (Ewens 2004), one should expect the number of polymorphic loci to increase almost linearly (rather than logarithmically) with the sample size as one samples the population deeper and deeper.

### Why draft is different from drift

Draft and hitchhiking are often accounted for by an effective neutral model with reduced population size *N*_{e} (Hill and Robertson 1966; Felsenstein 1974). This is possible if draft does not change the offspring distribution qualitatively but only increases its variance. The central limit theorem then guarantees the existence of a meaningful diffusion limit for the allele frequency dynamics with the diffusion constant that could, if one were so inclined, be interpreted as the inverse effective population size. This diffusion limit, however, is not possible in our case since the number of recombinant genotypes, ξ, that descend from a single clone has a very broad distribution with power-law tails and diverging variance. To see this let us consider a genotype seeded by recombination at fitness *x*≫σ above the mean, which establishes with probability *x* − *r*. After establishment, its copy number *n*(*t*) initially grows almost deterministically; see Equation 4. Over time, the mean fitness increases such that the growth rate eventually turns negative and the genotype goes extinct. During its lifetime, the clone produces recombinant offspring. Since the initial fitness *x* of the genotype is drawn from a Gaussian distribution the mean and second moment of the distribution of the number of recombinant offspring are given by and respectively. While the mean exists, increases so rapidly with *x* that the second moment and hence the variance are divergent. Of course, the finite population cuts off the offspring distribution and strictly speaking prevents the divergence of the variance. However, this cutoff is on the order of the population size and is irrelevant for the dynamics of rare alleles and the existence of a diffusion limit. Below this cutoff, one finds that the distribution of ξ behaves asymptotically as(14)The implications of this heavy-tailed distribution of genotypes that descend from a single clone are best understood by slight abstraction of the model. Assume for a moment population dynamics where clones are seeded simultaneously and their recombinant offspring start growing only after the last clone of the previous round has died, as illustrated in Figure 4A. This corresponds to an effective discrete generation scheme, only that one generation corresponds to the rise and fall of a genotype, rather than a single individual. Each clone spawns a Poisson-distributed number of recombinant genotypes with mean ξ, where the ξ are drawn from the heavy-tailed distribution *P*(ξ). The generating function of the distribution of the number of recombinant offspring is found to be Starting from a single genotype, we calculate the distribution *P _{n}*(

*m*) of the number of genotypes carrying the mutant allele after

*n*effective generations, which has the generating function(15)where and ° denotes functional composition;

*i.e.*,

*f*°

*g*(

*x*) =

*f*(

*g*(

*x*)). From this result, we derive a difference equation for Φ

*(λ),(16)where it is assumed that Φ*

_{n}*(λ) is small. This difference equation is the discrete analog of Equation A6, which is derived in*

_{n}*Appendix A*for the continuous-time model and from which most of our results follow. Thus the dynamics of mutations in a rapidly adapting population can be viewed as population genetics of genotypes, rather than of individuals, with a power-law tailed offspring distribution with diverging variance. This feature makes the description by an effective neutral Fisher–Wright model impossible, since no diffusion limit exists when increments have such a long-tailed distribution. A similar effect occurs in Gillespie’s pseudo-hitchhiking model (Gillespie 2000), where a hitchhiking event can bring an allele to instantaneous fixation.

Coalescent processes with broad and skewed offspring distributions are an active field of research; see, for example, Mohle and Sagitov (2001) and Schweinsberg (2003). It is known that broad offspring distributions can result in simultaneous mergers of multiple lineages and have dramatic effects on the time to the most recent common ancestor, which can increase sublinearly with the population size. Whether and how our interpretation of the adapting population in terms of a coarse-grained coalescent corresponds to a known universality class of coalescent models remains to be seen. Coalescent models with broad offspring distributions have been applied to diversity data of pacific oyster (Eldon and Wakeley 2006) and our results suggest that they might apply to a larger class of phenomena.

### Adaptation *vs.* purifying selection

Discussions of Hill–Robertson interference typically focus on the effect of either linked beneficial or deleterious alleles: the former is referred to as hitchhiking (Maynard Smith and Haigh 1974; Kaplan *et al.* 1989) and the latter as “background selection” (Charlesworth *et al.* 1993; Nordborg *et al.* 1996). Our approach addresses the effect of variation in genetic background fitness independent of its origin. Thus, while our analysis was set up in the context of continuous adaptation driven by many simultaneous selective sweeps, it is equally applicable to fitness variation dominated by deleterious mutations. The balance between deleterious mutations and selection in large populations gives rise to a steady Poisson distribution of fitness (Haigh 1978; Charlesworth *et al.* 1993). If the deleterious mutation rate *U*_{d} is much larger than the selection coefficient *s*_{0} of mutations, the Poisson distribution is very close to Gaussian, as in the case of the adapting populations considered here. While the origin of fitness variation is very different in these two cases, the effect on the fate of mutations is similar. In the adaptation scenario, a genotype carrying the mutations in question stays at a certain point along the fitness axis while the mean fitness is increasing. When deleterious mutations dominate, the mean fitness is constant and set by a mutation–selection balance, while the fitness of asexual offspring of a particular genotype decreases due to accumulating weakly deleterious mutations. Since the dynamics are determined by fitness relative to mean fitness, there is little difference between these scenarios from the point of view of the focal mutation. We have simulated scenarios where fitness variation is due to purifying selection and found that the stochastic dynamics of novel mutations in a purifying selection scenario are very similar to the case where fitness variation is due to sweeping mutations. The analogs of Figures 2 and 3 for background selection are shown in Figure S2 and Figure S3.

In the asexual case, the most relevant quantity for background selection is the size of the least-loaded class given by , from which all future individuals descend and within which the dynamics are essentially neutral (Haigh 1978; Charlesworth *et al.* 1993). With recombination, beneficial and deleterious mutations can decouple from each other and individuals in the bulk of the distribution have a chance of contributing to the future population, albeit with a smaller and smaller chance as the distance from the fittest class increases. In the limit of rapid recombination, this effect can be described by a reduction in the effective population, which has been studied by Hudson and Kaplan (1995) and Nordborg *et al.* (1996). Our results imply that background selection changes the population genetics more dramatically when the outcrossing rate is small compared to the standard deviation in fitness.

In any real-world scenario, there will be contributions to fitness variation from beneficial as well as deleterious mutations. In fact, in HIV the fitness variation due to deleterious mutations is expected to be substantial: with a mutation rate of ∼0.2 per genome replication (Mansky and Temin 1995) and an average effect size of a mutation of *s*_{0} = − 0.01, we have . The outcrossing rate is of the same order of magnitude (Neher and Leitner 2010), so our results are expected to apply for deleterious mutations alone.

### Sequential *vs.* multiple sweeps

To illustrate the applicability of isolated *vs.* multiple-sweep regimes of selection we compare sweep frequencies in *Drosophila* and in HIV. Genetic divergence between different *Drosophila* species suggests a rate of amino acid substitution of about one every 200 generations (see Sella *et al.* 2009 for a review). While these estimates come with a rather large statistical and methodological uncertainty, they nevertheless indicate that sweeps are frequent, but do not interfere. Each sweep “occupies” a stretch of ∼*s*/ρ nucleotides (ρ is the recombination rate per nucleotide) on a chromosome for ∼*s*^{−1} generations. Since they come in at a rate of 0.005 per generation spread over the total map length, they should on average be far apart: expect on average <1% of genome length to be subject to draft at any given time. However, sites under weak selection might still cause substantial Hill–Robertson interference (McVean and Charlesworth 2000).

On the other hand, the multiple-sweep regime is likely to be relevant to organisms with a facultative sexual life cycle. In particular, pathogens like HIV are under constant selection pressure, resulting in a high rate of selective sweeps. Selection in HIV evolution is best characterized in the *pol* gene, the target of most antiretroviral drugs, and in the *env* gene, the target of neutralizing antibodies. On the order of 100 codons are implicated in drug resistance (Rhee *et al.* 2003; Chen *et al.* 2004) and several such mutations compete and sweep during the evolution of drug resistance in a single patient. The *env* gene frequently builds up a nucleotide diversity of 3–4% (Shankarappa *et al.* 1999), with frequent adaptive amino acid substitutions (Williamson 2003; Neher and Leitner 2010). The effective recombination rate is ∼10^{−5} per base and generation (Levy *et al.* 2004; Neher and Leitner 2010). With typical selection strength of a few percent per generation, the characteristic length of the segment affected by the sweep is ∼1 kb, which is comparable to the size of the evolving genes: hence the dynamics are in the multiple-sweep regime. While *pol* and *env* genes constitute only a fraction of the HIV genome, CTL epitopes are found throughout the HIV genome (Korber *et al.* 2007) so that even more sweeps associated with CTL escapes are expected (Asquith *et al.* 2006). In a recent study, Hedskog *et al.* (2010) characterized the evolution of the *pol* gene of HIV in six longitudinally sampled patients during antiretroviral treatment. Using their data, we measured the site frequency spectrum of derived alleles, shown in Figure 7. The allele frequency spectrum is indeed much steeper than expected for neutral evolution and compatible with ν^{−2}, rather than the neutral expectation of ν^{−1}. Furthermore, there is little difference between the spectra of silent and nonsynonymous mutations, consistent with our notion of quasi-neutrality. Note, however, that steep allele frequency spectra proportional to ν^{−2} are also expected in expanding populations. While the sequences used in Figure 7 are from chronically infected patients, changes in drug therapy resulted in shrinking and expanding population sizes and the effects of this expansion and draft cannot be disentangled at present.

In a recent study, Rouzine and Coffin (2010) present an analysis of adaptation of HIV, using a model similar to the one used here and in Neher *et al.* (2010). Rouzine and Coffin study the problem of selection on standing genetic variation and in addition to the speed of adaptation, they compute the rate of coalescence, which in their model is controlled by the probability that two individuals in the population are genetically identical, *i.e.*, are part of the same clone. In agreement with our result in Equation 7, they find that the coalescence rate is where *x*_{max} is the fitness of the fittest individuals in the population (relative to ). The quantity *x*_{max} corresponds to in our Equation 7. Furthermore, Rouzine and Coffin (2010) study how benefical alleles at low frequency can be lost due to the lack of recombination. Our work complements Rouzine and Coffin (2010), as we study the stochastic dynamics of new alleles arising from mutations, rather than the evolution of the fitness distribution, given standing variation. On the other hand, Rouzine and Coffin (2010) account for correlations between loci by allowing the distribution of recombinants to be broader than the population distribution (Rouzine and Coffin 2005)—an effect absent in our model. In the range of recombination rates where our results apply, the latter deviations are small.

Our analytic results are derived using a branching process approximation that assumes that the novel allele is a small fraction of the total population and the finiteness of the population does not yet affect its dynamics. The influence of the finite population size is apparent in Figure 3B, causing deviations from the branching process predictions at large frequencies. The branching process approach can be complemented by an effective theory for macroscopic frequencies, similar in spirit to diffusion theory for random drift. This theory, however, has to account for the very broad distribution of clone sizes. Throughout the article, we have assumed that the speed of adaptation is identical to the variance in fitness; *i.e.*, we have assumed that effects on fitness of different mutations are purely additive. If interactions between mutations contribute to fitness, the variance in fitness tends to be larger than the speed of adaptation, since coadapted combinations are destroyed by recombination. Including genetic interactions within our framework involves decoupling *v* and σ, as well as changing the recombination functions. Genetic interaction might increase the importance of draft significantly, as interactions have a tendency to result in clonal population structures and couple the loci along the genome beyond physical linkage (Neher and Shraiman 2009). Another simplification we have made is to assume that all loci segregate independently in an outcrossing event. Our analysis can be extended to account for linear chromosomes by considering a hierarchy of recombination rates for chromosomal segments at different distances. These and other extensions are interesting projects, which we leave for future work.

Several large-scale resequencing projects are underway with the goal to characterize genetic diversity in populations of humans, *Drosophila*, and *Arabidopsis* at great depth. These upcoming data will allow us to measure allele frequency spectra to much greater depth, similar to the example from HIV shown in Figure 7, and reveal the mechanisms shaping genetic diversity in natural populations.

## Acknowledgments

The authors acknowledge stimulating discussions with Daniel Fisher and Michael Lynch, thank Sally Otto and Nick Barton for a critical reading of the manuscript, and thank Jan Albert for providing the HIV polymorphism data prior to publication. This research was supported by the National Science Foundation under grant PHY05-51164 (to R.A.N. and B.I.S.), by National Institutes of Health grant R01 GM-086793 (to B.I.S.), and by the Harvey L. Karp Discovery Award (to R.A.N.).

## Appendix A

### Derivation of the results

To analyze the dynamics of the probability distribution, it is useful to consider the generating function . From Equation 1, one obtains (see File S1) the following equation for ,(A1)with boundary condition ϕ(λ, *T*, *T*, *x*) = 1 − λ. The survival probability is given by evaluating ϕ(λ, *T*, *t*, *x*) at λ = 0, while the moments and asymptotic behavior of *p*(*n*, *T*, *t*) are determined by the behavior of ϕ(λ, *T*, *t*, *x*) in the vicinity of λ = 1. We have to solve for ϕ(λ, *T*, *t*, *x*) in these two limits, which show up repeatedly below. It is convenient to remove the explicit dependence on the initial condition and the velocity *v* = σ^{2} by rescaling ψ(*T*, *t*, *x*) = ϕ(λ, *T*, *t*, *x*)/(σε) with ε = 1 − λ. The two limits of interest are now ε≪1 (λ ≈ 1) and ε ≈ σ^{−1}≫1 (λ ≈ 0). After rescaling , , and χ = σ^{−1}*x* − σ*T* and τ = σ(*T* − *t*), Equation A1 takes the form(A2)where ψ(τ, χ) is the rescaled generating function of the copy number distribution of a mutation that initially occurred on the genome with rescaled background fitness χ. The distribution of rescaled fitness χ in the population is Gaussian around the moving mean fitness . Averaging Equation A2 over *P*(χ, τ) yields an equation for the scaled generating function (A3)For beneficial mutations, Φ(τ) approaches a steady state where the two terms on the right balance each other. This long time limit was used as a solvability condition in Neher *et al.* (2010) to determine the fixation probability of beneficial mutations. The steady state is independent of the initial condition *ɛ* and therefore equals the rescaled survival probability. However, if , no such steady state exists, since neutral and deleterious mutations have zero chance of fixation in an infinite population. To calculate the probability that a mutation reaches *n* copies after time τ, we need to find the full time-dependent solution of Equation A3. For simplicity, we use the communal recombination model throughout and show that the infinitesimal model yields similar results by simulations of the latter. For the communal recombination model, the recombination contribution in Equation A2 reduces to .

In analogy to Neher *et al.* (2010), we solve Equation A2 in the two asymptotic regimes of large positive and large negative growth rate For θ < 0 the nonlinearity in Equation A2 can be neglected since ψ(τ, χ)≪ε^{−1}. Conversely, at large θ the dominant balance in Equation A2 is θψ(τ, χ) = εψ^{2}(τ, χ). The two asymptotic solutions are(A4)The crossover between the two solutions occurs in a narrow region at θ = Θ_{c}(τ), where Θ_{c}(τ) is approximately the point where the two solutions cross. The evolution of Φ(τ) is governed by Equation A3, which has to be self-consisted with Equation A4. We first solve for Φ(τ) assuming that τ is large. In this case the dominant contribution to the integral in Equation A4 will come from a well-defined τ′≫0 that is determined by the maximum of at τ′ = τ − θ. Φ(τ) will turn out to change slowly and its time derivative can be neglected when determining the maximum of the exponent. Hence,(A5)This solution is valid below Θ_{c}(τ), where ψ(τ) crosses over the linear saturated form θ/*ɛ*. Substituting this solution into Equation A3 yields(A6)We first consider neutral mutations, , in which case , confirming that log Φ(τ) changes slowly as long as (the limit of yields qualitatively different answers). Solving Equation A6 for Θ_{c}(τ) and differentiating with respect to τ yields an equation for ∂* _{τ}*Θ

_{c}(τ), which is readily solved for Θ

_{c}(τ) in the case ,(A7)where Θ

_{0}is the initial condition at τ = τ

_{0}and

*W*(

*x*) is Lambert’s

*W*,

*i.e.*, the solution of

*W*(

*x*)

*e*

^{W}^{(}

^{x}^{)}=

*x*. The approximation of

*W*(

*x*) by a logarithm is accurate only at a very large time since there are significant log log

*x*corrections. The solution for the rescaled generating function then reads(A8)where the last terms include only the exponential factors. Apart from the rescaling, this solution is independent of the initial condition

*ɛ*. The dependence on

*ɛ*will reemerge through the matching to the short-time solution. Note that Φ(τ) satisfies asymptotically for large τ,(A9)which means that the relaxation rate of Φ decreases with time, but more slowly than it does in the case of pure drift (in the absence of draft), in which case ∂

_{τ}log Φ(τ) = −1/τ.

The long-time solution in Equation A8 has to be matched to the solution at small τ where the initial condition cannot be neglected.

#### Short-time solution

To analyze the short-time behavior, it is useful to split ψ(τ, χ) = ψ_{r}(τ, χ) + ψ_{0}(τ, χ) into a contribution fed by recombination ψ_{r}(τ, χ) and the contribution originating from the initial condition ψ_{0}(τ, χ), where ψ_{r}(0, χ) is initially zero and ψ_{0}(0, χ) = 1. In the communal recombination model, the term describing the production of novel recombinants reduces to and the two contributions evolve according to(A10)Recombination enters the equation for ψ_{0} only through a reduction of the growth rate and the solution for ψ_{0}(τ, χ) is simply(A11)The integral acts as a source for ψ_{r}(τ, χ). The initial behavior of Φ_{0}(τ) is quite different for *ɛ* ≫ 1 and *ɛ*≪1 and we analyze these cases separately below.

#### Survival probability of neutral mutations

Here, we present the steps necessary to arrive at the result for the survival probability (Equation 5). The survival probability is given by the generating function of *p*(*n*, τ), evaluated at λ = 0. In our rescaled variables, the survival probability therefore corresponds to Φ(τ) with *ɛ* = σ^{−1}≫1. The long-time solution for Φ(τ) given in Equation A8 contains the free parameters τ_{0} and Θ_{0}, defined by matching with the short-time solution that depends explicitly on the initial condition (see above). To establish the matching, we trace the short-time solution for *ɛ*≫1 through several regimes until the initial condition is forgotten. At small times τ≫1, we have(A12)This rapid initial decay corresponds to the early loss of the allele due to neutral processes. Selection starts to matter only after τ ∼ 1, yielding an accelerated decay of Φ_{0}(τ):(A13)The Gaussian decay of the survival probability of unrecombined alleles is due to the increasing mean fitness, *i.e.*, the adapting population wave moving past the genotype the mutation initially arose on. Hence Φ_{0}(τ) does not contribute to the long-time behavior, which is dominated by the part of the solution driven by recombination. We therefore have to calculate how much weight is transferred from ψ_{0} to ψ_{r} via the term in Equation A10. In the case *ɛ*≫1, the dominant balance for ψ_{r}(τ, χ) is , resulting in an initially steady . After the initial condition has decayed and Φ_{0}(τ)≪Φ_{r}(τ), the solution ψ_{r}(τ, χ) acquires the form of Equation A4 with the only memory of the initial condition in Φ_{r}(τ). The crossover time is obtained by equating Φ_{0}(τ) with Φ_{r}(τ), which yields . Using this τ_{0} and (compare with Equation A4), we can match the short-time solution to the long-time solution in Equation A8.

To illustrate the validity of the approximations made in the analysis we solved Equation A2 numerically. Fig. A1 shows the square sqrt of the logarithm of the rescaled survival probability, plotted as a function of . Data for different outcrossing rates collapse onto the same curve, indicating that the expression in Equation A8 captures the dependence on .

The above analysis assumed the communal recombination model, *i.e.*, a model where the offspring fitness is independent of the parental fitness. To demonstrate that the communal recombination model captures the behavior of the infinitesimal model, we compare results obtained from stochastic simulations of the branching process for the two models in Figure A1. The infinitesimal model exhibits the same scaling collapse, indicating that the dependence on the parameters is similar. The two models do, however, show quantitative differences, which can be interpreted as less efficient recombination in the infinitesimal model. This is expected, as recombination in the infinitesimal model only halves the correlation with either parent, while this correlation is completely destroyed in the communal recombination model. Note, however, that the dependence of *P*_{S} on is strong, so that a rescaling of substantially changes the absolute numbers.

#### Beneficial and deleterious mutations

To analyze the behavior of Φ(τ) in the case of , we track back to Equation A6, focusing on the case where :(A14)This equation highlights the fact that as long as , the dynamics of Φ(τ) are dominated by fluctuations and are therefore similar to the neutral case. For a negative , the decay of Φ(τ) is accelerated, while a positive slows down and eventually halts the decay of Φ(τ); see Figure 5. The crossover to the behavior dominated by the intrinsic fitness effect occurs when the two terms are equal. Comparing with Equation A9 yields the crossover condition(A15)Before this crossover, the effect of selection on is a simple attenuation or amplification of , where Ψ(τ) is approximately the neutral solution, with the slight difference that the crossover point . The correction to the crossover point is still small when selection on the intrinsic fitness effect *s* starts to dominate the dynamics of Φ(τ).

#### Distribution of the allele copy number

The probability *p*(*n*, *T*) of finding *n* copies of the allele after time *T* can be calculated from the generating function by contour integration in the complex λ-plane:(A16)The contour CΧ has to encircle the origin and no other singularities of . For *n* > 0, we can evaluate the contour integral by deforming the contour to run alongside the singularities away from zero, which are tightly controlled by for *n*≫1. To do so, we have to study the analytic properties of or equivalently of where *ɛ* = 1 − λ. For negative *ɛ* = 1 − λ and *s* ≥ 0, ψ(χ, τ) has a singularity at Θ_{c}, the value separating the regime of small θ where the linear equation is valid and the saturated regime where ψ(τ) = θ/*ɛ*; compare Equation A4. Φ(*t*) is the integral of ψ(τ, χ) over χ and therefore has a branch cut for negative real *ɛ*. Otherwise, Φ(τ) is an analytic function of *ɛ*. We can use this branch cut to evaluate the contour integral in Equation A16. Substituting Φ(τ) into Equation A16 in changing the integration variable to *ɛ*, we have(A17)where is the contour encircling the branch cut. For large *n* the dominant contributions to the integral come from *ɛ*≪1 and we have to find a solution for Φ(τ) for small ɛ.

To this end, we backtrack to Equation A11 and integrate the evolution equation for Φ(τ) for *ɛ*≪1. We find

For intermediate , the matching between the short-term solution and the long-time solution occurs at , after which we have(A19)For large Θ_{0}, and the branch-cut integral becomes(A20)The exponent of the integrant peaks when(A21)For large *n*, the dominant balance is between the first term of Equation A21 and the remainder, which requires that and hence ɛ ≈ *n*^{−1}. Using the asymptotic copy number distribution becomes(A22)The leading dependence of *p*(*n*, τ) on *n* is *p*(*n*, τ) ∼ *n*^{−2}. Note, however, that the cross-term from the exponent yields another subdominant factor . By contrast, the corresponding expression in the high recombination regime is *p*(*n*, *T*) ∼ *T*^{−2}*e*^{−}^{n}^{/}* ^{T}*.

The allele frequency spectrum, *f*(*n*), *i.e.*, the probability to find *n* copies of a derived allele irrespective of its age, is given by the time average of *p*(*n*, τ) over mutations that arose at different times τ in the past. It depends on how quickly the survival probability decays, as well as on the shape of *p*(*n*, τ). For large *n* one can expand the logarithm in the exponent of Equation A22 and integrate *p*(*n*, τ) over τ to obtain(A23)Even though *p*(*n*, τ) decays much faster for (exponential) than for (algebraically), the allele frequency spectrum is steeper for than for . The long tail of the latter is due to the contribution of very old alleles with very flat exponential *p*(*n*, τ). For , the clonal amplification and rapid extinction of most alleles give rise to steep allele frequency spectra.

#### The double mutation probability: stochastic tunneling

To calculate tunneling probabilities (Iwasa *et al.* 2004; Weissman *et al.* 2009), we need to know the distribution of treated as a stochastic variable, *i.e.*, the size of transient “bubbles” of mutant individuals in a large population. In analogy to Equation 1, the distribution *p*(*w*, *T*|*k*, *t*, *x*) for a mutation being present on background *x* in *k* copies obeys the following equation:(A24)Instead of the generating function, we now consider the Laplace transform . Because of the convolution property of Laplace transforms and the fact that the fates of the *k* alleles present at time *t* are independent of each other, we have . As before, we scale rates by σ^{−1} and times by σ. The equation for then is(A25)where . At τ = σ(*T* − *t*) = 0, *w* = 0 and therefore ϕ(τ, *z*, χ) = 0. If tunneling is rare and happens only on timescales longer than the lifetime of bubbles, we do not need to know the full time-dependent solution for ϕ(τ, *z*, χ) but can send τ to infinity. The above equation can again be solved by matching the asymptotic solutions at large negative and positive θ, which in the steady state read(A26)The crossover point Θ_{c} is determined by . The “solvability condition” obtained by integrating Equation A25 with regard to the Gaussian distribution of χ yields(A27)For small , the integral on the right-hand side is dominated by the vicinity of the crossover point Θ_{c}. Combining with the matching condition, one finds for Φ(*z*)(A28)Substituting into the crossover condition we obtain an equation for Θ_{c}:(A29)In the neutral case , the location of the crossover is at which implies for Φ(*z*)(A30)In the case that the intrinsic effect of the mutation dominates the denominator of Equation A29 one finds(A31)The result for Φ(*z*) given by Equation A28 and its different limits is directly relevant to the calculation of the probability of secondary events, *e.g.*, an additional mutation as discussed in the earlier in *Stochastic tunneling and complex adaptations*.

For completeness we also calculate the probability distribution of that is given by the inverse Laplace transform of Φ(*z*),(A32)where CΧ is the contour encircling the branch cut. For example, in the intermediate asymptotic regime when , we have(A33)(A34)where we used an approximation valid for *w*≫1. This can be compared to the “draftless” result, obtained also in the limit: in that case , the inverse Laplace transform of which gives *P*(*w*) ∼ *w*^{−3/2} (Weissman *et al.* 2009).

## Appendix B

### Computer simulation methods

#### Fisher–Wright simulation

To efficiently simulate the dynamics of large populations, we keep track of classes of individuals with identical genomes, which are encoded by bitstrings. The fitness *x* of all individuals of a class is simply the sum of the contributions from all loci. In our discrete generation scheme, a pool of gametes is produced to which each class contributes a Poisson-distributed number of gametes with mean . Here, *c* is the size of the class, is the mean fitness in the population, and keeps the population size *N* approximately constant at . Fitness defined as growth rate in the continuous-time model naturally assumes this exponential form in a discrete generation model, simply by integration over one generation.

To implement facultative mating, a fraction *r* of the gametes are paired up and from each pair two offspring are produced by assigning at random the genes of the parents to either offspring. The remaining 1 − *r* fraction of gametes is copied into the next generation without recombination; *i.e.*, they have gone through an asexual reproduction cycle.

The genome of each class has a fixed length *L* = 1024 and new mutations are introduced at a locus whenever this locus is monomorphic, *i.e.*, whenever a previous mutation either went extinct or fixed. For each monomorphic locus, an individual is chosen at random and the mutation is introduced in this single individual. This scheme of keeping a fixed number of loci polymorphic has the advantage of making optimal use of the computational resources, *i.e.*, keep every locus polymorphic while maintaining an overall low mutation rate. The mutation rate, however, becomes a dependent quantity that fluctuates around a value that depends on other parameters of the population: *NU*_{b} is simply the average number of loci that become monomorphic in one generation; see Neher *et al.* (2010). For large *L*, the number of mutations introduced in each generation is much greater than one and does not fluctuate greatly.

To establish a steadily adapting population, 80% of the loci are kept polymorphic with beneficial mutations with effect size *s*_{0} which is rescaled each generation such that the overall fitness variance is σ^{2} = 0.0025. This rescaling is done to be able to specify *r*/σ and *s*/σ explicitly. Since fluctuations of σ in large populations with many polymorphic loci are small, this does not change the properties of the dynamics and the same results are obtained letting σ be freely determined by the population. The remaining 20% of the loci are used to study the fate of mutations with fixed effect size *s*. Measurements are performed after an equilibration period of 20,000 generations at intervals of 100 generations. The source code is available from the authors on request.

#### Simulation of the branching process

The simulation of the population dynamics is complemented by a simulation of the branching process that can be directly compared to analytic calculations. We simulate the process described by Equation 1, using an event-driven algorithm (Gillespie 1977). The simulation keeps track of all individuals that currently carry the allele in question. For each individual, the time *t* + Δ*t* of the next event is determined by drawing a Δ*t* from an exponential distribution with parameter *B*(*t*) + *D* + *r*, where *B* and *D* are the birth and death rates, respectively. At time *t* + Δ*t*, a birth, death, or recombination event is performed with probabilities proportional to *B*(*t* + Δ*t*), *D*, and *r*, respectively. In the case of death, the individual is deleted; in the case of birth, it is duplicated with the exact same fitness; and in the case of recombination, its fitness is redrawn according to the recombination kernels in Equations 2 and 3. (Note that the waiting-time distribution for the next event is not exactly exponential, since the birth rate is time dependent. This, however, amounts only to a correction of order σ^{2}≪1.) For the recombination function of the communal model, we need not simulate Equation 1 but can solve numerically for the generating function of the *p*(*n*, *T*, *t*), given in Equation 17. To this end, we discretize the fitness *x* and solve for the vector ψ(τ, *x*), using the ODE solver of SciPy (Oliphant 2007). The results obtained through stochastic simulation of the communal model agree with the numerical solution for the generating function, as it has to be.

- Received March 20, 2011.
- Accepted May 25, 2011.

- Copyright © 2011 by the Genetics Society of America