Originally published as Genetics Published Articles Ahead of Print on May 15, 2006.

Genetics, Vol. 173, 1793-1811, July 2006, Copyright © 2006
doi:10.1534/genetics.106.058586

The Hill–Robertson Effect and the Evolution of Recombination

Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom

1 Corresponding author: Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, King's Bldgs., W. Mains Rd., Edinburgh EH9 3JT, United Kingdom.
E-mail: denis.roze{at}ed.ac.uk

Manuscript received March 25, 2006. Accepted for publication May 4, 2006.

ABSTRACT

In finite populations, genetic drift generates interference between selected loci, causing advantageous alleles to be found more often on different chromosomes than on the same chromosome, which reduces the rate of adaptation. This "Hill–Robertson effect" generates indirect selection to increase recombination rates. We present a new method to quantify the strength of this selection. Our model represents a new beneficial allele (A) entering a population as a single copy, while another beneficial allele (B) is sweeping at another locus. A third locus affects the recombination rate between selected loci. Using a branching process model, we calculate the probability distribution of the number of copies of A on the different genetic backgrounds, after it is established but while it is still rare. Then, we use a deterministic model to express the change in frequency of the recombination modifier, due to hitchhiking, as A goes to fixation. We show that this method can give good estimates of selection for recombination. Moreover, it shows that recombination is selected through two different effects: it increases the fixation probability of new alleles, and it accelerates selective sweeps. The relative importance of these two effects depends on the relative times of occurrence of the beneficial alleles.


ONE of the first general hypotheses proposed to explain the maintenance of sex and recombination states that sex increases the rate of adaptation, by allowing beneficial mutations initially present on different chromosomes to be combined onto the same chromosome (MORGAN 1913; FISHER 1930; MULLER 1932). The first analytical treatment of the question was done in the 1960s and led to some debate: using a simple calculation, CROW and KIMURA (1965) showed that the rate of incorporation of beneficial mutations is indeed higher in sexual than in asexual populations, whereas a deterministic model made by MAYNARD SMITH (1968) showed no difference between sexual and asexual populations. FELSENSTEIN (1974) pointed out that this disagreement stemmed from the different assumptions made by these models: Crow and Kimura represented a finite population in which advantageous alleles first arise as single copies, while Maynard Smith considered the case of an infinite population, in which deleterious alleles at two loci are maintained at mutation–selection equilibrium and become advantageous after a change in the environment. Because the model assumes no epistasis, the two loci always stay in linkage equilibrium, and recombination has no effect on the change in genotype frequencies. By contrast, in a finite population such as the one modeled by Crow and Kimura, selection and random drift combine to generate some linkage disequilibrium even in the absence of epistasis. For example, consider the case where a new beneficial allele enters the population while another beneficial allele at a second locus is already present. The new mutation can occur either on the good background at the second locus, in which case the linkage disequilibrium (LD) is positive, or on the bad background, in which case the linkage disequilibrium is negative. A simple calculation shows that, on average, this initial linkage disequilibrium is zero. However, when both advantageous alleles arise in coupling (LD > 0) they tend to increase in frequency rapidly, while when they are in repulsion (LD < 0) they compete against each other, which slows their rate of increase. Because situations in which LD < 0 tend to last longer, the initial variance in linkage disequilibrium generates, on average, negative linkage disequilibrium. The same "Hill–Robertson effect" occurs when the variance in LD is generated by drift due to finite population size in a single population (HILL and ROBERTSON 1966; FELSENSTEIN 1974) or in an arbitrary number of interconnected populations (MARTIN et al. 2006). The negative linkage disequilibrium generated by the Hill–Robertson effect reduces the rate of incorporation of advantageous alleles and limits the speed of adaptation. BARTON (1995b) quantified this effect by calculating the reduction in fixation probability of an advantageous allele when selection occurs simultaneously at another linked locus.

Increasing the recombination rate between loci decreases the strength of the Hill–Robertson effect and thus increases the rate of fixation of beneficial alleles. Simulation and analytical models have shown that this generates indirect selection on modifier genes that increase recombination rates (FELSENSTEIN and YOKOHAMA 1976; OTTO and BARTON 1997, 2001; ILES et al. 2003; BARTON and OTTO 2005; MARTIN et al. 2006). Recombination modifiers are not mere theoretical constructs, but have been identified in a number of experimental studies. In particular, selection experiments have shown that it is possible to artificially increase recombination within genomes, either by direct selection on recombination rates or as a by-product of selection for another trait (e.g., KIDWELL 1972; CHARLESWORTH and CHARLESWORTH 1985a,b; KOROL and ILIADI 1994; KOROL et al. 1994; and references in OTTO and BARTON 2001). In some cases, genetic factors affecting recombination rates have been mapped, revealing a variety of effects: some factors have a local effect, increasing recombination only over a small region of the genome (which can be located on the same or on a different chromosome), while others affect the overall recombination rate of whole chromosomes. Recombination rates are also known to evolve in nature: for example, TRUE et al. (1996) found that the genetic map of the island endemic Drosophila mauritiana is 1.8 times longer than the map of its sister species D. melanogaster, with a higher frequency of crossovers along each chromosome. Rapid adaptation to a new environment may have selected for a higher recombination rate, but a broader survey is needed to establish whether there is a correlation between adaptation and recombination rate.

The first analytical study on the evolution of recombination due to the Hill–Robertson effect was by OTTO and BARTON (1997). Their model represents two loci under directional selection: at the first locus, a new beneficial allele arises as a single copy, while at the second locus another beneficial allele is already established and sweeps through the population; finally, a third locus affects the recombination rate between the selected loci. Because recombination decreases interference between the selected loci, the new beneficial allele has a higher fixation probability when it occurs on the high-recombination background. The frequency of the modifier allele coding for higher recombination is then expected to increase by hitchhiking, on average, as the new beneficial allele goes to fixation. Otto and Barton's analysis proceeds in two steps: first, they derive the probability of fixation of the new advantageous allele, arising on each of the four possible backgrounds at the two other loci; then, they use a deterministic calculation to obtain the change in frequency of the recombination modifier, given that the new beneficial allele reaches fixation. This approach, however, has several limitations. In particular, although interference between selected loci due to the Hill–Robertson effect is taken into account in the calculation of fixation probabilities, it does not enter the deterministic calculation of the hitch given by the new beneficial allele to the modifier, given that it fixes. This last calculation indeed assumes linkage equilibrium between the selected loci. There are, however, three possible sources of LD, which, as discussed by Otto and Barton, are probably the cause of the quantitative differences they observe between analytical and simulation results. The first is the initial disequilibrium generated by the mutation event producing the new beneficial allele; although this disequilibrium is zero on average, it is sometimes positive (when the new allele arises in coupling with the other beneficial allele) and sometimes negative (when it arises in repulsion), and the effect of this disequilibrium on the change in frequency of the modifier does not cancel out. This initial disequilibrium is taken into account in some of the numerical calculations of Otto and Barton (dotted lines in Figures 4–9 of OTTO and BARTON 1997), but this still leaves important departures from simulation results in several cases. The second source of disequilibrium is random drift, which can have strong effects while the new allele is still rare. The effect of this disequilibrium is analyzed in a more recent model (BARTON and OTTO 2005), by considering small fluctuations around the deterministic increase of the new allele. This analysis, however, does not apply while the allele is present in just a few copies (in which case drift can generate important departures from the deterministic trajectory), which is just when the effect on the modifier is strongest. Finally, a third source of disequilibrium comes from the fact that when the new beneficial allele happens to reach fixation, it is more likely to be in coupling with the beneficial allele at the other selected locus. Therefore, the disequilibrium on the paths that lead to fixation of the new allele tends to be more positive than the disequilibrium observed on the paths leading to loss of the allele.

In this article, we show that these restrictions can be overcome by finding the distribution of numbers of the new mutation across the four backgrounds, after it has become established but while it is still rare. Since this initial stage involves the independent replication of rare mutants, it can be described by a branching process; the solution can then be taken forward deterministically, to find the net expected increase of the modifier allele. This assumes that population size is sufficiently large that the effects of drift can be neglected once alleles are established within the population. As we will see, this method involves solving numerically a set of differential equations; therefore, one has to resort to a computer to evaluate the strength of selection on the modifier, as with simulation methods. However, our method has two main advantages over simulations: first, it can stand in place of the millions of replicate simulations needed to get accurate estimates of the change in frequency of the modifier; second (and more importantly), it provides us with an understanding of the scaling relationships among the parameters. We will see that our method provides accurate predictions over a wide range of parameter values. It also captures two different effects of recombination modifiers: they affect the fixation probability of new advantageous alleles and also the speed of selective sweeps. These two effects both generate indirect selection at the modifier locus; we will see that their relative importance depends on the relative times of occurrence of beneficial alleles. Finally, we use our method to estimate the overall strength of selection on a modifier affecting recombination over the whole genome, when recurrent beneficial mutations occur throughout the genome. This shows that the rate of adaptive substitutions has to be relatively high for interactions between selected alleles to generate a substantial force on recombination modifiers. Although the estimated rates of adaptive amino acid substitutions for the Drosophila and the human lineages do not seem compatible with such high values, it is still possible that most adaptive changes occurred in relatively short periods, during which indirect selection on recombination may have been important. Despite this requirement for a high substitution rate, stochastic forces may still appear as a more likely candidate to explain the widespread occurrence of recombination than deterministic forces generated by epistasis among loci. Indeed, epistasis can select for increased recombination rates only when it is negative and not too variable among loci (BARTON 1995a; OTTO and FELDMAN 1997), whereas available data do not show any clear tendency for epistasis to follow this pattern. Finally, we will see that our model does not give accurate results for all possible parameter values and initial conditions: it best describes the case where a new, relatively weakly selected allele occurs when a more strongly selected allele is already sweeping through the population. We have also worked on another method that better describes the case where a relatively strongly selected allele occurs while a more weakly selected one is already established and the case where two beneficial alleles occur at about the same time (N. H. BARTON and D. ROZE, unpublished results). In these cases, recombination has little influence on fixation probabilities, but still has an effect in accelerating selective sweeps. Finally, the method of BARTON and OTTO (2005) deals with the effects of small fluctuations due to drift, once alleles are abundant. Together, these methods thus give us a complete understanding of the different regimes under which recombination can be favored.


GENERAL METHOD
The general setting is the same as in OTTO and BARTON (1997). We consider a haploid population of size N; mating is random, and generations are discrete and nonoverlapping. We assume that selection occurs at two loci, favoring allele A over allele a at the first locus and allele B over allele b at the second locus. We call sA and sB the selective advantages of A and B and assume no epistasis, so that the fitnesses of the different genotypes are given by

Formula 1(1)
We assume that Formula 1, and Formula 1, so that the fate of alleles A and B is decided while they are still rare. We then assume that a third locus, with two alleles M and m, affects the recombination rate between the two selected loci (recombination modifier). The recombination rate between loci (A, a) and (B, b) equals rAB(1 – {delta}r), rAB, and rAB(1 + {delta}r) in mm, Mm, and MM individuals, respectively; {delta}r thus measures the modifier effect, and for simplicity we suppose that this effect is additive. The recombination rate between loci (M, m) and (A, a) is denoted rMA. We present the model in the case where loci are in order (M, m)–(A, a)–(B, b); the method applies more generally, however, and we also give results for different orderings of the three loci.

We call t the time at which allele A enters the population (as a single copy). We suppose that, at this time, allele B is already established (although it can still be rare). We see later that this assumption is not crucial and that in many cases we can still obtain accurate results even if B is not established at t. We also assume that allele M is frequent enough that stochastic fluctuations can be neglected at the (M, m) locus and suppose that, at time t, the (M, m) and (B, b) loci are in linkage equilibrium (thus, we neglect any associations between M and B that may have been generated by previous sweeps at yet other loci).

We decompose the spread of allele A into two phases: during the first phase, between times t and t*, A either is lost from the population by drift or becomes established (i.e., reaches a high enough number of copies so that it will almost certainly fix). In the following we assume that A is still rare in the population at t*, so that between t and t*, the replication of each copy of A does not depend on the presence of other A alleles in the population, in which case the change in number of A's can be described by a branching process. After t* we assume that A, if not lost, increases in frequency on a deterministic trajectory. As we said before, allele B at the other selected locus is supposed to be already established at t, and we assume that the frequency of B stays on a deterministic trajectory all the time. We suppose that t* is sufficiently far that B will have reached fixation at t*. We assume that the frequency of the modifier allele M is not too small, so that M changes in frequency solely due to its effect on the spread of allele A: when M helps A to recombine onto the good B background, it then tends to be carried along during the spread of A after time t*, by hitchhiking. Before t*, the effect of A on the frequency of the modifier is very small (because A remains at low frequency). Thus, we assume that the frequency of M remains constant between t and t*.

In summary, the only stochastic process in our model concerns the change in number of copies of A between times t and t*. During this time, the frequency of M does not change, while B increases in frequency on a deterministic trajectory, until fixation. After t*, B is fixed, A increases on a deterministic trajectory if it is still present, and M changes in frequency due to the linkage disequilibrium established with A during the first phase. Note that these different assumptions all stem from the assumption that population size is large and that M and B are common enough at t that the effects of drift at these loci can be neglected.

In APPENDIX A, we show that the expected total change in frequency of the recombination modifier after time t* is given by

Formula 2(2)
where {rho} = rMA/sA, S is the total number of A alleles at t*, and D is the linkage disequilibrium between loci (M, m) and (A, a) at t*, multiplied by population size N. To express the expected change in frequency of the modifier, we thus need to calculate the expectation of S{rho}–1D over the probability distribution of the number of copies of A on the different possible backgrounds at t*. We now present a method to derive this distribution.

In the following, we use index j to denote a possible genetic background on which allele A may be found (in our three-locus model, the possible backgrounds are thus mb, mB, Mb, and MB). As we said before, we assume that A is present as a single copy at time t; however, let us assume for the moment that the initial number of copies of A is arbitrary and call nj;0 the number of copies of A on background j at t. We denote by n0 the vector of all nj;0. We then call nj the number of copies of A on background j at time t* and n the vector of all nj. Finally, {phi}(n) is the probability that the number of copies of A on the different backgrounds at t* is equal to n. Although it is difficult to derive the distribution {phi} directly, we show below that it is possible to obtain its Fourier transform {psi}, defined using the vector of dummy variables Formula 2,

Formula 3(3)
where {iota} is the imaginary number Formula 3, and where the integral is a multiple integral over all possible values of n. Note that although n takes only discrete values, we treat it here as a continuous variable (since we integrate over n, rather than summing); this will later allow us to use a diffusion approximation. The Fourier transform {psi} is thus a function of variables {omega}j, and Formula 3 is the vector of all {omega}j's (which has the same dimension as n). Considering {psi} rather than {phi} proves useful, because we can make use of the convolution theorem, which states that the Fourier transform of the distribution of a sum of independent random variables is equal to the product of the Fourier transforms of the individual distributions of these variables. Because we assume that copies of allele A replicate independently between t and t*, the distributions of the number of copies originating from each initial A allele are independent. Since n is the sum of these numbers, {psi} must take the form

Formula 4(4)
where Formula 4 is the Fourier transform of the distribution of the number of copies of A at t* originating from a single copy, present on background j at t (the functions {psi}j thus do not depend on the nj;0's). Equation 4 can also be written

Formula 5(5)
where each Pj equals Formula 5. Note that instead of the Fourier transform, we could also use the Laplace transform or the generating function of {phi}, which also satisfies the convolution theorem. However, in the following we need to invert the transform, and for this the Fourier transform proves more convenient.

When t = t*, the distribution {phi} is reduced to a single point at n = n0. The Fourier transform of such a distribution is given by Formula 5, giving at t = t*

Formula 6(6)
for all j. We then use a diffusion approximation to express the Pj's as a function of t, for a given value of t*. A backward diffusion on {psi} is given by

Formula 7(7)
The term Formula 7 in Equation 7 represents the expected increase in nj at t, assuming that all nj;0 are small relative to N (the term {sigma}jk represents the expected change in nj accounting for possible movement from background k and can be expressed in terms of recombination rates and selection coefficients). Assuming a Poisson number of offspring per parent, the variance in the change of nj at t is approximately nj;0 (for Formula 7), while the covariance between the changes in nj and nk is negligible for all j != k. Substituting Equation 5 into Equation 7 then yields

Formula 8(8)
(e.g., HARRIS 1963, chap. V.15). Equation 8, together with the boundary condition (6), can be used to express the Pj's, and thus {psi}, for arbitrary t. In general, Equation 8 will have to be solved numerically. One can then recover the distribution {phi} by taking an inverse Fourier transform or derive directly moments of {phi} from its Fourier transform {psi}; this is illustrated in the next two-locus example. Because we use the diffusion approximation, we have to assume that selection and recombination are weak (so that the nj's change slowly over time). However, we will see that this is not a major restriction of our model. Finally, we note that Equation 8 is identical to the equation giving fixation probabilities of mutations occurring on different backgrounds in BARTON (1995b) and OTTO and BARTON (1997), except that boundary conditions differ. As we will see later, fixation probabilities can be recovered from the limit of the Fourier transform {psi} as {omega} tends to infinity (this is because the distribution {phi} has a spike at zero, corresponding to the loss of the A allele, which is described by the limit of {psi} for large {omega}). In the next section, we illustrate our method by considering the case of two selected loci (without a recombination modifier).


TWO-LOCUS CASE
We consider here the effect of selection at the (B, b) locus on the distribution of the number of copies of A at t*, in the absence of a recombination modifier. Because we assume that B is fixed at t*, allele A can be found only on background B at that time. The distribution {phi} is thus a function of a single variable (the number of A alleles at t*, which will be on background B), which we simply denote n. The Fourier transform of {phi}(n) is given by

Formula 9(9)
We also know from the convolution theorem that {psi} must take the form

Formula 10(10)
(Equation 5), where PB and Pb are functions of {omega}, and where nB;0 and nb;0 are the number of copies of A on the B and b backgrounds at time t, respectively. At t = t*, the distribution {phi} is reduced to a single point at n = nB;0, and its Fourier transform is thus given by Formula 10. This, with Equation 10, gives the boundary conditions

Formula 11(11)
at t = t* (Equation 11 is a special case of Equation 6, constrained so that allele A is present only on the B background at t*). From now on, we assume that A enters the population as a single copy at t. This copy will be on background B with probability pB (which is the frequency of B at t) or on background b with probability pb (which is the frequency of b). Equation 10 can thus be written

Formula 12(12)
Indeed, Formula 12 is the Fourier transform {psi}({omega}) conditional on the initial copy of A being on background B (in which case nB;0 = 1 and nb;0 = 0 in Equation 10), while Formula 12 is {psi}({omega}) conditional on the initial copy of A being on background b. Assuming that selection is relatively weak at the (B, b) locus, pB and pb are given by the logistic equations

Formula 13(13)
where t = 0 is at the midpoint of the sweep of allele B.

The {sigma}-coefficients that appear in Equation 8 are easily obtained. The expected number of offspring copies produced by an A allele present on a b background at time t is (1 + sA)/(1 + sB pB) {approx} 1 + sAsB pB. Of these, a proportion rAB pB moves to the B background, while a proportion 1 – rAB pB stays on the b background. The expected number of copies produced by an A allele present on a B background at t is (1 + sA)(1 + sB)/(1 + sBpB) {approx} 1 + sA + sBpb; of these, a proportion rABpb moves to the b background, while a proportion 1 – rABpb stays on the B background. This gives (using Equation 8)

Formula 14(14)
with

Formula 15(15)
These are the same as Equations 5a and 5b in BARTON (1995b). Equation 14, with boundary condition (11), can be solved numerically using the NDSolve routine of Mathematica (notebook available on request), giving {psi} (Equation 12).

To recover the distribution {phi}(n) from the Fourier transform {psi}({omega}), it is useful to decompose {phi} into two parts: a distribution that is different from zero only when n = 0 (loss of allele A) and that is equal to {phi}(0) at this point [this distribution is thus given by {phi}(0){delta}(n), where {delta} is Dirac's "delta function") and the rest of the distribution {phi} for n > 0 that we approximate by a continuous function on the range n ≥ 1. The Fourier transform of the first part (which we call {psi}0) is a constant: {psi}0({omega}) = {phi}(0) for all {omega} [because the Fourier transform of {delta}(n) is 1], while the Fourier transform of the second part (which we call {psi}>0) tends to zero as {omega} tends to infinity (because the Fourier transform of a piecewise continuous function tends to zero as {omega} tends to infinity, e.g., LIGHTHILL 1958). Because {psi} is the sum of {psi}0 and {psi}>0, its limit as {omega} tends to infinity is thus {phi}(0): therefore, we can obtain the probability that allele A is extinct at time t* from the value of {psi} for large {omega}. The rest of the distribution {phi} (for n > 0) can be recovered by inverting the Fourier transform {psi}>0. In APPENDIX B, we show that this can be done for each value of n by calculating a sum of nmax terms, where nmax is an arbitrary value such that the probability that allele A is present in a number of copies greater than nmax at time t* can safely be ignored (remember that we assume that A is still rare at t*). One obtains

Formula 16(16)
and where {psi}({infty}) is the limit of {psi} as {omega} tends to infinity (which can be determined by evaluating {psi} for a very large value of {omega} or from Equation 6 in BARTON 1995b, giving fixation probabilities). The number of terms to be evaluated can then be reduced by using some symmetries of {psi} (see APPENDIX B). In Figure 1 we compare this solution with simulation results, for sA = 0.01, sB = 0.1, rAB = 0.02, t = –100, t* = 100, and N = 106 (and where pB is given by Equation 13). Note that the population size N does not appear in the derivation, but appears in the simulation program, which represents a Wright–Fisher model with multinomial sampling in each generation (our simulation program is written in C++ and is available on request). Figure 1 shows that Equation 16 accurately describes the probability distribution {phi}(n).


Figure 1
View larger version (12K):
In this window
In a new window
Download PPT slide
 
FIGURE 1.—

Two-locus model: probability distribution of the number of copies of allele A at time t* (n), in the absence of recombination modifier, from the analytical model (using Equation B6 with nmax = 5000, solid curve) and from simulations (shaded points). Parameter values are sA = 0.01, sB = 0.1, rAB = 0.02, t = –100, t* = 100, and N = 106. In the simulations, the distribution is obtained from 108 replications.

 
Moments of the distribution {phi} can be obtained directly from {psi}, without having to invert the Fourier transform (in essence, the method is similar to Ohta and Kimura's method for deriving moments using the diffusion approximation, e.g., OHTA and KIMURA 1969, 1971). For example, we show in APPENDIX B that the variance of {phi} is obtained by solving numerically a system of three differential equations, derived from Equation 14. Figure 2 shows the logarithm of the variance at time t* = 100, as a function of the time t at which allele A first entered the population for sA = 0.01, sB = 0.1, rAB = 0.02 (solid line), and rAB = 0.04 (dashed line). The dotted line shows the logarithm of the variance in the absence of interference (in the one-locus model), given by Formula 16, with {Delta}t = t* – t (e.g., FELLER 1951). Figure 2 illustrates the fact that increasing recombination between the selected loci decreases stochastic effects at the (A, a) locus (the variance is lower with rAB = 0.04 than with rAB = 0.02); this generates an indirect selective pressure to increase recombination rates, which we quantify in the next section.


Figure 2
View larger version (8K):
In this window
In a new window
Download PPT slide
 
FIGURE 2.—

Logarithm of the variance of the number of copies of allele A at time t* = 100, as a function of the time t when allele A first appeared. Curves correspond to the model predictions, circles and squares to simulation results (variance observed among 107 replications). Solid curve, filled circles: two-locus model, sA = 0.01, sB = 0.1, rAB = 0.02, N = 106. Dashed curve, squares: two-locus model, sA = 0.01, sB = 0.1, rAB = 0.04, N = 106. Dotted curve: single selected locus (no interference), sA = 0.01, N = 106.

 


CHANGE IN FREQUENCY AT THE MODIFIER LOCUS
We now introduce the recombination modifier locus (M, m). As we said before, because N is large we can assume that the frequency of allele M does not change between times t and t*; we denote this frequency pM, while pm is the frequency of allele m. Furthermore, since we suppose that allele B is fixed at t*, allele A can be found only on backgrounds MB and mB at t*. Calling nMB and nmB the numbers of copies of A on these two backgrounds at t*, the Fourier transform of the probability distribution Formula 16 is given by

Formula 17(17)
From the convolution theorem, we also know that {psi} must take the form

Formula 18(18)
(Equation 5), where the sum is over all possible initial backgrounds j = mb, mB, Mb, and MB, and where the Pj's are functions of {omega}MB and {omega}mB. At t = t*, we have boundary conditions

Formula 19(19)
From Equation 8, one then obtains the system

Formula 20(20)
where s and s+ are given by Equation 15, while pB and pb are given by Equation 13. For example, the expected number of copies produced by an allele A present on background MB at t is 1 + sA + sBpb (to the first order in selection coefficients). Of these, a proportion rAB(1 + {delta}r x pM)pb will move to the Mb background—since the expected recombination rate between selected loci, experienced by a chromosome carrying the M allele, is rAB(1 + {delta}r x pM)—a proportion rMApm will move to the mB background, and the rest will stay on the MB background (note that because we need to express the {sigma}-coefficients of Equation 8 only to first order in selection coefficients and recombination rates, we can neglect double-recombination events). Allowing the modifier to affect the recombination rate rMA would not affect the equations above. Specifically, we may denote the recombination rates between loci (M, m) and (A, a) in mm, Mm, and MM individuals by rMA(1 {delta}rMA), rMA, and rMA(1 + {delta}rMA), respectively. However, because recombination between (M, m) and (A, a) would affect genotype frequencies only when it occurs in Mm heterozygotes, {delta}rMA would not enter into the equations.

The system (20), with boundary conditions (19), can be solved numerically to obtain the Pj's for arbitrary t. In the case where A is present as a single copy at t, and assuming linkage equilibrium between the (M, m) and (B, b) loci at t, the expression for the Fourier transform {psi} becomes (from Equation 18)

Formula 21(21)
Using Equations 1921, one can express {psi} for any time t of occurrence of the A mutation.

In the following, we use scaled parameters and variables; this allows us to factor out selection coefficients and obtain more general results, in terms of the ratio of selection coefficients at both loci. In the diffusion limit, {omega}j and Pj are of order sA for all backgrounds j. We define the scaled variables Formula 21 and Formula 21 as

Formula 22(22)
We then use the scaled parameters

Formula 23(23)
The frequency of allele B as a function of time thus becomes Formula 23. Using these scaled parameters and variables, Equations 20 become

Formula 24(24)

From Equation 2, the change in frequency of the modifier after T* can be obtained from Formula 24, where {rho} = rMA/sA = {rho}MA/{theta}, S is the total number of A alleles at T*, and D is the linkage disequilibrium between loci (M, m) and (A, a) at T*, multiplied by population size N (see APPENDIX A). The variables S and D are given by

Formula 25(25)
In APPENDIX C, we present a method to derive Formula 25 from the Fourier transform {psi}, in the case of a modifier of small effect ({delta}r small), and when linkage between the modifier and the (A, a) locus is sufficiently tight ({rho} < 1). We then calculate the selection gradient on the modifier allele M, defined as

Formula 26(26)
The expected change in frequency of the modifier over the whole process is given by

Formula 27(27)
Interestingly, APPENDIX C shows that sM is independent of the initial frequency of the modifier (pM) and can be written in the form Formula 27, where Formula 27 is a function of T, {theta}, {rho}AB, {rho}MA, and NsA. A measure of the strength of selection at the modifier locus, generated by the Hill–Robertson effect at the selected loci, is thus given by Formula 27. As shown in APPENDIX C, obtaining Formula 27 for a given set of parameter values involves solving numerically a system of differential equations (Equations C15C22) and performing a numerical integration (Equation C23). These operations have been implemented in a Mathematica notebook, which is available on request. Modifying our model to represent different orderings of the three loci is straightforward and leads to minor changes to our equations (see end of APPENDIX C).

Figure 3 shows a test of our method against simulations, for {theta} = 0.1, {rho}MA = 0.01, {rho}AB = 0.2, and NsA = 104 (in the simulations sB = 0.1, {delta}r = 0.5, N = 106, and the initial frequency of the modifier is pM = 0.5). In the simulation program, we start from a population where allele B is at frequency pB = 1/(1 + eT) (the deterministic frequency of B at time T), allele M is at frequency pM, and B and M are in linkage equilibrium. Allele A is introduced as a single copy in a random background; then for each generation we calculate the expected change in genotype frequencies and sample the new frequencies according to a multinomial distribution. We stop when allele A has disappeared or has reached fixation and measure the change in frequency of M over the whole process. Figure 3 shows that our method provides a good prediction of selection on the modifier, as long as T is not too negative (bottom, dashed curve). The discrepancy for T < ~–10 comes from the fact that the model assumes that allele B is established at T. However, when T is too negative, the initial frequency of B is too low for the allele to be established; this is shown in Figure 3, top, which plots the deterministic trajectory of allele B (solid curve) and its probability of fixation as a function of T (dashed curve), where we calculate the fixation probability from the standard one-locus diffusion formula

Formula 28(28)
where {sigma}A = NsA (e.g., CROW and KIMURA 1970). When T < ~–15, Figure 3 shows that allele B has a high chance of being lost from the population, in which case selection on the modifier disappears. If we multiply the expression for the change in frequency of the modifier given by Equation C23 with the probability of fixation of B (given by Equation 28), we obtain a result that matches the simulations very well (Figure 3, bottom, solid curve). To generate the curves of Figure 3, we set the value of time T* to 50; however, other values of T* between 30 and 350 lead to indistinguishable curves. When T* < 30, allele A can be still present in the population but not established (which leads to an overestimation of the change in frequency of the modifier), while when T* > 350, allele A, when still present in the population, has already reached a substantial frequency (which may lead either to an underestimation or to an overestimation of the change in frequency of the modifier, depending on the parameter values, as the model ignores changes in modifier frequency before T*, but tends to overestimate the change after T*). The fact that the results are insensitive to T* over a large range of values shows that the method is consistent. In Figure 4, we show that our model also gives very good results for different values of the selection coefficients sA and sB. Finally, Figure 5 compares results obtained using our method and Otto and Barton's method for sA = sB = 0.1, rMA = 0.001 (Figure 4A), and rMA = 0.01 (Figure 4B), corresponding to the values used in Figures 7 and 9 of OTTO and BARTON (1997). Figure 5 shows that the present method provides a better prediction of the change in frequency of the modifier. In Figure 5A, it can be seen that both the model and simulations give a bimodal curve; this effect is explained in the next section.


Figure 3
View larger version (9K):
In this window
In a new window
Download PPT slide
 
FIGURE 3.—

(Top) Deterministic increase in frequency of allele B (solid curve) and probability of fixation of B (dashed curve, from Equation 28) from initial frequency pB (given by Equation 13) as a function of time T. (Bottom) Scaled selection gradient on the modifier Formula 28, as a function of T. The dashed curve shows the results obtained from Equation C23, and the solid curve shows the results obtained from (C23) multiplied by the probability of fixation of B (from Equation 28). Dots show simulation results (average over 107 replications); error bars measure ±1.96 SE. Parameter values are: {theta} = 0.1, {rho}MA = 0.01, {rho}AB = 0.2, and NsA = 104; in the simulations sB = 0.1, {delta}r = 0.5, with N = 106 and pM = 0.5, while in the analytical model the splice point T* is set to 50.

 

Figure 4
View larger version (8K):
In this window
In a new window
Download PPT slide
 
FIGURE 4.—

Scaled selection gradient on the modifier Formula 28 times the probability of fixation of B as a function of T. Curves: results obtained from Equation C23, multiplied by the probability of fixation of B (from Equation 28). Circles and squares: simulation results (average over 107 replications); error bars measure ±1.96 SE (when the error bars do not appear, they are smaller than the dots). (A) sA = 0.01 and sB = 0.1 (solid curve, solid circles), 0.05 (dashed curve, squares), 0.02 (dotted curve, open circles). (B) sB = 0.1 and sA = 0.01 (solid curve, solid circles), 0.02 (dashed curve, squares), 0.05 (dotted curve, open circles). In each case rMA = 0.001, rAB = 0.02, N = 106. In the simulations {delta}r = 0.5 and pM = 0.5, while in the analytical model the splice point T* is set to 50.

 

Figure 5
View larger version (6K):
In this window
In a new window
Download PPT slide
 
FIGURE 5.—

Scaled selection gradient on the modifier Formula 28 times the probability of fixation of B as a function of T. Solid curves, results obtained from Equation C23; dotted curves, results obtained from Equations 710 and 19 in OTTO and BARTON (1997); dashed lines, results obtained by the method of OTTO and BARTON (1997), taking into account the initial linkage disequilibrium between selected loci. Dots: simulations results (average over 2 x 107 replications), error bars measure ±1.96 SE (in B, the error bars are smaller than the size of dots). Parameter values: sA = 0.1, sB = 0.1, rAB = 0.02, N = 106, and rMA = 0.001 (A) and 0.01 (B). In the simulations, {delta}r = 0.5 and pM = 0.5.

 


NET SELECTION GRADIENT OVER RECURRENT SWEEPS
What will be the net selection gradient for a modifier increasing the total map length of a genome, under a constant input of beneficial mutations? We assume here that the modifier increases the recombination rate uniformly over the genome and that adaptive sweeps are sufficiently rare that we can neglect interactions among more than two beneficial alleles. More specifically, we derive an expression for the net selection gradient to the order {Lambda}2, where {Lambda} is the rate of adaptive substitutions, and neglect terms of higher order in {Lambda}. Our model allows us to consider two classes of mutations. One can assume, for example, that some strongly selected alleles enter the population at a given rate (sufficiently low that the effect of interactions among these mutations can be neglected), while weakly selected mutations occur at a higher rate and have a reduced fixation probability due to the recurrent sweeps of strongly selected alleles. The interaction between strongly and weakly selected alleles will select for higher recombination rates. Interactions among weakly selected alleles will also select for more recombination, but as the effect of these interactions is weaker (e.g., BARTON 1995b), they may be neglected (this should be checked, however, as this depends on how much more numerous weakly selected alleles are). To obtain the average effect of strong substitutions on weak ones, we can consider the case where allele B is strongly selected and allele A weakly selected. Equation C23 then gives the rate of increase of the modifier, given that B fixes, and for given positions of the (A, a) and (B, b) loci. The selection gradient, conditional on fixing both alleles, is then given by sM/Pfix(A), where Pfix(A) is the fixation probability of allele A, which can be obtained from Equation 6 in BARTON (1995b). Because Pfix(A) can be written as Formula 28, where Formula 28 is a function of T, {theta}, and {rho}AB, while Formula 28, the selection gradient given the fixation of both alleles is Formula 28, which depends on T, {theta}, {rho}MA, {rho}AB, and NsA. Averaging over time gives the net selection gradient, for given positions of alleles A and B:

Formula 29(29)
The net selection gradient, under constant rates of substitutions {Lambda}A and {Lambda}B for weakly and strongly selected alleles, and for all possible positions of selected loci, is then obtained by integrating over the genetic map,

Formula 30(30)
where R is the total map length (because only selected loci that are sufficiently tightly linked to the modifier contribute significantly to its increase in frequency, we can take +{infty} as the integration limit when integrating over the genetic map). Equations 29 and 30 finally give

Formula 31(31)
where the triple integral depends on {theta} and on NsA. Moreover, we expect that Formula 31 should scale roughly with Formula 31. Indeed, from Equation C23, Formula 31 takes the form Formula 31, multiplied by a function of {rho}MA, {rho}AB, and {theta}, say Formula 31. Expanding f using a Taylor series and integrating over {rho} = {rho}MA/{theta} gives

Formula 32(32)
which is dominated by the first term when Formula 32 is large. This slow decrease in the strength of selection on the modifier as population size increases probably reflects the fact that what matters most in a large population are stochastic effects occurring when selected alleles are present in small numbers of copies, which are not affected much by population size. Finally, it is important to note that the calculation of Formula 32 given above considers only pairwise interactions among selected loci and thus cannot be applied to the case where more than two selective sweeps overlap in time. However, as is discussed later, rates of amino acid substitutions in the human and Drosophila lineages are rather low, suggesting that interactions between more than two sweeps may be quite rare.

Equation 31 involves the selection gradient for the modifier, conditional upon the fixation of allele B (Formula 32). Moreover, the integral over T includes cases where allele A enters the population before allele B. In the previous section, we compared simulation results with the model's predictions for the unconditional selection gradient for the modifier, Formula 32. If we now condition upon the fixation of B, selection on the modifier can be substantial even when allele A occurs before allele B, but is still segregating at the time where B occurs. This is shown, for example, in Figure 6, which plots the selection gradient for the modifier, conditional upon the fixation of allele B (Formula 32). The solid and dashed curves give the model predictions for {rho}MA = 0.01 and {rho}MA = 0.005 (respectively), obtained from Equation C23; dots represent simulation results, for sB = 0.1. In the simulation program, allele B appears as a single copy at the time where pB = 1/N according to the deterministic expression (13); this time is ~ –13.8 for the parameter values used in Figure 6. For values of T < –13.8, allele A enters the population before allele B and appears on a random background at the modifier locus. Then, at time –13.8, allele B enters the population on a random background. The program stops when both selected loci are fixed, and the change in frequency of the modifier is measured among those cases where B has reached fixation. For values of T > –13.8, allele B enters the population before allele A; again, the program runs until both loci are fixed, and the change in frequency of the modifier is measured only among cases where B is fixed.


Figure 6
View larger version (10K):
In this window
In a new window
Download PPT slide
 
FIGURE 6.—

Scaled selection gradient for the modifier (Formula 32) as a function of the time of appearance of allele A (T), conditional on the fixation of allele B. Curves: results obtained from Equation C23, with {rho}MA = 0.01 (solid line) and {rho}MA = 0.005 (dashed line). Dotted curve: result obtained from Equations 710 and 19 in OTTO and BARTON (1997), with {rho}MA = 0.01. Dots: simulation results (average over 6 x 107 replications), with {rho}MA = 0.01 (solid circles) and {rho}MA = 0.005 (open circles); error bars measure ±1.96 SE. In the simulations, allele B enters the population as a single copy (on a random background) at time T = –13.8. Other parameters: {theta} = 0.1, {rho}AB = 0.2, NsA = 104; in the simulations, sB = 0.1, {delta}r = 0.5, N = 106, and pM = 0.5. In the analytical model the splice point T* is set to 50.

 
Several aspects of Figure 6 deserve attention. First, the model does not provide a good prediction of the change in frequency of the modifier for very negative values of T, that is, when allele A appears long before allele B. This is due to the fact that the model assumes that, at a time T* after the sweep of allele B, allele A is either lost or established, but still at low frequency in the population. However, for very negative values of T, A may have reached a high frequency at time T* or even be fixed. For example, for T = –125, simulations indicate that the frequency of A at time T* = 50, given that A will fix, is ~0.99. In that case, the branching process approximation (independent replication of the different copies of A) has ceased to apply long before T* and probably even before the sweep of B. In this case, most of the change at the modifier locus occurs before T*, while the model assumes no change before T*. However, the model greatly overestimates the change after T*, because it overestimates Formula 32 at T* (this is confirmed by simulations), and also because Equation A7 assumes that allele A is still in low frequency at T*. Another potential problem is that the model allows pB > 0 before allele B actually appears in the population (that is, before time –13.8). However, this does not seem to cause problems when A appears at time –50, for example. In any case, it appears unlikely that these problems can be solved by simple adjustments of our model: indeed, the discrepancy occurs when the sweeps of A and B overlap, which requires using the full three-locus recursions for describing the deterministic phase. We are currently working on an alternative method to treat this case, as is discussed next.

A second important result shown by Figure 6 is that the curves giving the selection gradient on the modifier as a function of T are bimodal (this can also be seen in Figure 5A and is confirmed by the simulations). This is due to the fact that the modifier has two different effects on the spread of allele A, whose relative importance depends on the time of appearance of A relative to B. When allele A appears after or shortly before allele B, its probability of fixation is reduced by the sweep of B. By reducing this interference, the recombination modifier increases the fixation probability of A and then benefits from the sweep of A (by hitchhiking). However, for values of T < ~ –40, the sweep of allele B has very little effect on the probability of fixation of A (this can be seen, for example, in Figure 4 of OTTO and BARTON 1997): A is either lost or already established when B enters the population. In this case, the modifier does not affect the probability of fixation of A, but still has an effect in accelerating the sweep of A, by helping it to recombine onto the B background. This effect also generates an indirect selective force on the modifier and explains the leftmost maxima of the curves of Figure 6. Because the two peaks are separated by a time ~Formula 32 (which is the time needed for allele A to become established, in the absence of allele B), they will move apart as N increases (this is confirmed by simulations—not shown). It is possible in principle to calculate the advantage of the modifier via accelerating sweeps (left peak) using a deterministic calculation; indeed, one can relatively easily describe changes in genotype frequencies when A and B are rare and then solve for the change when they become common using eight coupled differential equations, describing evolution at the three loci. Such a calculation is presently being assembled (N. H. BARTON and D. ROZE, unpublished data).

Finally, Figure 6 shows that increasing the recombination rate {rho}MA can have different qualitative effects on the rate of increase of the modifier, depending on the time of appearance of A relative to B (this can also be seen by comparing Figure 5A and 5B). When A appears after or shortly before B, increasing {rho}MA decreases the rate of increase of the modifier, by decreasing the effect of hitchhiking. When A appears long before B, however, increasing {rho}MA has the opposite effect. Before the time when B enters the population, the only effect of recombination between loci (M, m) and (A, a) is to decrease the variance in linkage disequilibrium DMA. By using the method of BARTON and OTTO (2005), we found that the only effect of an initial variance in DMA is to reduce the genetic variance at the modifier locus, pMpm. Indeed, in the absence of selection at locus (B, b), the expected change in genetic variance at the modifier locus, over one generation, is given by

Formula 33(33)
This corresponds to the classical result that selection tends to reduce genetic diversity at linked loci. Because the expected change in frequency of the modifier, at any generation, is proportional to the genetic variance at the modifier locus (at this generation), the variance in DMA tends to reduce the rate of increase of the modifier. Increasing the recombination rate {rho}MA decreases the variance in DMA and thus decreases this effect.

Because the present method does not deal well with the case where allele A occurs a long time before allele B (as shown by Figure 6), it has to be coupled to another method describing this case better before we can use Equation 31 to obtain the net selection gradient for the modifier under recurrent sweeps (N. H. BARTON and D. ROZE, unpublished data). Nevertheless, we can still obtain some results for the case where {theta} = 1 (sA = sB). Indeed, simulations indicate that the method gives accurate predictions as long as A occurs after B (e.g., Figure 5). However, in the opposite case (when A occurs before B), and when sA = sB, we can simply swap A and B and continue to use the same method. However, to do so we need to incorporate the fact that allele B starts from frequency 1/N; we thus need to specify the values of N and sA, while Equation 31 depends only on the product NsA. We performed numerically the triple integration of Equation 31 for the case where sA = sB = 0.05 and obtained sM R2/{Lambda}2 {approx} 0.151, 0.126, and 0.102 for N = 105, N = 106, and N = 107, respectively. These results confirm the fact that the selection gradient roughly scales with Formula 33 for large Formula 33; indeed, Formula 33 gives 1.29, 1.36, and 1.34 for N = 105, 106, and 107, respectively. The values given above lead to exceedingly small selection gradients, using current estimates for {Lambda} and R for Drosophila and humans. From comparisons between D. simulans and D. yakuba, the rate of adaptive amino acid substitutions has been estimated at one every 450 generations (SMITH and EYRE-WALKER 2002). This estimate has since been decreased to one every 800 generations (BIERNE and EYRE-WALKER 2004). The total map length R is estimated to be ~3.6 for D. simulans (TRUE et al. 1996). For humans, the rate of adaptive amino acid substitution is estimated to be one every 200 years since the divergence with old-world monkeys (FAY et al. 2001), which, using a generation time of 25 years (EYRE-WALKER and KEIGHTLEY 1999) gives {Lambda} = Formula 33 per generation. The total map length is ~R = 34 for humans. In both cases, the ratio {Lambda}2/R2 is very small (of order 10–5 for humans and 10–7 for Drosophila). Combining the present method with the method of N. H. BARTON and D. ROZE (unpublished data) that studies the effect of recombination modifiers in accelerating sweeps and the method of BARTON and OTTO (2005) that deals with the effects of small departures from deterministic trajectories once alleles are common, we should be able to obtain estimates over a range of selection coefficients and population sizes. However, it appears unlikely that the selection gradient for recombination modifiers will be large, except during periods of frequent selective sweeps or in genomes with very tight linkage.


DISCUSSION
Apart from the idea that recombination brings a mechanistic advantage to individuals in terms of DNA repair (e.g., BERNSTEIN et al. 1988), the different hypotheses proposed to explain the maintenance of high rates of recombination in higher organisms state that recombination is advantageous because it reduces LD among selected loci (KONDRASHOV 1993; BARTON and CHARLESWORTH 1998; OTTO and LENORMAND 2002). Different possible sources of LD have been identified. First, LD may be generated by epistatic interactions among loci. In this case, increased recombination is selected when epistasis is weakly negative (FELDMAN et al. 1980; KONDRASHOV 1982, 1988; BARTON 1995a) and not too variable among pairs of loci (OTTO and FELDMAN 1997) or when epistasis fluctuates over short periods of time (CHARLESWORTH 1976; BARTON 1995a). However, experimental evidence does not indicate any strong trend for epistasis to follow this pattern, and it seems likely that the variance in epistasis is high in most cases (e.g., SEAGER and AYALA 1982; SEAGER et al. 1982; DE VISSER et al. 1996, 1997a,b; ELENA and LENSKI 1997; ELENA 1999; DE LA PEÑA et al. 2000; WLOCH et al. 2001). A second possible source of LD is environmental heterogeneity, in which case recombination is selected when directional selection at different loci covaries negatively between habitats (LENORMAND and OTTO 2000). Finally, LD may be generated by drift and directional selection: in a finite population, some negative disequilibrium does develop, on average, between selected loci, even in the absence of epistasis (HILL and ROBERTSON 1966; FELSENSTEIN 1974). By decreasing the effect of these interactions among loci, recombination increases the rate of fixation of advantageous alleles (BARTON 1995b). In this article, we presented a new method to quantify the strength of selection acting on a recombination modifier, due to this last effect.

Although the idea that sex and recombination increase the rate of adaptation has a long history in population genetics, theoretical models addressing the question of the effect of interference among loci on the evolution of recombination have been developed relatively late. OTTO and BARTON (1997) calculated the expected change in frequency of a modifier allele affecting the recombination rate between two selected loci in a finite population, using a method derived from branching processes. This model has led to important insights, but it does not always provide good quantitative estimates of the strength of selection at the modifier locus. This is due to the fact that interference among selected loci is neglected in part of the analysis: the linkage disequilibrium generated by the Hill–Robertson effect is taken into account in the calculation of probabilities of fixation of selected alleles, but not in the calculation of the hitch given by a selected allele to the recombination modifier, given that it fixes. In a more recent work, BARTON and OTTO (2005) used a different method to calculate the effect of linkage disequilibrium generated by drift on the change in frequency of a recombination modifier; however, their analysis assumes small departures from deterministic trajectories and thus does not apply when selected alleles start from a small number of copies (which is the case where drift will have the strongest effect, when population size is large).

In this article, we have presented a new method that overcomes these difficulties, in the case where population size is sufficiently large. The essence of the method lies in the calculation of the probability distribution of the number of copies of a new selected allele on different genetic backgrounds, once it is established but while it is still rare. Assuming a large population size, one can then neglect further stochastic fluctuations and express the change in frequency of the recombination modifier in terms of this probability distribution. Because the new allele remains rare during the establishment phase, the change in number of its copies on the different backgrounds can be described by a branching process, which generates a system of differential equations that can be solved numerically to obtain the Fourier transform of the probability distribution. This method fully takes into account the linkage disequilibrium between selected loci generated by the initial mutation event, by genetic drift during the establishment phase, and by the fact that the allele tends to be found more often in coupling with the other selected allele on the paths leading to its fixation than on the paths leading to its loss. We used this method to calculate the full probability distribution of the number of copies of a new selected allele when selection also occurs at a linked locus and found a good match with simulation results (Figure 1). To obtain an expression for the change in frequency of the recombination modifier, we coupled