## Abstract

Whether recombination decelerates or accelerates a population's response to selection depends, at least in part, on how fitness-determining loci interact. Realistically, all genomes likely contain fitness interactions both with positive and with negative epistasis. Therefore, it is crucial to determine the conditions under which the potential beneficial effects of recombination with negative epistasis prevail over the detrimental effects of recombination with positive epistasis. Here, we examine the simultaneous effects of diverse epistatic interactions with different strengths and signs in a simplified model system with independent pairs of interacting loci and selection acting only on the haploid phase. We find that the average form of epistasis does not predict the average amount of linkage disequilibrium generated or the impact on a recombination modifier when compared to results using the entire distribution of epistatic effects and associated single-mutant effects. Moreover, we show that epistatic interactions of a given strength can produce very different effects, having the greatest impact when selection is weak. In summary, we observe that the evolution of recombination at mutation–selection balance might be driven by a small number of interactions with weak selection rather than by the average epistasis of all interactions. We illustrate this effect with an analysis of published data of *Saccharomyces cerevisiae*. Thus to draw conclusions on the evolution of recombination from experimental data, it is necessary to consider the distribution of epistatic interactions together with the associated selection coefficients.

PURGING deleterious mutations is one of the main explanations for the evolution of recombination. The menace of deleterious mutations is as ubiquitous as recombination itself: Every organism experiences at any time the opposing forces of selection and mutation. Selection drives populations to the peaks of the fitness landscape, maximizing the average fitness, while mutation spreads them out across the sequence space, generating deleterious mutations. Recombination potentially increases the efficiency of selection by combining good alleles with good genetic backgrounds and bad alleles with bad genetic backgrounds. This results in an indirect selection for high recombination rates, because subpopulations with high recombination rates will be more localized around the fitness peaks. However, the only effect of recombination is to weaken statistical associations between different loci. Therefore recombination increases the chance of good alleles and good backgrounds occurring together, only if the good alleles are predominantly found within bad backgrounds. Such a statistical association is termed a negative linkage disequilibrium.

Why should linkage disequilibria be negative? In this article we focus on statistical associations between loci due to fitness interactions. In a two-locus(*A*_{1}, *A*_{2})/two-allele(0, 1) model fitness interactions can be measured by epistasis, which is given bywhere is the fitness of the haploid genotype with *A*_{1} = *x* and *A*_{2} = *y*. If epistasis is positive, the double mutant has a higher fitness than expected from the fitness of the single mutants (measured on a multiplicative scale). Conversely, if epistasis is negative the double mutant has a lower fitness than expected from the fitness of the single mutants. Therefore, as long as no other effects generate linkage disequilibria, the extreme genotypes are expected to be under- (over) represented for negative (positive) epistasis. Indeed, in an infinite panmictic population at mutation–selection balance linkage disequilibrium always has the same sign as epistasis (Eshel and Feldman 1970). Thus, for two loci, negative epistasis leads to negative linkage disequilibria and hence recombination increases the efficiency of selection.

In two-locus/two-allele haploid models there is only one epistatic interaction. Thus the above reasoning is limited to cases where all epistatic interactions have the same sign. It is no surprise, however, that in real genomes both positive and negative epistatic interactions have been found (Mukai 1964; Kitagawa 1967; de Visser *et al.* 1997; Elena and Lenski 1997; West *et al.* 1998; Elena 1999; de la Peña *et al.* 2000; Peters and Keightley 2000; Wloch *et al.* 2001; Bonhoeffer *et al.* 2004). Although it is well understood how epistasis affects the evolution of recombination in a two-locus system, the effects of several epistatic interactions with different signs and strengths remain largely unexplored. Using an analytical approximation, Barton (1995) found that if the recombination modifier is tightly linked to loci under direct selection, a high recombination rate is favored whenever the mean epistasis is negative. Using the same method Otto and Feldman (1997) showed that, due to the recombinational load, variance in epistasis reduces the parameter space in which recombination is favored. However, due to the underlying assumptions these theoretical results are limited to cases with weak selection and weak epistasis.

In this article we examine how linkage disequilibrium and a recombination modifier behave in a multilocus system with a broad range of epistasis. We assume that genomes consist of independent pairs of loci, which is the simplest possibility to take account of several different epistatic interactions. Furthermore, we assume that selection occurs only in the haploid phase. First we examine how the shape of the linkage-disequilibrium distribution relates to the shape of the underlying epistasis distribution. Possible incongruities between the epistasis and the linkage-disequilibrium distributions are crucial, because fitness interactions affect the evolution of recombination only via the linkage disequilibrium they produce. Next we determine the strength of the linkage disequilibrium and the selection on the recombination modifier produced by a single fitness interaction in a two-locus system. Considering single interactions is important for our original question because the effect of epistatic interactions in a two-locus system may indicate the strength of their effect in a multilocus system. In contrast to Otto and Feldman (1997), we assume small recombination rates both between the selected loci themselves and between the modifier and the selected loci. Last, we consider a model consisting of two independent pairs of loci and a recombination modifier. This is the simplest case, where more than one fitness interaction simultaneously affect the evolution of recombination, and helps to clarify how the results based on two-locus systems can be extended to the multilocus case.

## METHODS

#### Model:

Our model describes the effect of recombination in an infinite population at mutation–selection balance. The state of the population is given by the frequencies *f _{i}*, where

*i*is a haplotype of the genome. The genome consists of

*L*biallelic loci, with the alleles at locus

*k*denoted by

*A*= 0 and

_{k}*A*= 1. We assume discrete generations. In each generation there is selection, mutation, and recombination. For simplicity, we consider selection to occur only in the haploid phase. Therefore the frequencies after selection are given bywhere

_{k}*w*is the fitness of the haplotype

_{i}*i*and is the mean fitness. Unless stated otherwise, we assume that backward mutations occur at the same rate as forward mutations. Thus, the frequencies after mutation arewhere

*m*is the mutation rate and

*h*(

*i*,

*j*) is the Hamming distance, which measures the number of loci at which haplotypes

*i*and

*j*differ. To simplify the recursion relations, we allow a maximum of one crossover event per replication. Between any pair of adjacent loci, crossover occurs with a probability such that the total probability of a crossover is

*r*(

*i.e.*, crossover interference is complete). Assuming random mating, we obtainwhere provided that one can construct genotype

*i*from genotypes

*j*and

*l*using one crossover and

*R*

_{i}_{,j,l}= 0 in all other cases. We obtain the haplotype frequencies of successive generations by recursively equating

*f*to .

_{i}Locus *A*_{1} assumes the role of a modifier of recombination rate. That is, locus *A*_{1} has no impact on the fitness-function *w _{i}* but affects recombination in the following manner: Recombination takes place with rate

*r*=

*r*

_{0}when both genomes have allele 0 at the modifier locus, with rate

*r*=

*r*

_{0}+ δ

*r*when only one has allele 1, and with rate

*r*=

*r*

_{0}+ 2δ

*r*if both have allele 1. Otto and Feldman (1997) found that for loose linkage variance in epistasis inhibits the evolution of recombination. In contrast, we focus here mainly on tight linkage. In most cases we have chosen

*r*

_{0}= 0 and δ

*r*very small, typically 0.01 or 0.001. Note that the crossing-over probability between the modifier locus

*A*

_{1}and its neighbor

*A*

_{2}is the same as between any other pair of adjacent loci. Furthermore the mutation rate for the modifier locus is zero.

Simulating the model, we typically let the system run to steady state with the modifier switched off (*i.e.*, *A*_{1} = 0 for all genomes) and then change the modifier from allele 0 to allele 1 in half of the genomes. After this, we can distinguish between three consecutive phases (see Figure 1): a rather short perturbation phase caused by the abrupt change of a large number of modifier alleles, a linear phase in which the modifier frequency increases or decreases with a constant slope, and finally, when either the 0 or the 1 allele reaches fixation, a long saturation phase, where the steepness of the curve decreases continuously. We call the slope of the modifier frequency in the linear phase the strength of selection acting on the modifier. Similar results were obtained by calculating the leading eigenvalue of the local stability matrix using the *Mathematica* package of Otto and Feldman (1997).

#### Fitness landscape:

To reduce complexity, we focus here on an idealized genome structure that decomposes into independent pairs of loci. This means that the genome has the form *A*_{1}*A*′_{1}*A*_{2}*A*′_{2}, … , *A _{l}A*′

_{l}. The term independent refers to our assumption that there are no epistatic interactions between pairs of loci, save for the pairs of interacting loci

*A*′

_{k}A_{k}(

*k*= 1, … ,

*l*), which are always adjacent in our simulations. Hence, the fitness of the whole genome can be written as the product of the fitness values of the pairs of interacting loci:Epistatic interactions are assumed to occur only within pairs of interacting loci, where epistasis is defined asThe linkage disequilibrium between pairs of interacting loci vanishes at mutation–selection balance, because there are no epistatic interactions between them. Thus their steady-state frequencies are given bywhere are the steady-state frequencies of the two-locus system with fitness function and mutation rate

*m*. The only nonvanishing linkage disequilibria are those within pairs of interacting loci, , which are the same as the linkage disequilibria in the two-locus systems given by . Hence, if we are interested only in the distribution of linkage disequilibria, we can solve the steady-state equations for

*l*two-locus systems instead of one

*L*-locus system. This combines the simplicity of two-locus genomes with the possibility of multilocus genomes to account for the simultaneous presence of diverse epistatic interactions (Phillips

*et al.*2000).

## RESULTS

#### Distributions of epistasis and linkage disequilibrium:

Epistatic interactions affect the evolution of recombination by generating linkage disequilibria. Recombination reduces the magnitude of these linkage disequilibria, which, depending on the sign of linkage disequilibrium, can increase or decrease the population's response to selection. This process will have an overall beneficial effect only if the beneficial effects of reducing negative disequilibria dominate the detrimental effects of reducing positive linkage disequilibria. Which effect prevails depends on the linkage-disequilibrium distribution.

We examined how the form of this distribution relates to the form of the epistasis distribution in systems consisting of independent pairs of loci. To this end the fitness values of each pair were drawn from a given probability distribution. We considered different distributions, but with the constraint that they produced symmetric epistasis distributions with mean zero. Examples are given in Figure 2. Note that although the epistasis distributions are symmetric with mean zero (Figure 2, A and C), the corresponding distributions of linkage disequilibria are not symmetric and their mean differs significantly from zero (Figure 2, B and D).

In Figure 2B there are far more strongly negative than strongly positive linkage disequilibria, while the converse is true for Figure 2D. The main difference between both distributions is that for Figure 2, A and B, pairs with negative epistasis are associated with weak selection while for Figure 2, C and D, this is the case for pairs with positive epistasis. Since the level of linkage disequilibrium produced by an epistatic interaction of a given strength depends inversely on the selection coefficient (see next section), this leads to skewed linkage-disequilibrium distributions.

What does this mean for recombination? Recombination increases (decreases) the response to selection whenever linkage disequilibrium is negative (positive). The magnitude of this change depends monotonically on the magnitude of the linkage disequilibrium. Hence, for the linkage disequilibrium shown in Figure 2B we expect that recombination increases the response to selection. Yet, for other fitness distributions we observed predominantly positive linkage disequilibria (see Figure 2D), in which case we expect recombination to decrease the response to selection. Thus one cannot conclude whether positive or negative linkage disequilibria have a greater impact in general. However, two important points should be noted: First, the shape of the linkage-disequilibrium distribution does not in general reflect the shape of the epistasis distribution. Second, this incongruity can crucially affect the predicted fate of a modifier for recombination.

#### Effects of epistasis in two-locus systems:

To interpret the above result, we examined the effects of epistasis in a two-locus/two-allele system. First, we calculated the strength of the linkage disequilibrium from a single epistatic interaction at mutation–selection balance in the absence of recombination (*i.e.*, *r* = 0). The genotype frequencies at mutation–selection balance are given by the normalized eigenvector of the mutation–selection matrix:The corresponding linkage disequilibrium is a complicated expression of the fitnesses *w _{i}* and the mutation rate

*m*, but we can obtain a simple approximation by neglecting back mutations.

Assuming furthermore that the strength of selection on both single and double mutants is considerably larger than the mutation rate [*i.e*., ], we obtain the expressionwith the epistasis ε = *w*_{4} − *w*_{2}*w*_{3}.

To investigate the effect of recombination, we consider an approximation of the steady-state equations for small recombination rates *r* (see appendix). From this we obtain the following expressions for the change of linkage disequilibrium,and for the change of mean fitnessbetween the steady states with recombination rates *r* and 0, respectively. The above equations show that *D*, Δ*D*, and are proportional to the epistasis ε and inversely proportional to the squared strength of selection on the single mutant (1 − *w*_{2}) and the strength of selection on the double mutant (1 − *w*_{4}). As expected, epistasis does have an effect on the resulting linkage disequilibrium. However, the strength of this effect is weighted by the strength of selection. In particular, epistasis has the strongest effect on linkage disequilibrium when selection on either the single or the double mutant becomes weak (*i.e.*, if *w*_{2} or *w*_{4} is close to 1).

Importantly, epistasis of the same level can induce very different absolute levels of linkage disequilibria. This is the reason why one can obtain a strongly skewed distribution of linkage disequilibria from a symmetric distribution of epistatic interactions, as observed in Figure 2. In Figure 2, A and B, negative epistatic interactions induce on average stronger disequilibria than positive epistatic interactions of the same absolute magnitude. As can be seen in Figure 3A the above conclusions for the linkage disequilibrium *D* also hold if we allow for back mutations (the corresponding plots for Δ*D* and look similar).

Considering only fitnesses leading to large linkage disequilibria, we can distinguish three regions in the parameter space. In region 1 selection is strong on the single mutants, but weak on the double mutant (*i.e.*, , ). In region 2 selection is weak on the single mutants, but strong on the double mutants (*i.e*., , ). Finally, in region 3 selection is weak on both single and double mutants (*i.e.*, ). We refer to these regions as compensatory mutations, synthetic lethals, and overall weak selection, although these terms have been used in the literature without this strict quantitative definition. In the remaining region the linkage disequilibrium is several orders of magnitude smaller. For a genome consisting of independent pairs of loci this implies that the bulk of epistatic interactions leads to a vanishingly small linkage disequilibrium and is therefore irrelevant for the evolution of recombination. Instead relatively few epistatic interactions may dominate the behavior of a recombination modifier.

Next we go one step further and determine the relation between epistasis and strength of selection acting on a modifier of recombination. Because we are unable to derive analytical expressions for the strength of selection on the modifier for the entire parameter space, we show numerical simulations in Figure 3B. Figure 3B shows that the conclusions are qualitatively the same for the strength of selection as for linkage disequilibrium. In particular, synthetic lethals, compensatory mutations, and overall weak selection exert a far stronger (indirect) selection on the modifier than all other fitness values. Qualitatively, these results also hold for loose linkage between selected loci, but the sensitivity of the modifier to the strength of selection on individual loci decreases rapidly as the linkage between the selected loci decreases (see Figure 4).

Two points are worth further consideration. First, although selection on the modifier is positive for negative epistasis that is not too strong relative to selection on the single mutant, it becomes negative as the selection on the single mutant vanishes (see Figure 5). This is in accordance with previous theory (see, for example, Equation 1, box 1 in Otto and Lenormand 2002).

Second, we always consider systems at mutation–selection equilibrium. This approach makes sense only if we can assume that this state is reached within a reasonable time, *i.e*., on a timescale that is considerably shorter than the timescale of changes in the fitness landscape. If this is not the case, we expect adaptive and purging dynamics to interfere. Therefore the above selection coefficients would most probably overestimate the impact of deleterious mutations. Figure 6 shows the time it takes to reach 90% of the steady-state level of the linkage disequilibrium (starting from a homogeneous wild-type population) for different fitness values. As expected the equilibration time is larger the weaker selection is. Hence, the linkage disequilibrium takes a large time to establish in those regions of the fitness landscape where the strength of selection on the modifier is largest.

#### Impact of two fitness interactions on a recombination modifier:

Last, we discuss five-locus systems (that is, four selected loci plus one modifier locus) to investigate how the above results for the modifier extend to systems with more than one fitness interaction. We considered two independent pairs of loci, with fitnesses *w*_{1} = 1, *w*_{2} = *w*_{3}, *w*_{4} for the first pair and for the second pair. We compare this five-locus system with two three-locus systems that we obtain by linking a recombination modifier to each of the two pairs. In our simulations we varied the fitness values over the parameter space in steps of 0.02 and found that in all cases the strength of selection on a modifier linked to two pairs is given to a good approximation by the sum of the selection coefficients acting on modifiers linked to each of the pairs separately, in agreement with the analytical result for selection on the modifier presented in Barton (1995). For the level of linkage used in Figure 3 the relative error was always <1%.

This means that, at least for two independent pairs, adding up selection coefficients is a reasonable way of bookkeeping interactions between more than two loci. Thus, it is the distribution of the two-locus selection coefficients on the modifier rather than the distribution of epistasis that determines the fate of the recombination modifier (in other words, the key question is where in Figure 3 the interacting loci lie). Therefore, except for the special case where the epistasis distribution does not depend on the strength of selection (*i.e.*, the same epistasis is measured for every selection coefficient), the epistasis distribution can lead to erroneous conclusions about the fate of a recombination modifier. To illustrate this consider the following extreme example: A genome is made up of a modifier and two independent pairs. The first pair is a synthetic lethal (*w*_{1} = 1, *w*_{2} = *w*_{3} = 0.99, *w*_{4} = 0.9). The second pair has variable positive epistasis and strong selection on both the single and the double mutant (). From the position of the pairs in Figure 3 one expects that the negatively interacting pair (the synthetic lethal) exerts a much stronger influence than the positively interacting pair, such that a high recombination rate is selected regardless of the average epistasis. Figure 7 shows the selection coefficient on a modifier as a function of ε_{2}. As expected, the selection coefficient on the modifier is positive (favoring recombination) even if the positive epistasis of the second pair is considerably stronger than the negative epistasis of the first pair (ε_{1} = 0.9 − 0.99^{2} = −0.081 represented by the dashed line). The modifier is selected against only if the positive epistasis ε_{2} is so large that selection on the double mutant becomes weak. Note that we could have also chosen an example where there is selection against recombination although the average epistasis is negative.

#### Epistatic effects in *Saccharomyces cerevisiae*:

Our models suggest that the level of recombination might be governed by a small number of epistatic interactions and that considering epistatic interactions independently of the corresponding levels of selection can lead to erroneous conclusions. Whether this caveat is relevant for the interpretation of natural systems depends mainly on the covariance between single- and double-mutant fitnesses.

As a preliminary examination of this question, we used data on epistatic interactions among mutants in *S. cerevisiae* from Szafraniec *et al.* (2003). The growth rate estimates were converted to fitness estimates of single and double mutants (Figure 8A; data kindly provided by R. Korona); we restricted our attention to the 33 pairs of mutants with fitnesses less than that of the wild type. We ignored measurement error and assumed that the heterozygous mutant fitnesses could be used as proxies for haploid mutant fitnesses. For each pair of mutants, the selection induced on a modifier of recombination was calculated numerically from the *Mathematica* package of Otto and Feldman (1997). Figure 8B illustrates the results, assuming a recombination rate between loci of 0.1. A rare modifier that increases recombination by 10% is selected against according to most pairs of loci. Most importantly, selection against recombination was 16.9 times stronger when the fitness effects of each pair of mutants were considered separately and then averaged (−2.0 × 10^{−12}) than when the average fitnesses were used (−1.2 × 10^{−13}). Similar results were obtained when recombination rates of 0.0001, 0.01, or 0.4 were used or when the mutant fitness effects were magnified by a factor of 50. This result demonstrates the importance of considering distributions of fitness effects; relying on average fitness effects generated results that were incorrect by more than an order of magnitude.

## DISCUSSION

Recombination has the potential to increase genotypic variation at mutation–selection balance and hence the population's resistance against mutational decay. However, this effect takes place only under a restricted set of conditions. For a two-locus system, linkage disequilibrium must be negative. A negative linkage disequilibrium in turn is induced by negative epistasis (in the absence of other factors such as drift). Thus, in a two-locus system and an infinite population, recombination increases the response to selection whenever epistasis is negative. Things are more complicated for multilocus systems, particularly because both negative and positive epistases can be present simultaneously. In this article we investigated the effects of many epistatic interactions on the evolution of recombination.

We found that in general the distribution of linkage disequilibria does not reflect the distribution of epistasis. In particular, a symmetric or even positively skewed epistasis distribution can produce a clearly negatively skewed linkage-disequilibrium distribution. Because it is linkage disequilibrium and not epistasis on which recombination acts, knowledge of epistasis alone is insufficient for drawing conclusions about the nature of selection on a recombination modifier.

The incongruity between the linkage-disequilibrium distribution and the epistasis distribution is due to the fact that the epistasis does not fully determine linkage disequilibrium. To this end we determined what linkage disequilibria and selection coefficients on the modifier can result from a given epistasis in a two-locus system. We found that epistatic interactions of the same strength can lead to very different linkage disequilibria and selection strengths on the modifier. In particular, interactions with weak selection on either single mutant or on the double mutant showed much stronger effects than interactions with strong selection on all mutants. This suggests that in a multilocus system the behavior of a modifier of recombination may be dominated by a small number of interactions, whereas the majority of interactions are irrelevant. In contrast to this disproportionate importance of weakly selected pairs, we found that the time a system requires to reach mutation–selection balance increases steeply as selection diminishes. This could imply that the system is governed by interactions with weak but not too weak selection.

To apply the above findings to systems containing more than one fitness interaction, we considered the simplest such model, *i.e*., two independent pairs linked to a modifier. We found that the selection coefficient on the modifier was given approximately by the sum of the selection coefficients for each pair if they were considered separately. Thus, in contrast to epistasis, averaging selection coefficients should provide useful information about the fate of a modifier. In particular this result implies that in systems containing two interactions, a modifier can select for increased recombination even if the average epistasis is positive.

Our results indicate that using the average level of epistasis can generate highly misleading predictions about the overall evolutionary force acting on a modifier of recombination. By summing the selection induced on a modifier of recombination over all pairs of interacting loci, we can generate more accurate predictions from empirical data on the fitness effects of single and double mutants. Whether these more accurate predictions will act in favor of or against the evolution of recombination is an open empirical question, as it depends on the covariance between single- and double-mutant effects. Using estimates of the selection coefficients from *S. cerevisiae* confirmed that averaging epistatic effects results in inaccurate estimates of the selection on a modifier of recombination. In this example, a single pair of weakly selected loci dominated all other fitness interactions. The strong influence of pairs of weakly selected loci has an additional important consequence for the interpretation of experimental data: As can be seen in Figures 3 and 5 the selection coefficient on the modifier depends very sensitively on *w*_{2,3} if selection is weak. This implies that small measurement errors in the fitness values of these weakly selected loci can have a strong impact on the estimated selection on the recombination modifier.

A large number of studies have attempted to draw conclusions on the evolution of recombination based on the distribution of epistasis (West *et al.* 1998; Elena 1999; de la Peña *et al.* 2000; Peters and Keightley 2000; Bonhoeffer *et al.* 2004). However, we have shown that to interpret experimental data in the context of the evolution of recombination, it is necessary to consider not only the distribution of epistatic interactions, but also the associated selection coefficients.

## APPENDIX: EFFECTS OF A SMALL RECOMBINATION RATE

For a two-locus/two-allele system with fitnesses (*w*_{1} = 1, *w*_{2}, *w*_{3}, *w*_{4}), recombination rate *r*, mutation rate *m*, and no back mutation the recursion relations read(see *Model* section). We obtain the steady-state equations by setting in the recursion relations.

For *r* = 0 the steady-state equations are simple to solve (see *Effects of epistasis in two-locus systems*) and yield the frequencies (for *m* < 1 − *w*_{2})with .

For *r* ≠ 0 the steady-state frequencies take the formwith *x* = *O*[1]. For small *r*, we insert this expression in the steady-state equations and keep only terms linear in *r* (there are no terms independent of *r*, because *f*^{0} solves the steady-state equations for *r* = 0). Because *x* occurs in the equations only as *x* × *r*, this results in a linear equation for *x*, which can easily be solved.

The change in linkage disequilibrium then readsand the change in mean fitness isIf we insert the solution for *x*, Δ*D* and become complicated functions of *w _{i}* and

*m*. However, by keeping only lowest-order terms in

*m*, we obtain the expressions for Δ

*D*and Δ

*w*referred to in

*Effects of epistasis in two-locus systems*.

## Acknowledgments

We thank Lucy Crooks, Marcel Salathe, and Martin Ackermann for helpful discussions. R.D.K. and S.B. gratefully acknowledge support from the Swiss National Foundation.

## Footnotes

Communicating editor: H. G. Spencer

- Received November 4, 2005.
- Accepted March 13, 2006.

- Copyright © 2006 by the Genetics Society of America