## Abstract

GC-biased gene conversion (gBGC) is a recombination-associated process mimicking selection in favor of G and C alleles. It is increasingly recognized as a widespread force in shaping the genomic nucleotide landscape. In recombination hotspots, gBGC can lead to bursts of fixation of GC nucleotides and to accelerated nucleotide substitution rates. It was recently shown that these episodes of strong gBGC could give spurious signatures of adaptation and/or relaxed selection. There is also evidence that gBGC could drive the fixation of deleterious amino acid mutations in some primate genes. This raises the question of the potential fitness effects of gBGC. While gBGC has been metaphorically termed the “Achilles' heel” of our genome, we do not know whether interference between gBGC and selection merely has practical consequences for the analysis of sequence data or whether it has broader fundamental implications for individuals and populations. I developed a population genetics model to predict the consequences of gBGC on the mutation load and inbreeding depression. I also used estimates available for humans to quantitatively evaluate the fitness impact of gBGC. Surprising features emerged from this model: (i) Contrary to classical mutation load models, gBGC generates a fixation load independent of population size and could contribute to a significant part of the load; (ii) gBGC can maintain recessive deleterious mutations for a long time at intermediate frequency, in a similar way to overdominance, and these mutations generate high inbreeding depression, even if they are slightly deleterious; (iii) since mating systems affect both the selection efficacy and gBGC intensity, gBGC challenges classical predictions concerning the interaction between mating systems and deleterious mutations, and gBGC could constitute an additional cost of outcrossing; and (iv) if mutations are biased toward A and T alleles, very low gBGC levels can reduce the load. A robust prediction is that the gBGC level minimizing the load depends only on the mutational bias and population size. These surprising results suggest that gBGC may have nonnegligible fitness consequences and could play a significant role in the evolution of genetic systems. They also shed light on the evolution of gBGC itself.

GC-BIASED gene conversion (gBGC) is increasingly recognized as a widespread force in shaping genome evolution. In different species, gene conversion occurring during double-strand break recombination repair is thought to be biased toward G and C alleles. In heterozygotes, GC alleles undergo a kind of molecular meiotic drive that mimics selection (reviewed in Marais 2003). This process can rapidly increase the GC content, especially around recombination hotspots (Spencer *et al*. 2006), and, more broadly, can affect genome-wide nucleotide landscapes (Duret and Galtier 2009a). For instance, it is thought to play a role in shaping isochore structure evolution in mammals (Galtier *et al*. 2001; Meunier and Duret 2004; Duret *et al*. 2006) and birds (Webster *et al*. 2006). Direct experimental evidence of gBGC mainly comes from studies in yeast (Birdsell 2002; Mancera *et al*. 2008; but see Marsolier-Kergoat and Yeramian 2009) and humans (Brown and Jiricny 1987). However, associations between recombination and the nucleotide landscape and frequency spectra biased toward GC alleles provide indirect evidence in very diverse organisms (Table 1).

The impact of gBGC on noncoding sequences and synonymous sites has been studied in depth, especially because of confounding effects with selection on codon usage (Marais *et al*. 2001). More recently, Galtier and Duret (2007) pointed out that gBGC may also interfere with selection when affecting functional sequences. They argued that gBGC could leave spurious signatures of adaptive selection and proposed to extend the null hypothesis of molecular evolution. Indeed, gBGC can lead to a ratio of nonsynonymous (*d*_{N}) over synonymous (*d*_{S}) substitutions above one (Berglund *et al*. 2009; Galtier *et al*. 2009), *i.e.*, a typical signature of positive selection (Nielsen 2005). This hypothesis has been widely debated for human-accelerated regions (HARs). These regions are extremely conserved across mammals but show evidence of accelerated evolution along the human lineage, which has been interpreted as evidence of positive selection (Pollard *et al*. 2006a,b; Prabhakar *et al*. 2006, 2008). On the contrary, other authors argued that patterns observed in HARs, such as the AT → GC substitution bias, the absence of a selective sweep signature, or the propensity to occur within or close to recombination hotspots, are more likely explained by gBGC rather than positive selection (Galtier and Duret 2007; Berglund *et al*. 2009; Duret and Galtier 2009b; but see also Pollard *et al*. 2006a who also suggested that gBGC might play a role in HARs evolution). It is thus crucial to take gBGC into account when interpreting genomic data.

Moreover, Galtier and Duret (2007) initially suggested that gBGC hotspots could contribute to the fixation of slightly deleterious AT → GC mutations and could represent the Achilles' heel of our genome. This hypothesis was reinforced later in primates, with evidence of gBGC-driven fixation of deleterious mutations in proteins (Galtier *et al*. 2009). A similar result was also found in some grass species, whose genomes are also supposed to be affected by gBGC (Table 1; Glémin *et al*. 2006). Haudry *et al*. (2008) compared two outcrossing and two selfing grass species and showed that GC-biased genes exhibit higher *d*_{N}/*d*_{S} ratio in outcrossing than in selfing lineages. The reverse pattern would be expected under pure selective models because of the reduced selection efficacy in selfers (Charlesworth 1992; Glémin 2007). This pattern is in agreement with a genomic Achilles' heel associated with outcrossing, while gBGC is inefficient in selfing species because they are mainly homozygous.

Twenty years ago, Bengtsson (1990) already pointed out that biased conversion can generally affect the mutation load. The mutation load is the reduction in the mean fitness of a population due to mutation accumulation, which could lead to population extinction if it is too high (Lynch *et al*. 1995). At this time, Bengtsson concluded that “it is impossible to know if biased conversion plays a major role in determining the magnitude of the mutation load in organisms such as ourselves, but the possibility must be considered and further investigated (Bengtsson 1990, p. 186).” Now, one can propose gBGC could be such a widespread biased conversion process. It thus appears timely to thoroughly investigate the fitness consequences of gBGC through its potential effects on the dynamics of deleterious mutations. The fitness consequences of gBGC were also pointed out as a major future issue to be addressed by Duret and Galtier (2009a). In addition to the load, deleterious mutations have many other evolutionary consequences (for review see Charlesworth and Charlesworth 1998). They are thought to be the main determinant of inbreeding depression, *i.e.*, the reduction in fitness of inbred individuals compared to outbred ones. They also play a key role in the evolution of genetic systems (sexual reproduction and recombination, inbreeding avoidance mechanisms, ploidy cycles), of senescence, or in the degeneration of nonrecombining regions, such as Y chromosomes. So far, we know little, if anything, about how gBGC might affect these processes.

In his seminal work, Bengtsson (1990) did not address several important points. First, he did not include genetic drift in his model. Nearly neutral mutations, for which drift and selection are of similar intensities, are the most damaging ones because they can drift to fixation, unlike strongly deleterious mutations that are maintained at low frequency (Crow 1993; Lande 1994, 1998). While gBGC intensities are rather weak (Birdsell 2002; Spencer *et al*. 2006), they could markedly affect the fate of nearly neutral mutations (see also Galtier *et al*. 2009). Second, Bengtsson did not study the effect of gene conversion on inbreeding depression, while he showed that recessive mutations, mostly involved in inbreeding depression, are the most affected by gene conversion. Third, he did not envisage systematic GC bias with its opposite effects on A/T and G/C deleterious alleles. Fourth, while he noted that selfing affects both the efficacy of selection and that of conversion, he did not fully investigate the effect of mating systems. On one hand, selfing is efficient in purging strongly deleterious mutations causing inbreeding depression. However, since selfing is expected to increase drift, weakly deleterious mutations can fix in selfing species, contributing to the so-called “drift load” (Charlesworth 1992; Glémin 2007). Self-fertilizing populations are thus expected to exhibit low inbreeding depression and high drift load. On the other hand, gBGC, and thus its cost, vanishes as the selfing rate and homozygosity increase (Marais *et al*. 2004). gBGC could thus challenge classical views on mating systems and it was even speculated that gBGC could affect their evolution (Haudry *et al*. 2008).

Here I present a population genetics model that includes mutation, selection, drift, and gBGC, which extends previous studies (Gutz and Leslie 1976; Lamb and Helmi 1982; Nagylaki 1983a,b; Bengtsson 1990). I specifically examine how gBGC can affect inbreeding depression and the mutation load. I also focus on the effect of mating system, which is especially interesting with regard to the interaction between biased conversion and selection. Finally, I discuss how these results could give insight into how gBGC evolved.

## MODEL AND RESULTS

#### General formulation:

Consider a single biallelic locus with weak (*W* = A or T) and strong (*S* = G or C) alleles. The life cycle is as follows: *N* diploid adults produce gametes after mutation and conversion events. Heterozygote individuals produce *S* alleles with probability 1/2(1 + *b*) and *W* alleles with probability 1/2(1 – *b*), where *b* is the gBGC coefficient (disparity coefficient *sensu* Nagylaki 1983a). Fertilization occurs and zygotes experience selection to give adults, which experience regulation to *N* individuals. If the *S* (resp. *W*) allele is deleterious, the relative fitnesses of *WW* (resp. *SS*), *WS*, and *SS* (resp. *WW*) genotypes are 1, 1 − *hs*, and 1 − *s*, respectively, where *s* is the selection coefficient and *h* the dominance coefficient. The *W* allele mutates at rate *u* to the *S* allele. The reverse mutation occurs at rate *v* = λ*u.* λ is the mutation bias from *S* to *W* alleles, which ranges from 2 to 4.5 in many different organisms (Lynch 2007). I assume that *u* and *v* are much smaller than the selection and conversion coefficients (*u*, *v* ≪ *s*, *hs*, and *b*). For simplicity, I also assume that the gBGC intensity is constant. However, strong gBGC events are thought to be associated with short-lived recombination hotspots, at least in humans (McVean *et al*. 2004; Myers *et al*. 2005). I thus implicitly assume that gBGC/selection dynamics are shorter than the recombination hotspot lifespan. The validity of this assumption is discussed below. All parameters are summarized in Table 2.

#### Deterministic equilibrium—three selection regimes:

Analysis of the deterministic case (*N* = ∞) highlights the different behaviors of the model. Bengtsson (1990) gave the full equation of the deterministic change in allele frequency. Under the assumption of weak selection and weak gBGC (*s* ≪ 1 and *b* ≪ 1), changes due to mutation, selection, and gBGC can be computed separately. Note that a full treatment of the system, without approximations, is given by A. Popa, S. Glemin, E. Popa, D. Mouchiroud, and C. Gautier (unpublished data). The deterministic change in frequency of the deleterious allele (*x*) can thus be approximated by(1)withandwhere is the mean fitness of the population and *F* is Wright's fixation index, *F*_{IS}.

Note that in Δ*x*_{selection}, the approximation holds either for *s* ≪ 1 or *x* ≪ 1. So it is still valid for lethal alleles because *x* ≪ 1. For weak selection, the *F* coefficient can be used for neutral loci, that is, *F* = σ/(2 − σ), where σ is the selfing rate (Caballero and Hill 1992). For strong selection, *F* depends on selfing rate and selection parameters (Glémin 2003). Given the approximation for Δ*x*_{selection}, the effect of gBGC is equivalent to modifying the selection and dominance coefficients as follows:(2)Conventional population genetics results thus hold using this new parameterization. In panmixia (*F* = 0), gBGC is thus fully equivalent to modifying selection patterns. However, under inbreeding, the analogy is not complete because selection and dominance coefficients depend on *F*.

Equilibrium frequencies are then obtained by solving Δ*x*=0 (see appendix a). The case for which *W* is the deleterious allele is straightforward because both gBGC and selection favor the *S* allele. The deleterious allele is maintained at a lower frequency than under mutation/selection equilibrium (see also Bengtsson 1990):(3)The first case (with *S* being the deleterious allele), the one underlying the Achilles' heel hypothesis, is much less trivial. It leads to three selection regimes (Figures 1 and 2). If gBGC is weak relative to selection (*b* ≪ *s*), the equilibrium frequency is(4a)The deleterious *S* allele is maintained at low frequency, but slightly higher than at mutation/selection equilibrium. On the contrary, if gBGC is strong (*b* ≫ *s*), the deleterious allele is close to fixation:(4b)Finally, if gBGC and selection are of similar intensity, the two forces can maintain the deleterious allele at intermediate frequency:(4c)The condition that ensures the existence of this last equilibrium point is (*h* + *F* − *hF*)*s*/(1 − *F*) ≤ *b* ≤ (1 − *h* + *hF*)*s*/(1 − *F*), and analysis of the derivative shows that this equilibrium is stable between the two limits (Figure 2). In panmixia, this case is fully equivalent to overdominance. Below a given selfing threshold, it is well known that the equilibrium is not stable if overdominance is asymmetrical (Kimura and Ohta 1971). However, in the present case, the equilibrium is always stable because equivalent selection parameters depend on *F* and the range of the overdominant-like domain narrows as inbreeding increases (Figure 2). In the special case of fully recessive alleles in panmixia (*F* = 0),(4d)which vanishes to the well-known equilibrium without gBGC (Haldane 1937). As long as gBGC is small (*b* < 0.1), Equation 4d also holds for lethal alleles and gives results very close to those of Bengtsson (1990); also A. Popa, S. Glemin, E. Popa, D. Mouchiroud, and C. Gautier, unpublished results.

#### Dynamics in finite populations and time properties:

In finite populations, the first two moments of the stationary probability distribution of allele frequency are required to compute the mean load and mean inbreeding depression (*e.g.*, Glémin *et al*. 2003). The work of Nagylaki (1983a) can be extended to incorporate the effect of dominance and inbreeding using the classical diffusion theory and the reparameterization given by (2). The Wright (1937) stationary distribution is given by(5a)if *S* is the deleterious allele and(5b)if *W* is the deleterious allele, where *K* is an integration constant, ensuring that and , where *N* is the effective population size of the equivalent population without inbreeding (Pollak 1987; Nordborg 1997), and α (α = 1 in panmixia; 0 < α ≤ 1 under inbreeding) is a coefficient introduced to summarize the additional reduction in effective population size due to hitchhiking and bottleneck effects associated with inbreeding (Glémin 2007).

There is no general analytical solution for *K*. However, for the special case of genic selection in panmixia (*F* = 0 and ), the moments of Φ are(6a)if *S* is the deleterious allele and(6b)if *W* is the deleterious allele, where Γ is the gamma function and _{1}*F*_{1} is the regularized confluent hypergeometric function (Abramowitz and Stegun 1970). Analytical expression cannot be given in the general case, and numerical integrations are used (see appendix b).

The probability of fixation of a new mutation is also useful (see appendix a). General analytical expressions can be obtained but they are formidable and not given here. However, the probability of fixation of a new mutation under genic selection in panmixia simply reduces to(7a)if *S* is the deleterious allele and(7b)if *W* is the deleterious allele. Approximations hold for |*s* − *b*| ≪ 1 and |*s* + *b*| ≪ 1, respectively (see also Nagylaki 1983a).

The last quantities of interest are sojourn and fixation times. Time properties are needed to evaluate whether, on average, the allele life span is much shorter than the recombination hotspot life span. This is implicitly assumed by modeling a constant gBGC intensity. In the panmictic codominant case, the absorption time has been already derived by Nagylaki (1983a). Conversion only modifies the selection intensity (genic selection), and *S* deleterious alleles quickly lead to either extinction or fixation. The mean absorption time of a new mutant is always shorter than that under pure genetic drift (see Equations 28–33 in Nagylaki 1983a). On the contrary, as the deterministic analysis showed, gBGC can maintain recessive deleterious alleles at intermediate frequencies for a long time by mimicking overdominance. Conventional diffusion results can also be applied (*e.g.*, Ewens 2004) to compute the mean sojourn and fixation times of a new mutation (see appendix a). In the general case, these expressions have no analytical expression and numerical integrations are performed (see appendix b). However, under the “overdominance-like” regime, deleterious alleles rarely reach fixation. It is thus more interesting to compute the mean time until the mutant allele reaches the deterministic equilibrium frequency (*x*_{eq}) for the first time. I used the expression of Kimura and Ohta (1973) that gives the mean time to reach a given frequency for the first time, starting from frequency *p*_{0} < *x*_{eq}. For fully recessive mutations (*h* = 0) in panmixia, the equilibrium frequency is well approximated by *x*_{eq} = *b*/(*b* + *s*) (Equation 4d with *s* > *b* ≫ *u*). Once again, no analytical expression is available and numerical integrations are performed. Finally, under the overdominance-like regime, the allele can persist for a long time if it is not initially lost (*e.g.*, Takahata 1990). To evaluate this conditional persistence, I also computed the mean sojourn time of an allele starting from frequency *x*_{eq}. The numerical results are presented in Tables 3 and 4].

On average, gBGC makes *S* deleterious alleles persist longer in populations. However, the time to reach the deterministic equilibrium frequency for the first time is still short. Therefore, computing the equilibrium frequency on the basis of constant *b* is reliable despite the short life span of recombination hotspots. On the contrary, once the allele reaches the equilibrium frequency it will persist for a very long time. If it is not initially lost, a deleterious *S* allele will thus persist in the population over the hotspot life span.

#### Impacts of gBGC on the mutation load:

I now consider how gBGC affects the mutation load. The load is defined as(8)Here we need to consider both sites for which either *S* or *W* alleles are deleterious, in proportions *q* and 1 − *q*, respectively, so that(9)For nonsynonymous positions and noncoding functional regions, it seems reasonable to assume that . However, for synonymous positions under selection for codon usage, *q* will depend on the distribution of preferred codons. As GC ending codons are often more preferred than AT ending ones (Duret and Mouchiroud 1999; Wang and Roossinck 2006), *q* can be close to 0.

##### gBGC and the load structure in panmictic populations:

In infinite panmictic populations, without gBGC, the mutation load is equal to twice the mutation rate (Haldane 1937). In finite populations, deterministic results hold for strongly deleterious mutations (*s* ≫ 1/*N*_{e}). Slightly deleterious alleles (∼*s* ≤ 1/*N*_{e}) can drift to fixation, causing a much higher load (“fixation load”) than that caused by strongly deleterious mutations segregating at low frequency (“segregating load”), *i.e.*, ∼*L* ≈ *s* (Figure 3a and see Kimura *et al*. 1963). Consequently, the relative part of the fixation load (or “drift load,” see the distinction below) and the segregating load depends on both the effective population size and the distribution of selection coefficients.

gBGC affects both the magnitude and the structure of the mutation load. In infinite populations, and more generally for strongly deleterious alleles (*N*_{e}*s* ≫ 1), the deterministic load is obtained by replacing *x* by *x*_{eq}, given by Equations 4 in Equations 8 and 9:(10a)(10b)(10c)Under the first condition (*b* ≪ *hs*), gBGC increases the load if *b* > *hs*(1−2*q*/(*q* + λ − *q*λ)). Under the other conditions, the load due to *W* deleterious mutations is negligible and gBGC always increases the load. Using λ = 2 and , gBGC increases the load due to mutations for which *hs* < 3*b*. Given that *b* is quite low (Spencer *et al*. 2006), gBGC increases the load only due to rather weakly deleterious mutations.

In finite populations, the load must be integrated over the Φ distribution, which leads to(11)*E*_{Φ}[x] and *E*_{Φ}[x^{2}] can be computed numerically (appendix b) and introduced in (11).

For genic selection () in panmixia (*F* = 0), the approximations(12a)and(12b)can be obtained (see appendix c), where *L*_{det} is the deterministic load given by Equations 10. Interestingly, these approximations are very robust and hold for a wide range of parameters (see appendix d for numerical validation). The analysis of Equations 12 and numerical explorations show that (i) if *N*_{e}*b* < 1, the effect of gBGC is negligible (but see below) and the mutations maximizing the load (either *W* or *S*) are of the order of *s* ≈ 1/*N _{e}*, and (ii) if

*N*

_{e}

*b*> 1, the mutations maximizing the load are

*S*mutations with

*s*≈ 2

*b*(see Figure 3 and appendix c). gBGC increases the fraction of the fixation load beyond the drift load

*sensu stricto*,

*i.e.*, the load due to deleterious mutations fixed by drift only.

##### Minimum load:

As explained above, if mutations are biased toward *W* alleles, under certain circumstances, gBGC can decrease the load (see appendix c). A minimum is thus reached for low gBGC levels. In panmictic populations, the gBGC level that minimizes the segregating load is(13)While gBGC can greatly increase the fixation load, a very low gBGC level can also reduce it, and the gBGC level that minimizes the fixation load is(14)Both expressions clearly show that gBGC can minimize the total load only if λ > 1; *i.e.*, the mutation is biased toward *W* alleles. Interestingly, the gBGC level that minimizes the fixation load does not depend on *s* but instead only on *N*_{e} and λ. Since (13) depends on *h* and *s*, it is hard to predict which gBGC level will minimize the load for a given distribution of *h* and *s*. However, gBGC will greatly increase the fixation load as far as *b* > ln(λ)/2*N _{e}* (see appendix c). Since in most cases, is thus a very good approximation of the gBGC level that minimizes the total load, whatever the distribution of mutational effects is. This rationale was confirmed by numerical analyses (appendix d).

##### Quantitative effects of gBGC on the load:

To quantitatively assess the theoretical effects predicted above, I computed the total load in a genome experiencing a 3% gBGC hotspot with *b* = 0.0002 on average [according to estimates in humans (Spencer *et al*. 2006), and see also Galtier *et al*. 2009] and exposed to a flow of deleterious mutations with a gamma-shaped distribution of selection coefficients with mean *s*_{mean} = 0.0325 and shape parameter = 0.23 [according to estimates in humans (Eyre-Walker *et al*. 2006), and see also Galtier *et al*. 2009]. For simplicity, fitness is assumed to be multiplicative across loci. However, recombination and conversion are not distributed similarly in all organisms. For instance, recombination is probably not localized in hotspots in *Caenorhabditis elegans* (Rockman and Kruglyak 2009). I thus also tested more homogenous and more heterogeneous gBGC distributions. To make the different cases comparable, I based my computations on a constant total amount of gBGC (*b*_{total} = 0.0002 × 0.03 = 6 × 10^{−6}). In addition to the first case (3% hotspots with *b* = 0.0002), I assumed that gBGC was homogeneously distributed over the whole genome, distributed over half of the genome, and in 0.3% hotspots with *b* = 0.002. For simplicity, all mutations are assumed to be codominant (), and the genomic mutation rate is set at *U* = 0.01 (see appendix b). The results are presented in Table 5. They show that in rather large populations, gBGC hotspots as low as 3% can increase the load two- or threefold. Even in small populations (*N* = 10,000, similar to effective size in humans), rare gBGC hotspots can increase the load by almost 10%. However, this analysis also shows that weak gBGC throughout the genome has a weak effect or can even reduce the load by compensating for the mutational bias. This occurs when *b* < ln(λ)/2*N _{e}*, as predicted above.

#### Impacts of gBGC on inbreeding depression:

Inbreeding depression is defined as the reduction in fitness of selfed (and more generally inbred) individuals compared to outcrossed individuals,(15)where and are the mean fitness of outcrosses and selfcrosses, respectively (Charlesworth and Charlesworth 1987; Charlesworth and Willis 2009). The approximation is very good in most conditions, because under weak (*s* ≪ 1) and strong selection (*x* ≪ 1) (see Glémin *et al*. 2003). Similar to the load, considering both sites for which either *S* or *W* alleles are deleterious, in proportion *q* and 1 – *q*, respectively, we get(16)

##### gBGC and the genetic basis of inbreeding depression in panmictic populations:

In infinite panmictic populations without gBGC, inbreeding depression depends only on mutation rates and dominance levels. Partially recessive mutations () contribute only to inbreeding depression, and the more recessive they are, the higher the inbreeding depression (Charlesworth and Charlesworth 1987). In finite populations, deterministic results hold for strongly deleterious mutations (*s* ≫ 1/*N*_{e}), which contribute mostly to inbreeding depression. Contrary to the load, weakly deleterious mutations (∼*s* ≤ 1/*N*_{e}) contribute little to inbreeding depression (Figure 4, a and c, and see Bataillon and Kirkpatrick 2000).

Like the load, gBGC affects both the magnitude and the structure of inbreeding depression. In infinite populations, and more generally for strongly deleterious alleles (*N*_{e}*s* ≫ 1), replacing *x* by *x*_{eq} given by Equations 4 in Equations 15 and 16 leads to(17a)(17b)(17c)The effect of gBGC on inbreeding depression is not monotonic. Like the load, gBGC increases inbreeding depression if *b* > *hs*(1 − 2*q*/(*q* + λ − *q*λ)). However, contrary to the load, a strong gBGC decreases inbreeding depression, which tends to 0 as *b* increases, while the load tends to *qs* (Equation 10c). An analysis of Equation 17b shows that mutations that maximize inbreeding depression are those that also maximize the load, *i.e.*, *S* deleterious mutations with *s* ≈ 2*b*.

In finite populations, inbreeding depression must be integrated over the Φ distribution, which leads to(18)(see also Glémin *et al*. 2003). While it is not possible to get an analytical expression of (18), numerical computations (see appendix b) show that *S* deleterious mutations with *s* ≈ 2*b* also maximize inbreeding depression in finite populations (Figure 4). More broadly, inbreeding depression is maximal under the overdominant-like selection regime (gray area in Figure 2). Once again, even low to moderate gBGC markedly affects the genetic structure of inbreeding depression. First, mutations of intermediate effects contribute the most to inbreeding depression, *i.e.*, up to one order of magnitude higher than strongly deleterious mutations (compare Figure 4a with 4b). Second, even nearly additive mutations can have a substantial effect (compare Figure 4c with 4d).

Since little is known about the distribution of dominance coefficients, especially the dominance of mildly deleterious mutations (of the order of *b*), it is difficult to quantitatively predict the full impact of gBGC on inbreeding depression. We can conclude that, on average, gBGC should increase inbreeding depression. However, further insight into mutational parameters is crucial to assess the quantitative impact of gBGC.

#### Joint effect of gBGC and mating system on the load and inbreeding depression:

Selfing, or more generally inbreeding, slightly reduces the segregating load through the purging of recessive mutations (Ohta and Cockerham 1974), but can substantially increase the fixation load because of the effective population size reduction under inbreeding: (see above and Pollak 1987; Nordborg 1997; Glémin 2007). In numerical examples, I assumed that α decreases with *F* according to the background selection model (Charlesworth *et al*. 1993; Nordborg *et al*. 1996), as in Glémin (2007). With gBGC, selfing thus has two opposite effects on the fixation load. Selfing increases the drift load *sensu stricto* but decreases the fixation load due to gBGC. A surprising consequence is that the load can be higher in outcrossing than in selfing populations (Figure 5). Quantitatively this is also expected, even with a gBGC hotspot affecting just 3% of the genome (Table 5). Another consequence is that a minimum load can be reached for intermediate selfing rates (Figure 5 and Table 5).

Generally, the effect of selfing is simpler for inbreeding depression. Purging, *N*_{e} reduction, and suppression of gBGC contribute to decreasing inbreeding depression in selfing populations (Figure 6a). However, there are special cases in which maximum inbreeding depression is reached for intermediate selfing rates (Figure 6b). In such cases, in outcrossing populations, gBGC is strong enough to sweep polymorphism out and reduce inbreeding depression (*b* > *s*, regime 1 in Figure 2). As the selfing rate increases, gBGC declines, and the selection dynamics become overdominant-like (regime 2, Figure 2), thus maximizing inbreeding depression. For high selfing rates, gBGC vanishes (regime 3 in Figure 2) and deleterious alleles are either purged or fixed if there is substantial drift. This is similar to the effect of selfing on inbreeding depression caused by asymmetrical overdominance, where inbreeding depression also peaks for intermediate selfing rates (Ziehe and Roberds 1989; Charlesworth and Charlesworth 1990). In the present case, the range of parameters leading to this peculiar behavior is narrow because the overdominant-like region depends on the selfing rates and can vanish either for low or for high selfing rates (Figure 2).

## DISCUSSION

I analyzed the effect of gBGC on the fate and consequences of deleterious mutations by extending previous theoretical studies to also include population processes, such as drift and inbreeding, and the whole range of mutations, including direction (*W* → *S* and *S* → *W*), strength, and dominance. gBGC affects *S* and *W* deleterious alleles in opposite ways. However, except in special cases, these opposite effects do not cancel out because *S* alleles are much more strongly affected than *W* alleles. Moreover, while significant gBGC levels are likely limited to recombination hotspots [only 3% of the genome in humans (Myers *et al*. 2005)], I showed that both the quantitative and the qualitative effects of gBGC on the load and inbreeding depression are far from being negligible.

#### Deleterious impacts of gBGC:

##### Segregation of deleterious alleles and genetic diseases:

Depending on selection and dominance levels, gBGC can slightly increase the frequency of deleterious *S* alleles before their elimination, drive them to fixation, or maintain them at intermediate frequencies for long periods of time in an overdominant-like way (Figure 2). In particular, the frequency of recessive alleles, even strongly deleterious ones, can be markedly increased by gBGC. Even moderate gBGC intensities can thus significantly affect lethal alleles. For instance, taking *u* = 10^{−8} and *b* = 0.0002 as estimated in human recombination hotspots (Myers *et al*. 2005), the frequency of a lethal allele can be more than doubled (see Equation 4d). On the contrary, gBGC helps in purging *W* deleterious alleles. We thus expect that (i) genes near recombination hotspots are more likely involved in genetic diseases and (ii) *S* alleles contribute disproportionately to genetic diseases, especially those found in appreciable frequencies. These predictions were recently confirmed through a genome-wide survey of disease-related mutations (A. Necsulea, A. Popa, D. N. Cooper, P. D. Stenson, D. Mouchiroud, C. Gautier, and L. Duret, unpublished results).

However, this kind of disease allele will be probably hard to identify individually through population genomic approaches. While formally equivalent in a one-locus model, recessive deleterious alleles balanced by gBGC and “true” balancing selection should leave different signatures at the molecular level. Because conversion tracts are likely only a few hundred base pairs long and mainly occur in recombination hotspots, only very short haplotypes are expected to be dragged up at intermediate frequency. It thus should be difficult to detect such alleles maintained in the population for a long time, contrary to other balancing selection mechanisms (Charlesworth 2006a).

##### Fraction of the load due to gBGC:

Through its effect on the fate of deleterious alleles, gBGC affects both the magnitude and the architecture of the genetic load (Figure 7). Under realistic conditions of mutation bias toward AT (λ > 1), gBGC slightly decreases the segregating load due to strongly deleterious mutations, but markedly increases the fixation load due to weakly deleterious mutations. For *S* mutations, the limit between the two mutation categories is mainly independent of the effective population size but depends on the intensity of gBGC. On the contrary, *W* mutations behave mainly as without gBGC and the limit between the two mutation categories is still ∼ *s* ≈ 1/2*N _{e}*. Quantitatively, the effect of gBGC on the segregating load is rather weak. The main effect of gBGC is thus to increase the fixation load by the proportion of mutations spanning ∼1/2

*N*<

_{e}*s*< 2

*b*. The paradoxical consequences are that (i) the fraction of the fixation load induced by gBGC becomes roughly independent of population size and (ii) the proportion of the load due to gBGC increases with effective population size.

Quantitatively, gBGC can have a significant effect. Assuming only 3% the genome experiencing gBGC hotspots of average intensity *b* = 0.0002, gBGC can increase the load by two- to threefold in large populations (Table 5). In humans (using *N* = 10,000), gBGC could contribute to 10% of the load. However, this is only a rough estimate based on selection parameters estimated from coding regions. On one hand, recombination seems to be depressed within genes (Myers *et al*. 2005; Coop *et al*. 2008), which should reduce the load due to gBGC. For instance, if gBGC is twofold lower within genes than around them (*b* = 0.0001), only 5% of the load would be due to gBGC (data not shown). On the other hand, noncoding functional regions can also contribute to the load, especially in gene control regions (Keightley *et al*. 2005; Eory *et al*. 2010), where recombination can be high (Coop *et al*. 2008). Despite these uncertainties, gBGC may have fundamental fitness implications, not merely practical consequences for the detection of selection at the molecular level (Berglund *et al*. 2009; Galtier *et al*. 2009). These results confirm the pertinence of the “genomic Achilles' heel” metaphor proposed by Galtier and Duret (2007).

##### Fixation load and heterosis:

The load structure is also crucial for predicting heterosis, the increase in fitness of crosses between populations or lines compared to parents. Heterosis is created and proportional to among-population variance in deleterious allele frequencies (Whitlock *et al*. 2000; Glémin *et al*. 2003). Without gBGC, mutations contributing to heterosis also contribute to the drift load, and heterosis is mainly expected in small and isolated populations (Whitlock *et al*. 2000; Glémin *et al*. 2003). As discussed above, gBGC generates the fixation load in large populations, especially for *s* ∼ 2*b* (see above). In addition, under these conditions, the variance in allele frequencies is also high. gBGC compensates for selection and deleterious alleles behave almost neutrally. The variance in allele frequency is thus roughly that expected under mutation–drift equilibrium, *V*_{Φ}[*x*] = λ /(1 + λ)^{2}(1 + 4*Nu*(1 + λ)) (Wright 1937), which can be high, even in large populations. This is in sharp contrast with the low variance expected without gBGC, *V*_{Φ}[*x*] ≈ *u*/*hs*(1 + 4*Nhs*) (Bataillon and Kirkpatrick 2000). While this rationale should be confirmed by detailed analyses of the interaction of selection and gBGC in subdivided populations, it leads to the surprising conclusion that gBGC could create heterosis in large populations.

#### Minimum load and the evolution of gBGC and recombination landscapes:

Although gBGC may have deleterious fitness consequences, it is surprising that it evolved in many taxa (Table 1 and reviewed in Duret and Galtier 2009a). Birdsell (2002) initially suggested that gBGC may have evolved as a response to mutational bias toward AT (λ > 1, here). Indeed, I show that a minimum load is reached for weak gBGC (*b* ≈ ln(λ)/4*N*, Equation 14). This result is very general whatever the distribution of fitness effects of mutations (appendix d). However, the range of optimal gBGC is narrow, and gBGC increases the load as far as *b* > ln(λ)/2*N* (appendix c). In humans, using *N* = 10,000 and λ = 2, gBGC levels that minimize the load are ∼1.17 × 10^{−5}, *i.e.*, one order of magnitude lower than the average bias observed in recombination hotspots (Myers *et al*. 2005). However, selection on conversion modifiers will not necessarily minimize the load because of gametic disequilibrium generated between modifiers and fitness loci (Bengtsson and Uyenoyama 1990). Selection for limitation of somatic AT-biased mutations could also have selected for GC-biased mismatch repair machinery (Brown and Jiricny 1987). If the bias level that would be selected for somatic reasons is >ln(λ)/2*N*, a side effect would be the generation of a substantial load at the population level. Finally, it is interesting to note that when synonymous codon positions are under selection for translation accuracy, optimal gBGC levels can be higher than gBGC levels that minimize the protein load, especially when most optimal codons end in G or C ().

Conversely, gBGC could also affect the evolution of recombination landscapes, which could evolve to reduce the gBGC load. Surprisingly, for a given recombination/conversion level, the hotspot distribution does not appear to be optimal (Table 5). Homogenous but lower gBGC has little effect on the load and could even reduce it by offsetting the mutational bias, provided that *b* < ln(λ)/2*N*. However, in species where recombination mainly occurs at hotspots, such as in humans, mice, and yeasts (Nishant and Rao 2005), one can speculate that the hotspot localization outside genes could be a response to avoid the deleterious effects of gBGC.

Up to now, these verbal arguments have not been assessed theoretically (but see Bengtsson and Uyenoyama 1990 for a different kind of conversion bias). Population genetics models are necessary to test these hypotheses concerning the evolution of gBGC and recombination landscapes and to pinpoint the key parameters that might govern their evolution.

#### gBGC and the evolution of mating systems:

Deleterious mutations also play a crucial role in the evolution of mating systems. They are the main source of inbreeding depression, which balances the automatic advantage of selfing. The drift load is also thought to contribute to the extinction of selfing species. Since they are mainly homozygous, selfing species are mostly free from gBGC and its deleterious impacts. I discuss below how this might affect the evolution of mating systems.

##### Inbreeding depression and the shift in mating systems:

Inbreeding depression plays a key role in the evolution of mating systems (Charlesworth and Charlesworth 1987; Charlesworth 2006b). Since it balances the automatic advantage of selfing, high inbreeding depression favors outcrossing, while selfing can evolve when it is low. Moreover, selfing helps to purge strongly deleterious mutations, thus decreasing inbreeding depression. This positive feedback reinforces the disruptive selection on the selfing rate and prevents the transition from selfing to outcrossing (Lande and Schemske 1985).

Theoretical results suggest that, in most conditions, gBGC would reinforce inbreeding depression in outcrossing populations (Figure 6), which would prevent the evolution of selfing. In reverse, if selfing is initially selected for, recurrent selfing would reduce the load through both purging and avoidance of gBGC. Under this scenario, gBGC would reinforce disruptive selection on mating systems. However, under some conditions (see Figure 6), inbreeding depression peaks at intermediate selfing rates, as observed for asymmetrical overdominance (Ziehe and Roberds 1989; Charlesworth and Charlesworth 1990). In theory, this could prevent the shift toward complete selfing and maintain stable mixed mating systems (Charlesworth and Charlesworth 1990; Uyenoyama and Waller 1991). However, this pattern is observed under restrictive conditions and it is very unlikely on the whole-genome scale. Dominance patterns are crucial for predicting inbreeding depression, especially with gBGC. Contrary to the load, it is thus difficult to evaluate the quantitative impact of gBGC on inbreeding depression. However, increased inbreeding depression in outcrossing species subject to gBGC seems to be the most likely scenario.

##### gBGC and the long-term evolution of mating systems:

In the long term, the gBGC-induced load also challenges the “dead-end hypothesis,” which posits that, because of the reduction of selection efficacy, self-fertilizing species would accumulate weakly deleterious mutations in the long term, eventually leading to extinction (Takebayashi and Morrell 2001). Because of gBGC, not drift, outcrossing species could also accumulate a load of weakly deleterious mutations (Figure 7), and they could suffer from a higher load than highly self-fertilizing species (Table 4). In line with this hypothesis, Haudry *et al*. (2008) found that in two outcrossing grass species, but not in two self-fertilizing ones, the *d*_{N}/*d*_{S} ratio is significantly higher for genes exhibiting GC enrichment. They speculated that substitutions in these genes might contribute to increasing the load in these two outcrossing grass species. Such results are still very sparse. In plants, evidence of strong gBGC is mainly restricted to grasses (but see Wright *et al*. 2007). It will be necessary to conduct more in-depth studies to assess the phylogenetic distribution of gBGC in plants and other hermaphrodite organisms and to further test the genomic Achilles' heel hypothesis in relation to mating systems. While theoretically possible, the quantitative effect of gBGC on the evolution of mating systems remains a new, open, and challenging question.

#### Conclusion:

I showed that the interaction between gBGC and selection might have surprising qualitative consequences on load and inbreeding depression patterns. Given the few quantitative data available on gBGC levels and selection intensities (mainly in humans), it turns out that even weak genome-wide gBGC can have significant fitness impacts. gBGC should be taken into account not only for sequence analyses (Berglund *et al*. 2009; Galtier *et al*. 2009), but also for its potential fitness consequences, for instance concerning genetic diseases. Interferences between gBGC and selection also give rise to new questions on the evolution of mating systems. However, most of the challenging conclusions given here have yet to be quantitatively evaluated. Quantification of gBGC and its interaction with selection in various organisms will be crucial in the future.

## APPENDIX A: GENERAL DERIVATIONS

#### Analysis of the deterministic model:

Equation 1 gives the change in allele frequency due to mutation, selection, and gBGC, as follows:(A1a)if *S* is deleterious and(A1b)if W is deleterious. If *W* is deleterious, both selection and gBGC maintain the deleterious alleles at low frequency, so that *x* ≪ 1. Linearizing Δ*x* around 0 and neglecting back mutation gives Δ*x* ≈ λ*u* − *b*(1 − *F*)*x* − *sx*(*h* + *F* − *hF*), which leads to (3). If *S* is the deleterious allele, this leads to three selection regimes. If gBGC is weak compared to selection (*b* ≪ *s*), *x* ≪ 1, and the linear approximation gives Δ*x* ≈ *u* + *b*(1 − *F*)*x* − *sx*(*h* + *F* − *hF*), which leads to (4a). If gBGC is strong (*b* ≫ *s*), *x* is very close to 1. Linearizing Δ*x* around 1 and neglecting direct mutation gives Δ*x* ≈ λ*u* + *b*(1 − *F*)(1 − *x*) − *s*(1 − *x*)(1 − *h* + *hF*), which leads to (4b). Finally, if gBGC and selection are of similar intensity, the two forces can maintain the deleterious allele at intermediate frequency and mutations can be neglected, Δ*x* ≈ *x*(1 − *x*)(*b*(1 − *F*) − *s*)*h* + *F* − *hF* + (1 − *F*)(1 − 2*h*)*x*)), which leads to (4c). In the special case of fully recessive alleles in panmixia (*F* = 0), using *x* ≪1, Δ*x* can be approximated by Δ*x* ≈ *u* + λ*ux* + *bx* − *sx*^{2}, which leads to (4d).

#### Analysis of the deterministic model:

In finite populations, the Wright (1937) stationary distribution is given by(A2)where *M*_{x} = Δ_{x}, given by Equation 1, and *V _{x}* =

*x*(1 −

*x*)/2

*N*

_{e}are the infinitesimal mean and variance of changes in allele frequency.

*K*is an integration constant, ensuring that and

*N*

_{e}= α(

*N*/(1 +

*F*)). Integration of (A2) leads to (6a) and (6b).

To compute the probability of fixation, let us note(A3)where *M _{x}* = Δ

*x*

_{conversion}+ Δ

*x*

_{selection}and

*V*is the same as in (A2), and the probability of fixation starting from frequency

_{x}*p*

_{0}is(A4)(Kimura 1962).

For a new mutation (*p*_{0} = 1/2*N*) under genic selection in panmixia, (A4) simply reduces to (7a) and (7b). General analytical expressions for *P*_{fix} can be obtained but they are formidable and not given here.

The mean sojourn and fixation times of a new mutation can be obtained using classical diffusion results (*e.g*., Ewens 2004),(A5)and(A6)where with *G* defined in (A3), and *p*_{0} = 1/2*N*.

The mean time until the mutant allele reaches the deterministic equilibrium frequency, *x*_{eq}, for the first time is given by the Kimura and Ohta (1973) expression, which gives the mean time to reach a given frequency for the first time, starting from frequency *p*_{0} < *x*_{eq},(A7)where.

For fully recessive mutations in panmixia, the equilibrium frequency is well approximated by *x*_{eq} = *b*/(*b* +*s*) (Equation 4c). Under the overdominance-like regime, if the allele is not initially lost, it can persist for a long time (*e.g*., Takahata 1990). To evaluate this conditional persistence, I also computed the mean sojourn time of an allele starting from frequency *x*_{eq}, using *p*_{0} = *x*_{eq} in (A5).

In the general case, these expressions have no analytical expression and numerical integrations were performed (see appendix b).

## APPENDIX B: NUMERICAL INTEGRATIONS

#### Stationary distribution and time properties:

Because Φ(*x*) given by Equations 5a and Equations 5b can diverge in 0, 1, or both, I adapted Kimura *et al*.'s (1963) quadrature method to facilitate the numerical integration of *E*_{Φ}[*x*^{i}]. The principle is given for *S* deleterious alleles.

Let *q*_{i}(*x*) = *x*^{4(αN/(1 + F))u − 1 + i} and *r*(*x*) = (1 − *x*)^{4(αN/(1 + F))λu − 1}*e*^{−2(αN/(1 + F))(s(1 + F) − 2b(1 − F))}, which verifiesLetandNoting ϕ(*x*) = Φ(*x*)/*K*, we thus haveandDue to this manipulation, the functions that are numerically integrated do not diverge. Numerical integrations were done using the NIntegrate function of the Mathematica software (Wolfram 1996), setting the maximum recursion at 50, working precision at 15, and other options at default values.

The other integrals converge so I performed simple numerical integrations using the NIntegrate function of the Mathematica software (Wolfram 1996), setting maximum recursion at 50, working precision at 15, and other options at default values.

#### Total genomic load:

To compute the load and on the whole-genome scale, I integrated Equation 11 over a distribution of mutational effects. I assumed a gamma distribution of selection coefficients, ψ(*s*), with mean *s*_{mean} = 0.0325 and shape parameter = 0.23, according to Eyre-Walker *et al.* (2006). Since the load weakly depends on the dominance levels, I assumed . For simplicity, I assumed that the fitness is multiplicative, such thatwhere *L _{i}* is the load contributed by the locus

*i*, and

*n*is the number of loci contributing to the load. I used

*u*= 10

^{−8}and

*n*= 10

^{6}so that the genomic deleterious mutation rate is

*U*= 10

^{−2}. This is much lower than current estimates in humans (

*U*> 1, Keightley and Eyre-Walker 2000; Keightley

*et al.*2005, 2006). However, using such values leads to unbearable loads under the multiplicative fitness model. I performed numerical integration using the quadrature method (see above) and the function NIntegrate of Mathematica software, setting maximum recursion at 50 and working precision at 15.

## APPENDIX C: LOAD APPROXIMATIONS

#### Genic selection in panmixia:

For genic selection in panmixia, an interesting approximation can be obtained for the load as follows. Consider that the deleterious allele can be either fixed with probability *p*_{F}, causing the load *s*, or segregating at deterministic frequency with probability 1 − *p*_{F}, causing the load *L*_{det}. The probability of being fixed for the deleterious allele, *p*_{F}, is approximately given by(C1)for *S* deleterious alleles and(C2)for *W* deleterious alleles (*e.g*., Bulmer 1991; Lande 1998).

Under genic selection and panmixia, this reduces to(C3)if *S* is the deleterious allele and(C4)if *W* is the deleterious allele (see also Kimura *et al.* 1963; Bulmer 1991; Lande 1998).

This leads to Equations 12a and Equations 12b.

#### Maximum load:

The deleterious effect of mutations that maximizes the load can be obtained by solving in *s*: *∂L*/*∂s* = 0. Approximate solutions can be obtained through the following rationale (for the case without gBGC, see also Kimura *et al.* 1963). Mutations that cause the highest load are the most deleterious ones among those that can reach fixation. Fixation can be reached because of drift or gBGC. We can thus simply maximize the first part of Equations 12, that is, *sp*_{F}. The selection coefficients that maximize the load, *s*_{max}, are thus solutions in *s* of(C5)for *W* deleterious alleles and(C6)for *S* deleterious alleles.

This leads to(C7)if *W* is the deleterious allele and(C8)if *S* is the deleterious allele, where PL is the product log function (Abramowitz and Stegun 1970).

For the *W* deleterious allele, 1/2*N* < *s*_{max} < (1/2*N*)(1 + PL(λ*e*^{−1})), which is ∼1/2*N* < *s*_{max} < 1/*N* for realistic λ values. For the *S* deleterious allele, (1/2*N*)(1 + PL(*e*^{−1}/λ)) < *s*_{max} < (1/2*N*)(PL(*e*^{4Nb}/λ)). For realistic λ-values, the minimum value is composed between 1/2*N* and 1/*N* and the maximum one between *b* and 2*b* if *Nb* ≫ 1. So we have ∼min(1/2*N*, *b*) < *s*_{max} < max(1/*N*,2*b*).

#### Minimum load:

Similarly, one can compute the gBGC level that minimizes the total load by solving in *b*: *∂L*/*∂b* = 0. Here, I compute separately the gBGC level that minimizes the segregating load and the fixation load. gBGC weakly decreases the load due to *W* deleterious mutations. To minimize the total segregating load, the gBGC level must weakly increase the load due to *S* deleterious alleles, so *b* < *hs*. Using Equation 10a, we thus need to minimize *q*(2*uhs*/(*hs* − *b*)) + (1 − *q*)(2λ*uhs*/(*hs* + *b*)). The solution is(C9)which reduces to(C10)for .

As expected, gBGC can minimize the load only when λ > 1, that is, when the mutation is biased toward AT. The gBGC level that minimizes the drift load is given by the value that minimizesThe solution is(C11)which reduces to(C12)for . Interestingly, (C12) is independent of *s*. In most cases, ; is thus a very good approximation of the gBGC level that minimizes the total load, even when assuming a distribution of selection coefficients (appendix d).

Finally, we can compute gBGC levels above which it increases the fixation load by solving in *b*,which leads to(C14)In most cases, we also have , so gBGC levels that decrease the segregating load should increase the fixation load.

## APPENDIX D: NUMERICAL VALIDATION OF EQUATIONS 12 AND 14

#### Equations 12:

Equations 12 are approximations for the exact Equation 10 that gives the load due to codominant mutations () in panmixia. Figure D1 and Figure D2 show that, for *S* deleterious alleles, (12a) is a very accurate approximation for (11) for a wide range of parameters. This is also true for *W* deleterious alleles (not shown):

(a) N = 10,000; b = 0, b = 2 × 10

^{−3}, b = 2 × 10^{−4}, b = 5 × 10^{−5}(from left to right).(b) b = 2 × 10

^{−4}; s = 10^{−2}, s = 10^{−3}, s = 10^{−4}, s = 10^{−5}(from top left to bottom left).

#### Equation 14:

The total genomic load is computed as explained in appendix b. Then the value of *b* that minimizes the load was numerically computed through a simple iterative algorithm. Numerical results were compared to Equation 14. Results are given (×10^{−3}) for various mutational biases (λ), population sizes (*N*), and shape of the gamma distribution of deleterious mutations (β). (See Table 6).

## Acknowledgments

I thank N. Galtier, L. Duret, A. Poppa, and C. Gautier for discussions and two anonymous reviewers for helpful insights and suggestions to improve the manuscript. I also thank L. Duret and A. Necsulea for sharing a manuscript before publication. This is manuscript Institut des Sciences de l'Evolution de Montpellier ISEM 2010-029. This work was supported by the French Centre National de la Recherche Scientifique and Agence Nationale de la Recherche (ANR-08-GENM-036-01).

## Footnotes

Communicating editor: S. G. Hamish

- Received March 5, 2010.
- Accepted April 19, 2010.

- Copyright © 2010 by the Genetics Society of America