## Abstract

Heterosis is a widespread phenomenon corresponding to the increase in fitness following crosses between individuals from different populations or lines relative to their parents. Its genetic basis has been a topic of controversy since the early 20th century. The masking of recessive deleterious mutations in hybrids likely explains a substantial part of heterosis. The dynamics and consequences of these mutations have thus been studied in depth. Recently, it was suggested that GC-biased gene conversion (gBGC) might strongly affect the fate of deleterious mutations and may have significant fitness consequences. gBGC is a recombination-associated process mimicking selection in favor of G and C alleles, which can interfere with selection, for instance by increasing the frequency of GC deleterious mutations. I investigated how gBGC could affect the amount and genetic structure of heterosis through an analysis of the interaction between gBGC and selection in subdivided populations. To do so, I analyzed the infinite island model both by numerical computations and by analytical approximations. I showed that gBGC might have little impact on the total amount of heterosis but could greatly affect its genetic basis.

CROSSES between individuals from different populations or lines often result in increased fitness relative to the parental fitness. This process—referred to as heterosis or hybrid vigor—has many practical and fundamental implications. It has long been used by plant and animal breeders to create high yield or high performance varieties, especially in maize (Duvick 2001). More recently, the potential role of heterosis has been emphasized and debated concerning conservation issues. Proposals included using heterosis generated by migration to reinforce populations, *i.e.*, a form of genetic rescue (Richards 2000; Willi and Fischer 2005; Hogg *et al.* 2006; Willi *et al.* 2007), or to understand the success of hybrid invaders (Facon *et al.* 2005). Heterosis may also play a role in the evolution of life-history traits, such as dispersal (Guillaume and Perrin 2006; Roze and Rousset 2009) or outcrossing (Ronfort and Couvet 1995; Theodorou and Couvet 2002).

The consequences and uses of heterosis partly depend on its genetic basis, which has long been debated with respect to plant and animal breeding and in evolutionary genetics. Initially, Davenport (1908) proposed that the negative effects of inbreeding in humans could be explained by the unmasking of recessive deleterious alleles. On the contrary, East (1908) and Shull (1911) suggested overdominance could explain hybrid vigor in maize, and Shull proposed the term “heterosis” as a descriptive term for hybrid vigor irrespective of the mechanism (Shull 1914). While the masking of recessive deleterious alleles in outbred and hybrid individuals is generally the main mechanism explaining inbreeding depression and heterosis, overdominant loci have been found in several studies (*e.g.*, Hua *et al.* 2003; Semel *et al.* 2006). Epistatic interactions between loci can also generate substantial heterosis (*e.g.*, Yu *et al.* 1997).

Whatever the underlying genetic basis, heterosis is possible only if parental lines or populations exhibit different genetic compositions. For instance, during the intentional creation of inbred parental lines, different alleles become fixed in each line and are eventually pooled in subsequent hybrids. Genetic drift is another way to create variance in allele frequencies between populations. Since migration between subpopulations homogenizes the genetic composition, higher heterosis is expected in small and highly subdivided populations, and it is also expected that heterosis increases with population differentiation (Whitlock *et al.* 2000; Glémin *et al.* 2003; Roze and Rousset 2004). These theoretical predictions have been confirmed in various species, including wild flowering plants (Richards 2000; Willi and Fischer 2005), freshwater snails (Escobar *et al.* 2008), and crops (Reif *et al.* 2003).

In small and highly subdivided populations, heterosis is strongly associated with the so-called local “drift load,” *i.e.*, the load due to the local fixation of deleterious alleles (Whitlock *et al.* 2000), and it has been suggested that heterosis could be used as a proxy for evaluating the drift load in wild populations (Glémin *et al.* 2003). Forces that affect the drift load might potentially affect heterosis too. Recently, I showed that GC-biased gene conversion (gBGC), which is a molecular process associated with recombination, might substantially affect inbreeding depression and the mutation load (Glémin 2010). gBGC is a kind of meiotic drive occurring on the nucleotide scale during recombination. In heterozygotes, heteroduplex strands generated during recombination lead to DNA mismatches. In several species, mismatch repair is biased toward G and C bases over A and T bases, resulting in an excess of G and C gametes (Marais 2003). This kind of gene conversion process mimics selection in favor of G and C (Nagylaki 1983a,b). Initially, gBGC was put forward to explain peculiar nucleotide landscapes, especially in mammals and birds (Duret and Galtier 2009). More recently, it was shown to interfere with selection, potentially leading to the fixation of weakly deleterious GC alleles (Galtier and Duret 2007; Galtier *et al.* 2009). Theoretically, I showed that gBGC can induce “fixation load” without drift; moreover, under certain conditions, gBGC can make the dynamics of GC deleterious alleles similar to those noted in overdominance (Glémin 2010). It thus seems reasonable to suggest that gBGC could also affect the amount and pattern of heterosis. However, the interaction of gBGC and selection in subdivided populations has not yet been studied. The aim of this article is thus to investigate how gBGC could affect heterosis through an analysis of the interaction between gBGC and selection in subdivided populations. To do so, I analyzed the infinite island model, both by numerical computations and by analytical approximations. First, I added gBGC in the Whitlock *et al*. (2000) numerical approach based on Wright's equation (Wright 1937). Then, to get a better understanding of the process, I used previous approaches that give approximate solutions for selection in subdivided populations (Glémin *et al.* 2003; Roze and Rousset 2003, 2004). I focused on how gBGC can affect the total amount and the genetic basis of heterosis.

## MODEL

#### Basic assumptions:

Throughout the article, I considered heterosis only as being caused by the masking of partially recessive deleterious alleles in hybrid individuals. As in Glémin (2010), I considered a single biallelic locus with weak (*W* = A or T) and strong (*S* = G or C) alleles. If the *S* (resp. *W*) allele is deleterious, the relative fitnesses of *WW* (resp. *SS*), *WS*, and *SS* (resp. *WW*) genotypes are 1, 1 − *hs*, and 1 − *s*, respectively, where *s* is the selection coefficient and *h* the dominance coefficient. The life cycle is as follows: *N* diploid adults produce gametes after conversion followed by mutation events. Heterozygote individuals produce *S* alleles with probability 1/2(1 + *b*) and *W* alleles with probability 1/2(1 − *b*), where *b* is the gBGC coefficient (disparity coefficient *sensu* Nagylaki 1983a). The *W* allele then mutates at rate *u* to the *S* allele. The reverse mutation occurs at rate *v* = λ*u*. λ is the mutation bias from *S* to *W* alleles, which ranges from 2 to 4.5 in many different organisms (Lynch 2007). Fertilization occurs at random and it is followed by selection, migration, and regulation to *N* individuals [“soft selection” model (*e.g.*, Whitlock 2002; Roze and Rousset 2003)]. For clarity, hard selection is not investigated here but could be taken into account using Roze and Rousset's (2003) or Whitlock's (2002) framework (see also discussion). I assume that *u* and *v* are much smaller than selection and conversion coefficients (*u*, *v* ≪ *s*, *hs*, and *b*). For simplicity, I also assume that the gBGC intensity is constant. However, strong gBGC events are thought to be associated with short-lived recombination hotspots, at least in humans (McVean *et al.* 2004; Myers *et al.* 2005). I thus implicitly assume that gBGC/selection dynamics are shorter than the recombination hotspot life span. The validity of this assumption is discussed in Glémin (2010).

Heterosis can be defined as the increase in fitness of individuals derived from crosses between populations relative to individuals derived from crosses within a population,(1)where *W*_{within} is the average fitness of individuals produced by random mating within demes and *W*_{between} is the average fitness of individuals whose parents are sampled from different demes. Other authors have also defined *W*_{between} as the fitness of individuals whose parents are sampled from the whole metapopulation (Whitlock 2002; Roze and Rousset 2004, 2009). The first definition better matches experimental designs while the second one is more mathematically convenient. However, as the number of demes tends toward infinity, the two definitions become equivalent.

#### Wright's infinite island model with gBGC:

In the infinite island model, heterosis can be expressed as a function of deleterious allele frequencies as(2)(*e.g.,* Roze and Rousset 2004, 2009), where *x _{i}* is the local frequency of the deleterious allele in deme

*i*,

*E*denotes averaging over all demes, and

*x*is the average frequency over the whole metapopulation. We thus have and (2) reduces to(3)where

*V*denotes the variance of allele frequencies over all demes. The right-hand term is an approximated expression for weak selection and/or for deleterious alleles maintained at low frequencies (

*e.g.*, Glémin

*et al.*2003; Roze and Rousset 2004, 2009).

*x*and

_{i}*V*[

*x*] can be computed using Wright's distribution of allele frequencies in the infinite island model (Wright 1937), which gives the distribution of allele frequencies

_{i}*within a deme*, assuming all demes are equivalent and migration is symmetrical, as(4)withif

*S*is the deleterious allele,if

*W*is the deleterious allele, andThis leads to(5a)if

*S*is the deleterious allele and(5b)if

*W*is the deleterious allele. is obtained by assuming that selection, gBGC, migration, and mutation are weak enough to neglect interaction terms between these elementary processes (without migration, see Equations 5a and 5b in Glémin 2010).

*K*is an integration constant, ensuring that , and

*x*can be computed numerically by an iteration procedure until (Barton and Rouhani 1993) and using the Kimura

*et al.*(1963) quadrature method (for adaptation to gBGC and selection see Glémin 2010). This was done using a Mathematica script (Wolfram 1996) available on request.

*V*[

*x*] is then given by . This approach is the same as in Whitlock

*et al*. (2000), including gBGC in addition to selection. There is no general explicit analytical solution for (5a) and (5b); however, approximations can be obtained as developed below.

#### Analytical approximations:

##### Weak selection:

When selection is weak and migration not too small, we can use the approach developed by Roze and Rousset (2003, 2004). They showed that heterosis can also be expressed as(6)(Roze and Rousset 2004, 2009), where *F* is the probability of coalescence within the same deme of two genes sampled with replacement from the same deme, and it is equivalent to *F*_{ST} in the infinite neutral island model. *x* can then be computed using diffusion approximations according to Roze and Rousset (2003). Using the direct fitness method (Rousset and Billiard 2000), the infinitesimal expected change in allele frequency *in the whole metapopulation* is given by(7a)(appendix a) if *S* is the deleterious allele and by(7b)if *W* is the deleterious allele, with , , and . As I consider only the special case of the infinite island model, , and *x* can be simply obtained by solving .

##### Strong selection:

When selection is strong and/or migration low, the previous method is not very accurate. When ∼*Nhs* > 5, the method developed by Glémin *et al.* (2003), adapted from Otha and Kimura's moment method (Ohta and Kimura 1969, 1971), can then be used instead. The aim is to obtain an analytical expression for V[*x _{i}*] to incorporate in Equation 3. Basically, the rationale is to obtain a set of linear equations as functions of moments of the distribution, Φ. Let be the infinitesimal expected change of allele frequency in deme

*i*, be its variance, and be the covariance of the change between demes

*i*and

*j*. For any function

*f*(

*x*

_{1},…,

*x*,…), Ohta and Kimura (1969, 1971) showed that(8a)which reduces to(8b)because (Glémin

_{i}*et al.*2003).

At equilibrium, . By choosing appropriate *f* functions, expressions can be obtained for each moment of Φ. However, this method leads to an infinite system of equations. To be solved, the system of recurrence equations must be closed, which can be done by linearizing (Glémin *et al.* 2003) or by arbitrarily assuming that moments vanish beyond a given order (Glémin 2005; Theodorou and Couvet 2006). Here, I used the linearization method (see appendix b). This assumes that selection is strong enough to maintain the average allele frequency at its deterministic equilibrium (as predicted by Glémin 2010). While this approximation for *x* is very rough, this leads to quite good approximations for the variance, *V*[*x*], *i.e.*, the main determinant of heterosis (Glémin *et al.* 2003).

#### Multilocus predictions:

To assess the quantitative effect of gBGC, multilocus heterosis can be computed under the assumption of the multiplicative contribution to heterosis of all *L* loci throughout the genome. Assuming a fraction *p* of these loci are affected by gBGC, and a given distribution of fitness effects of mutations (DFEM), ψ, we have(9)where *H*^{0} and *H ^{b}* denote heterosis without and with gBGC, respectively. Assuming half of deleterious mutations are

*S*and the other half are

*W*,

*H*

_{total}= (

*H*+

_{S}*H*)/2. I tested various more or less heterogenous gBGC distributions. Indeed, recombination and conversion are not distributed similarly in all organisms. For instance, recombination is probably not localized in hotspots in

_{W}*Caenorhabditis elegans*(Rockman and Kruglyak 2009). I considered gBGC occurred in hotspots spanning

*p =*3% of the genome with

*b*= 0.0002 on average [according to estimates in humans (Spencer

*et al.*2006)] or

*p*= 5% and

*b*= 0.0005. I also considered gBGC was homogeneously distributed over the whole genome (

*p*= 100%) with

*b*= 6.10

^{−6}(that is, 3% × 0.0002) or restricted to very hot hotspots with

*b*= 0.002 in frequency

*p*= 0.3%. I assumed the DFEM was gamma distributed with mean

*s*

_{0}and shape parameter α,(10)where Γ is the gamma function (Abramowitz and Stegun 1970). I used

*s*

_{0}= 0.0325 and α = 0.23 [according to estimates in humans (Eyre-Walker

*et al.*2006)]. Since heterosis strongly depends on dominance levels of mutations, which are poorly known, I explored different dominance levels. Equation 9 cannot be directly integrated using routine functions such as NIntegrate in Mathematica (Wolfram 1996) because the iteration procedure is needed for each

*s*(see above). The gamma distribution (Equation 10) was thus discretized into 100 categories, according to Yang (1994).

## RESULTS

#### Single-locus results and approximate solutions:

For weak selection and not too small migration, very good approximations are obtained by solving Equations 7a and 7b (see Figures 1–3⇓). However, the analytical expressions are complicated and not given here. We can thus get more useful approximations in different conditions.

The case of *W* deleterious alleles is straightforward because selection and gBGC act in the same direction against *W* alleles. The deleterious allele is maintained at a lower frequency than under mutation/selection equilibrium, and the variance and heterosis are also reduced. For weak selection, neglecting back mutation and terms in *x*^{2} and more in (7a), and assuming *u* ≪ *s*, *b* leads to(11)(see appendix a). For strong selection (see appendix b),(12)As expected, gBGC slightly reduces heterosis. When the *S* allele is deleterious, Glémin (2010) showed that the interaction between selection and gBGC leads to three selection regimes. If gBGC is weak relative to selection (*b* < *hs*), the deleterious *S* allele is maintained at low frequency, but slightly higher than at mutation/selection equilibrium, which leads to the same equations as (11) and (12), while replacing *b* by −*b* and λ*u* by *u*. On the contrary, if gBGC is strong (*b* > (1 − *h*)*s*), the deleterious allele is close to fixation. For weak selection, linearizing (7b) in *x* around 1 and neglecting direct mutation leads to(13)(see appendix a). For strong selection (see appendix b),(14)Finally, if gBGC and selection are of similar intensity (*hs* < *b* < (1 − *h*)*s*), the two forces can maintain the deleterious allele at intermediate frequency. For weak selection, neglecting mutations and solving (7b) gives(15)(see appendix a). For strong selection (see appendix b),(16)Note that the conditions for maintenance of the deleterious allele at intermediate frequency, and thus (15) being positive, are a bit more restrictive than in an unstructured population, , which vanish to the single-population conditions, *hs* < *b* < (1 − *h*)*s*, when *Nm* ≫ 1. This parallels the effect of inbreeding in single populations (Glémin 2010). Analytical approximations and numerical results (Figures 1 and 2) show that gBGC can strongly increase heterosis, mainly under the overdominant-like regime. Equations 15 and 16 also show that under the overdominant-like regime heterosis is mainly independent of mutation rates, contrary to the other regimes (Equations 11–14). At the genome scale, this means that if there are numerous such loci, heterosis can be high, even with low mutation rate. For other loci the total genomic mutation rate, not the number of loci, matters (*i.e.*, few loci with high mutation rates are roughly equivalent to many loci with low mutation rates).

For both weak and strong selection, an analysis of Equations 15 and 16 shows that heterosis is maximal for *b* = *s*/2, that is, the gBGC value maximizing the load and inbreeding depression as well (Glémin 2010). For this parameter range, heterosis can be increased by a factor of ∼*hs*/4*u* compared with the case without gBGC (see appendixes a and b). This can be very high if the mutation rate is low, *i.e.*, ∼10–1000. However, when gBGC is much higher than the selection, it reduces heterosis because it drives *S* deleterious alleles to fixation on the metapopulation scale (Figure 2). Conversely, for a given gBGC level, mutations maximizing heterosis are *S* mutations with effect , that is, mutations with effects between 2*b* and , or for fully recessive mutations (appendixes a and b). Mutations maximizing heterosis are thus mainly independent of the number of migrants, *Nm*, contrary to what was observed without gBGC (Whitlock *et al.* 2000).

Another surprising consequence of the interaction of gBGC and selection is that heterosis can reach a local maximum for intermediate migrant numbers, especially for highly recessive *W* → *S* mutations. In the case of fully recessive alleles, this local maximum is reached for (Figure 3 and appendix a). This can be explained because on one hand subdivision increases local drift, which increases heterosis, and on the other hand, subdivision increases local homozygosity, which reduces gBGC and thus heterosis. Because of these two opposing forces, maximum heterosis can thus be reached for intermediate levels of subdivision.

#### Quantitative patterns:

Single-locus analyses clearly showed that gBGC can affect the heterosis pattern, especially the contribution of different classes of mutations to heterosis. With gBGC, mutations contributing to heterosis are mainly restricted to those between *b*/(1 − *h*) and *b*/*h*, while the distribution is more homogeneous without gBGC and depends on the population size and migration (Figure 1). Moreover, assuming the DFEM follows a gamma distribution, *W* → *S* mutations of intermediate effect also contribute more than others, although they are not the most frequent. These mutations can contribute from a few to 10% of the total amount of heterosis while they represent only <1% of the total arising new mutations (Table 1). On the contrary, without gBGC, mutations with a very weak effect contribute the most to heterosis, simply because they are the most frequent.

However, the total amount of heterosis is only slightly affected by gBGC, because it has opposite effects for different kinds of mutations. While it increases heterosis for rather strong *S* mutations (*s* > *b*/(1 − *h*)), it decreases it for weak ones (*s* < *b*/(1 − *h*)) and for all *W* mutations. Low gBGC, but widespread throughout the genome, has almost no effect. Only a rather high level of gBGC can significantly increase heterosis, mostly when deleterious mutations are highly recessive (Table 1).

## DISCUSSION

Biased gene conversion can greatly affect the fate of selected alleles (Gutz and Leslie 1976; Lamb and Helmi 1982; Nagylaki 1983a,b; Bengtsson 1990). In several organisms, gBGC is a genome-wide biased gene conversion mechanism that can have quantitative fitness consequences. Recently, I showed that gBGC may strongly affect the mutation load, both qualitatively and quantitatively (Glémin 2010). Even rather weak gBGC restricted to a few genomic regions can create a substantial load by driving GC deleterious alleles to high frequency and fixation. I also showed that the interaction between selection and gBGC can generate an overdominant-like regime, maintaining recessive deleterious mutations at intermediate frequency for long periods of time, thus generating inbreeding depression. Here I investigated how gBGC could affect the fate of deleterious alleles in subdivided populations and thus another aspect of fitness, *i.e*., heterosis, which measures the increase in fitness due to crosses between demes relative to crosses within demes.

Contrary to its effect on the load, gBGC has only a very weak quantitative impact on heterosis, at least for levels in the range of what have been observed in a few species (Table 1). First, while heterosis is expected to be higher when local demes are small and isolated, gBGC intensity is reduced under these conditions because homozygosity is higher. Second, gBGC has an opposite effect on *W* and *S* deleterious mutations. However, its effect on *S* deleterious mutations is much stronger than its effect on *W* ones; on average, gBGC thus strongly increases the frequency of deleterious mutations and hence the load. However, heterosis depends on segregating mutations. Provided that *b* > *hs*, all weakly deleterious mutations affected by gBGC contribute to the load, while only those maintained at intermediate frequencies of ∼*b*/(1 − *h*) < *s* < *b*/*h* significantly increase heterosis. Weakly deleterious mutations fixed by gBGC increase the load but do not contribute to heterosis. The quantitative impact of gBGC is thus strongly dependent on the distribution of fitness and dominance effects of mutations. Numerical values explored in Table 1 suggest that in various contexts the quantitative effect of gBGC should be weak. However, a better characterization of these parameters is needed to evaluate the proportion of mutations submitted to the overdominant-like regime.

While gBGC should have a weak quantitative effect on heterosis, it may strongly affect its genetic basis. First, for a given gBGC level, *S* deleterious mutations corresponding to the overdominant-like regime contribute disproportionally to heterosis. For instance, using selection parameters from human data and several dominance levels, such mutations can cause a few to 10% of heterosis while representing only <1% of the whole spectrum of mutations (see Table 1 for several numerical examples). Interestingly, these mutations cause both heterosis and inbreeding depression (Glémin 2010). On the contrary, without gBGC, these two phenomena are expected to have a different basis, with heterosis being mainly caused by weakly deleterious mutations while inbreeding depression is mainly caused by strongly deleterious ones (Whitlock *et al.* 2000; Glémin *et al.* 2003). This variation in the genetic basis of heterosis may also depend on the degree of population structure, as gBGC may have a maximum effect for intermediate migrant numbers, *Nm* (Figure 3), and the range of mutations belonging to the overdominant-like regime depends on *Nm* (see Equation 15).

The genetic basis of heterosis has been debated for more than a century. These theoretical results suggest that gBGC should be taken into account as a potential additional factor, especially in species for which gBGC can be strong. This is typically the case for maize, probably the plant most studied for heterosis, and it belongs to the grass family in which gBGC is supposed to be strong (Glémin *et al.* 2006; Haudry *et al.* 2008; Escobar *et al.* 2010). Massive genomic data on maize are now available to study heterosis at the molecular level, both through crossing experiments (Swanson-Wagner *et al.* 2009) and through population genetics approaches (Gore *et al.* 2009). The results presented here lead to several predictions, which should now be testable on the basis of these new data. First, genes belonging to highly recombining regions should contribute disproportionally to heterosis compared to those in other genomic regions. This prediction is robust because such regions may have a higher local effective population size (Gordo and Charlesworth 2001) and thus contribute less to heterosis if gBGC is inactive. Second, *W*/*S* polymorphism should be involved more frequently than expected and the two alleles should segregate at intermediate frequencies, especially under the “overdominant-like” selection regime. However, it is worth noting that, contrary to their overdominant population behavior, such alleles should give clear dominance/recessive signatures when heterosis is estimated through cross-designs.

Here, I investigated only the case of soft selection. However, the conclusions should be very similar under hard selection. Using the hard selection model given by Roze and Rousset (2003), *i.e.*, there is no local regulation before migration contrary to the life cycle studied here, the results are expected to be very close to those obtained under soft selection, except when deme sizes are very small, as shown by Roze and Rousset (2003, 2004). Whitlock (2002) initially proposed another model of hard selection where the contribution of each deme to the next generation is proportional to the mean relative fitness of the deme. Under this model, selection occurs between unrelated individuals (at the metapopulation scale) and it is thus stronger than under soft selection, where related individuals compete locally. Heterosis is thus expected to be lower; however, the effect of gBGC is similar. Indeed, this model is equivalent to a single inbreeding population, replacing *F*_{IS} by *F*_{ST} (Whitlock 2002), so that single-population results given in Glémin (2010) can be directly used (results not shown).

In summary, gBGC is likely a negligible process affecting the overall magnitude of heterosis in natural or breeding populations. However, these results strongly suggest that it should be taken into account when dissecting its genetic basis. To do so, quantification of the magnitude and distribution of gBGC throughout genomes in various organisms will be a critical issue for future studies.

## APPENDIX A: APPROXIMATION FOR WEAK SELECTION

#### Derivation of Equations 7:

In the limit of weak selection and weak gBGC, the gBGC and selection model is equivalent to changing parameters as follows:(A1)(Glémin 2010). We can thus directly use these expressions in Equation 23 in Roze and Rousset (2003) to get Equations 7a and 7b. We can also derive Equations 7a and 7b by considering additively the effect of selection (Equation 23 in Roze and Rousset 2003) and gBGC as a form of genic selection (Equation 23 with *h* = and s = 2*b* in Roze and Rousset 2003).

Finally, a more complete derivation can also be obtained by modification of the equation of Roze and Rousset (2003) by the addition of gBGC. Here I present the case of *S* deleterious alleles. Consider the Wright island model with *n* demes of size *N*. The infinitesimal expected change in the deleterious allele frequency on the whole metapopulation, *x*, can be expressed as(A2)where *W*_{i,j,k} (*k* = 1, 2) is the fitness function defined at the gene lineage level, that is, the expected number of gene copies left by gene lineage *k* in individual *j* in population *i*. Similarly, *x*_{i,j,k} is the frequency of the deleterious allele in gene lineage *k* in individual *j* in population *i. x*_{i,j,k} = 0 or 1 and *x*_{i,j} = 0, , or 1. Note that according to this definition the total fitness of the population sums up to 2. One can write *W*_{i,j,k} as the product of a selection component, *F*_{i,j}(*s*) (equivalent to *W*_{i,j} in Roze and Rousset 2003), and a gBGC component, *B*_{i,j,k}:Taylor's expansion of (A2) in *s* and *b* gives(A3)The first term, in *s*, is the same as that in Equation 13 in Roze and Rousset (2003) and leads to their Equation 23. The second term, in *b*, is expressed as(A4)The right-hand term is based on the fact that and . Roze and Rousset (2003) showed that (Equation 16) , where *r*_{0} is the probability of coalescence of two genes sampled with replacement from the same individual in a metapopulation with an infinite number of demes. Under panmixia, . Equation A4 thus becomes , which is equivalent to the expected change in allele frequency under genic selection. These computations lead to Equations 7a and 7b.

#### Analysis of Equations 7:

For *W* deleterious mutations, linearizing (7a) in *x* near 0 and neglecting back mutation gives(A5)which leads to (11).

For *S* deleterious mutations with *b* < *hs*, linearizing (7b) in *x* near 0 and neglecting back mutation gives(A6)which leads to Equation 11 with the appropriate change of *b* by *–b* and λ*u* by *u*.

For *b* > (1 − *h*)*s*, linearizing (7b) in *x* near 1 and neglecting direct mutations gives(A7)which leads to (13).

Finally, for the overdominant-like regime, *hs* < *b* < (1 − *h*)*s*, neglecting both direct and back mutations leads to(A8)which leads to (15).

Solving in *b* (for *x _{S}* given by Equation 15) shows that heterosis is maximal for

*b*=

*s*/2, for which , while for

*b*= 0, heterosis is only . Under this condition, we thus have .

Solving (for *x _{S}* given by Equation 15) in

*s*shows that mutations maximizing heterosis are

*S*mutationswith . Finally, we can also show that heterosis can reach a local maximum for an intermediate number of migrants by solving in

*Nm*, using Equation 15 for

*H*, that is, considering only the overdominant-like selection regime. The full result is very substantial and not given here. Simple approximations can be obtained for fully recessive alleles. Noting β =

_{S}*b*/

*s*, Taylor expansion of the solution in β simply gives

## APPENDIX B: APPROXIMATION FOR STRONG SELECTION

Using Equation 8b with appropriate *f* functions, one can get a set of equations of the moments of ϕ. To solve this system, must be linear in *x _{i}* (see Glémin

*et al.*2003). For

*W*deleterious mutations, linearizing in

*x*around 0 and neglecting back mutation give(B1)Using and in (8b) leads to(B2)Recalling that

_{i}*E*[

*x*] =

_{i}*x*, the solution of (B2) is(B3)Since

*x*is in

*O*(

*u*

^{2}), , which can be inserted in (3) and gives (12).

Similarly, for *S* deleterious alleles, for gBGC weaker than selection, *b* < *hs*, linearizing in *x _{i}* around 0, neglecting back mutations, using the same

*f*functions, and solving the system with the same approximations gives(B4a)and(B4b)which leads to (12) with the appropriate change of

*b*by −

*b*and λ

*u*by

*u*.

For gBGC stronger than selection, *b* > (1 − *h*)*s*, linearizing in *x _{i}* around 1, neglecting direct mutations, using the same

*f*functions, and solving the system leads to(B5a)and(B5b)which leads to(B5c)and to Equation 14.

Finally, for the overdominant-like regime, *hs* < *b* < (1 − *h*)*s*, one can linearize in *x _{i}* around the deterministic equilibrium, (Glémin 2010), and neglect both direct and back mutations. Using the same

*f*functions and solving the system leads to(B6a)and(B6b)which leads to(B6c)and to Equation 16.

## Acknowledgments

This is manuscript ISEM 2010-111. I thank F. Rousset for mathematical help. This work was supported by the French Centre National de la Recherche Scientifique and Agence Nationale de la Recherche (ANR-08-GENM-036-01).

## Footnotes

Communicating editor: H. G. Spencer

- Received July 11, 2010.
- Accepted October 13, 2010.

- Copyright © 2011 by the Genetics Society of America