## Abstract

Both the spatial distribution of organisms and their mode of reproduction have important effects on the change in allele frequencies within populations. In this article, we study the combined effect of population structure and the rate of partial selfing of organisms on the efficiency of selection against recurrent deleterious mutations. Assuming an island model of population structure and weak selection, we express the mutation load, the within- and between-deme inbreeding depression, and heterosis as functions of the frequency of deleterious mutants in the metapopulation; we then use a diffusion model to calculate an expression for the equilibrium probability distribution of this frequency of deleterious mutants. This allows us to derive approximations for the average mutant frequency, mutation load, inbreeding depression, and heterosis, the simplest ones being Equations 35–39 in the text. We find that population structure can help to purge recessive deleterious mutations and reduce the load for some parameter values (in particular when the dominance coefficient of these mutations is <0.2–0.3), but that this effect is reversed when the selfing rate is above a given value. Conversely, within-deme inbreeding depression always decreases, while heterosis always increases, with the degree of population subdivision, for all selfing rates.

THE reproductive system of organisms greatly affects the change in genotype frequencies within populations, which may in return influence the evolution of the reproductive biology of a species. Recessive deleterious mutations, for example, are likely to play an important role in the evolution of self-fertilization. Although the average frequency of such mutations in a population decreases as the rate of selfing increases, they cause a reduction in fitness of selfed progeny relative to outcrossed progeny. This inbreeding depression may be sufficient in some cases to prevent higher rates of selfing to evolve. Direct advantages, however, may be associated with selfing: if the proportion of male gametes used by an individual for self-fertilization is negligible, a selfer should enjoy a 50% reproductive advantage in a completely outcrossing population (the “cost of outcrossing”) and rapidly spread to fixation in the absence of other factors. From such considerations, it has been argued that increased levels of selfing should be selected if inbreeding depression is lower than the cost of outcrossing, while selfing should decrease otherwise (Lande and Schemske 1985). Since inbreeding depression is a decreasing function of the selfing rate, the only possible evolutionary stable equilibria should correspond to complete selfing or complete outcrossing (Lande and Schemske 1985; Charlesworth* et al.* 1990). Intermediate rates of self-fertilization, however, are commonly observed in plants (Schemske and Lande 1985), and various hypotheses have been proposed to explain this, on the basis of different dispersal abilities of selfed and outcrossed progeny (Holsinger 1986), pollination ecology (Johnston 1998), variations in inbreeding depression across space and time (Cheptou and Mathias 2001; Cheptou and Schoen 2002), or population spatial structure (Ronfort and Couvet 1995). This last study assumed “mass-action pollination” (competition between self and outcross pollen to pollinate the ovules), which may also explain in part the evolution of intermediate selfing rates (Uyenoyama* et al.* 1993, p. 340).

By changing the relative importance of natural selection and genetic drift on allele frequency variations, population structure affects the equilibrium frequency of deleterious mutations and the magnitude of inbreeding depression. One can define different forms of inbreeding depression in a subdivided population; in this article we define “within-deme inbreeding depression” (δ_{IS}) as the reduction in fitness of selfed progeny relative to outcrossed progeny from the same deme and “between-deme inbreeding depression” (δ_{IT}) as the reduction in fitness of selfed progeny relative to progeny obtained by outcrossing randomly over the whole metapopulation. We also define heterosis (δ_{ST}) as the reduction in fitness of progeny obtained by outcrossing within demes relative to that of progeny obtained by outcrossing randomly over the whole metapopulation. To have a better understanding of the role of population structure in the evolution of mating systems, the combined effects of self-fertilization and population structure on selection against deleterious mutants need to be elucidated. For island models of population structure, Wright (1937) obtained an expression for the equilibrium distribution of allele frequencies across demes, assuming that frequencies change slowly within each deme (that is, for sufficiently weak selection, mutation and migration, and large deme size); Whitlock (2002) used this distribution to obtain approximations for the frequency of deleterious mutations, the mutation load, and inbreeding depression in a population subdivided into a very large number of demes (deterministic equilibrium). He found that population structure can substantially reduce within-deme inbreeding depression (due to increased drift within subpopulations), while it can increase or decrease the frequency of deleterious mutants and the mutation load, depending on the parameter values, dominance in particular. He also considered the effects of inbreeding, by introducing the inbreeding coefficient *F*_{IS} to measure heterozygote deficits within demes, and found that the mutation load decreases as *F*_{IS} increases. Theodorou and Couvet (2002) presented results relative to an island model of population structure with an infinite number of demes, obtained by solving numerically a system of recurrence equations. They considered the case of partially selfing individuals and separated pollen from seed migration. They found, among other results, that population structure has almost no effect on within-deme inbreeding depression when the selfing rate is high and that the mutation load is higher under pollen migration than under seed migration. In a recent article, Glémin* et al.* (2003) derived approximations for the mutation load, inbreeding depression, and heterosis in a structured population, considering both the island model and the one-dimensional stepping-stone model of population structure. They assumed that deleterious alleles remain at a low frequency within each deme and that they are present only in the heterozygous stage, which implies that selection is strong relative to drift within demes, and deleterious mutations are not too recessive. They found that population structure decreases within-deme inbreeding depression, while it increases between-deme inbreeding depression and heterosis. As in Whitlock (2002), they introduced the inbreeding coefficient *F*_{IS} to measure heterozygote deficits within demes and showed that increasing *F*_{IS} reduces within- and between-deme inbreeding depression, as well as heterosis.

In this article, we use a diffusion model to calculate the average frequency of deleterious alleles, the mutation load, inbreeding depression, and heterosis in a population of partially selfing individuals, subdivided according to an island model. The model assumes a large (but finite) number of demes and weak selection; deme size and migration rate can be arbitrarily large or small, as long as the selection coefficient is smaller than *m* + 1/*N*, where *m* is the migration rate and *N* is the number of adult individuals per deme. Although we assume that migration occurs only in the diploid phase, the model could be easily extended to include gametic migration. As in Theodorou and Couvet (2002), Whitlock (2002), and Glémin* et al.* (2003), we consider a one locus-two alleles model. Whether results from such models can be easily extended to situations with several selected loci remains questionable, since both partial selfing and population structure introduce statistical associations among loci (Weir and Cockerham 1973; Vitalis and Couvet 2001). In the case of nonsubdivided populations, however, Charlesworth* et al.* (1990) have shown that the effect of these associations on inbreeding depression is negligible as long as selection is weak at each locus. These results were obtained from a model of unlinked loci, but Charlesworth* et al.* (1992) showed that the rate of recombination between selected loci has only a small effect on the mutation load and inbreeding depression, at least under weak selection. Epistasis, however, will affect mutation load and inbreeding depression (Charlesworth* et al.* 1991).

First, we introduce some methodology developed by other authors to study the effects of self-fertilization on the mean frequency of deleterious alleles, mutation load, and inbreeding depression in finite nonsubdivided populations; we also give approximations for the case of very large populations (deterministic solutions). Second, we then extend these models to the case of structured populations. We define measures of within-deme inbreeding depression, between-deme inbreeding depression, and heterosis and derive simple relations between these quantities; we then use a diffusion model to calculate their expected values. In the general case, these values are obtained by numerical integration of a complicated function; in some cases, however, they can be approximated by deterministic solutions. Under weak migration and large deme size, these deterministic solutions are equivalent to the ones obtained by Whitlock (2002) in his soft selection model; as we do not make any assumption on migration rate and deme size, however, we can obtain precise solutions for arbitrarily large migration rates and/or small deme sizes. We show that the effects of population structure on the frequency of deleterious mutations and the mutation load depend critically on the dominance coefficient of these mutations and on the rate of partial selfing in the population. In particular, we show that population structure can have different qualitative effects on these quantities depending on the selfing rate, a result that, to our knowledge, has not been shown before. Finally, we compare the solutions of Glémin* et al.* (2003) and our solutions with simulation results and find that the solutions of Glémin *et al.* are more accurate than ours when selection is stronger than migration (*s* > *m*) and the dominance coefficient of deleterious mutations is not small, while our method is more accurate in the opposite cases (*s* < *m* and/or low dominance coefficient).

## NONSUBDIVIDED POPULATION

We assume that two alleles *A* and *a* at a given locus segregate in a population of *N* monoecious individuals. We call *p* the frequency of *A*, and *q* = 1 − *p* the frequency of *a*. At the beginning of each generation, each of the *N* adults produces a very large number of gametes and dies; we assume that allele *A* has a deleterious effect, such that *aa*, *Aa*, and *AA* individuals produce a number of gametes proportional to 1, 1 − *hs*, and 1 − *s*, respectively. Self-fertilization occurs at a rate α; more precisely, a proportion α of fertilizations involve two gametes produced by the same individual, while the other 1 − α involve two gametes taken at random among all gametes produced (therefore when α = 0 self-fertilization occurs at rate 1/*N*). Finally, *N* individuals are sampled randomly among the large number of juveniles produced to form the next adult generation. At each generation, mutation from *a* to *A* occurs at rate *u*, while mutation from *A* to *a* occurs at rate *v*.

The mutation load *L* is defined as the reduction in mean fitness of a population due to the constant input of deleterious mutations (Crow 1970). Calling *p _{aa}*,

*p*, and

_{Aa}*p*the frequencies of the three genotypes in the population, we have 1

_{AA}Genotypic frequencies are expressed as functions of *p*, the frequency of allele *A*, by the relations *p _{aa}* =

*q*

^{2}+

*Fpq*,

*p*= 2(1 −

_{Aa}*F*)

*pq*, and

*p*=

_{AA}*p*

^{2}+

*Fpq*, where

*F*is the correlation between uniting gametes due to nonrandom mating (Wright's

*F*

_{IS}). Although

*F*depends on the selection coefficient, it is sufficient to evaluate it in the neutral case to obtain an expression of the load to the first order in

*s*; under neutrality and with a rate of selfing α,

*F*= α/(2 − α) (

*e.g.*, Gillespie 1998, p. 93). This gives an expression of the load as a function of

*p*: 2

Inbreeding depression, δ, is defined as the reduction in fitness of an individual produced by selfing, relative to the fitness of an individual produced by outcrossing (Charlesworth and Charlesworth 1987), which gives 3giving 4

In a population of finite size, at equilibrium, allele *A* will be in frequency *p* with a given probability φ(*p*). To obtain the mean load and inbreeding depression at equilibrium, one has to replace *p* and *pq* in Equations 2 and 4 by ∫*p*φ(*p*)*dp* and ∫*pq*φ(*p*)*dp*. An approximation for φ(*p*) can be obtained from diffusion methods: assuming that *N* is large and that *s*, *u*, and *v* are small (of order 1/*N*), the change in frequency of *A* over generations can be approximated by a diffusion process, with drift and diffusion coefficients
5
6

(Caballero and Hill 1992) with 7

*M*_{δ}* _{p}* and

*V*

_{δ}

*measure the mean and variance of the change in frequency of*

_{p}*A*over one generation; φ(

*p*) is expressed as a function of these two quantities by the relation 8

(*e.g.*, Ewens 1979), where *K* is a constant such that probabilities sum to one. This gives
9

This distribution has to be integrated numerically to obtain its first two moments and the average values of the mutation load and inbreeding depression. This was done by Bataillon and Kirkpatrick (2000), who showed in particular that the mean load is a decreasing function of population size, while the mean inbreeding depression increases with population size, these effects being marked mostly when population size is small. Some insights can be gained by considering the values of *p*, *L*, and δ in the limit as population size tends to infinity (deterministic equilibrium); these values can be obtained simply by solving *M*_{δ}* _{p}* = 0 for

*p*and injecting this equilibrium value of

*p*in Equations 2 and 4. Simpler expressions are obtained by neglecting back mutation (

*v*= 0); one may also neglect terms in

*p*

^{2}, at least when

*h*> 0, while when

*h*= 0 these terms have to be conserved. This leads to different solutions for the cases

*h*= 0 and

*h*≫ 0; these solutions are given in Table 1 for α = 0 (no self-fertilization), and α > 0. From the expressions given in Table 1, one finds easily that self-fertilization decreases the frequency of deleterious mutants and the mutation load as long as 0 <

*h*< 1 and decreases inbreeding depression as long as 0 ≤

*h*<

^{1}/

_{2}. One can also see that while the load depends only on

*u*under random mating, it also depends on

*h*and α when self-fertilization occurs (Ohta and Cockerham 1974).

## POPULATION SUBDIVISION

We now turn to the case of a subdivided population. We assume an island model of population structure, where *n* is the number of demes and *N* the number of adult individuals per deme. Adult individuals produce a large number of gametes, still in relative proportions 1, 1 − *hs*, and 1 − *s* depending on their genotype. Fertilization occurs within each deme with a rate α of selfing, and the juveniles produced migrate at a rate *m*. We distinguish soft and hard selection as follows: under soft selection, the number of juveniles per deme is regulated to a constant value (the same for all demes) just before migration, while under hard selection there is no such regulation, and thus the number of juveniles produced by a deme at the time of migration depends on the genotypes of the parents that were in that deme (Christiansen 1975, Equation 1; Nagylaki 1992, p. 134); other definitions of soft and hard selection exist in the literature (Wallace 1975; Whitlock 2002). Finally, once migration has occurred, *N* individuals are sampled in each deme to form the next adult generation.

### Mutation load, inbreeding depression, and heterosis:

We call *p _{aai}*,

*p*, and

_{Aai}*p*the frequencies of the three genotypes in deme

_{AAi}*i*. Given all demes have the same size

*N*, the mutation load is given by the expression 10where the overbar means the average over all demes. To obtain the average load in the population at equilibrium,

*p̅*

_{A̅a̅i̅}and

*p̅*

_{A̅A̅i̅}have to be integrated over a probability distribution Φ(

**p**), where

**p**is a vector representing all genotypic frequencies in all demes, and Φ(

**p**) gives the probability that these frequencies equal

**p**at equilibrium. In the following we use a diffusion model to approximate these values; previous studies have shown that diffusion approximations can be used to describe the change in allele frequency in an island model with increasing accuracy as increasingly accurate methods are used to compute the drift and diffusion coefficients (Maruyama 1983; Cherry and Wakeley 2003; Roze and Rousset 2003; Wakeley 2003; Whitlock 2003). The approximations obtained are accurate as long as the number of demes

*n*is large and the migration rate is not too small (approximately

*m*>

*s*). These approximations are obtained by considering limit processes when

*s*,

*u*, and

*v*are of order 1/

*n*, so that when

*s*tends to zero,

*n*tends to infinity, and

*u*and

*v*tend to zero; it is assumed that the products

*ns*,

*nu*, and

*nv*tend to finite values in this limit.

To obtain an expression of the load to the first order in *s*, Equation 10 shows that it is sufficient to calculate the average over Φ of *p̅*_{A̅a̅i̅} and *p̅*_{A̅A̅i̅} in the limit when *s* tends to zero. In this limit, *p̅*_{A̅a̅i̅} and *p̅*_{A̅A̅i̅} can be expressed as functions of *p*, the frequency of *A* in the whole metapopulation (Roze and Rousset 2003). Indeed as *s* tends to zero and *n* tends to infinity, the ancestral lineages of the two homologous genes of an individual can either stay in the same deme and coalesce (with a probability that we call *r*_{0}) or migrate to different demes (with probability 1 − *r*_{0}), in which case they will take an infinite time to coalesce. As *s* tends to zero and *n* tends to infinity, the frequency of *A* in the whole population does not change over the (finite) coalescence time within demes; therefore the probability that the two homologous genes of an individual are *A*, averaged over all individuals, can be written
11and the same reasoning gives
12

*r*_{0} measures deviations of genotype frequencies relative to expectations based on random union of gametes in the metapopulation; it is therefore equivalent to Wright's *F*_{IT}. Since *p̅*_{A̅a̅i̅} and *p̅*_{A̅A̅i̅} can be expressed as functions of *p* and *r*_{0}, we just need the value of *r*_{0} and the probability distribution of *p* at equilibrium, still in the diffusion limit (*n* tends to infinity; *s*, *u*, and *v* tend to zero), instead of the whole distribution of frequencies in the different demes. As in the previous section, we call φ(*p*) this distribution; from Equations 10, 11, and 12, the mean load at equilibrium, to first order in *s*, is the average over φ of
13

Inbreeding depression in a structured population may be defined in several ways (Johnston and Schoen 1994; Theodorou and Couvet 2002; Whitlock 2002). In the following we call *f*_{s}* _{i}* the mean fecundity of individuals produced by selfing in deme

*i*,

*f*

_{o}

*the mean fecundity of individuals produced by outcrossing in deme*

_{i}*i*(meaning that both parental gametes are sampled randomly from deme

*i*), and

*f*

_{b}the mean fecundity of individuals produced by outcrossing over the whole metapopulation. Within-deme inbreeding depression δ

_{IS}is defined as (where the overbar means the average over all demes), between-deme inbreeding depression δ

_{IT}as , and heterosis δ

_{ST}as . δ

_{IT}actually combines the effects of within-deme inbreeding depression and heterosis; to the first order in

*s*, we have 14

This relation is exact in Theodorou and Couvet (2002), as they define δ_{IS} as 1 − *f̅*_{s̅i̅}/*f̅*_{o̅i̅} (*i.e.*, from the ratio of average *f* 's rather than from the average ratio); it is exact to the first order in *s* only with the definition we use. δ_{IS} and δ_{IT} correspond to δ_{1} and δ_{2} in Whitlock (2002).

One can use the same reasoning as above to express δ_{IS}, δ_{IT}, and δ_{ST} as functions of *p*. For example, δ_{IS} is the average over *i* of
15where *p _{i}* is the frequency of

*A*in deme

*i*. This gives 16

In the limit as *s*, *u*, and *v* tend to zero and *n* tends to infinity, *p̅*_{i̅}*q̅*_{i̅} and *p̅*^{2̅}_{i̅} can be expressed as functions of *p* and a new variable *r*^{R}_{1}, which measures the probability of coalescence within their deme of the ancestral lineages of two genes sampled with replacement from the same deme, still in the diffusion limit (Roze and Rousset 2003):
17
18

*r*^{R}_{1} is equivalent to Wright's original definition of *F*_{ST}. Calling *r*_{1} the probability that the ancestral lineages of two genes sampled in two *different* individuals from the same deme coalesce, in the limit as *n* tends to infinity and *s*, *u*, and *v* tend to zero, we have the relation
19

*r*_{1} is equivalent to other common definitions of *F*_{ST}. Equations 11, 12, and 16–19 give
20

Similarly, one arrives at 21 22

The expressions above imply that
23where is the “relatedness coefficient” in an island model (Michod and Hamilton 1980; Roze and Rousset 2003). Again, the average values of δ_{IS}, δ_{IT}, and δ_{ST} are obtained by integrating these expressions over φ.

### Probability distribution of the frequency of *A* at equilibrium:

To calculate φ in the diffusion limit we can use Equation 8, where *M*_{δ}* _{p}* and

*V*

_{δ}

*still represent the mean and variance in the change in frequency of*

_{p}*A*over one generation.

*M*

_{δ}

*and*

_{p}*V*

_{δ}

*can be expressed as functions of*

_{p}*p*and of several variables measuring various probabilities of coalescence, which have to be evaluated in the limit as

*n*tends to infinity and

*s*,

*u*, and

*v*tend to zero (Roze and Rousset 2003). We have already defined

*r*

_{0}as the probability of coalescence of the two homologous genes of an individual and

*r*

_{1}as the probability of coalescence of two genes sampled in two different individuals from the same deme; we now define

*a*as the probability that the ancestral lineages of the two homologous genes of an individual, plus a third gene sampled in a different individual from the same deme, all stay in the same deme and coalesce, and

*c*as the probability that these three lineages separate into different demes before any two of them coalesce.

*r*

_{0},

*r*

_{1},

*a*, and

*c*correspond to

*r*

^{D}

_{0},

*r*

^{D}

_{1},

*a*

^{D}, and

*c*

^{D}in Roze and Rousset (2003). Under soft selection, the mean change in frequency of

*A*over a generation takes the form 24

(Roze and Rousset 2003), with
25and
26while the variance in the change in frequency of *A* is given by
27with
28

Under hard selection, *S*_{1} and *S*_{2} become
29
30

When *m* = 1, *r*_{1} = 0 and the expressions of *N*_{e}, *S*_{1}, and *S*_{2} under hard selection take the same form as in Equation 7 (nonsubdivided population), with *F* being replaced by *r*_{0}. If *m* is small, assuming either hard or soft selection does not make much difference, as long as *N* is not small.

*r*_{0}, *r*_{1}, *a*, and *c* have been defined as probabilities of coalescence, which have to be computed in the limit as the number of demes tends to infinity, and when *s*, *u*, and *v* equal zero. These probabilities are obtained by writing recurrence equations and calculating equilibrium values; these recurrence equations are given in appendix A. In Roze and Rousset (2003) we also considered self-fertilization, but α was not defined exactly as in this article: it was the probability that two uniting gametes came from the same parent, so that when α was zero all uniting gametes came from different parents; here when α equals zero, two uniting gametes come from the same parent with probability 1/*N*; the present definition appears more frequently in the literature and proves more convenient to study the evolution of self-fertilization. For this reason, the recurrence equations given in appendix A are not exactly the same as those in Roze and Rousset (2003). The expressions of *a* and *c* are complicated, but simpler expressions can be obtained if we assume that *N* is large and *m* small—we assume that *m* is of order 1/*N* and neglect terms in *O*(*m*). Under these conditions we obtain (with *M* = *Nm*):
31
32

For large *m* and/or small *N*, however, the exact expressions obtained by solving the equations given in appendix A have to be used.

The expressions of *M*_{δ}* _{p}* and

*V*

_{δ}

*under our life cycle—Equations 24 and 27—take the same form as for a panmictic population—Equations 5 and 6. Therefore, at equilibrium, the probability that allele*

_{p}*A*is present in frequency

*p*in the metapopulation, φ(

*p*), is still given by Equation 9, now using the expression of

*N*

_{e}given by Equation 28 and the expressions of

*S*

_{1}and

*S*

_{2}given by Equations 25 and 26 for soft selection or those by Equations 29 and 30 for hard selection. Again, the averages of

*p*and

*pq*over this probability distribution have to be obtained by numerical integration.

## RESULTS

### Approximate solutions

Although we could not find any exact solution for the mean of *p* and *pq* over a probability distribution of the form of Equation 9, some approximations can be obtained, as explained in appendix B. A first approximation, valid when *h* > ∼0.3, or when *h* < 0.3 and *m* is small (see discussion at the end of appendix B for more details), is
33

This approximation also corresponds to the deterministic solution (very large number of demes), obtained after neglecting terms in *p*^{2}. When *h* < 0.3, a better approximation is
34

(from appendix B), where . In (33) and (34), *N*_{e} is given by Equation 28, while *S*_{1} and *S*_{2} are given by (25) and (26) if selection is soft and by (29) and (30) if selection is hard. Another aproximation for the case *h* = 0 and *m* not too small is given in appendix B, Equation B11.

Figure 1
compares these approximations with numerical integrations using the NIntegrate function of *Mathematica* 4.1 (Wolfram 1991). When *h* = 0, (33) and (34) give good results only for small values of *m*; as *h* increases, the range over which they are accurate increases, (34) giving better results than (33), while when *h* > 0.3, (33) gives good results for all values of *m*. Figure 1 corresponds to the case of random mating within demes (α = 0). We observed that the accuracy of approximations (33) and (34) increases as the selfing rate α increases; approximation (B11), however, gives good results only for small values of α (results not shown). Finally, we observed that the expression of the mutation load obtained from equations (13) and (33) gives good results for all values of *h*, even when *h* = 0 and *m* is large (not shown).

Approximations (33) and (34) are complicated functions of the parameters, as they depend on *r*_{0}, *r*_{1}, *a*, and *c*, which are the solutions of the equations given in appendix A. When *m* is small and *N* large, however, these solutions can be approximated by Equations 31 and 32, which leads to a simple expression for *S*_{1}. In that case, using (33) leads to (for both hard and soft selection)
35
36
37
38
39

Equations 35 and 36 can also be obtained from Whitlock's (2002) expressions for the mutant frequency and mutation load in an infinite-island model under soft selection (Equations 43, 44, and 46 in Whitlock 2002, with *b* = 0). Under hard selection we still obtain Equations 35–37 while Whitlock finds different results; this comes from the fact that our definition of hard selection is different from that in Whitlock (2002)(see Roze and Rousset 2003).

### Quantitative patterns

#### Effects of population structure:

Figure 2
shows the average frequency of the deleterious allele *A* as a function of the migration rate *m*, under hard selection (results under soft selection are qualitatively very similar), when α = 0. Here we integrated numerically over the distribution given by Equation 9, with *N*_{e}, *S*_{1}, and *S*_{2} given by Equations 28–30, and *r*_{0}, *r*_{1}, *a*, and *c* calculated from the recurrence equations given in appendix A. One can see from Figure 2A that the mutation-selection balance when *m* = 1 is not exactly the same as in a nonsubdivided population of the same total size; this comes from the fact that even when *m* = 1, population structure still has the effect of increasing the probability of self-fertilization, as we assume that fertilization occurs within demes before migration. This effect becomes negligible when deme size is large. For all parameter values that we tested, we found that reducing the migration rate up to a given point reduces the frequency of deleterious mutants when *h* < ∼^{1}/_{3}, and increases it when *h* > ∼^{1}/_{3}, which corresponds to what Whitlock (2002) had found in his soft-selection model. For small values of the migration rate, however, the frequency of deleterious mutants increases as *m* decreases, for all values of *h*, as can be seen on the left of Figure 2A. Indeed population structure has different effects on the selection against deleterious alleles: it increases homozygosity, which helps to purge recessive deleterious alleles, but it also decreases the efficiency of selection, by increasing the genetic similarity among competing individuals. When migration is very weak, demes are almost always fixed for *a* or *A* and selection has little effect; this explains why the curves go up on the left of Figure 2A.

Figure 3
shows that decreasing *N* increases the value of *m* that minimizes the mean frequency of recessive deleterious mutations. The deterministic approximation (35) predicts that the frequency of *A* should decrease when *Nm* decreases up to
40

If *Nm* is lower than this limit value, the frequency of *A* increases as *Nm* decreases. Although Equation 35 no longer predicts very well the mean frequency of *A* when *N* is small and/or *m* is large, we found that Equation 40 usually gives correct predictions of the position of the minimum. Equation 40 simplifies to when *h* = 0, and *Nm* = ^{1}/_{2} when *h* = 0.1; it also shows that the frequency of *A* always increases as *Nm* decreases for *h* > ^{1}/_{3}.

The effect of population structure on the mutation load is illustrated in Figure 4
; when *h* > 0 but < ∼^{1}/_{4}, decreasing *m* has a nonmonotonic effect on the load (the load first decreases and then increases), while when *h* = 0 or > ∼^{1}/_{4}, decreasing *m* always increases the load. When *h* is between 0 and ^{1}/_{4}, an approximation for the value of *Nm* that minimizes the load can be obtained from the deterministic approximation (36): the load should be minimal when *Nm* equals
41

For the parameter range that we have tested, we found that population structure can decrease the load substantially only when the number of demes is very large or when selection is moderate to strong. When the number of demes is not very large, population structure only very weakly reduces the load (and only for 0 < *h* < ^{1}/_{4}), while for small values of *m* the load still increases rapidly as *m* decreases (Figure 4, left).

When *h* < ^{1}/_{2}, population structure always decreases within-deme inbreeding depression (δ_{IS}) as illustrated by Figure 5A
and always increases heterosis (δ_{ST}; not shown). Between-deme inbreeding depression (δ_{IT}) can increase or decrease with the migration rate, depending on the value of *h*, as can be seen in Figure 5B. When *m* is small, however, δ_{IT} always increases as *m* decreases, for all values of *h*. From the deterministic approximation (38), one finds that δ_{IT} has a minimum when *Nm* equals
42indicating that when *h* > ^{1}/_{4}, δ_{IT} should always increase as *Nm* decreases; this corresponds to what we observed for most parameter values after numerical integration over the φ distribution.

#### Selfing and population structure:

We observed that, as in an undivided population, self-fertilization decreases the average frequency of the deleterious *A* allele, for 0 ≤ *h* ≤ 1 (results not shown). This purging effect of selfing is greatest for recessive mutations (*h* equal or close to zero). Moreover, the effect of population structure on the average frequency of *A* depends on the selfing rate α. We have seen in the previous section that, without selfing, population structure decreases the frequency of *A* when *h* < ∼^{1}/_{3} (as long as *Nm* is not too small); selfing decreases this purging effect of population structure and can even reverse it if α is high. This is illustrated by Figure 6
, which, like the previous figures, was obtained by numerical integration over the φ distribution, without making any assumption on *N* and *m*. From the deterministic approximation (35), one predicts that for values of α > ^{2}/_{3}, population structure should always increase the frequency of *A*, for all values of *h*; this corresponds to what we observed for most parameter values. Note that Figure 6 represents the mutant average frequency in a subdivided population, *relative* to its average frequency in a nonsubdivided population of the same total size; as both depend on α, one cannot deduce from the figure what is the absolute effect of α on the average mutant frequency (again, this effect is to decrease the mutant frequency, for all parameter values). The same is true for Figures 7 and 8
.

When *h* is significantly greater than zero, selfing reduces the mutation load in nonsubdivided populations (Table 1); we observed the same effect in structured populations (not shown). When *h* is equal or close to zero, however, selfing does not affect the load in large nonsubdivided populations (*L* = *u*), while in structured populations the load is minimal for an intermediate value of α; this can be seen from Equation 36 and is also observed in simulations (not shown). This nonmonotonic effect of α for small *h* may be due to the fact that selfing increases the efficiency of selection against recessive mutations (by increasing homozygosity), but also increases the effects of drift within demes, since selfing decreases *N*_{e} in finite populations.

We have seen that, without selfing, population subdivision decreases the mutation load when *h* > 0 but < ∼^{1}/_{4}, if the number of demes is very large or if selection is moderate to strong. This effect is attenuated by a low rate of selfing, as shown by Figure 7, and is reversed when the rate of selfing is moderate to strong. When *h* = 0 or >^{1}/_{4}, decreasing *Nm* always increases the load, for all values of α. Overall, we observed that, when α > ∼0.2, population structure always increases the mutation load, for all values of *h*. Calculations from the deterministic approximation (36) lead to a similar result.

Finally, we found that self-fertilization always decreases δ_{IS}, δ_{IT}, and δ_{ST} when 0 ≤ *h* ≤ ^{1}/_{2}. Selfing does not change the direction of the effect of population structure on δ_{IS} and δ_{ST}: decreasing *m* always decreases δ_{IS}, for all values of α between zero and one (Figure 8A), and always increases δ_{ST} (not shown), as long as 0 ≤ *h* ≤ ^{1}/_{2}. For very high selfing rates, however, the effect of population structure on δ_{IS} is very reduced and disappears completely when α = 1, as shown by Figure 8A. The effect of population structure on δ_{IT}, however, can change in direction depending on the selfing rate. We have seen that without selfing, decreasing *Nm* decreases δ_{IT} when *h* < ∼^{1}/_{4} (provided that *Nm* is not too small); Figure 8B shows that this effect is reversed when α is sufficiently high. When *h* > ^{1}/_{4}, however, δ_{IT} always increases as *Nm* decreases, for all values of α.

#### Effects of *h* and *s*:

Increasing the dominance coefficient of deleterious mutations (*h*) always increases the mutation load and decreases the average mutant frequency, the within- and between-deme inbreeding depression, and heterosis (results not shown). Indeed, selection is less effective against recessive mutations than against dominant mutations, and the fitness difference between heterozygotes and mutant homozygotes is greatest for recessive mutations, which explains the effects of *h* on *p*, δ_{IS}, δ_{IT}, and δ_{ST}. The effect of *h* on the mutation load can be explained by the fact that recessive mutants are eliminated most often in the homozygous state, while dominant mutants are eliminated most often in the heterozygous state: therefore it takes fewer genetic deaths to eliminate a given number of recessive mutations than to eliminate the same number of dominant mutations (*e.g.*, Crow and Kimura 1970, p. 301). Although in this article we restricted ourselves to the case 0 ≤ *h* ≤ 1, the effects of underdominant (*h* < 0) and overdominant (*h* > 1) mutations can be addressed by our model. We found similar qualitative results for dominant and overdominant mutations, while the effects of population structure and selfing on the average frequency, inbreeding depression, and heterosis due to underdominant and recessive mutations are qualitatively similar. When *h* < 0, the load has to be redefined to take into account the fact that heterozygotes have the highest fecundity (segregation load); in that case Whitlock (2002) found that population structure increases the load.

The effects of the coefficient of selection against deleterious mutants (*s*) are illustrated by Figure 9
. Although mutants become less frequent as *s* increases, the greater fitness difference between heterozygotes and mutant homozygotes causes inbreeding depression to increase; the effect of *s* on heterosis, however, is nonmonotonic, a result already obtained by Whitlock* et al.* (2000). Because our diffusion model does not give accurate results when *s* > *m* [*i.e.*, when log(*s*) > −2 in Figure 9], we could not observe this nonmonotonic effect from the model, but only from the simulations. The effect of *s* on the mutation load is also nonmonotonic. As *s* decreases, the average frequency of deleterious mutations increases, but their effect on fitness decreases; in the deterministic regime, these two effects exactly compensate each other, and the load does not depend on *s*. At low values of *s*, however, deterministic solutions become less and less accurate, and the average frequency of mutants becomes higher than the deterministic equilibrium, causing the load to increase. In the limit as *s* tends to zero, deleterious mutants have no effect on fitness and the load equals zero. Figure 9 shows that the value of *s* that maximizes the load (∼10^{−4} in Figure 9) is lower than the value of *s* that maximizes heterosis (∼10^{−2}).

## DISCUSSION

In this article we expressed the mutation load, inbreeding depression, and heterosis in an island model of population structure as functions of the first moments of the frequency distribution of a deleterious allele in the metapopulation. We then used a diffusion model to calculate this distribution. Diffusion methods for island models of population structure use the fact that, as the number of demes tends to infinity, the ancestral lineages of different genes from the same deme either stay in the same deme and coalesce relatively fast or move to a different deme and take a very long time to join again in a deme. This has been described by some authors as a separation of timescales (Ethier and Nagylaki 1980; Wakeley 2003). By neglecting the effect of selection on the average coalescence time within demes, one can then use neutral probabilities of coalescence to obtain expressions for the variance of allele frequencies among demes. This gives good results as long as selection is small relative to allele frequency fluctuations within demes, *i.e*., approximately when *s* < *m* + 1/*N* (Roze and Rousset 2003). Although the first moments of the probability distribution of the frequency of the deleterious allele have to be obtained by numerical integration, some approximations are possible, the simplest being Equation 33, which also corresponds to the deterministic equilibrium. We found that Equation 33 gives good results as long as *h* > ∼0.3, or as *h* < 0.3 and the migration rate *m* is small. If one then assumes that *m* is small and that deme size *N* is large, Equation 33 becomes a simple expression that leads to Equations 35–39. These are the simplest approximations for the mean mutant frequency, mutation load, inbreeding depression, and heterosis that we could obtain. The diffusion method also assumes a high number of demes and weak selection, but these hypotheses are in fact not very restrictive, as long as *s* < *m*, as illustrated in Table 3 of Roze and Rousset (2003). When *s* > *m*, we found that the solutions obtained by Glémin* et al.* (2003) are more accurate than ours, if *h* is not small (indeed the method used by Glémin *et al.* is not appropriate for small values of *h*, since they assume that deleterious alleles are present only in the heterozygous stage). This is illustrated by Figure 10
, which shows δ_{IS} calculated from Equations 3a, 8, and 10a from Glémin* et al*. (2003), from our model after numerical integration over the φ distribution, and from our approximation (37).

Whether the diffusion method can be used for other models of population structure remains unclear. Whitlock (2002) assumed that *M*_{δ}* _{p}* has the same form in a stepping-stone model as in the island model, but there is no proof of this assumption (see Rousset 2002; Rousset 2004, p. 82). Aware of this problem, Maruyama (1983) derived diffusion approximations by noting that

*M*

_{δ}

*/*

_{p}*V*

_{δ}

*can be computed despite the exact form of*

_{p}*M*

_{δ}

*being unknown (see also Rousset 2004, p. 151). Maruyama's argument holds for semidominant mutants but may not be easily extended to other cases, and we made no attempt in this direction. Although Whitlock's diffusion approximations may be reasonable for the parameter values he investigated (Whitlock 2003), a general argument is missing and further investigations would be required.*

_{p}We obtained simple results about the effects of selfing: the average mutant frequency, within- and between-deme inbreeding depression, and heterosis all decrease as the selfing rate increases; in most cases, the mutation load also decreases as selfing increases, except when deleterious mutations are fully (or almost fully) recessive and population structure is strong, in which cases the load is minimized for an intermediate value of the selfing rate. The effects of population structure are more complicated and depend on the rate of partial selfing. Without selfing, moderate population structure decreases the frequency of deleterious mutants, the mutation load, and between-deme inbreeding depression (δ_{IT}) when deleterious mutations have a dominance coefficient < ∼0.2–0.3. This purging effect of population structure is decreased by selfing and is reversed when the selfing rate is sufficiently high. When the dominance coefficient of deleterious mutations is >0.2–0.3, population structure always increases the mutant frequency, the mutation load, and between-deme inbreeding depression. Finally, when population structure is very strong, the mutant frequency, mutation load, and between-deme inbreeding depression always increase as population structure increases, for all selfing rates and dominance coefficients. The effects of population structure on within-deme inbreeding depression and heterosis are simpler: δ_{IS} always decreases, and δ_{ST} always increases, as the degree of population structure increases.

These results illustrate a double effect of population structure, already shown by Whitlock (2002) in models without selfing: (i) population structure helps to purge recessive deleterious mutations by increasing the average homozygosity, but (ii) also increases the genetic similarity among competing individuals, thereby reducing the effect of selection. The overall effect of population subdivision depends on the relative importance of these two effects. When the selfing rate is low, the first effect prevails when mutations are recessive (*h* < 0.2–0.3), which explains why the mutant frequency and the mutation load decrease as *Nm* decreases (unless *Nm* is very small); with a moderate to high selfing rate, however, selfing becomes more efficient than population structure in increasing homozygosity, and the second effect of population structure, which increases the frequency of deleterious mutants and the mutation load, prevails.

Results about the effects of the selection coefficient *s* show that weakly deleterious mutations have the strongest effect on the mutation load, while heterosis is maximum for moderately deleterious mutations, and inbreeding depression is maximum for strongly deleterious mutations (Figure 9). Whether heterosis, mutation load, and inbreeding depression are due mainly to weakly, moderately, or strongly deleterious mutations depends critically of course on the probability of occurrence of these different types of mutations, which at present remains poorly known.

Although we assumed throughout the article that migration in the gametic phase was absent, it can be incorporated into the model easily. We modified the life cycle considered in the present article to model a plant life cycle; in this modified model, self-fertilization occurs first (a proportion α of the ovules of a plant being fertilized by pollen produced by the same plant), then pollen migrates at a rate *m*_{p}, the 1 − α remaining ovules are fertilized randomly by pollen present in the same deme, and finally seeds migrate at a rate *m*_{s}. We found that when *m*_{p} and *m*_{s} are small and *N* is large, the deterministic solutions for the average mutant frequency, the mutation load, inbreeding depression, and heterosis are still given by Equations 35–39, *m* being replaced by *m*_{s} + (1 − α)*m*_{p}/2. When migration through seeds is very low or absent, and most migration occurs through pollen, the selfing rate α has a nonmonotonic effect on *p̅*, *L*, δ_{IT}, and δ_{ST}; this is due to the fact that selfing has now the additional effect of decreasing the rate of pollen migration. The effect of α on δ_{IS}, however, remains unchanged: δ_{IS} always decreases as α increases.

## APPENDIX A

To obtain *a* and *c*, we also need to calculate the probabilities that three genes sampled in three different individuals from the same deme all have a common ancestor (*d*) and that they have no common ancestor (*f*), in the infinite island model. The recurrence equations are
A1
A2
A3
A4
A5
A6which gives for *r*_{0} and *r*_{1},
A7
A8

The expressions of *a* and *c* are more complicated and not given here for space reasons, but are available by request to the authors.

## APPENDIX B

We seek convenient approximations for integrals of the form B1

There is no simple approximation for all conceivable ranges of parameter values, but a useful approximation can be obtained for |σ_{1}| large, from which other approximations can be derived. Write
B2, B3from Equation 13.2.1 in Abramowitz and Stegun (1972)(refered to hereafter as AS), where *M*(., ., .) is Kummer's confluent hypergeometric function (AS 13.1.2). From this, assuming that |σ_{1}| is large gives by AS 13.1.5,
B4
B5where *U*(., ., .) is the other Kummer function (AS 13.1.3).

The mean allele frequency is
B6with σ_{1} = 4*N*_{e}*S*_{1}, σ_{2} = 2*N*_{e}*S*_{2}, μ = 4*N*_{e}*u*, and ν = 4*N*_{e}*v*. Applying the approximation above, we obtain
B7

We assumed that |σ_{1}| was large to obtain (B7). After comparisons with numerical integrations, we found that (B7) is a good approximation as long as *ns* is of order 1 or higher. The fit is better as *m* decreases, because it increases *N*_{e}.

Using AS 19.12.4, we obtain from (B7),
B8where *U*(., .) is Whittaker's parabolic cylinder function.

If ≫ μ + , we can use AS 19.8.1, giving B9if we keep only the first term of AS 19.8.1, and B10if we add the second term.

If μ − ≫ , AS 19.9.1 gives B11 B12

In practice, assuming that *ns* is of order 1, the order of magnitude of depends critically on *h* and *m*. increases with *h*, and when *h* > 0.3, assuming that the mutation rate is not too large (<10^{−4}), is large enough for approximation (B9) to give good results. When *h* < ∼0.3, approximation (B9) can still be used when *m* is small enough, because increases faster than μ as *m* decreases; (B10) can be used for larger values of *m* than (B9). When *h* is very small (equal or close to zero), (B9) and (B10) work only for small values of *m*, while (B11) gives good results when *m* is not small. (B11) does not work well, however, when *h* is not very small.

The mean of *pq* over φ is given by
B13

Using the same approximation as in (B4), one finds that *p̅q̅* is also approximately equal to (B7); therefore the same approximations can be used for *p̅* and *p̅q̅*.

## Acknowledgments

We thank Denis Couvet, Sylvain Glémin, Yannis Michalakis, Kostas Theodorou, and two anonymous reviewers for helpful comments on previous versions of this manuscript. This is article ISEM 04-030.

## Footnotes

Communicating editor: M. K. Uyenoyama

- Received November 27, 2003.
- Accepted March 10, 2004.

- Genetics Society of America