An Inbreeding Model of Associative Overdominance During a Population Bottleneck
Nicolas Bierne, Anne Tsitrone, Patrice David


Associative overdominance, the fitness difference between heterozygotes and homozygotes at a neutral locus, is classically described using two categories of models: linkage disequilibrium in small populations or identity disequilibrium in infinite, partially selfing populations. In both cases, only equilibrium situations have been considered. In the present study, associative overdominance is related to the distribution of individual inbreeding levels (i.e., genomic autozygosity). Our model integrates the effects of physical linkage and variation in inbreeding history among individual pedigrees. Hence, linkage and identity disequilibrium, traditionally presented as alternatives, are summarized within a single framework. This allows studying nonequilibrium situations in which both occur simultaneously. The model is applied to the case of an infinite population undergoing a sustained population bottleneck. The effects of bottleneck size, mating system, marker gene diversity, deleterious genomic mutation parameters, and physical linkage are evaluated. Bottlenecks transiently generate much larger associative overdominance than observed in equilibrium finite populations and represent a plausible explanation of empirical results obtained, for instance, in marine species. Moreover, the main origin of associative overdominance is random variation in individual inbreeding whereas physical linkage has little effect.

CORRELATIONS between allozyme multilocus heterozygosity (MLH, the number of heterozygous loci per individual) and fitness-related traits have been under study for decades. Positive correlations have been reported for various organisms, especially marine bivalves (Zouros 1987), salmonid fishes (Learyet al. 1984), and pine trees (Bushet al. 1987; reviewed in Mitton and Grant 1984; Zouros and Foltz 1987; David 1998). When detected, correlations usually account for 3–6% of the observed phenotypic variance (Britten 1996; David 1998) and are often inconsistent across samples (Gaffney 1990; David and Jarne 1997). However, if the magnitude of heterozygosity-fitness correlations (HFC) is now well known, it has proven extremely difficult to distinguish between competing explanations of the phenomenon. As the neutral status of allozymes is questionable, the debate focused on two hypotheses. The “direct overdominance” hypothesis (Koehn and Shumway 1982; Mitton 1993) treats enzymes as the causative agent of the correlation, whereas in the “associative overdominance” hypothesis (Ohta 1971; Zouroset al. 1980), allozymes are neutral indicators of genetic conditions responsible for the correlation. Recently, HFC has been observed by using noncoding DNA markers (Bierne et al. 1998, 2000; Coltmanet al. 1998; Coulsonet al. 1998; Pogson and Fevolden 1998). Although some noncoding DNA is functional, these markers seem unlikely to be anything other than selectively neutral. Thus, these data favor the “associative” hypothesis. However, in the only direct comparison between allozymes and DNA markers to date, Pogson and Zouros (1994) reported a MLH-growth correlation with seven allozyme loci in the scallop Placopecten magellanicus, while eight anonymous restriction fragment length polymorphism (RFLP) loci failed to produce a significant correlation in the same sample.

The question of “direct” vs. “associative” hypothesis is only a first step before resolving the basis of HFC. Under the direct hypothesis, several genetic determinisms, from single-locus true overdominance to more complex metabolic models involving epistasis (Koehnet al. 1988; Hawkinset al. 1989; Zouros and Mallet 1989; Clark 1991), could account for HFC. On the other hand, in a large, random mating population at equilibrium, heterozygosity at a few marker loci is poorly correlated with genomic heterozygosity (Chakraborty 1987). Therefore, associative effects need particular population structures. Models of associative overdominance have considered two possible sources of correlations among loci. First, the effect of linkage disequilibrium (correlation of allelic states within gametes) has been studied in finite populations at mutation-selection-drift equilibrium (Ohta and Kimura 1970, 1971; Ohta 1971, 1973; Zouros 1993; Pamilo and Pàlsson 1998). Because strong linkage disequilibria are mainly restricted to physically tightly linked loci (Hill and Robertson 1968), authors referring to this approach usually emphasized the role of physical linkage (but see Ohta 1973). For this reason, these effects were categorized as local effects (Davidet al. 1995). Second, the effect of identity disequilibrium (two-locus correlation of homozygosity) has been investigated in large populations with partial self-fertilization, at equilibrium (Ohta and Cockerham 1974; Strobeck 1979; Charlesworth 1991; Zouros 1993; David 1999). Physical linkage only slightly affects identity disequilibrium (Weir and Cockerham 1973). Therefore, this was referred to as a general effect (Davidet al. 1995). Although the inbreeding model may well fit populations of self-fertilized plants, neither recurrent inbreeding nor permanent small population size are likely to fit the population structure of marine organisms, where HFC was nevertheless often reported (David 1998). Other models of population structure, in which populations are not inevitably at equilibrium, have to be considered.

The aims of this study are (i) to fill the gap between the two classical approaches, by showing that effects of finite population size can be expressed in terms of inbreeding coefficients (f), just as in the case of systematic inbreeding, and by reevaluating the role of physical linkage in this context; and (ii) to relax the restrictive assumption of population equilibrium, investigating associative overdominance during a sustained population bottleneck.


In this section, we first show how the degree of associative overdominance can be related to inbreeding coefficients in a general way. This result is then used to analyze the degree of associative overdominance in a population experiencing a sustained bottleneck, assuming that variation in fitness is due to deleterious mutations.

The model: The rationale of the model is that homozygosity at a neutral locus correlates with genomic homozygosity through variation in individual inbreeding coefficients (f) and that inbreeding depression is the source of fitness variation. By inbreeding coefficient, we mean the average autozygosity over all loci in an individual, or, in other words, the proportion of loci that are homozygous for two alleles identical by descent (IBD) within an individual. This proportion depends on the pedigree of the individual and on the physical map (degree of linkage) of the loci (Weiret al. 1980). The definition of IBD for a pair of alleles is further detailed when considering the effects of particular population dynamics (bottleneck).

Consider that a fitness trait, W, is a linear function of the inbreeding coefficient (f), as expected if deleterious mutations have nonepistatic effects, Wj=Woβfj, (1) where Wj is the trait value of an individual j with inbreeding coefficient fj, Wo is the trait value for outbred individuals, and β is the inbreeding load (Mortonet al. 1956; β > 0 means inbreeding depression). In practice, there is usually a variance in fitness among equally inbred individuals (individuals with the same fj). However, the derivations below remain correct if Wj is replaced by Wj , the conditional expectation of Wj knowing fj.

The expected magnitude of associative overdominance (AO, the difference between mean fitnesses of heterozygotes and homozygotes at the marker locus) is AO=E(W)hetE(W)hom=β[E(f)homE(f)het], (2) where E(W)het[E(f)het] and E(W)hom[E(f)hom] are the expected inbreeding coefficients (trait values) of heterozygotes and homozygotes at the marker, respectively.

The expected inbreeding coefficient of heterozygotes is E(f)het=01fPr(fhet)df, (3) where Pr(f/het) is the conditional probability that the inbreeding coefficient is f, knowing that the marker locus is in a heterozygous state. The reciprocal probability, Pr(het/f), is the well-known H0(1 − f), where H0 is the probability for two non-IBD alleles at the marker locus to be nonidentical in state. Thus, using Bayes' theorem, Pr(fhet)=Pr(f)Pr(hetf)Pr(het)=Pr(f)H0(1f)H0(1E(f)), (4) where E(f) is the unconditional expectation of f over all possible individuals in the population.

Noting σ2(f), the variance of f over all possible individuals in the population, E(f)het=11E(f)01f(1f)Pr(f)df=11E(f)[E(f)σ2(f)E2(f)], (5) which reduces to E(f)het=E(f)σ2(f)1E(f). (6)

The expected inbreeding coefficient of homozygotes can be similarly derived as E(f)hom=E(f)+E(H)1E(H)σ2(f)1E(f), (7) where E(H) = H0(1 − E(f)).

Using Equations 2, 6, and 7, the magnitude of associative overdominance is AO=β[σ2(f)(1E(H))(1E(f))]. (8)

AO is the product of the inbreeding load and a term describing the magnitude of association between marker homozygosity and inbreeding coefficient. At this stage, no assumption has been used about population structure or equilibrium and about the genetic basis of inbreeding depression (e.g., overdominance or deleterious mutations).

We derived AO at one marker locus. However, most experimental studies describe the correlation between MLH and a fitness trait. The value of this correlation is derived in the appendix.

Sustained population bottleneck under the deleterious genomic mutation model: Let us assume an infinite random mating population experiencing a bottleneck of size N individuals at generation G0. We further assume that all individuals at G0 are unrelated [E(f) = 0 and σ2(f) = 0] and that a deleterious mutation is present in only one copy in G0. The latter assumption is justified by the very low equilibrium frequencies of deleterious alleles in the infinite population. For the neutral marker, H0 defined in the previous section is equivalent to the initial genetic diversity at generation G0. The population size remains constant over t generations, after which AO is measured. Mutation at the marker locus is neglected after the foundation event.

Foundation load: As a first approximation, let us consider that the variance in fitness is wholly due to the segregation of mutations initially present at G0 (the effect of new mutations will be considered in the next section). The operational definition of IBD for a pair of alleles is their being two copies of the same allele of generation G0. The fitness of an individual j will therefore depend on f0,j (the subscript 0 refers to the generation of reference) and on the number and effect of mutations present in G0.

Using the classical deleterious genomic mutation model, the multilocus fitness is given multiplicatively by w=(1s)y(1hs)z, (9) where h is the dominance coefficient, s the selection coefficient against deleterious homozygotes, and y and z are the numbers of mutations in homozygous and heterozygous states, respectively (Charlesworthet al. 1990). To obtain a linear function as in Equation 1, the natural logarithm of fitness is taken as our fitness trait, W = ln(w).

Neglecting purging selection (i.e., the selective elimination of some deleterious alleles during the bottleneck), the expected W of an individual j is expressed as a function of the inbreeding coefficient, Wj=WOβf0,j, (10) where the inbreeding load is βU(12h1) (11) (B in Charlesworth and Charlesworth 1999) and WO ≈ −U (Haldane 1937; Kimuraet al. 1963).

Adding subsequent mutational load: To take into account mutations occurring after foundation, we introduce the coefficient, fg,j, which denotes the proportion of loci in individual j for which the two alleles are copies of the same allele of generation g (1 < g < t).

The fitness function takes the form Wj=WOβf0,jβmutg=1tfg,j, (12) where βmut = Us(1/2 − h) = hsβ is the inbreeding load due to mutations occurring after foundation.

The magnitude of associative overdominance for the fitness trait W = ln(w) is AO=11H0(1E(f0))×[βσ2(f0)1E(f0)+βmutg=1tσ2(fg)1E(fg)]. (13)

This equation assumes no mutation at the marker locus. Indeed, even though the number of new mutations affecting fitness loci in the whole genome may be high, the frequency of mutations affecting any particular locus (in this case, the marker locus) in the first few generations following the foundation can be neglected as a first approximation.

Purging selection: In the computation of inbreeding load coefficients, β and βmut, purging selection is neglected. Incorporating this process into analytical expressions is a complex task and would require special attention (see Wanget al. 1999 for a simulation approach of this problem). However, when selection is not too strong and dominance is large enough, the expected number of copies left by a mutant gene is roughly (1 − hs) per generation, assuming that such alleles are mostly in heterozygous state. Then, the expected number of copies at time t is (1 − hs)t, which represents the approximate rate of decrease of the inbreeding load. This can be incorporated into Equation 13: AO=11H0(1E(f0))×[β(1hs)tσ2(f0)1E(f0)+βmutg=1t(1hs)gσ2(fg)1E(fg)]. (14)

Note that this is not a generally satisfactory approximation. However, it resulted reasonably accurately under the range of parameters studied.


Mean and variance of the inbreeding coefficient during a sustained population bottleneck: The sustained bottleneck causes (i) an increase in the mean inbreeding coefficient [E(f)] and (ii) variation of the inbreeding coefficient among individuals due to random variation in pedigrees (Weir and Cockerham 1969; Weiret al. 1980).

The increase in E(f) is a well-known consequence of genetic drift (Malécot 1946). E(f) depends on the effective population size and time since foundation. It is only slightly affected by the mating system (Malécot 1946) and is unaffected by linkage. In a random mating population with constant size N, the value of E(f0) after t generations is E(f0)=1(112N)t1et2N. (15)

The evolution of the variance in inbreeding coefficients, σ2(f), during a sustained population bottleneck has been studied by Weir et al. (1980). Although there is no simple analytic expression, these authors provided the transition matrix of two-locus descent measure vectors. This allows the computation of σ2(f) as a function of population size, time since foundation, mating system, and degree of linkage (Weiret al. 1980). σ2(f) and E(f) were numerically calculated in a Mathematica 3.0 program (Wolfram 1996). Equation 14 and the transition matrix method provide exact results under neutrality. Actually, selection may modify E(f) and σ 2(f) compared to neutral expectations. The validity of the neutral approximation is tested using simulations.

Simulations: All programs were written in Turbo-Pascal. We simulated a single chromosome with a map length of L Morgans. No constraint was applied to the number of loci in the genome. The position of the marker locus was generated from a uniform law at the beginning of each simulation. To obtain the desired values of H0 (the initial genetic diversity at the marker locus), we started with H = 1 (one different allele for each chromosome) and simulated neutral drift until H decreased to H0. A random number of mutations were then attributed to each chromosome, following a Poisson distribution with mean U/(hs). The mutations were uniformly distributed along the chromosomes. Three mating systems were modeled: monogamy (N/2 males each mate with N/2 females), random mating (N monoecious individuals mate at random, including random selfing), and random mating with selfing excluded. Recombination rate, r, was related to the map distance between loci, d, using Haldane's mapping function or r = 0.5 for unlinked loci (Weiret al. 1980). Each generation, a Poisson-distributed (with mean U/2) random number of new mutations were uniformly distributed along each chromosome. Simulations were performed with or without selection. To account for selection, randomly drawn offspring survived until reproduction with a probability proportional to their fitness (given by Equation 9); this probability was one in the absence of selection. The output values of the model were calculated on offspring before selection and averaged over 1000 simulations.

Figure 1.

Associative overdominance as a function of time. Differential contribution of mutations already present in the founding population and of new mutations to AO [the difference between the average value of heterozygotes vs. homozygotes for the fitness trait W = ln(w) at the neutral marker locus] and simulation results (1000). The population size is N = 40. The mating system is monogamy. The genomic mutation rate is U = 1. The dominance and selection coefficients are h = 0.3 and s = 0.05. The initial marker gene diversity is H0 = 0.9. Loci are unlinked.


General shape of AO as a function of time since foundation: Figure 1 presents AO calculated with Equation 14, detailing the effects of the first term (mutational load already present at the foundation of the bottleneck) and of the second term (effect of subsequent mutational load). AO is maximal just after the bottleneck, when σ2(f) and E(H) are maximal and E(f) minimal, and then decreases as σ2(f) and E(H) decrease and E(f) increases. Foundation load initially has a far more marked effect than mutations occurring after the bottleneck. There is a good agreement between theoretical values and simulated data with the deleterious parameters used (h = 0.3, s = 0.05). However, purge is underestimated when h < 0.3 and s > 0.15 (data not shown).

The diffusion approximation of Ohta (1971, 1973) gives AO ≈ 0.008 at mutation-selection-drift equilibrium with the parameters used in Figure 1. This is the same order of magnitude as our results at generation 50 (AO = 0.004). The discrepancy relies on slightly different assumptions in the two models. Ohta (1971, 1973) considered additive rather than multiplicative fitness across loci and a diallelic marker locus with fixed heterozygosity H = 0.5, whereas marker heterozygosity was allowed to evolve from an initial value of H0 = 0.9 in our model.

Figure 2.

Effect of initial marker gene diversity on AO. Curves are analytical results. Simulation results obtained for H0 = 0.9 and H0 = 0.5 are depicted by circles and triangles, respectively. N = 20, parameters are as in Figure 1.

Population size: As expected, smaller AO is observed in larger populations, in which σ2(f) is smaller. Indeed, at the second generation after foundation, when AO is maximal, AO is, respectively, 0.03, 0.02, and 0.01 for N = 20 (Figure 2), N = 40 (Figure 1), and N = 100.

Initial heterozygosity and number of markers: Figure 2 presents results for H0 = 0.5 (e.g., allozyme) and H0 = 0.9 (e.g., microsatellites). Initial heterozygosity has a large effect, especially just after the bottleneck. Markers with larger initial heterozygosity will be better indicators of the inbreeding coefficient. Figure 3 shows that HFC increases with the number of marker loci, but quickly saturates.

Mating systems: Figure 4 presents results for three mating systems. The mating systems influence AO through the magnitude of σ2(f), consistent with the results of Weir et al. (1980). When selfing is allowed, σ 2(f) is enhanced.

Deleterious genomic mutation parameters: There are no universal values for U, h, and s. For example, empirical orders of magnitude for U range from 0.01 to 1 (Garcia-Doradoet al. 1999; Keightley and Eyre-Walker 1999; Lynchet al. 1999). In our model, the effect of mutation parameters is twofold: (i) they determine the inbreeding load β, to which AO is proportional, all other things being equal, and (ii) they modify purging selection, which tends to decrease AO faster than the neutral expectation. Trivially, increasing the mutation rate U or decreasing the dominance coefficient h enhances the inbreeding load (Equation 11) and subsequently AO. The effect of (ii) is reasonably accounted for by our approximation when the heterozygous effects of mutations predominate (low s, large h). In this case, purge decreases AO by roughly a proportion hs per generation. When homozygous effects of mutations become important (e.g., h < 0.2, s > 0.1), AO decreases faster than predicted by Equation 14 because purging selection is underestimated (data not shown). However, the purge effect never changes the order of magnitude of AO compared to analytical predictions. The extreme situation is that of lethal mutations (say, s = 0.9–1 and h = 0.03; see Wanget al. 1998). Lethals greatly increase AO in the first few generations after the bottleneck (large β), though the subsequent decrease in AO is very fast (strong purge).

Figure 3.

(A) Maximum correlation coefficient between MLH (M = 1, 5, and 10 loci) and fitness with generation time. (B) Effect of the number of marker loci on the maximum correlation coefficient between MLH and fitness 10 generations after the foundation event. N = 20, parameters are as in Figure 1.

Finally, note that if AO scales with β, the latter does not influence the correlation coefficient between MLH and the fitness trait (see appendix), which, unlike AO, is not expressed in phenotypic units. Thus, purging selection does not affect the correlation.

Figure 4.

Effect of the mating system on AO. Curves are analytical results. Simulation results obtained for monogamy, random mating, and random mating with selfing excluded are depicted by circles, triangles, and squares, respectively. N = 20, parameters are as in Figure 1.

Linkage: Figure 5 presents results for unlinked loci and linked loci in a genome of either 1 or 10 M. Tight linkage promotes strong AO, though this effect is restricted to very small genomes, in which deleterious mutations are highly concentrated (33.3 mutations in 1 M). Another effect of linkage is to delay maximum AO, which is reached a few generations (rather than immediately) after the onset of the bottleneck.

Results take into account two sorts of linkage: linkage among fitness loci and linkage between the marker locus and fitness loci. However, the latter is far more important than the former. Indeed, if the marker locus is independent from the rest of the genome (consisting in linked fitness loci), the AO obtained by simulation is roughly the same as for a completely unlinked system of loci (data not shown).

Another effect of strong physical linkage (1 M) is to introduce a departure from the analytical expectation (Figure 5). The difference between analytical and simulation results disappears when selection is removed (Figure 5) and is therefore entirely due to selection. However, it does not rely on an inaccurate approximation of purging, as the discrepancy persists when the actual inbreeding loads estimated at each generation in simulations are plugged into Equation 14. Therefore, selection acts through a change in the distribution of inbreeding coefficients [E(f) and σ2(f)] compared to the neutral distribution. E(f) is little affected by selection with linkage with the set of parameters used in Figure 5, as the maximum difference between neutral E(f) and E(f) from simulations is only 3% at generation 50. The departure between analytical and simulation results with selection and linkage is therefore due to a decrease in σ2(f).

Figure 5.

Effect of linkage on AO. Curves are analytical results. Simulation results obtained for unlinked loci, a genome size of 10 M, and a genome size of 1 M are depicted by circles, triangles, and solid squares, respectively. Simulation results without selection, obtained for a genome of 1 M, are depicted by open squares. N = 20, parameters are as in Figure 1.


Our aim was to derive associative overdominance as a function of individual inbreeding coefficients (genomic autozygosity) and to apply it to the case of a sustained population bottleneck. In this section, we first focus on how associative overdominance can be understood under the inbreeding model, showing that the effects of all parameters may be related to inbreeding. Under this framework, we explain how bottlenecks enhance AO and reevaluate the role of physical linkage in small populations. Finally, the relationships between associative overdominance and selection are discussed.

Associative overdominance and inbreeding: The degree of associative overdominance depends on the joint effect of several mechanisms, all related to inbreeding.

Population structure and inbreeding variance: AO arises to the extent that population structure promotes correlations between marker homozygosity and genomic inbreeding (autozygosity). Our approach shows that this association depends mainly on the variance in inbreeding coefficients. Basically, if all individuals have the same inbreeding coefficient, as in a large random mating population, no association can arise. Small population size (instantaneous or at equilibrium) as well as recurrent inbreeding in large populations are two cases of population structure that enhance inbreeding variance. The first case involves random inbreeding (the random variation in relatedness among pairing mates) and the second case involves systematic inbreeding (the attribution of a fixed proportion of matings to related mates; Malécot 1969). Within a small population, the mating system (opportunity of random selfing, monogamy …) affects random inbreeding variation, σ2(f) (Weiret al. 1980). It is therefore an important parameter to explain associative overdominance. Finally, because more variable markers are better indicators of inbreeding, they will exhibit stronger AO. Increasing the number of marker loci allows a better approximation of the genomic autozygosity and MLH-fitness correlation increase with this number, with a saturation effect.

Fitness variation and inbreeding depression: The degree of AO depends on variation in fitness within the population, which is caused here by inbreeding depression. Therefore, mutation parameters that generate a large inbreeding load, such as large U or small h, promote AO. However, inbreeding depression also depends on population structure. In small populations at equilibrium, the standing variation is low as new mutations either disappear or go to fixation quickly. The expected depression is therefore usually very small (Bataillon and Kirkpatrick 2000) and only very tight linkage can maintain polymorphism (Charlesworthet al. 1993; Pàlsson and Pamilo 1999) and inbreeding depression. On the other hand, in very large populations, recessive deleterious mutations are maintained by the mutation-selection balance and inbreeding depression is maximal (Bataillon and Kirkpatrick 2000).

Effects of a population bottleneck: High inbreeding depression associated with large inbreeding variance and marker diversity promote AO. However, they cannot act simultaneously in populations at equilibrium, unless there is some degree of systematic inbreeding (Charlesworth 1991; David 1999). Indeed, as explained above, large random-mating populations at equilibrium display high inbreeding depression and marker diversity but lack variation in inbreeding, whereas the reverse is true for small populations at equilibrium. In contrast, the initial stages of a sustained bottleneck are transient situations that maximize AO. During the first generations after the bottleneck, the population retains high marker diversity and high inbreeding depression corresponding to the mutation-selection equilibrium in the founding large population, but displays high inbreeding variance because of the small instantaneous population size (Weiret al. 1980). This effect progressively dies out until the equilibrium AO for a small population is recovered.

Bottlenecks are likely mechanisms by which AO can arise in natural populations. Indeed, equilibrium models cannot apply to all the taxa in which AO has been detected. Pine trees may well consist of large, partially selfing populations at equilibrium (Ledig 1986; Strauss 1986). However, neither marine bivalves nor salmonid fishes seem likely to fit equilibrium models. Moreover, their population sizes are orders of magnitude too high to predict detectable AO as a result of finite population size at equilibrium. On the other hand, it has been suggested that transient and local inbreeding can arise as a result of the fragmentation of these populations into small, transient reproductive groups (Blanc and Bonhomme 1986; Avise 1994; Hedgecock 1994a; Li and Hedgecock 1998). This is consistent with the slight but significant genetic differentiation observed at small spatial scales in marine species (Johnson and Black 1984; Wattset al. 1990; Hedgecock 1994b; Davidet al. 1997). These patterns of differentiation are generally ephemeral and do not reflect permanent structure as in island models or isolation-by-distance systems (Johnson and Black 1984; Davidet al. 1997). The effect of transient population fragmentation is similar to a few generations of bottleneck at the scale of a local population. AO could therefore appear within local samples. Moreover, the predicted transient nature of AO in such systems is also consistent with the observed temporal and spatial irregularity of its occurrence (Gaffney 1990; David and Jarne 1997; Pogson and Fevolden 1998). Artificial populations also provide examples of the bottleneck effect. Indeed, Bierne et al. (2000) observed a significant correlation between heterozygosity at three microsatellite markers and growth in the shrimp Penaeus stylirostris after 10 generations of controlled bottleneck (N = 20). Finally, marine bivalves display high genetic loads that have been interpreted as a consequence of their high fecundity (Bierneet al. 1998). With such a load, significant AO arises with little variation in inbreeding.

Effect of linkage under the inbreeding model: Linkage affects AO because it increases σ2(f). The variance in f can be partitioned into two components: an “amongpedigree” component arising from the random variation in pedigrees and a “within-pedigree” component among possible individuals with identical pedigrees (e.g., full sibs). Without linkage, individual inbreeding coefficients mainly depend on the pedigree. Indeed, the pedigree gives the probability of autozygosity at a locus, which is independent from that of other loci in the same individual. The proportion of autozygous loci, f, varies little within pedigrees because it is averaged over a large number of loci. Moreover, the within-pedigree variation in f is not correlated to homozygosity at the marker locus. Therefore, variance in f and correlation with the marker locus come from the fact that different individuals have different pedigrees. With linkage, alleles at different loci do not segregate independently along the pedigree; rather, the genome is fragmented into a finite number of chunks of chromosome. This creates correlations among autozygosities at different loci within a pedigree, which greatly inflate the within-pedigree variance in f, and reinforces the correlation with homozygosity at the marker locus.

The quantitative importance of physical linkage in promoting AO has been largely debated. Effects directly attributable to physical linkage in the vicinity of the marker locus have been referred to as “local” effects and contrasted to “general” effects affecting the whole genome (Davidet al. 1995). Previous models either consider infinite populations with partial selfing and unlinked alleles or emphasize the role of linkage disequilibrium in small populations. Although physical linkage is predicted to be relatively unimportant in the first case (Weir and Cockerham 1973; David 1999) and not strictly necessary in the second (Ohta 1973), empiricists often refer to tightly linked fitness loci, i.e., local effects, to explain the observed correlations (Pogson and Fevolden 1998). Our results confirm that linkage is not needed for AO and allow us to quantify its relative contribution when present. This contribution appears to be quite low: Only in very small genomes (e.g., 1 chromosome of 1 M) is AO substantially different from that predicted using completely independent loci. However, species usually have several chromosomes. Assuming a chromosome length of 1 M, the neutral marker is only about four times more influenced by mutations localized on its chromosome (Figure 5) than by independent mutations. This local effect may be hardly detectable in empirical studies, as the marked chromosome on average harbors a fraction 1/n of the total mutations (n being the haploid number of chromosomes). For example, Bierne et al. (2000) detected significant AO in the shrimp P. stylirostris. However, n = 46 (Nakauraet al. 1988) and most of the genome is unlinked to any particular marker.

Associative overdominance and natural selection: Two levels of selection may be considered: (i) indirect selection on the marker locus and (ii) direct selection on deleterious mutations.

How does AO influence the fixation rate at the marker locus? Several studies of populations artificially maintained at small numbers revealed that the decrease in heterozygosity at presumably neutral markers is slower than the neutral expectation (Connor and Bellucci 1979; Frankhamet al. 1993; Rumballet al. 1994; Latteret al. 1995). AO has been invoked to explain such an observation (Latter 1998) as well as the maintenance of neutral polymorphism in small populations (Ohta 1971; Pamilo and Pàlsson 1998; Pàlsson and Pamilo 1999). However, we need to distinguish between two classes of phenomena, both referred to as AO. The first is a statistical correlation between heterozygosity and fitness (HFC) and is the one described by our model. As pointed out by Charlesworth (1991), it is important to keep in mind that it is only an apparent selection that does not necessarily influence the evolution of the neutral marker. The second is really indirect balancing selection on neutral marker loci. In the conditions used in our model, only physical linkage can indirectly—and weakly—select for neutral marker heterozygosity, whereas it is not necessary to generate HFC.

Selection on deleterious mutations: For fitness traits, in the case of unlinked loci, the main effect of selection is to decrease the inbreeding load (purge) compared to the neutral expectation. However, this effect is usually small in the first generations following the bottleneck. When selection and linkage act simultaneously, not only inbreeding load is decreased but the evolution of inbreeding variance is greatly perturbed. This can substantially decrease AO compared to neutral expectations (Figure 5), although the order of magnitude remains the same. On the whole, selection is predicted to have relatively little impact on empirically detected AO for two reasons. First, AO is more likely to be detected in the initial phases of the bottleneck, when it is maximal and selection has had no time to modify the standing mutation load and the distribution of inbreeding. Second, although physical linkage enhances the impact of selection, this effect is diluted proportionally to the haploid chromosome number (as described above).


Mating system and linkage should not be considered as competing hypotheses to explain apparent heterozygote advantage. Indeed, both act by increasing the variance in individual inbreeding levels. Associative overdominance is expected whenever population structure and/or dynamics enhance this variance. Bottlenecks provide such situations. Moreover, associative overdominance is high during bottlenecks because the genetic variation in fitness inherited from the founding large population has not yet been eroded by purging selection or random drift. However, the association between marker loci and fitness genes is ephemeral and has little effect on marker variation. More complex population structures, such as metapopulations, still have to be investigated.


We are very much indebted to F. Bonhomme, K. Dawson, and P. Jarne for constructive discussions. P. Keightley and two anonymous referees provided insightful remarks on the manuscript. We also thank K. Belkhir for his advice on Pascal programming and R. Vitalis for drawing our attention to relevant references.


We consider M unlinked marker loci with the same genetic diversity H0 and calculate the correlation coefficient between MLH and the fitness trait W, as a function of time after the foundation event. The foundation load only is considered, and subsequent mutational load is neglected.

MLH can be calculated as MLH=k=1MHk, (A1) where Hk is the indicator variable of heterozygosity at marker k (Hk takes the value 0 when the marker k is homozygous and 1 when heterozygous).

One marker locus: The correlation coefficient between Hk and W is ρ(Hk,W)=cov(Hk,W)σ(Hk)σ(W). (A2)

Hk is a binomially distributed variable with mean E(Hk)=H0(1E(f)) (A3) and variance σ2(Hk)=H0(1E(f))(1H0(1E(f))). (A4)

From Equation 1, σ2(W)=β2σ2(f). (A5)

As mentioned previously, Equation 1 assumes that W depends only on f values. However, equally inbred individuals may actually have different W because of random distribution of mutations and environmental variation. This introduces a within-pedigree variance in W, noted σ2within(W), which has to be added to Equation A5. Within-pedigree variation is uncorrelated to heterozygosity at the marker locus and therefore does not participate in the covariance term. Therefore, cov(Hk, W) is simply cov(Hk,W)=E(Hk)(1E(Hk))β(E(f)homE(f)het)=βE(Hk)σ2(f)1E(f). (A6)

The maximal correlation coefficient is obtained in the absence of within-pedigree variation (σ2within(W) = 0) and can be computed by plugging Equations A4, A5 and A6 into Equation A2: ρmax(Hk,W)=E(Hk)1E(Hk)σ(f)1E(f). (A7)

When σ2within(W) > 0, the correlation coefficient decreases to ρ(Hk,W)=ρmax(Hk,W)1+σwithin2(W)β2σ2(f). (A8)

Several marker loci: The covariance between MLH and the fitness trait can be calculated as cov(MLH,W)=Σk=1Mcov(Hk,W)=Mcov(H1,W) (A9) and the variance in MLH is σ2(MLH)=Mσ2(H)+2Σ1i<jMcov(Hi,Hj). (A10)

Since two unlinked marker loci do not covary within a given level of inbreeding, we obtain covij(Hi,Hj)=H02σ2(f) (A11) and, in the absence of within-pedigree variation in W, =σ(f)H0M(1E(f))(1H0(1E(f)))+(M1)H0σ2(f). (A12)

Within-pedigree variation may be accounted for as in Equation A8.


  • Communicating editor: P. D. Keightley

  • Received November 15, 1999.
  • Accepted April 17, 2000.


View Abstract