Inferring Recent Outcrossing Rates Using Multilocus Individual Heterozygosity: Application to Evolving Wheat Populations
 ^{*} INRA, Station de Génétique Végétale, Ferme du Moulon, 91190 Gif sur Yvette, France
 ^{†} INRA, Domaine de Melgueil, 34130 Mauguio, France
 Corresponding author: Jérôme Enjalbert, INRA, Unité de Pathologie Végétale, BP01, 78850 Thivervalgrignon, France. Email: enjalber{at}grignon.inra.fr
Abstract
Using multilocus individual heterozygosity, a method is developed to estimate the outcrossing rates of a population over a few previous generations. Considering that individuals originate either from outcrossing or from n successive selfing generations from an outbred ancestor, a maximumlikelihood (ML) estimator is described that gives estimates of past outcrossing rates in terms of proportions of individuals with different n values. Heterozygosities at several unlinked codominant loci are used to assign n values to each individual. This method also allows a test of whether populations are in inbreeding equilibrium. The estimator’s reliability was checked using simulations for different mating histories. We show that this ML estimator can provide estimates of outcrossing rates for the final generation outcrossing rate (t_{0}) and a mean of the preceding rates (t_{p}) and can detect major temporal variation in the mating system. The method is most efficient for low to intermediate outcrossing levels. Applied to nine populations of wheat, this method gave estimates of t_{0} and t_{p}. These estimates confirmed the absence of outcrossing (t_{0} = 0) in the two populations subjected to manual selfing. For freemating wheat populations, it detected lower final generation outcrossing rates (t_{0} = 00.06) than those expected from global heterozygosity (t = 0.020.09). This estimator appears to be a new and efficient way to describe the multilocus heterozygosity of a population, complementary to Fis and progeny analysis approaches.
THE effect of mating system on plant genetic diversity is a major theme in evolutionary genetics, as it plays a major role in structuring the genetic variation within and between populations. For example, selffertilization correlates with lower withinpopulation diversity and with higher betweenpopulation differentiation, and global polymorphism is generally lower in selfing species (Godt and Hamrick 1991).
Two strategies have been used to estimate the outcrossing rates of a mixed mating population from genetic markers: measuring the frequency of heterozygous genotypes and the analysis of progeny arrays (Ritland 1983). Assaying the frequency of heterozygotes in a population gives an estimate of the outcrossing rate assuming that (i) the selfing rate has been constant for a sufficient number of generations, (ii) the population is in inbreeding equilibrium, and (iii) selfing is the major cause of departure from HardyWeinberg frequencies (see Brown and Allard 1970). In such cases, a simple relation exists between Wright’s withinpopulation inbreeding coefficient f and the outcrossing rate t_{f}, f = (1  t_{f})/(1 + t_{f}) (Wright 1969). This method is widely used when surveying marker diversity in natural populations, for example, in Centrosema (Penteadoet al. 1996) or in Bulinus truncatus (Viardet al. 1997). However, estimates of t_{f} must be treated with caution when temporal variation exists in outcrossing rates.
The second method is based on progeny arrays. Morphological markers were used long ago to identify outcrossing vs. selfing events in progenies of known maternal genotypes (e.g., in wheat; Hayes 1918). As the detection of a single nonmaternal allele in a genotype proves its outbreeding origin, accurate estimates of the outcrossing rate in natural populations can be obtained using multilocus information on progenies (especially in conjunction with estimates of maternal genotypes). Powerful estimators of outcrossing rate based on progeny analysis have been proposed (Ritland and Jain 1981; Shawet al. 1981) and widely used (Schoen and Brown 1991). Progeny arrays can also measure the variance in outcrossing rates between maternal individuals, e.g., as has been done in studies in Eucalyptus regnans (Moranet al. 1989) and Acacia nilotica (Mandal and Ennos 1995). They also yield estimates of both ovule and pollen allelic frequencies (Godt and Hamrick 1991). Additionally, differences between monolocus and multilocus estimates make it possible to infer the amount of inbreeding due to mating between relatives (Shaw and Allard 1982; Ritland 1984). Note that progeny arrays can also yield a Wright estimate t_{f}, using the inbreeding coefficient of mothers.
Environmental conditions can cause considerable temporal variation in mating behavior. Low temperatures or light intensity can modify outcrossing in some selfing species, as documented in wheat (DemotesMainardet al. 1995) and rice (Liet al. 1996). Low population density can reduce outcrossing, as demonstrated in Bombacaceous trees (Murawskiet al. 1990) and in Cuphea laminuligera (Krueger and Knapp 1991). Studies over successive years are thus required to measure temporal variation in outcrossing rates (Barrettet al. 1993; Sproule and Dancik 1996).
Here we develop a method to estimate the outcrossing rates for a few previous generations using a single generation survey. It is based on the analysis of multilocus heterozygosity within individuals. In mixed mating species, individuals can either originate from outcrossing between individuals or be derived from a varying number of selfing generations from outbred individuals. Hence individuals display varying levels of heterozygosity in their genomes, as initial heterozygosity resulting from the founder outcrossing event is halved by each successive selfing generation (i.e., for two independent loci, recently produced genotypes are more likely to be double heterozygotes, whereas more and more double homozygotes accumulate with each selfing generation). The proportion of individuals exhibiting a high level of heterozygosity should thus give information about the outcrossing rate in the most recent generation, while the proportions of individuals with varying levels of homozygosity could be used to estimate outcrossing rates in the previous generations. Consequently, a population can be partitioned into classes of individuals sharing the same number of selfing generations since the last outcrossing event in their genealogy. Each class presents distinct expected levels of multilocus individual heterozygosity (MIH), and class frequencies are a result of the mating history of the population. Using both probability formulas, we developed a maximumlikelihood estimator of outcrossing rates on the basis of multilocus patterns observed in a random sample of individuals drawn from an infinite population. This approach allows a test of whether a population had a constant selfing rate in the past, i.e., if it is in inbreeding equilibrium.
Using simulated populations, the estimator properties are evaluated for various mating histories as functions of the sample size, the number of loci used to estimate MIH, and their Nei diversity. The MIH method is tested on nine populations of wheat (Triticum aestivum) derived from a pilot program of dynamic management of genetic resources. These populations have known contrasting mating histories. We show that this approach can be used when progeny analysis is not possible for practical reasons, or when temporal variance in outcrossing rates is suspected.
ANALYTICAL MODEL
Inbreeding classes distribution: Our model assumes that populations evolve with temporal variation in outcrossing rates with no genetic drift, no selection (particularly no heterotic selection), and with the random mating of outcrossing gametes. In populations of hermaphroditic species, individuals may originate from a cross between two different parents or from the selfing of a single individual, which itself could have resulted from selfing or outcrossing. Each individual can then be indexed on n (0 ≤ n < ∞), the number of selfing generations since the last outcrossing event in its genealogy. Such an individual is said to belong to class S_{n}. Then any population observed at a given generation is composed of an infinity of classes of individuals {S_{0},..., S_{∞}} in proportions {Q_{0},..., Q_{∞}}, with of course R
Individuals from all classes are assumed to outcross at the same rate. If t_{0} is the outcrossing rate that occurred in the production of the present generation, then individuals that were in class S_{n} one generation before produced progeny in class S_{0} at a rate of t_{0} and in S_{n} _{+ 1} at a rate of (1  t_{0}). It is therefore clear that
If we use t_{}_{n} to denote the outcrossing rate in the population n generations before the one observed, then, given that individuals in class S_{n} in the present population derive from outbred ancestors created n generations previously, we have
By recurrence, the proportion of individuals descending from n generations of selfing in the generation observed is thus
Expected multilocus individual heterozygosity: In diploid species, most individuals that result from an outcrossing event (S_{0}) can be detected since they are heterozygous at many loci (Bennett and Binet 1956). Individuals in class S_{n}_{+1} have half the mean heterozygosity of individuals in the previous class S_{n}. We note that P(ht/S_{n}) and P(hm/S_{n}) are the probabilities of individuals from class S_{n} being heterozygous and homozygous, respectively, at a given locus. In the generation observed, P(ht/S_{0}), i.e., the probability of being heterozygous after an outcrossing event is
Now consider G_{x}, the multilocus heterozygosity pattern of an individual x at all the loci genotyped. Using the variable a_{l}, where a_{l} = 1 if x is heterozygous at locus l, and a_{l} = 0 otherwise, then for L unlinked loci in linkage equilibrium,
This product only holds at linkage equilibrium; otherwise disequilibrium measures have to be introduced (not developed here). Note that the correlation between heterozygous states of independent loci in a mixed mating population observed by Bennett and Binet (1956) results from combining of the very first selfing classes, mainly composed of multiple heterozygous individuals, with highly selfed classes composed of multiple homozygotes. This correlation is absent within a given selfing class S_{n} and has no effect on S_{0} formation as the disequilibrium is zygotic rather than gametic. Any heterozygosity correlation appearing in S_{0} by departure from our hypothesis (nonrandom mating, population of small size...) will be rapidly lost while heterozygosity diminishes during the successive selfings.
Maximumlikelihood estimation of recent outcrossing rates: From the previous expressions, the likelihood of G_{x}, the genotype of individual x, is
As P(x ∊ S_{n}) = Q_{n}_{,0}, using Equations 1, 2, and 4, the first term of L(x) depends both on gene diversities and outcrossing rates for successive generations and, hence,
The mating behavior is estimated for z previous generations, and the outcrossing rates of all generations before generation z are assumed to be constant and equal to t_{p}. As developed in appendix c, when outcrossing rates vary before generation z, t_{p} is a function of (t_{}_{z}, t_{}_{z}_{1},..., t_{∞}), mainly depending on the very first terms (recent outcrossing rates).
Thus z + 1 outcrossing rates (t_{0},..., t_{}_{z}_{+1}, t_{p}) must be jointly estimated by the maximumlikelihood technique. A complete expression for L(x) is presented in appendix a.
The likelihood of a sample of X individuals L(1,.x., X) of genotype (G_{1}, G_{x}, G_{X}) is the product of the individual likelihoods, on the assumption of independence between sampled individuals,
Confidence intervals: When known, the distribution of the LOD score, the logarithm of the ratio of the maximum likelihood to the likelihood for any value of {t_{0},..., t_{}_{z}_{+1}, t_{p}}, LOD = Log_{10} (L(tˆ_{0},..., tˆ_{}_{z}_{+1}, tˆ_{p})/L(t_{0},..., t_{}_{z}_{+1}, t_{}_{z}_{+1}, t_{p})), allows one to build intervals for any given confidence level. The distribution of the LOD for the MIH estimator was empirically determined using large numbers of independent simulations for mating histories with known outcrossing rates. The values of these outcrossing rates are hereafter referred as parametric values (see numerical validation). For each simulated mating scenario, we calculated the LOD score based on the maximumlikelihood estimates and the likelihood obtained with the parametric values. To build confidence intervals, LOD_{95} (or LOD_{99}) values were empirically determined as minimum LOD values for which 95% (or 99%) of runs included the parametric values.
Inbreeding equilibrium test: To test whether a population is in inbreeding equilibrium, we calculated the LOD score Log_{10} (L(tˆ_{0},..., tˆ_{}_{z}_{+1}, tˆ_{p})/L(tˆ_{f},..., tˆ_{f}, tˆ_{f})), where {tˆ_{0},..., tˆ_{}_{z}_{+1}, tˆ_{p}} are the maximumlikelihood (ML) estimates and tˆ_{f} is the rate calculated on the assumption of temporally constant outcrossing rates and inbreeding equilibrium. For LOD score values >LOD_{95}, the hypothesis of inbreeding equilibrium was rejected at the 5% probability level, and we then concluded that tˆ_{0} was significantly different from tˆ_{p.} Note that this likelihood ratio is related to a chisquare distribution of 1 d.f. (see Weir 1990, p. 80).
Hereafter, tˆ_{f} is calculated from a multilocus estimate of the fˆ inbreeding coefficient (Rousset and Raymond 1995) and then tˆ_{f} = (1  fˆ)/(1 + fˆ).
SIMULATION ALGORITHM
To simulate infinite populations with constant allelic frequencies (no selection) and no genotypic variance in outcrossing rate, the proportions of the {S_{0},..., S_{n}} classes were calculated according to the parametric outcrossing values using Equations 1 and 2. Samples of individuals were randomly drawn from this distribution. The multilocus genotype of each individual was then determined by random drawing (Monte Carlo procedure) according to the probabilities of being homozygous or heterozygous given the selfing class, for a given distribution of allelic frequencies. For reasons of simplicity, all simulated populations were in inbreeding equilibrium and outcrossed at rate t_{p} until the last but one generation and then outcrossed at a rate of t_{0} for the final generation.
NUMERICAL VALIDATION
To cover a breadth of situations, both t_{0} and t_{p} took values in {0.01, 0.25, 0.50, 0.75}. For each of the 16 (t_{0}, t_{p}) pairs, we simulated 100 independent populations of 150 individuals genotyped for four loci with diversity value D = 0.8. Actual Dˆ values used in MIH estimators were estimated for each locus from the simulated data sets according to Nei (1978).
Table 1 gives means and standard deviations of the estimated values of tˆ_{0} and tˆ_{p} over 100 independent simulations for the 16 parametric scenarios. The tˆ_{0} and tˆ_{p} values obtained were reliable even with a low number of loci (four), and no significant bias was observed in their means. The parametric value of t_{p} slightly affected the standard deviation of tˆ_{0,} whereas the standard deviation of tˆ_{p} increased with a high value of t_{0}.
No significant effect of (t_{0}, t_{p}) values on LOD score threshold values at 95 and 99% was apparent from the distribution of LOD scores for all parametric values of (t_{0}, t_{p}) (ANOVA, results not shown). Mean LOD score threshold values were 2.99 and 4.76 for 95th and 99th percentiles, respectively. By default, these empirical score values can be used to calculate confidence intervals. We also computed tˆ_{f} and their LOD score values. The proportion of estimates for which inbreeding equilibrium is rejected (which can be interpreted as the proportion of significant differences between tˆ_{0} and tˆ_{p)} is reported in Table 2 for each simulated (t_{0}, t_{p}) pair. When parametric t_{0} was equal to parametric t_{p}, the hypothesis of inbreeding equilibrium (H_{0}) was accepted 98 times out of 100 on average. For low to intermediate parametric outcrossing rates, H_{0} was always rejected when the parametric value of t_{0} effectively differed from that of t_{p}. But with increasing parametric outcrossing rates, and particularly t_{0}, H_{0} could be wrongly accepted. The worst such situations were those for which t_{0} was high and the difference with t_{p} low (H_{0} being accepted 81 times out of 100 when t_{0} = 0.75 and t_{p} = 0.50).
This numerical validation shows that MIH estimates of t one generation (or more) before that observed are sensitive to the “resetting effect” of a high level of outcrossing in the final generation. In such cases, most inbred individuals outcross to give S_{0} individuals and the proportions of selfing classes S_{n} with n > 0 are therefore dramatically reduced. The next section analyzes the variance of these ML estimators under different experimental designs (sample size, and number and diversity of loci).
RELIABILITY OF ESTIMATES BASED ON MIH
The reliability of MIH estimators of successive outcrossing rates was studied as a function of (i) the number of sampled individuals, which determines the sampling error of the estimated proportions in selfing classes, and (ii) the numbers and diversity of genotyped loci, which determine the accuracy of individual assignment to selfing classes. The theoretical variance of the maximumlikelihood estimator can be calculated by the second partial derivatives of the likelihood expression. Because of the complexity of these derivatives, we chose to analyze the different sources of variation involved in the likelihood separately.
Sampling error in selfing class distribution: Sample size has an obvious effect on the accuracy of ML estimates through sampling errors in the representation of the different selfing classes. As the proportions of selfing classes {Q_{0},..., Q_{n}} depend on the outcrossing rates experienced, the optimal sampling size will vary with the mating history and the number of past generation outcrossing estimates desired. The higher the outcrossing rate in the most recent generation, the lower the number of individuals distributed in {S_{1},..., S_{n}}, and the more individuals will be needed.
When sampling within selfing classes is the only considered effect, the variance of the t_{}_{n} estimate in a population having a steady outcrossing rate t = t_{0} = t_{1} = t_{}_{n} is
Figure 1 plots the standard deviation over tˆ_{}_{n}_{,} when sample size is N = 100, and illustrates that the lower the successive outcrossing rates, the deeper the possible insight into the past mating behavior of the population. For low or intermediate outcrossing rates, the accuracy of the recent outcrossing rate estimates t_{0}, t_{1}, and t_{2} is thus expected to be good with a small sample size. Using Equation 8, sample sizes can be determined to obtain sufficiently accurate estimates of successive outcrossing rates.
Discrimination between selfing classes: Accuracy in the determination of outcrossing rates relies greatly on the ability to correctly assign an individual to its specific selfing class on the basis of its heterozygosity. To illustrate the importance of the number and diversity of loci used to discriminate selfing classes, we plot on Figure 2 the distribution of the number of heterozygous loci expected for S_{0}, S_{1}, S_{2}, and S_{3} individuals in the case of L = 20 independent loci with equal Nei’s diversity (a, D = 0.8; b, D = 0.3). In this simple case, the distribution of the MIH is binomial,
With few loci, MIH distribution of S_{0} individuals differentiates from the other distributions, allowing us to estimate t_{0}. Oppositely, S_{1}, S_{2}, and S_{3} distributions are strongly overlapping, and more loci of high polymorphism are to be used to separate them and thus allow access to the corresponding outcrossing rate. To compare sets of loci of different levels of diversity, Equation 9 was arbitrarily used to calculate the number of loci required to have <20% of the area of a given distribution overlapping with the others (Figure 3).
Discrimination between S_{0} and S_{1} can be achieved with a reasonable number of loci (<10) provided their diversity is at least ∼0.5. Discriminating S_{1} from S_{2} is still possible using markers with high diversity values, but discrimination between higher inbreeding levels will require a huge number of loci (∼30 highly diverse loci to discriminate between S_{2} and S_{3}). Thus the MIH approach in practice seems limited to three successive outcrossing rate estimates (t_{0}, t_{1}, t_{p}).
Errors in diversity estimates: According to Weir (1990), variance in Dˆ at a locus is expressed as
Designing an experimental protocol: As discussed previously, the optimal experimental design depends on the mating history of the population and also on the polymorphism of the marker used. Thus, to complete a specific experimental design, it could be of interest to first obtain a rough value for (t_{0}, t_{p}) from a subsample of individuals analyzed with few markers. Then the genotyping could be completed by increasing the number of loci or the number of individuals analyzed. This optimal experimental design could be determined by simulating populations with parameters varying around the rough estimates.
OUTCROSSING RATE ESTIMATES IN WHEAT POPULATIONS
Plant material: Using restriction fragment length polymorphism (RFLP) markers, MIH maximumlikelihood estimation of outcrossing rates was applied to nine wheat populations derived from two composite crosses (hereafter called PA and PB). The composites stem from successive crosses of two distinct sets of 16 inbred lines (Davidet al. 1997; Enjalbertet al. 1999a). A pyramidal crossing design was used to create these composites. It required 4 yr of manual crosses: eight oneway hybrids were created the first year, four twoway hybrids the second year, and so on. Beginning in 1984, six populations were then grown in different field locations. The individuals used in this study were derived from the 1994 harvest, after at least 12 generations of open pollination, predominantly selfing. Thus, the heterozygosity due to the manual hybridization of parental lines will have practically vanished and should not interfere with the estimation of the recent outcrossing rates in these populations.
An additional three populations were derived from the same initial populations by four (SR) and six (PA0 and PB0) generations of single seed descent. As selfing was performed by bagging the spikes, these populations are expected to have been purely selfing since the controlled crosses.
About 78 individuals per population were genotyped using 14 restriction enzyme and probe combinations, providing 25 independent codominant RFLP loci (Enjalbertet al. 1999b). To verify whether our data were able to provide proper estimates of outcrossing rates and to calculate the LOD_{95} value, we simulated populations with identical sample size, loci number, and diversity as observed in wheat populations.
Results: The mean number of alleles per locus was 2.6 and the mean Nei gene diversity was 0.34. Most markers were located on distinct chromosome arms. After the four manual crosses, which occurred during the building of the initial population, linkage disequilibrium was low in PA0, PB0, and in all derived populations (Enjalbertet al. 1999a). Our data should thus make it possible to discriminate satisfactorily S_{0}  S_{1} (0.8) and S_{1}  S_{2} (0.63) and to estimate two outcrossing rates t_{0} and t_{p}. Note that for the three single seed descent (SSD) populations, t_{p} was not an equilibrium value, but was a mean of the manual crosses and selfing that occurred in the previous generations. The LOD_{95} value obtained by simulation was 2.8, close to values obtained in previous simulations. Table 3 gives the estimates (tˆ_{0}, tˆ_{p}) and corresponding confidence intervals, together with tˆ_{f} and its LOD score. Globally, outcrossing is low in the nine populations. The greatest departures from inbreeding equilibrium were observed for the three SSD populations for which tˆ_{0} was estimated to be zero, as expected, while high tˆ_{p} values (for a selfing species) suggested considerable outcrossing in the past (LOD score >19). As also expected, the fourgenerationold SSD population (SR) has a higher tˆ_{p} than the sixgenerationold SSD populations (PA0 and PB0). More surprisingly, all populations but one (PA Moulon) grown under natural selection and open pollination were not in inbreeding equilibrium and sometimes indicated highly contrasting outcrossing values. Another point to note is that tˆ_{0} was always lower than tˆ_{p}. This could be for various reasons, discussed below.
DISCUSSION
We developed a maximumlikelihood estimator of successive outcrossing rates based on MIH. This method provides estimates and confidence intervals for the two or three most recent outcrossing rates, under the assumptions of no selection, no allelic frequency changes, unlinked loci at linkage equilibrium, and random mating for outcrossing gametes. To build confidence intervals, an empirical LOD_{95} value of ∼3.0 should be used to cover most situations, except those with few loci. We verified that this LOD_{95} value did not differ greatly for other simulated populations with varying sampling size and loci number or diversity (results not shown).
The accuracy and number of estimates that can be made depend greatly on the outcrossing level of the population: the MIH estimator is most informative for populations with intermediate or low outcrossing rates, because of the resetting effect of outcrossing on mating history. The accuracy of the estimates is related to the size of the sample analyzed and to the number and heterozygosities of the markers used. Increasing the diversity of loci rather than their number allows for better discrimination between selfing classes and thus a deeper insight into populations’ mating history, whereas increasing the sample size improves the accuracy of estimates of the sizes of distinct inbreeding classes. To improve the statistical power of a data set and to test inbreeding equilibrium, we suggest simulating data within a range of expected t values and sequentially accumulating the molecular data, adding more individuals or more loci to obtain the desired accuracy.
This is apparently the first method that leads to a test for inbreeding equilibrium using a single generation analysis. It is of great importance to the analysis of heterozygosity, as using mean heterozygosity to estimate the outcrossing rate t_{f} will lead to misleading conclusions in a population with varying outcrossing rates. As pointed out by Brown and Albrecht (1980), the estimates of inbreeding coefficient will be biased to higher values by varying outcrossing rates. Considering a population that has a constant outcrossing rate t_{p} for a long time and shifting the final generation from t_{p} to t_{0}, then the value of t_{f} assuming inbreeding equilibrium is t_{f} = (t_{0} + t_{p})/(2 + t_{p}  t_{0}) (appendix b). Let us imagine two populations, the first with (t_{p} = 0.3, t_{0} = 0.1) and the second with (t_{p} = 0.1, t_{0} = 0.3). The first population that had a high level of outcrossing for more generations than the second one will have the lower tˆ_{f} value (0.18 vs. 0.22). As experimentally demonstrated for the SSD populations analyzed (Table 3), our method thus avoids this pitfall that could lead to wrong decisions in population management.
In wheat bulk populations, MIH estimators detected significant temporal variation in outcrossing rates in eight populations out of nine. In the controlled populations (SR and SSD), the MIH estimators were close to the theoretical expected values. In all experimental wheat populations, the most recent outcrossing rate was estimated to be lower than the previous ones. This may indicate that a considerable yeartoyear effect exists for outcrossing in these populations, as already shown for commercial cultivars sensitive, for example, to low light levels (for example, the Moulin variety in DemotesMainardet al. 1996). However, analyses of climatic conditions do not clearly explain why the 1994 season involved more selfing than previous years. Heterotic selection (positive correlation between heterozygosity and fitness; see David 1998 for example) could explain this situation. Since it decreases the reduction in heterozygosity by favoring the most heterozygous genotypes, heterotic selection leads to overestimation of t_{p}, whereas t_{0} estimates remain unaffected (since genotypes were sampled prior to juvenile selection). This hypothesis is consistent with other experiments that will be presented elsewhere.
For the short period of time relevant here (two to four generations), low temporal variation of allelic frequencies is expected in populations of reasonable size and should thus only slightly affect the outcrossing estimates, especially since mean diversity indices seem to be only slightly sensitive to allelic frequency variation, as found in B. truncatus (Viardet al. 1997). Nevertheless, attention should be paid to populations of low effective size, rapidly evolving under strong natural selection or submitted to migration. A misleading situation would also be the pooling into a single sample of strongly spatially structured populations. In this case, heterozygous deficiency could lead to the underestimation of outcrossing (as would homogamy or crosses between relatives, as shown for Secale cereale; Perez de la Vega and Allard 1984).
CONCLUSION
Multilocus individual heterozygosity has been used here to analyze the historical outcrossing rates of populations, yielding a better understanding of the dynamics of plant populations. The availability of highly polymorphic codominant markers, such as microsatellites (Tauzet al. 1986), now makes it possible to envisage powerful MIH studies within realistic experiments. As progeny array analysis is not always possible for practical reasons, our procedure offers an alternative to estimation based on inbreeding level. Further theoretical developments are still needed for formal descriptions of the estimator’s properties, particularly in the case of using loci in linkage disequilibrium and heterotic selection.
Decomposing a population into selfing classes could also be used to study heterosis in situ. For example, if individual fitness can be measured in situ and individuals can be genotyped to estimate their probabilities of belonging to different selfing classes, weighted mean fitnesses of successive selfing classes can provide information about inbreeding depression. Additionally, the study of two successive generations makes it possible to measure correlations between frequencies of S_{n} classes and S_{n}_{1} classes in the previous generation. Any correlation not fitted with the value expected from the final generation outcrossing rate will measure a selection effect.
APPENDIX A
Likelihood of individual x of genotype G_{x} could be written as
The mating behavior is estimated for z previous generations and outcrossing rates of all generations before generation z are assumed to be constant and equal to t_{p}. Then z + 1 outcrossing rates (t_{0},..., t_{}_{z}_{+1}, t_{p}) must be estimated jointly by the maximumlikelihood technique. Hence, L(x) can be developed as
For highly selfing classes (m > N ≈ 20),
Extracting the constant terms
For computational ease, the logarithm of Equation A3 was used in our program. This C program calculating t estimates could be downloaded at ftp://moulon.inra.fr/pub/moulon/enj.
APPENDIX B
At inbreeding equilibrium,
APPENDIX C
Sampling variance over each selfing class could be derived from Equation 2:
By recurrence,
Mean and variance of t_{}_{n} could be approximated using Taylor’s secondorder expansion (i.e., the Delta method; Weir 1990, p. 44). All moments of Q¯_{n} derive from moments of binomial or multinomial distributions,
In an infinite population in inbreeding equilibrium with a constant outcrossing rate t, it could be shown that Q_{n} = t(1  t)^{n}. In this simple case, variance of tˆ_{}_{n} is
Acknowledgments
We are grateful to eric Jenczewski, Christine Dillmann, Isabelle Goldringer, and Thomas Bataillon for helpful comments on earlier drafts of the article. We thank three anonymous reviewers for their criticisms, Tony Brown for stimulating discussions and final corrections, and Anne Marie Wall of the INRA language department for roughing out our English style. This research was supported by a grant from the French Bureau des Ressources Génétiques.
Footnotes

Communicating editor: A. H. D. Brown
 Received October 19, 1999.
 Accepted August 9, 2000.
 Copyright © 2000 by the Genetics Society of America