Inferring Recent Outcrossing Rates Using Multilocus Individual Heterozygosity: Application to Evolving Wheat Populations
- * INRA, Station de Génétique Végétale, Ferme du Moulon, 91190 Gif sur Yvette, France
- † INRA, Domaine de Melgueil, 34130 Mauguio, France
- Corresponding author: Jérôme Enjalbert, INRA, Unité de Pathologie Végétale, BP01, 78850 Thiverval-grignon, France. E-mail: enjalber{at}grignon.inra.fr
Abstract
Using multilocus individual heterozygosity, a method is developed to estimate the outcrossing rates of a population over a few previous generations. Considering that individuals originate either from outcrossing or from n successive selfing generations from an outbred ancestor, a maximum-likelihood (ML) estimator is described that gives estimates of past outcrossing rates in terms of proportions of individuals with different n values. Heterozygosities at several unlinked codominant loci are used to assign n values to each individual. This method also allows a test of whether populations are in inbreeding equilibrium. The estimator’s reliability was checked using simulations for different mating histories. We show that this ML estimator can provide estimates of outcrossing rates for the final generation outcrossing rate (t0) and a mean of the preceding rates (tp) and can detect major temporal variation in the mating system. The method is most efficient for low to intermediate outcrossing levels. Applied to nine populations of wheat, this method gave estimates of t0 and tp. These estimates confirmed the absence of outcrossing (t0 = 0) in the two populations subjected to manual selfing. For free-mating wheat populations, it detected lower final generation outcrossing rates (t0 = 0-0.06) than those expected from global heterozygosity (t = 0.02-0.09). This estimator appears to be a new and efficient way to describe the multilocus heterozygosity of a population, complementary to Fis and progeny analysis approaches.
THE effect of mating system on plant genetic diversity is a major theme in evolutionary genetics, as it plays a major role in structuring the genetic variation within and between populations. For example, self-fertilization correlates with lower within-population diversity and with higher between-population differentiation, and global polymorphism is generally lower in selfing species (Godt and Hamrick 1991).
Two strategies have been used to estimate the outcrossing rates of a mixed mating population from genetic markers: measuring the frequency of heterozygous genotypes and the analysis of progeny arrays (Ritland 1983). Assaying the frequency of heterozygotes in a population gives an estimate of the outcrossing rate assuming that (i) the selfing rate has been constant for a sufficient number of generations, (ii) the population is in inbreeding equilibrium, and (iii) selfing is the major cause of departure from Hardy-Weinberg frequencies (see Brown and Allard 1970). In such cases, a simple relation exists between Wright’s within-population inbreeding coefficient f and the outcrossing rate tf, f = (1 - tf)/(1 + tf) (Wright 1969). This method is widely used when surveying marker diversity in natural populations, for example, in Centrosema (Penteadoet al. 1996) or in Bulinus truncatus (Viardet al. 1997). However, estimates of tf must be treated with caution when temporal variation exists in outcrossing rates.
The second method is based on progeny arrays. Morphological markers were used long ago to identify outcrossing vs. selfing events in progenies of known maternal genotypes (e.g., in wheat; Hayes 1918). As the detection of a single nonmaternal allele in a genotype proves its outbreeding origin, accurate estimates of the outcrossing rate in natural populations can be obtained using multilocus information on progenies (especially in conjunction with estimates of maternal genotypes). Powerful estimators of outcrossing rate based on progeny analysis have been proposed (Ritland and Jain 1981; Shawet al. 1981) and widely used (Schoen and Brown 1991). Progeny arrays can also measure the variance in outcrossing rates between maternal individuals, e.g., as has been done in studies in Eucalyptus regnans (Moranet al. 1989) and Acacia nilotica (Mandal and Ennos 1995). They also yield estimates of both ovule and pollen allelic frequencies (Godt and Hamrick 1991). Additionally, differences between monolocus and multilocus estimates make it possible to infer the amount of inbreeding due to mating between relatives (Shaw and Allard 1982; Ritland 1984). Note that progeny arrays can also yield a Wright estimate tf, using the inbreeding coefficient of mothers.
Environmental conditions can cause considerable temporal variation in mating behavior. Low temperatures or light intensity can modify outcrossing in some selfing species, as documented in wheat (Demotes-Mainardet al. 1995) and rice (Liet al. 1996). Low population density can reduce outcrossing, as demonstrated in Bombacaceous trees (Murawskiet al. 1990) and in Cuphea laminuligera (Krueger and Knapp 1991). Studies over successive years are thus required to measure temporal variation in outcrossing rates (Barrettet al. 1993; Sproule and Dancik 1996).
Here we develop a method to estimate the outcrossing rates for a few previous generations using a single generation survey. It is based on the analysis of multilocus heterozygosity within individuals. In mixed mating species, individuals can either originate from outcrossing between individuals or be derived from a varying number of selfing generations from outbred individuals. Hence individuals display varying levels of heterozygosity in their genomes, as initial heterozygosity resulting from the founder outcrossing event is halved by each successive selfing generation (i.e., for two independent loci, recently produced genotypes are more likely to be double heterozygotes, whereas more and more double homozygotes accumulate with each selfing generation). The proportion of individuals exhibiting a high level of heterozygosity should thus give information about the outcrossing rate in the most recent generation, while the proportions of individuals with varying levels of homozygosity could be used to estimate outcrossing rates in the previous generations. Consequently, a population can be partitioned into classes of individuals sharing the same number of selfing generations since the last outcrossing event in their genealogy. Each class presents distinct expected levels of multilocus individual heterozygosity (MIH), and class frequencies are a result of the mating history of the population. Using both probability formulas, we developed a maximum-likelihood estimator of outcrossing rates on the basis of multilocus patterns observed in a random sample of individuals drawn from an infinite population. This approach allows a test of whether a population had a constant selfing rate in the past, i.e., if it is in inbreeding equilibrium.
Using simulated populations, the estimator properties are evaluated for various mating histories as functions of the sample size, the number of loci used to estimate MIH, and their Nei diversity. The MIH method is tested on nine populations of wheat (Triticum aestivum) derived from a pilot program of dynamic management of genetic resources. These populations have known contrasting mating histories. We show that this approach can be used when progeny analysis is not possible for practical reasons, or when temporal variance in outcrossing rates is suspected.
ANALYTICAL MODEL
Inbreeding classes distribution: Our model assumes that populations evolve with temporal variation in outcrossing rates with no genetic drift, no selection (particularly no heterotic selection), and with the random mating of outcrossing gametes. In populations of hermaphroditic species, individuals may originate from a cross between two different parents or from the selfing of a single individual, which itself could have resulted from selfing or outcrossing. Each individual can then be indexed on n (0 ≤ n < ∞), the number of selfing generations since the last outcrossing event in its genealogy. Such an individual is said to belong to class Sn. Then any population observed at a given generation is composed of an infinity of classes of individuals {S0,..., S∞} in proportions {Q0,..., Q∞}, with of course R
Individuals from all classes are assumed to outcross at the same rate. If t0 is the outcrossing rate that occurred in the production of the present generation, then individuals that were in class Sn one generation before produced progeny in class S0 at a rate of t0 and in Sn + 1 at a rate of (1 - t0). It is therefore clear that
If we use t-n to denote the outcrossing rate in the population n generations before the one observed, then, given that individuals in class Sn in the present population derive from outbred ancestors created n generations previously, we have
By recurrence, the proportion of individuals descending from n generations of selfing in the generation observed is thus
Expected multilocus individual heterozygosity: In diploid species, most individuals that result from an outcrossing event (S0) can be detected since they are heterozygous at many loci (Bennett and Binet 1956). Individuals in class Sn+1 have half the mean heterozygosity of individuals in the previous class Sn. We note that P(ht/Sn) and P(hm/Sn) are the probabilities of individuals from class Sn being heterozygous and homozygous, respectively, at a given locus. In the generation observed, P(ht/S0), i.e., the probability of being heterozygous after an outcrossing event is
Now consider Gx, the multilocus heterozygosity pattern of an individual x at all the loci genotyped. Using the variable al, where al = 1 if x is heterozygous at locus l, and al = 0 otherwise, then for L unlinked loci in linkage equilibrium,
This product only holds at linkage equilibrium; otherwise disequilibrium measures have to be introduced (not developed here). Note that the correlation between heterozygous states of independent loci in a mixed mating population observed by Bennett and Binet (1956) results from combining of the very first selfing classes, mainly composed of multiple heterozygous individuals, with highly selfed classes composed of multiple homozygotes. This correlation is absent within a given selfing class Sn and has no effect on S0 formation as the disequilibrium is zygotic rather than gametic. Any heterozygosity correlation appearing in S0 by departure from our hypothesis (nonrandom mating, population of small size...) will be rapidly lost while heterozygosity diminishes during the successive selfings.
Maximum-likelihood estimation of recent outcrossing rates: From the previous expressions, the likelihood of Gx, the genotype of individual x, is
As P(x ∊ Sn) = Qn,0, using Equations 1, 2, and 4, the first term of L(x) depends both on gene diversities and outcrossing rates for successive generations and, hence,
The mating behavior is estimated for z previous generations, and the outcrossing rates of all generations before generation z are assumed to be constant and equal to tp. As developed in appendix c, when outcrossing rates vary before generation z, tp is a function of (t-z, t-z-1,..., t-∞), mainly depending on the very first terms (recent outcrossing rates).
Thus z + 1 outcrossing rates (t0,..., t-z+1, tp) must be jointly estimated by the maximum-likelihood technique. A complete expression for L(x) is presented in appendix a.
The likelihood of a sample of X individuals L(1,.x., X) of genotype (G1, Gx, GX) is the product of the individual likelihoods, on the assumption of independence between sampled individuals,
Confidence intervals: When known, the distribution of the LOD score, the logarithm of the ratio of the maximum likelihood to the likelihood for any value of {t0,..., t-z+1, tp}, LOD = Log10 (L(tˆ0,..., tˆ-z+1, tˆp)/L(t0,..., t-z+1, t-z+1, tp)), allows one to build intervals for any given confidence level. The distribution of the LOD for the MIH estimator was empirically determined using large numbers of independent simulations for mating histories with known outcrossing rates. The values of these outcrossing rates are hereafter referred as parametric values (see numerical validation). For each simulated mating scenario, we calculated the LOD score based on the maximum-likelihood estimates and the likelihood obtained with the parametric values. To build confidence intervals, LOD95 (or LOD99) values were empirically determined as minimum LOD values for which 95% (or 99%) of runs included the parametric values.
Inbreeding equilibrium test: To test whether a population is in inbreeding equilibrium, we calculated the LOD score Log10 (L(tˆ0,..., tˆ-z+1, tˆp)/L(tˆf,..., tˆf, tˆf)), where {tˆ0,..., tˆ-z+1, tˆp} are the maximum-likelihood (ML) estimates and tˆf is the rate calculated on the assumption of temporally constant outcrossing rates and inbreeding equilibrium. For LOD score values >LOD95, the hypothesis of inbreeding equilibrium was rejected at the 5% probability level, and we then concluded that tˆ0 was significantly different from tˆp. Note that this likelihood ratio is related to a chi-square distribution of 1 d.f. (see Weir 1990, p. 80).
Estimates of successive outcrossing rates in simulated populations
Hereafter, tˆf is calculated from a multilocus estimate of the fˆ inbreeding coefficient (Rousset and Raymond 1995) and then tˆf = (1 - fˆ)/(1 + fˆ).
SIMULATION ALGORITHM
To simulate infinite populations with constant allelic frequencies (no selection) and no genotypic variance in outcrossing rate, the proportions of the {S0,..., Sn} classes were calculated according to the parametric outcrossing values using Equations 1 and 2. Samples of individuals were randomly drawn from this distribution. The multilocus genotype of each individual was then determined by random drawing (Monte Carlo procedure) according to the probabilities of being homozygous or heterozygous given the selfing class, for a given distribution of allelic frequencies. For reasons of simplicity, all simulated populations were in inbreeding equilibrium and outcrossed at rate tp until the last but one generation and then outcrossed at a rate of t0 for the final generation.
NUMERICAL VALIDATION
To cover a breadth of situations, both t0 and tp took values in {0.01, 0.25, 0.50, 0.75}. For each of the 16 (t0, tp) pairs, we simulated 100 independent populations of 150 individuals genotyped for four loci with diversity value D = 0.8. Actual Dˆ values used in MIH estimators were estimated for each locus from the simulated data sets according to Nei (1978).
Table 1 gives means and standard deviations of the estimated values of tˆ0 and tˆp over 100 independent simulations for the 16 parametric scenarios. The tˆ0 and tˆp values obtained were reliable even with a low number of loci (four), and no significant bias was observed in their means. The parametric value of tp slightly affected the standard deviation of tˆ0, whereas the standard deviation of tˆp increased with a high value of t0.
No significant effect of (t0, tp) values on LOD score threshold values at 95 and 99% was apparent from the distribution of LOD scores for all parametric values of (t0, tp) (ANOVA, results not shown). Mean LOD score threshold values were 2.99 and 4.76 for 95th and 99th percentiles, respectively. By default, these empirical score values can be used to calculate confidence intervals. We also computed tˆf and their LOD score values. The proportion of estimates for which inbreeding equilibrium is rejected (which can be interpreted as the proportion of significant differences between tˆ0 and tˆp) is reported in Table 2 for each simulated (t0, tp) pair. When parametric t0 was equal to parametric tp, the hypothesis of inbreeding equilibrium (H0) was accepted 98 times out of 100 on average. For low to intermediate parametric outcrossing rates, H0 was always rejected when the parametric value of t0 effectively differed from that of tp. But with increasing parametric outcrossing rates, and particularly t0, H0 could be wrongly accepted. The worst such situations were those for which t0 was high and the difference with tp low (H0 being accepted 81 times out of 100 when t0 = 0.75 and tp = 0.50).
This numerical validation shows that MIH estimates of t one generation (or more) before that observed are sensitive to the “resetting effect” of a high level of outcrossing in the final generation. In such cases, most inbred individuals outcross to give S0 individuals and the proportions of selfing classes Sn with n > 0 are therefore dramatically reduced. The next section analyzes the variance of these ML estimators under different experimental designs (sample size, and number and diversity of loci).
RELIABILITY OF ESTIMATES BASED ON MIH
The reliability of MIH estimators of successive outcrossing rates was studied as a function of (i) the number of sampled individuals, which determines the sampling error of the estimated proportions in selfing classes, and (ii) the numbers and diversity of genotyped loci, which determine the accuracy of individual assignment to selfing classes. The theoretical variance of the maximum-likelihood estimator can be calculated by the second partial derivatives of the likelihood expression. Because of the complexity of these derivatives, we chose to analyze the different sources of variation involved in the likelihood separately.
Test of the inbreeding equilibrium hypothesis
Sampling error in selfing class distribution: Sample size has an obvious effect on the accuracy of ML estimates through sampling errors in the representation of the different selfing classes. As the proportions of selfing classes {Q0,..., Qn} depend on the outcrossing rates experienced, the optimal sampling size will vary with the mating history and the number of past generation outcrossing estimates desired. The higher the outcrossing rate in the most recent generation, the lower the number of individuals distributed in {S1,..., Sn}, and the more individuals will be needed.
When sampling within selfing classes is the only considered effect, the variance of the t-n estimate in a population having a steady outcrossing rate t = t0 = t-1 = t-n is
Figure 1 plots the standard deviation over tˆ-n, when sample size is N = 100, and illustrates that the lower the successive outcrossing rates, the deeper the possible insight into the past mating behavior of the population. For low or intermediate outcrossing rates, the accuracy of the recent outcrossing rate estimates t0, t-1, and t-2 is thus expected to be good with a small sample size. Using Equation 8, sample sizes can be determined to obtain sufficiently accurate estimates of successive outcrossing rates.
Discrimination between selfing classes: Accuracy in the determination of outcrossing rates relies greatly on the ability to correctly assign an individual to its specific selfing class on the basis of its heterozygosity. To illustrate the importance of the number and diversity of loci used to discriminate selfing classes, we plot on Figure 2 the distribution of the number of heterozygous loci expected for S0, S1, S2, and S3 individuals in the case of L = 20 independent loci with equal Nei’s diversity (a, D = 0.8; b, D = 0.3). In this simple case, the distribution of the MIH is binomial,
With few loci, MIH distribution of S0 individuals differentiates from the other distributions, allowing us to estimate t0. Oppositely, S1, S2, and S3 distributions are strongly overlapping, and more loci of high polymorphism are to be used to separate them and thus allow access to the corresponding outcrossing rate. To compare sets of loci of different levels of diversity, Equation 9 was arbitrarily used to calculate the number of loci required to have <20% of the area of a given distribution overlapping with the others (Figure 3).
—Change in standard deviation of estimates of outcrossing rates for the five last generations according to the outcrossing rate of a population in inbreeding equilibrium in a sample of 100 individuals (see text for details).
—MIH distributions: distribution of the number of heterozygous loci expected in individuals belonging to S0, S1, S2, and S3 selfing classes when observing 20 independent loci with equal heterozygosities of (A) D = 0.8 and (B) D = 0.3.
Discrimination between S0 and S1 can be achieved with a reasonable number of loci (<10) provided their diversity is at least ∼0.5. Discriminating S1 from S2 is still possible using markers with high diversity values, but discrimination between higher inbreeding levels will require a huge number of loci (∼30 highly diverse loci to discriminate between S2 and S3). Thus the MIH approach in practice seems limited to three successive outcrossing rate estimates (t0, t-1, tp).
Errors in diversity estimates: According to Weir (1990), variance in Dˆ at a locus is expressed as
Designing an experimental protocol: As discussed previously, the optimal experimental design depends on the mating history of the population and also on the polymorphism of the marker used. Thus, to complete a specific experimental design, it could be of interest to first obtain a rough value for (t0, tp) from a subsample of individuals analyzed with few markers. Then the genotyping could be completed by increasing the number of loci or the number of individuals analyzed. This optimal experimental design could be determined by simulating populations with parameters varying around the rough estimates.
—Number of loci required to separate MIH distributions of two successive selfing classes with <20% of overlapping area, according to their Nei diversity.
OUTCROSSING RATE ESTIMATES IN WHEAT POPULATIONS
Plant material: Using restriction fragment length polymorphism (RFLP) markers, MIH maximum-likelihood estimation of outcrossing rates was applied to nine wheat populations derived from two composite crosses (hereafter called PA and PB). The composites stem from successive crosses of two distinct sets of 16 inbred lines (Davidet al. 1997; Enjalbertet al. 1999a). A pyramidal crossing design was used to create these composites. It required 4 yr of manual crosses: eight one-way hybrids were created the first year, four two-way hybrids the second year, and so on. Beginning in 1984, six populations were then grown in different field locations. The individuals used in this study were derived from the 1994 harvest, after at least 12 generations of open pollination, predominantly selfing. Thus, the heterozygosity due to the manual hybridization of parental lines will have practically vanished and should not interfere with the estimation of the recent outcrossing rates in these populations.
An additional three populations were derived from the same initial populations by four (SR) and six (PA0 and PB0) generations of single seed descent. As selfing was performed by bagging the spikes, these populations are expected to have been purely selfing since the controlled crosses.
Successive outcrossing rates in nine wheat populations
About 78 individuals per population were genotyped using 14 restriction enzyme and probe combinations, providing 25 independent codominant RFLP loci (Enjalbertet al. 1999b). To verify whether our data were able to provide proper estimates of outcrossing rates and to calculate the LOD95 value, we simulated populations with identical sample size, loci number, and diversity as observed in wheat populations.
Results: The mean number of alleles per locus was 2.6 and the mean Nei gene diversity was 0.34. Most markers were located on distinct chromosome arms. After the four manual crosses, which occurred during the building of the initial population, linkage disequilibrium was low in PA0, PB0, and in all derived populations (Enjalbertet al. 1999a). Our data should thus make it possible to discriminate satisfactorily S0 - S1 (0.8) and S1 - S2 (0.63) and to estimate two outcrossing rates t0 and tp. Note that for the three single seed descent (SSD) populations, tp was not an equilibrium value, but was a mean of the manual crosses and selfing that occurred in the previous generations. The LOD95 value obtained by simulation was 2.8, close to values obtained in previous simulations. Table 3 gives the estimates (tˆ0, tˆp) and corresponding confidence intervals, together with tˆf and its LOD score. Globally, outcrossing is low in the nine populations. The greatest departures from inbreeding equilibrium were observed for the three SSD populations for which tˆ0 was estimated to be zero, as expected, while high tˆp values (for a selfing species) suggested considerable outcrossing in the past (LOD score >19). As also expected, the four-generation-old SSD population (SR) has a higher tˆp than the six-generation-old SSD populations (PA0 and PB0). More surprisingly, all populations but one (PA Moulon) grown under natural selection and open pollination were not in inbreeding equilibrium and sometimes indicated highly contrasting outcrossing values. Another point to note is that tˆ0 was always lower than tˆp. This could be for various reasons, discussed below.
DISCUSSION
We developed a maximum-likelihood estimator of successive outcrossing rates based on MIH. This method provides estimates and confidence intervals for the two or three most recent outcrossing rates, under the assumptions of no selection, no allelic frequency changes, unlinked loci at linkage equilibrium, and random mating for outcrossing gametes. To build confidence intervals, an empirical LOD95 value of ∼3.0 should be used to cover most situations, except those with few loci. We verified that this LOD95 value did not differ greatly for other simulated populations with varying sampling size and loci number or diversity (results not shown).
The accuracy and number of estimates that can be made depend greatly on the outcrossing level of the population: the MIH estimator is most informative for populations with intermediate or low outcrossing rates, because of the resetting effect of outcrossing on mating history. The accuracy of the estimates is related to the size of the sample analyzed and to the number and heterozygosities of the markers used. Increasing the diversity of loci rather than their number allows for better discrimination between selfing classes and thus a deeper insight into populations’ mating history, whereas increasing the sample size improves the accuracy of estimates of the sizes of distinct inbreeding classes. To improve the statistical power of a data set and to test inbreeding equilibrium, we suggest simulating data within a range of expected t values and sequentially accumulating the molecular data, adding more individuals or more loci to obtain the desired accuracy.
This is apparently the first method that leads to a test for inbreeding equilibrium using a single generation analysis. It is of great importance to the analysis of heterozygosity, as using mean heterozygosity to estimate the outcrossing rate tf will lead to misleading conclusions in a population with varying outcrossing rates. As pointed out by Brown and Albrecht (1980), the estimates of inbreeding coefficient will be biased to higher values by varying outcrossing rates. Considering a population that has a constant outcrossing rate tp for a long time and shifting the final generation from tp to t0, then the value of tf assuming inbreeding equilibrium is tf = (t0 + tp)/(2 + tp - t0) (appendix b). Let us imagine two populations, the first with (tp = 0.3, t0 = 0.1) and the second with (tp = 0.1, t0 = 0.3). The first population that had a high level of outcrossing for more generations than the second one will have the lower tˆf value (0.18 vs. 0.22). As experimentally demonstrated for the SSD populations analyzed (Table 3), our method thus avoids this pitfall that could lead to wrong decisions in population management.
In wheat bulk populations, MIH estimators detected significant temporal variation in outcrossing rates in eight populations out of nine. In the controlled populations (SR and SSD), the MIH estimators were close to the theoretical expected values. In all experimental wheat populations, the most recent outcrossing rate was estimated to be lower than the previous ones. This may indicate that a considerable year-to-year effect exists for outcrossing in these populations, as already shown for commercial cultivars sensitive, for example, to low light levels (for example, the Moulin variety in Demotes-Mainardet al. 1996). However, analyses of climatic conditions do not clearly explain why the 1994 season involved more selfing than previous years. Heterotic selection (positive correlation between heterozygosity and fitness; see David 1998 for example) could explain this situation. Since it decreases the reduction in heterozygosity by favoring the most heterozygous genotypes, heterotic selection leads to overestimation of tp, whereas t0 estimates remain unaffected (since genotypes were sampled prior to juvenile selection). This hypothesis is consistent with other experiments that will be presented elsewhere.
For the short period of time relevant here (two to four generations), low temporal variation of allelic frequencies is expected in populations of reasonable size and should thus only slightly affect the outcrossing estimates, especially since mean diversity indices seem to be only slightly sensitive to allelic frequency variation, as found in B. truncatus (Viardet al. 1997). Nevertheless, attention should be paid to populations of low effective size, rapidly evolving under strong natural selection or submitted to migration. A misleading situation would also be the pooling into a single sample of strongly spatially structured populations. In this case, heterozygous deficiency could lead to the underestimation of outcrossing (as would homogamy or crosses between relatives, as shown for Secale cereale; Perez de la Vega and Allard 1984).
CONCLUSION
Multilocus individual heterozygosity has been used here to analyze the historical outcrossing rates of populations, yielding a better understanding of the dynamics of plant populations. The availability of highly polymorphic codominant markers, such as microsatellites (Tauzet al. 1986), now makes it possible to envisage powerful MIH studies within realistic experiments. As progeny array analysis is not always possible for practical reasons, our procedure offers an alternative to estimation based on inbreeding level. Further theoretical developments are still needed for formal descriptions of the estimator’s properties, particularly in the case of using loci in linkage disequilibrium and heterotic selection.
Decomposing a population into selfing classes could also be used to study heterosis in situ. For example, if individual fitness can be measured in situ and individuals can be genotyped to estimate their probabilities of belonging to different selfing classes, weighted mean fitnesses of successive selfing classes can provide information about inbreeding depression. Additionally, the study of two successive generations makes it possible to measure correlations between frequencies of Sn classes and Sn-1 classes in the previous generation. Any correlation not fitted with the value expected from the final generation outcrossing rate will measure a selection effect.
APPENDIX A
Likelihood of individual x of genotype Gx could be written as
The mating behavior is estimated for z previous generations and outcrossing rates of all generations before generation z are assumed to be constant and equal to tp. Then z + 1 outcrossing rates (t0,..., t-z+1, tp) must be estimated jointly by the maximum-likelihood technique. Hence, L(x) can be developed as
For highly selfing classes (m > N ≈ 20),
Extracting the constant terms
For computational ease, the logarithm of Equation A3 was used in our program. This C program calculating t estimates could be downloaded at ftp://moulon.in-ra.fr/pub/moulon/enj.
APPENDIX B
At inbreeding equilibrium,
APPENDIX C
Sampling variance over each selfing class could be derived from Equation 2:
By recurrence,
Mean and variance of t-n could be approximated using Taylor’s second-order expansion (i.e., the Delta method; Weir 1990, p. 44). All moments of Q¯n derive from moments of binomial or multinomial distributions,
In an infinite population in inbreeding equilibrium with a constant outcrossing rate t, it could be shown that Qn = t(1 - t)n. In this simple case, variance of tˆ-n is
Acknowledgments
We are grateful to eric Jenczewski, Christine Dillmann, Isabelle Goldringer, and Thomas Bataillon for helpful comments on earlier drafts of the article. We thank three anonymous reviewers for their criticisms, Tony Brown for stimulating discussions and final corrections, and Anne Marie Wall of the INRA language department for roughing out our English style. This research was supported by a grant from the French Bureau des Ressources Génétiques.
Footnotes
-
Communicating editor: A. H. D. Brown
- Received October 19, 1999.
- Accepted August 9, 2000.
- Copyright © 2000 by the Genetics Society of America