Abstract
Testing (over)dominance as the genetic cause of heterosis and estimating the (over)dominance coefficient (h) are related. Using simulations, we investigate the statistical properties of Mukai's approach, which is intended to estimate the average (
INBREEDING depression results from mating among relatives, and outbreeding enhancement results from mating among usually inbreeding lines or isolated populations. Both phenomena are widely observed (e.g., Wright 1977; Charlesworth and Charlesworth 1987; Falconer 1989; Crow 1993; Lynch and Walsh 1997). For simplicity, hereafter we will refer to both phenomena collectively as heterosis. The magnitude of heterosis has implications in many areas, such as the evolution of selfincompatibility systems in plants (Lande and Schemske 1985; Schemske and Lande 1985; Charlesworth and Charlesworth 1987), the evolution of dispersal mechanisms for inbreeding avoidance in animals (Shields 1982), the biological conservation of rare and endangered species (Soule 1986), the improvement of agricultural production (Falconer 1989), and the protection of human welfare (CavalliSforza and Bodmer 1971).
There are two main rival genetic hypotheses concerning individual loci to explain heterosis. One is the dominance hypothesis (Davenport 1908; Crow 1952), which argues that heterosis is caused by an enhanced expression of deleterious genes when homozygosity is increased and the heterozygote performance is somewhere (but not exactly) in between the two corresponding homozygotes. The other is the overdominance hypothesis (East 1908; Shull 1908; Crow 1952), which argues for heterozygote superiority relative to both homozygotes. Although neither dominance nor overdominance is necessary for heterosis (Richey 1942; Minvielle 1987; Schnell and Cockerham 1992), here we concentrate on studying just these two mechanistic causes (see discussion).
Although most experimental data are consistent with the dominance hypothesis, overdominance cannot be ruled out in many situations (Simmons and Crow 1977; Charlesworth and Charlesworth 1987; Barrett and Charlesworth 1991; Stuberet al. 1992; Crow 1993; Mitton 1993). At present, the evidence for functional overdominance does not seem to be very convincing, and most cited examples are compatible with associated overdominance [an artifact of linked deleterious recessive genes (cf. Houle 1989, 1994; Crow 1993)]. The debate over the relative importance of the two hypotheses continues (Sprague 1983; Wallace 1989; Crow 1993; Mitton 1993) and is unlikely to be settled until an unambiguous test is devised (H.W. Deng, Y.X. Fu and M. Lynch, unpublished results).
Testing dominance vs. the overdominance hypothesis is important for discerning mechanisms for the maintenance of genetic variability (Crow 1993). Some fairly recent studies tested the (over)dominance hypotheses and inferred nonadditivity of withinlocus mutation effects by estimating the average (
This study is purported to (1) develop a new approach to estimate
TWO APPROACHES TO ESTIMATE h ¯ AND σ h 2
Hypothesis testing and parameter estimation in statistics are two highly related topics. Unbiased and efficient estimation (estimation with a small sampling error) of a parameter generally forms a basis for a powerful test concerning that parameter. For a locus with the two alleles A and a, let the three genotypic values be, respectively:
Mukai's approach: This approach was developed under the assumptions that dominance is the sole mode of withinlocus genetic effects, the frequency of deleterious allele is very small, the population is at HardyWeinberg equilibrium, and mutation effects across loci are additive. It approximately estimates
New approach: A wide variety of outcrossing plants and invertebrates are capable of selfing. In such populations, if we self a random sample of genotypes and obtain a number of selfed progeny from each parent to form selfed families, then
SIMULATIONS
The above derivations make a number of assumptions, for example, the withinlocus nonadditive genetic effects are dominant and genetic effects across loci are additive. However, there is good evidence that genes for fitness or its components usually act multiplicatively (Mortonet al. 1956; Crow 1986; Fu and Ritland 1996). To investigate the robustness of the above approaches under overdominance and the more reasonable mixed modes of dominance and overdominance at different loci, and to investigate their statistical properties, simulations are performed in which fitnesses are assumed to be multiplicative. To concentrate on studying the robustness of the approaches and the influence of the genetic process (selfing outcrossed individuals or outcrossing homozygous lines) on the estimation, all genotypicvalues are assumed to be measured accurately. In reality, this would require that each genotype be clonally replicated and assayed a very large number of times. Ignoring measurement error of genotypic values due to random environmental and developmental processes will likely inflate the sampling error of estimation, but unlikely bias the estimation and comparison of the two approaches under the same sample sizes of genotypes (Deng and Lynch 1996a).
This section is organized according to the presentation of different mutation effects across the genome, starting from the simplest case of constant effects, progressing to biologically more complex and plausible situations. Simulations will be described for each situation, respectively, and some necessary analytical results will be developed.
Constant mutation effects with dominance: Dominance (h_{i}) and selection (s_{i}) coefficients across loci are the same, that is, h_{i} = h, s_{i} = s.
Mukai's approach in outcrossing populations: Assume some random pairs of homozygotes are established from natural populations [such as with a special chromosome construct in Drosophila (Mukaiet al. 1972)]. Their fitness is W = (1 − s)^{n}, where n is the number of mutations randomly determined from a Poisson distribution with mean U/(2hs), where U is the genomic mutation rate. This is because the mean number of mutations per genome in an outcrossing population is U/(hs), and nearly all mutations are heterozygous in state (Deng and Lynch 1996a), and homozygous lines established are expected to carry only about onehalf of the total genomic mutations of the outcrossed generation. Because genome size is usually very big and the mutation rate (μ) of each locus is very small, the mutant allele frequency (μ/hs at mutationselection balance; Crow and Kimura 1970) is also very small. It is unlikely that the homozygotes established have mutations at the same loci (Charlesworthet al. 1990; Deng and Lynch 1996a), as corroborated by our computer simulations (H.W. Deng, unpublished data). Therefore, throughout for dominant loci under mutationselection equilibrium, the number of mutations in an outcrossed progeny is the sum of the number of mutations of its two parental homozygotes (n_{m} and n_{f}), but all at heterozygous state. Thus, the fitness of an outcrossed progeny is W = (1 − hs)^{nm + nf}. Because
Mukai'sapproach in selfing populations:Homozygous lines are readily obtainable. Simulations are performed as above, except that n in the fitness function W = (1 − s)^{n} is randomly determined from a Poisson distribution with mean U/(2s) (Charlesworthet al. 1990; Deng and Lynch 1996a).
New approach: A number of randomly outcrossed parents are sampled, with each having fitness W = (1 − hs)^{n}, where n is the number of mutations randomly determined from the Poisson distribution with mean U/(hs) (Deng and Lynch 1996a). A number of selfed progeny genotypes for each parent is obtained, with each selfed genotype being determined by allowing the n parental heterozygous loci to segregate randomly into AA, Aa, and aa classes with respective probabilities of 0.25, 0.5, and 0.25. Letting n_{1} and n_{2} (both resulting from random segregation) be the numbers of heterozygous and homozygous loci containing mutations in a selfed offspring, its fitness is W(n_{1},n_{2}) = (1 − hs)^{n1} (1 − s)^{n2}. Equation 3c is used to estimate h. For the new approach in outcrossing populations 20 parents are employed throughout, each having 50 selfed progeny genotypes.
Constant mutation effects with over(under)dominance: Under over(under)dominance, the key assumption for estimating h, that the frequency of one homozygote at any polymorphic locus is low, is unlikely to be valid in outcrossing populations; however, in at least two situations, it may hold and Mukai's approach should be applicable regardless of h value. The first is in highly selfing populations, where overdominance for mutations is unlikely to be responsible for the maintenance of genetic variability (Kimura and Ohta 1971). The second is when lines are obtained by generations of mutation accumulation from homozygous replicate lines. If the mutation rate per locus is low, overdominant mutants are unlikely to achieve high frequencies.
Generally, the distribution of the number of loci per genome with overdominant mutations is not clear. Simulations are performed for different distributions of the number of loci having (over)dominance mutations. The simulation procedures are the same as before for the dominance case, except that the distribution of the number of loci under (over)dominance is different. The results (Table 2) indicate little influence on the estimation by Mukai's approach under very different simulated distributions of the number of loci per genome having (over)dominant alleles. This is not unexpected, because the derivation of the approach is based on the onelocus results and withinfamily data and extended to multiple loci under additive mutation effects across loci. No assumption was made as to the distribution of the number of loci having (over)dominant alleles per genome. Therefore, as before (i.e., Poisson distribution of the polymorphic loci is used) except that we let the parameter h < 0 (overdominance) or h > 1 (underdominance), simulations are performed for Mukai's approach for selfing populations. For Mukai's approach using mutation accumulation lines, the principle is the same and the simulation and results are similar, and hence not presented.
Mixed dominance and overdominance:Lines from highly selfing populations or derived by mutation accumulations: It is possible that both dominance and overdominance underlie heterosis and that new mutations of either nature can occur in the genome. Some interesting questions are the following: In highly selfing populations, what is the major cause of heterosis? In the genome, what is the major type of new mutations for heterosis? Can we answer these questions from ĥ? Throughout, a circumflex (^) indicates an estimated value.
With both dominant and overdominant loci present, if fitness effects of individual loci are independent, as is the case for the multiplicative fitness function, the heterosis due to dominance and overdominance is independent. Let the mutation rate to deleterious alleles and constant mutational effects under dominance be U_{1}, h_{1}, and s_{1}, respectively. In large highly selfing populations, the expected heterosis (the ratio of the mean fitness of the outcrossed offspring generation W_{o} to that of the homozygous parental generation W_{p}), which is due to dominance (δ_{d}), is then (Charlesworthet al. 1990; Deng and Lynch 1996a)
New mutational occurrence in the genome most likely follows a Poisson distribution, whether it involves dominant or overdominant mutations. Throughout, mutations fitting the (over)dominance hypothesis will be referred to as (over)dominant mutations. In highly selfing populations, mutant alleles will be maintained by mutationselection balance, regardless of their (over)dominance. This is because within a selfing line, frequent selfing will quickly bring any polymorphic locus into homozygous state. Under obligate selfing, different selfing lines are essentially reproductively isolated from each other, and thus overdominance will not contribute to the maintenance of genetic variability. Hence, as in the dominance case, we assume that the number of loci with overdominant mutants (n) (all in homozygous state) per genome in selfing populations is Poisson distributed with mean
The total heterosis is δ = E(W_{o})/E(W_{p}) = δ_{d}δ_{o}. The contribution to heterosis from dominance relative to overdominance can then be measured by the index
Using Equation 7, we can determine the number of overdominant loci (N_{o}) when dominance and overdominance contribute equally to heterosis
In simulations, the genome contains both dominant and overdominant loci, all at mutationselection equilibrium. In the parental generation, the number of dominant loci in each individual is sampled from the Poisson distribution of mean U/(2s_{1}) (Charlesworthet al. 1990; Deng and Lynch 1996a), and that for the overdominant loci is determined from a Poisson distribution with mean
Homozygous lines constructed from outcrossing populations: In outcrossing populations, with overdominance the key assumption that the rarer allele at any polymorphic locus is of a low frequency is often invalid. Thus, both Mukai's and our new approaches should not be used. However, there have been some practice and data on estimating
Let U_{1}, h_{1}, s_{1}, h_{2}, s_{2}, δ_{d}, δ_{o}, α, and N_{o} be defined as before. In large outcrossing populations, upon selfing (Deng and Lynch 1996a)
In the simulations, the homozygous lines constructed from large populations at mutationselection equilibrium contain both dominant and overdominant loci. The number of loci homozygous for deleterious alleles (under the dominance hypothesis) is determined from a Poisson distribution of mean U/(2h_{1}s_{1}), as explained before. The alleles at the n overdominant loci are determined from random uniform variables (ξs) (from 0 to 1) with allele A being chosen if ξ ≤ h_{2} − 1/2h_{2} − 1 [where h_{2} − 1/2h_{1} − 1 is the equilibrium frequency of A allele (Crow 1986)], and allele a chosen otherwise. This simulation procedure allows identity by state for mutant alleles at the overdominant loci in different inbred lines. In outcrossed progeny, the number of the loci (all in heterozygous state) having dominant mutations is the sum of the number of mutations of its two homozygous parents; the genotypes at the n overdominant loci are determined by the overdominant alleles at these loci in its two homozygous parents. Other aspects being the same as before, simulations for Mukai's approach are then performed.
Variable mutation effects under dominance: Deleterious mutation effects across loci (s_{i} and h_{i}) are not constant. The few available data suggest that s_{i} has a roughly exponential distribution (Gregory 1965; Mackayet al. 1992; Keightley 1994). As in Deng and Lynch (1996a), we use the exponential distribution to model s_{i}
To evaluate how Equations 1b and 4b perform for estimating
Because the estimates are usually biased under the variable mutation effects, we compute their MSE (mean square error) for comparison:
The effects of lethals: The above study for variable mutation effects assumes that the genome contains no lethal mutations. This is a good assumption for selfing populations, where lethal mutations cannot survive for more than a few generations, due to frequent exposure to selection in homozygous state. In outcrossing populations, this assumption does not hold (Simmons and Crow 1977; Crow and Simmons 1983), as lethals are usually shielded from selection in heterozygous state by their low degree of dominance. With Mukai's approach, lethals will not appear in final homozygous lines constructed. During generations of inbreeding to construct homozygotes, lines homozygous for lethals will be immediately lost. Therefore, Mukai's approach gives the estimates only for mildly deleterious mutations. Hence, we only evaluate our new approach under variable mutation effects with lethals in outcrossing populations.
Lethals (s_{L} = 1) compose approximately 1% of the genomic mutations, and h_{L} for lethals is estimated to be about 0.02 (Simmons and Crow 1977; Crow and Simmons 1983). The simulations in outcrossing populations with lethals and variable mutation effects are identical in all respects to those in the previous section, but with an additional low genomic mutation rate (1%) to lethals (s_{L} = 1, h_{L} = 0.02). In simulations, selfed offspring homozygous for lethals will be excluded from analysis, as is most likely the practice in actual experiments. The selfed family means are for those offspring that do not contain homozygous lethals.
RESULTS
Constant mutation effects with either dominance or (under)overdominance (Table 3): The bias and sampling variance of estimates by both approaches is very small, especially for Mukai's approach. The very small bias is because the logarithmic transformation of the multiplicative fitness function is used to approximate the additive fitness function assumed by the derivations (Deng and Fu 1997). In selfing and outcrossing populations, estimates by Mukai's approach yield nearly identical results, except for the undetectable difference in the sampling errors. The new approach has slightly larger bias and larger sampling variance. This is partly because the estimate is subject to more sampling error, as the mean of each selfed family is estimated by a limited number of progeny. In highly selfing populations or in lines from mutation accumulations where the key assumption holds, Mukai's approach can estimate h accurately under over(under)dominance.
Mixed dominance and overdominance: Lines from highly selfing populations or derived by mutation accumulations (Table 4): When the contributions to heterosis from dominance and overdominance are about the same (α ≈ 1), the ĥs are always positive with small sampling errors (Table 4), which favors the dominance hypothesis. Only with relatively large overdominance effects (h < −0.1) and overwhelming contributions from overdominant loci (α ~ 0.05), is ĥ < 0. Simulation results not shown here indicate similar conclusions for lines from mutation accumulations.
Homozygous lines constructed from outcrossing populations (Table 5): With α ≈ 1, ĥs are always positive with small sampling errors. Unlike estimates from selfing populations or mutation accumulation where the key assumption holds, ĥ is always positive when employing Mukai's approach in outcrossing populations. Even with pure overdominant genetic effects, ĥ is almost always greater than 0. For example, if the genome contains only 200 overdominant loci with h_{2} = −0.1 and s_{2} = 0.03, the equilibrium frequency for allele A is 0.917 and a 0.083. Applying Mukai's approach, we obtain ĥ = 5.84E4 (1 SD = 0.05738). This is because the key assumption of the approach that rarer alleles at all loci are of low frequencies is violated at overdominant loci.
In summary, with mixed dominance and overdominance jointly causing heterosis, Mukai's approach cannot be employed to distinguish dominance vs. overdominance. On the other hand, it is encouraging to see that the presence of overdominant loci does not greatly bias the estimation of h for the dominant alleles. Even with α ≈ 1 (i.e., equal contribution of dominance and overdominance to heterosis), ĥ is about 70% of the true h value for the dominant alleles (Table 5). Therefore, the ĥ_{s} estimated by Mukai's approach in outcrossing populations such as Drosophila (e.g., Mukaiet al. 1972; Mukai and Yamaguchi 1974; Eaneset al. 1985) may represent values that are underestimated (but not to a great extent) for dominant loci. These results may be better understood if we recall that, by derivation (Mukaiet al. 1972), Mukai's method estimates an average of h_{i}s at individual loci weighted by the genetic variance of the homozygotes. This can lead to very peculiar results as reflected by the simulation results here. In an extreme case involving loci of symmetrical overdominance (s_{2} = 0), even though the loci contribute to inbreeding depression, they are assigned a zero weight in estimating the average of h_{i}s across loci. This reflects the fact that the contributions to inbreeding depression and to the estimation of the average dominance coefficient from individual loci are not exactly the same. Therefore, if dominant and overdominant loci coexist in the genome, inferring which is the major cause for heterosis by the sign of ĥ_{s} is invalid.
Variable mutation effects under dominance (Table 6): Under variable mutation effects without lethals,
A similar bias pattern is observed for
The biases may have come from at least two sources: (1) The logarithmic transformation of the multiplicative fitness function is employed to approximate the additive fitness function (Equation 5). (2) The definitions of the estimates of
The effects of lethals in outcrossing populations (Table 7):
DISCUSSION
In this study, we develop a new approach to estimate
We developed a methodology in natural outcrossing
and selfing populations to estimate
We concentrate on studying the most plausible multiplicative mutation effects. Although epistatic mutation effects have been speculated and may be possible, their detection is a very difficult empirical problem, and little convincing information exists on the subject. We therefore do not study their effects here. The effects of synergistic mutation have been investigated for estimating genomic mutation rate U (Charlesworthet al. 1990) and U,
Due to the lack of knowledge of the statistical properties, it has been a practice (even until fairly recently) to discriminate dominance and overdominance hypotheses by estimated
We concentrate on studying the estimation of
Our results indicate that estimating
When implementing the two methods here, some practical issues need to be considered. The discussion of these practical issues cannot possibly be exhaustive here because different situations have their peculiar practical problems. An important common problem is the intergenerational environmental change. In selfing populations, homozygous parental genotypes can be cloned by further selfing, so that parents and outcrossed progeny can be assayed side by side with a randomized design in a single environment. In outcrossing populations where cloning of genotypes is possible, such as in cyclical parthenogens (Deng 1995), parents and selfed progeny can also be assayed in the same environment. Cloning of genotypes can essentially eliminate the problem of intergenerational environmental change (Deng 1995, 1997). In outcrossing populations where cloning of genotypes is not possible, the problem can be minimized by making the assay environments of the parent and progeny as similar as possible; additionally, large controls can be raised so that the values of the parents and offspring can be adjusted by the controls to be comparable.
Estimating
Acknowledgments
We thank Drs. D. Charlesworth, D. Houle, M. Lynch, M. Uyenoyama and two anonymous reviewers for very helpful comments on the manuscript. H.W. Deng thanks Dr. M. Lynch for years of advice and Dr. D. Hedgecock for providing support to attend the conference “The Genetic and Physiological Bases of Heterosis,” which greatly benefited the development of this work. The work was partially supported by a FIRST AWARD from the National Institutes of Health to Y.X. Fu. H.W. Deng was also supported by a Health Future Foundation grant to Dr. R. Recker when preparing this article.
Footnotes

Communicating editor: M. K. Uyenoyama
 Received June 26, 1997.
 Accepted December 18, 1997.
 Copyright © 1998 by the Genetics Society of America