Genetics, Vol. 151, 895-913, February 1999, Copyright © 1999

The Effect of Overdominance on Characterizing Deleterious Mutations in Large Natural Populations

Jin-Long Lia, Jian Lia, and Hong-Wen Denga
a Osteoporosis Research Center and Department of Biomedical Sciences, Creighton University, Omaha, Nebraska 68131

Corresponding author: Hong-Wen Deng, Osteoporosis Research Center, Creighton University, 601 N. 30th St., Suite 6787, Omaha, NE 68131., deng{at}creighton.edu (E-mail)

Communicating editor: M. A. ASMUSSEN


*  ABSTRACT
*TOP
*ABSTRACT
*SIMULATIONS
*RESULTS
*DISCUSSION
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

Alternatives to the mutation-accumulation approach have been developed to characterize deleterious genomic mutations. However, they all depend on the assumption that the standing genetic variation in natural populations is solely due to mutation-selection (M-S) balance and therefore that overdominance does not contribute to heterosis. Despite tremendous efforts, the extent to which this assumption is valid is unknown. With different degrees of violation of the M-S balance assumption in large equilibrium populations, we investigated the statistical properties and the robustness of these alternative methods in the presence of overdominance. We found that for dominant mutations, estimates for U (genomic mutation rate) will be biased upward and those for (mean dominance coefficient) and (mean selection coefficient), biased downward when additional overdominant mutations are present. However, the degree of bias is generally moderate and depends largely on the magnitude of the contribution of overdominant mutations to heterosis or genetic variation. This renders the estimates of U and not always biased under variable mutation effects that, when working alone, cause U and to be underestimated. The contributions to heterosis and genetic variation from overdominant mutations are monotonic but not linearly proportional to each other. Our results not only provide a basis for the correct inference of deleterious mutation parameters from natural populations, but also alleviate the biggest concern in applying the new approaches, thus paving the way for reliably estimating properties of deleterious mutations.


THE genome of any organism is subject to continuous bombardment of mutations, the majority of which are deleterious. Numerous theories based on the assumptions of deleterious genomic mutations have been developed to explain some fundamental phenomena in biology. These phenomena include (but are not limited to) the evolution of sex and recombination (MULLER 1964 Down; KONDRASHOV 1985 Down, KONDRASHOV 1988 Down; CHARLESWORTH 1990 Down), mate choice (KIRKPATRICK and RYAN 1991 Down), diploidy (KONDRASHOV and CROW 1991 Down), and out-breeding mechanisms (CHARLESWORTH and CHARLESWORTH 1987 Down). Theories also indicate that the parameters of deleterious genomic mutations determine the mutation load in populations at equilibrium (HALDANE 1937 Down; KIMURA et al. 1963 Down; BUGER and HOFBAUER 1994 Down), the role of deleterious mutations in the extinction of small populations (LANDE 1994 Down; LYNCH et al. 1995 Down, LYNCH et al. 1996 Down), the rate of input of genetic variance from deleterious mutations per generation (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down), and the extent to which neutral molecular variation is reduced due to background selection (B. CHARLESWORTH et al. 1993 Down; D. CHARLESWORTH et al. 1995 Down; HUDSON and KAPLAN 1995 Down). The validity of all these theories critically depends on the parameters of deleterious mutations.

For the rest of the Introduction, the following definitions and distinctions between dominance and overdominance are in order. For a locus with alleles A and a, let the three genotypic values of fitness be

Here, h is the dominance coefficient, where h < 0 implies overdominance, h = 0.5 implies additivity, and 0 <= h <= 1.0 (h != 0.5) implies dominance. Note that we use "dominant" or "dominance" to refer to cases of complete dominance and partial dominance. Mutations with (over)dominant effects are referred to as (over)dominant mutations. Deleterious genomic mutations generally refer to dominant mutations. Dominance is compatible with mutation-selection (M-S) balance; overdominance essentially encompasses all kinds of balancing selection at the allelic level (DENG et al. 1998A Down).

Three essential parameters of deleterious genomic mutations are (1) the genomic mutation rate (U), (2) the mean selection coefficient (), and (3) the mean dominance coefficient (). For the three essential parameters, there are now three approaches for estimation:

  1. The mutation-accumulation (M-A) approach (BATEMAN 1959 Down; MUKAI 1964 Down; MUKAI et al. 1972 Down): This technique estimates U and s. Most estimates have come from this approach applied to Drosophila melanogaster (MUKAI 1979 Down; CROW and SIMMONS 1983 Down; KEIGHTLEY 1994 Down, KEIGHTLEY 1996 Down) and have been very hard to acquire, requiring large and long-term M-A and special chromosomal constructs or inbred/asexual lines. The data from M-A can also be analyzed by the maximum-likelihood method (KEIGHTLEY 1994 Down) or the minimum-distance method (GARCIA-DORADO 1997 Down).

  2. The inbreeding depression approach (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down): Requiring a value that must be assumed or that cannot be estimated without bias (CABALLERO et al. 1997 Down; DENG and FU 1998 Down; DENG 1998A Down), this technique per se estimates U only. In the highly selfing annual plants Leavenworthia (CHARLESWORTH et al. 1994 Down) and Amsinckia (JOHNSTON and SCHOEN 1995 Down), U estimates from this approach are in line with earlier ones from M-A in Drosophila, suggesting high deleterious genomic mutation rates.

  3. The fitness moments approach (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down; DENG 1998B Down): This approach estimates U, h, and s. For two outcrossing species of cyclical parthenogenetic Daphnia (a freshwater microcrustacean), preliminary estimates by this approach generally agree with earlier ones from other species (DENG and LYNCH 1997 Down) and those from the direct M-A approach in Daphnia (LYNCH 1985 Down; LYNCH et al. 1998 Down).

The last two approaches depend on the change in mean (and genetic variance) of fitness traits upon only one generation of mating in large selfing or outcrossing populations. In comparison, the first approach is much more time-consuming and requires many generations of M-A. None of the current experimental designs and statistical methods can estimate mutation parameters without bias. Under a number of biologically plausible conditions, the statistical properties of the above three approaches were compared (DENG and FU 1998 Down). We found that, generally speaking, the third approach has the best statistical properties as reflected by the minimum mean square error (MSE). MSE is a composite index of both bias and sampling error for biased estimates.

An essential assumption common to the last two approaches is that all the genetic variation in the study population is maintained under M-S equilibrium. Accordingly, changes in the mean and genetic variance of fitness (or its components) upon inbreeding or outcrossing are solely due to deleterious dominant mutations maintained by M-S balance. Even in large populations, despite tremendous efforts (e.g., HOULE 1989 Down, HOULE 1994 Down; HOULE et al. 1996 Down; CHARLESWORTH and HUGHES 1998 Down; DENG 1998A Down; DENG et al. 1998A Down), the validity of this assumption is unknown. In large populations, alternatives to M-S balance, such as functional overdominance or overdominance induced by fluctuating selection, can in principle maintain polymorphisms, although no strong case has emerged for their generality (DENG and LYNCH 1996 Down).

The robustness of the approaches applied to natural populations has been investigated under a range of biologically plausible conditions, such as variable and/or epistatic mutation effects, etc. (CHARLESWORTH et al. 1990 Down; DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down; DENG and FU 1998 Down; DENG 1998B Down). Generally speaking, U and are underestimated and is overestimated. The direction and the magnitude of the bias revealed may provide a numerical basis for the close inference of deleterious genomic mutations. However, estimation under violation of the M-S balance assumption has never been investigated. It is intuitive that violation of the M-S balance assumption will result in biased estimates (DRAKE et al. 1998 Down). However, a critical issue is, What are the statistical properties (the degree of bias and sampling variance, especially the bias) under different degrees of violation of the M-S balance assumption?

The M-S balance assumption can be violated in several scenarios, such as in small populations subject to random genetic drift or in large populations subject to balancing selection due to functional overdominance and/or fluctuating selection at the allelic level. Each scenario deserves careful consideration and thus separate treatment. The two approaches applicable to natural populations were originally devised for large populations at approximate equilibrium. Hence, we investigate estimation in large natural populations with genetic variance maintained by either M-S balance or balancing selection, and with inbreeding depression caused by either dominant or overdominant mutations. The study is conducted by computer simulations using algorithms we devised previously (DENG 1998A Down) and those we devise here. Other scenarios will be fully investigated in future studies by employing iterative algorithms (i.e., LYNCH et al. 1995 Down, LYNCH et al. 1996 Down) to construct populations (in linkage disequilibrium) of various finite sizes.

The experimental designs to characterize deleterious genomic mutations are different depending on the study population's mating type (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down; DENG and LYNCH 1996 Down). In outcrossing populations, the outcrossed parents from natural populations are selfed to obtain selfed progeny. In selfing populations, selfed parents from natural populations are outcrossed to obtain outcrossed progeny.

In this article, we first outline the simulations and develop the associated analytical derivations in outcrossing and selfing populations. Then we present the simulation results in these two types of populations for the fitness moments approach and the inbreeding depression approach for both constant and variable mutation effects. Finally, we discuss the implications of our current results for characterizing deleterious genomic mutations from natural populations.


*  SIMULATIONS
*TOP
*ABSTRACT
*SIMULATIONS
*RESULTS
*DISCUSSION
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

The direction and the magnitude of the bias under balancing selection with overdominance are of particular interest to geneticists. To focus on this, we assume that genotypic values are measured accurately. In reality, this would require that each genotype be clonally replicated and assayed a large number of times. Ignoring measurement error for genotypic values reduces the sampling error of estimates, but is unlikely to bias either the estimation or the comparison of the techniques, assuming that the same number of genotypes would be handled experimentally. This is supported by our previous investigations (DENG and LYNCH 1996 Down; DENG and FU 1998 Down; DENG et al. 1998B Down). In outcrossing populations, inbreeding (such as sib mating) experiments can be performed for estimation, and selfing is not required (DENG 1998B Down). To apply the fitness moments approach (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down), we found (DENG 1998B Down) that for a given sample size, sampling one selfed progeny is generally more efficient than sampling more selfed progenies from each selfing family. Therefore, for outcrossing populations, selfing experiments in which only one selfed progeny is sampled from each parent are simulated for applying the fitness moments approach.

Large outcrossing populations at equilibrium are constructed with some dominant loci maintained under M-S balance and other overdominant loci maintained by balancing selection. In large selfing populations, overdominance does not contribute to the maintenance of genetic variability (because of constant exposure to the homozygous state under selfing), and mutations of overdominant effects are also maintained by M-S balance (CHARLESWORTH et al. 1990 Down; DENG 1998A Down). For both outcrossing and selfing populations, we study first constant, then variable, mutation effects for dominant mutations. For overdominant mutations, we assume that their effects are constant across loci. This treatment may be at least partially justified by the facts that (1) no theoretical and empirical evidence bearing on the genetic effects across overdominant loci exists and (2) what concerns geneticists most is the estimation under different contributions of overdominant loci to heterosis and standing genetic variation in populations, irrespective of their constant or variable effects. Here, heterosis will refer both to inbreeding depression in outcrossing populations and to outbreeding enhancement in inbred populations. The investigation of the methods under their respective assumptions with constant fitness effects can serve as a starting point for comparison with more realistic situations investigated later in this and future studies.

Mutation effects on fitness across all loci are assumed to be multiplicative throughout, an assumption that is biologically plausible (MORTON et al. 1956 Down; CROW 1986 Down; CRADDOCK et al. 1995 Down; FU and RITLAND 1996 Down) and assumed in the original development of the approaches applied to natural populations (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down; DENG and LYNCH 1996 Down). Simulations and algorithms are outlined or developed for outcrossing populations and for selfing populations in the following sections.

Outcrossing populations:
Loci of constant dominant mutation effects mixed with overdominant loci: At dominant loci at M-S balance, the number of mutations per individual (after selection, all in the heterozygous state) is Poisson distributed with an expectation of = (DENG and LYNCH 1996 Down). The population is assumed to be random mating and at linkage equilibrium. Throughout, h and s generally refer to the dominance and selection coefficients of deleterious genomic mutations. In each situation, simulations are performed for different sets of parameters. For each parameter set, K individuals are randomly sampled from both the outcrossed parental and selfed progeny generations (DENG and LYNCH 1996 Down; DENG 1998B Down). Unless otherwise specified, K = 200 for outcrossing populations. The total number of genotypes employed in an experiment for the fitness moments approach in outcrossing populations is then 400. For a genotype with n dominant mutations (randomly determined from the Poisson distribution) from the outcrossed parental generation, the fitness is

where hi and si are the dominance and selection coefficients of the ith locus with mutations. They are assumed to be constant initially and made variable later. Wmax is the fitness of a genotype that is free of segregating deleterious genomic mutations in the experimental environment where the measurements are taken. This parameter serves as a scaling factor so that fitness can be on any scale instead of from 0 to 1.

For a genotype sampled from the selfed progeny generation, the fitness is

where n1 and n2 are, respectively, the numbers of loci with mutations in heterozygous and homozygous states. n1 and n2 are determined from two levels of random sampling: (1) A number (n) of loci is randomly determined from the Poisson distribution with mean = . (2) Each of these n loci has a probability of 1/4 in the selfed progeny of being homozygous for the normal A allele, a probability of 1/2 of being a heterozygote Aa, and a probability of 1/4 of being a homozygote aa. After the genotypic status of each locus is determined as above, n1 and n2 are, respectively, the sum of loci heterozygous and homozygous for mutations.

Now consider the overall individual fitness with N polymorphic overdominant loci in the genome in addition to those dominant loci at M-S balance. At an overdominant locus with effect ho < 0 and so in large populations, the equilibrium frequency of the more fit allele B is p = (CROW 1986 Down) and that of the less fit allele b is q = . With N such additional overdominant polymorphic loci in the population, the overall fitness of a random parental individual now becomes

where n3 and n4 are, respectively, the numbers of overdominant loci with genotypes Bb and bb in the genome of this individual, and n is defined earlier for the dominant loci. n3 and n4 are determined by random sampling, N times, from the trinomial distribution, with genotypes Bb and bb having the frequencies of 2pq and q2, respectively. Recall that the population is assumed to be randomly mating. Different N values are assumed in simulations.

Upon selfing, the overall fitness of a selfed progeny whose parent has n3 overdominant loci with the Bb genotype and n4 overdominant loci with the bb genotype, is

where the term in the brackets has been explained earlier for the dominant loci. n5 and n6 are the number of loci in the selfed progeny heterozygous (Bb) and homozygous (bb) at the n3 heterozygous overdominant loci of the parent. They are determined by random segregation during selfing of the parent, in the same manner as n1 and n2.

Once the desired samples of K individuals from the parent and selfed progeny generations are simulated, we estimate the parameters of deleterious genomic mutations on the basis of the assumption of pure dominant mutations maintained under M-S balance (DENG and LYNCH 1997 Down). Let o and {sigma}2o denote the mean and genetic variance of fitness in the outcrossed parental generation, respectively; and let s and {sigma}2s denote the corresponding values among the selfed progeny generation, respectively. These can be computed easily from simulated data (DENG 1998B Down). We define x, y, and z as follows:

(1)

Then

(2a)

(2b)

(2c)

If a value of h is assumed by external knowledge or estimated by other experimental designs and estimation methods, U can then be estimated from the change in the mean fitness upon selfing by Equation 2b (MORTON et al. 1956 Down). The method of DENG 1998A Down is employed to estimate h. Unlike Mukai's method (MUKAI et al. 1972 Down), Deng's method does not require construction of homozygous lines. When applied to outcrossing populations, it can achieve about the same quality of estimation as Mukai's method. The data needed are the genotypic value of the parent fitness (w), and the mean genotypic fitness value (z) of the multiple selfed progeny within each selfed family. Let t = 4z - 2w. Then

(3)

For investigating the inbreeding depression approach in outcrossing populations, a set of M selfing families, each having S selfed progeny, is simulated as above to estimate h (Equation 3). This set of simulated selfing families and another set of L selfing families (each with one selfing parent and one selfed offspring) are employed to estimate inbreeding depression and then U (Equation 2b). Unless otherwise specified, M = 10, S = 40, and L = 20. This choice of parameters is shown to be efficient on the basis of our previous investigation (DENG 1998A Down; DENG and FU 1998 Down). The total number of genotypes employed in an experiment is then 450.

Two aspects of overdominant mutations concern geneticists most and are directly relevant to characterizing deleterious genomic mutations from natural populations. One is the contribution of dominant mutations to heterosis (the mean fitness of the outcrossed generation to the inbred generation) relative to that of overdominant mutations. The other is the magnitude of genetic variation due to dominant mutations maintained under M-S balance relative to that due to overdominant mutations maintained by balancing selection.

The contribution to the total heterosis upon selfing from the dominant mutations can be measured by the index

(4)
where wdp and wdo are, respectively, the fitness in the parent and selfed offspring generations (denoted by p and o, respectively, in the second term of the subscript) if there were only dominant (d in the first term of the subscript) mutations in the genome. wtp and wto are, respectively, the fitness in the parent and selfed offspring generations if the total mutation effects (including both dominant and overdominant mutations, denoted by t in the first term of the subscript) are considered. E denotes the mathematical expectation. The derivation of the four expectation terms in Equation 4 is technical and tedious; thus, we present them in Appendix 1 (Equation A1Equation A2Equation A3Equation A4).

The index {alpha} plays an important role. Compared with a similar index ({alpha}, constructed on the original fitness scale) of DENG 1998A Down, {alpha} here represents the proportion of heterosis on the log fitness scale that is attributable to dominant mutations. Therefore, {alpha} ranges from 0 to 1. If {alpha} = 1, the sole cause of heterosis is dominance; if {alpha} = 0, it is overdominance. The smaller the {alpha}, the larger the contribution to heterosis from overdominant mutations.

To measure the magnitude of genetic variation from dominant mutations maintained under M-S balance relative to that from overdominant mutations maintained by balancing selection, we define the index

(5)
ß is the proportion of the standing genetic variation on the log fitness scale in the parental generation that is attributable to dominant mutations maintained under M-S balance; accordingly, 1 - ß is that attributable to overdominant mutations maintained by balancing selection. ß ranges from 0 to 1. The smaller the ß, the larger the contribution to the standing genetic variation from overdominance. The numerator and denominator of Equation 5 can be expressed in terms of parameters for dominant and overdominant mutations (Equation A6 and Equation A8 in Appendix 1).

Dominant loci with variable mutation effects mixed with overdominant loci: Deleterious mutation effects hi and si, across loci are unlikely to be constant. For example, si may vary anywhere from 0 (neutral mutation) to 1 (lethal mutation). The rate of mutations with different effects may also vary so that mutations of smaller effects may occur at higher rates. To evaluate the direction and the magnitude of bias introduced jointly by variable mutation effects and overdominant mutations, as in DENG and LYNCH 1996 Down, we adopt an exponentially distributed mutation rate for mutations of variable effect si:

(6a)

Also we let

(6b)

By Equation 6b, hi and si are correlated. These are in rough accordance with the few available data (GREGORY 1965 Down; CROW and SIMMONS 1983 Down; MACKAY et al. 1992 Down; KEIGHTLEY 1994 Down) and with biochemical arguments (KACSER and BURNS 1981 Down). In Equation 6b, = 0.36 when = 0.03, h -> 0.5 as s -> 0, and h -> 0.0 as s -> 1.0, all in rough accordance with the data (CROW and SIMMONS 1983 Down). However, true mutational spectra may be such that the dominance of individual mutations is broadly scattered around such a function (CABALLERO and KEIGHTLEY 1994 Down). With variable effects for deleterious genomic mutations, the indices {alpha} (Equation 4) and ß (Equation 6aEquation 6b) can be constructed using the results in APPENDIX A.

In simulations, we divide the entire range of s (0 - 1) into 100 discrete classes of width 0.01. Within each class, mutations have constant effects (hi and si). Each individual from the outcrossed parental generation in the simulation is assigned a number (ni) of heterozygous mutations from the ith of these classes by drawing from a Poisson distribution with expectation Upi/(hisi), where pi is the density of the mutational distribution in the ith class. For an individual from the selfed progeny generation, ni's are first determined as above. Then for each of the ni loci, the genotype is, as before, determined by randomly sampling from the trinomial probabilities so that probabilities for different genotypes are 1/4 for AA, 1/2 for Aa, and 1/4 for aa, respectively (due to random segregation during selfing of parents). This discrete treatment closely approximates the continuous distribution of mutation effects (H.-W. DENG, unpublished data).

Selfing populations:
To estimate deleterious genomic mutations, selfed individuals from natural selfing populations are crossed randomly to obtain outcrossed progeny. In selfing populations, new mutations in the genome most likely follow a Poisson distribution, whether they involve dominant or overdominant mutations. In highly selfing populations, mutant alleles will be maintained by M-S balance, regardless of their (over)dominance (DENG 1998A Down). Hence, as in the dominant case, we assume that the number of loci with overdominant mutants (n7), all in the homozygous state, per genome in selfing populations is Poisson distributed with mean o and constant effects ho and so. If the genomic mutation rate to the overdominant (but less fit) allele a is Uo, it can be easily shown that at M-S equilibrium, o = (CHARLESWORTH et al. 1990 Down; DENG 1998A Down).

In each situation, a variable number K of individuals is randomly sampled from the selfed parental and outcrossed progeny generations, respectively. For a genotype with n dominant and n7 overdominant mutations [randomly determined from the Poisson distribution with mean U/(2s) and Uo/(2so), respectively] from the selfed parental generation, the fitness is

For an outcrossed progeny resulting from crossing two selfed parents (with nf, n7 and nm, n8 homozygous loci for dominant and overdominant mutations, respectively, where the subscript f indicates female parent and m the male parent), its fitness is

hi and si are the dominance and selection coefficients of the ith locus with dominant mutations. They are assumed to be constant initially and made variable later.

In selfing populations, the indices {alpha} and ß defined in Equation 4 and Equation 5 can be constructed from the derivations in Appendix 1 for the constant and variable dominant mutation effects, respectively. In simulated populations, the genome contains both dominant and overdominant loci, all at M-S equilibrium. In the parental generation, the number of homozygous dominant loci in each individual is determined by random sampling from a Poisson distribution of mean U/(2s), and the number for the overdominant loci is determined from a Poisson distribution with mean o [=] (CHARLESWORTH et al. 1990 Down; DENG 1998A Down). In the outcrossed offspring generation, the number of dominant loci in each individual is sampled from a Poisson distribution of mean U/s, and the number of overdominant loci in each individual is determined from a Poisson distribution with mean Uo/so (CHARLESWORTH et al. 1990 Down; DENG 1998A Down), all in the heterozygous state. The variable dominant mutations are modeled by Equation 6aEquation 6b and are simulated by discrete classes of mutations, in a manner similar to that in outcrossing populations as described earlier.

Once the desired samples of K individuals from the selfed parent and the outcrossed progeny generations are simulated, the estimation developed on the basis of the assumption of pure dominant mutants maintained under M-S balance (DENG and LYNCH 1996 Down) is applied. Unless otherwise specified, K = 200 for selfing populations. The total sample size is then 400 for the application of the fitness moments approach. Let o, {sigma}2o and s, {sigma}2s be the mean and genetic variance of the fitness in the outcrossed progeny and selfed parental generations, respectively. Let x, y, z be defined as in Equation 1; then

(7a)

(7b)

(7c)

To apply the inbreeding depression approach to estimate U (CHARLESWORTH et al. 1990 Down), the value for h must be assumed or estimated by other experimental designs and methods. Mukai's method (MUKAI et al. 1972 Down) is employed to estimate h. It estimates h approximately by the slope of the regression of the outcrossed-progeny fitness (x) on the fitness sum (y) of the two corresponding parental homozygotes:

Once h is estimated, U can be estimated by Equation 7b. A sample of 200 outcrossing families, each consisting of two selfed parents and one outcrossed offspring, is simulated to implement the inbreeding depression approach. The total number of genotypes employed in the experiment is 600.

In simulations, we arbitrarily let Wmax = 1, as the values of Wmax do not influence the estimation for the mutation parameters (DENG and LYNCH 1996 Down). For each set of parameters, we perform 500 simulations. Unless otherwise specified, in all the simulations presented (except with pure overdominant mutations in the genome, i.e., when {alpha} = ß = 0), U = 1.0, = 0.36, and = 0.03, which are close to the most often cited values estimated by MUKAI et al. 1972 Down; LYNCH et al. 1995 Down. The experimental designs have been laid out earlier for different estimations in different populations. Results for other simulation parameters (e.g., U = 0.1–4.0 and = 0.01–0.05) and experimental designs have also been performed. The results are similar and thus not presented. Because almost all the results are biased, the MSE is presented together with one standard deviation (SD) computed over the repeated simulations.


*  RESULTS
*TOP
*ABSTRACT
*SIMULATIONS
*RESULTS
*DISCUSSION
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

Outcrossing populations
Constant dominant mutation effects: The fitness moments approach (Table 1): With only deleterious dominant loci in the genome (N = 0 and {alpha} = ß = 1), the estimates for U, h, and s are unbiased. Recall that N is the number of polymorphic overdominant loci in the population and {alpha} and ß are, respectively, the proportion of heterosis and genetic variation on the log fitness scale that is attributable to dominance mutations. With overdominant loci coexisting in the genome with deleterious dominant loci (N > 0 and 0 < {alpha}, ß < 1), Û (^ indicates an estimated value) is an overestimate, while h and s are underestimated. The degree of bias increases with increasing contributions from overdominance to heterosis (decreasing {alpha}) and to the standing genetic variation in the population (decreasing ß). Generally, the bias is not dramatic so that estimates of the upper bound of U and lower bounds of h and s can be obtained, and these estimates are close to the true parameter values. All the sampling errors are quite small. Even with only overdominant mutations in the genome ({alpha} = ß = 0), estimates of U, , and can still be obtained, although the parameter values do not exist for the dominant mutations. In this case, it is not incorrect to treat Û as an upper limit for the true U of zero. This is understandable, because, upon selfing (or outcrossing in selfing populations), overdominant mutations will also cause mean and genetic variance of fitness to change, similar to those changes caused by dominant mutations. This will be similar in every case, and thus will not be repeated. The estimation bias is relatively more sensitive to a change of ho than to a change of so. With a larger absolute value of ho, the degree of bias increases.


 
View this table:
In this window
In a new window

 
Table 1. Characterizing constant deleterious genomic mutations in the presence of overdominant mutations with the fitness moments approach in outcrossing populations

The inbreeding depression approach (Table 2): With N = 0 and {alpha} = ß = 1, the estimates for U and h are nearly unbiased. With N > 0 and 0 < {alpha}, ß < 1, U is generally overestimated, while h is underestimated. The degree of bias generally increases with decreasing {alpha} and ß. Compared with the fitness moments approach, the bias is larger for h and smaller for Û. The smaller bias of Û is largely due to the greatly underestimated h. This can be understood from Equation 2b or Figure 1 in DENG and FU 1998 Down. Although the presence of overdominant mutations will tend to bias Û upward, the bias will be greatly dampened by a greatly underestimated h. However, the estimation of U suffers from large sampling errors, even though the number of genotypes employed (450) is larger than that for the fitness moments approach (400). When both sampling error and bias are considered, the estimation of U by the inbreeding depression approach is generally worse than that by the fitness moments approach, as reflected by the larger MSE. The statistical properties (mean and sampling variance) of Û are relatively unstable with changes of a and ß. This instability is largely due to the relatively small sample size employed. When overdominance contributes importantly to the heterosis and standing genetic variation in natural populations (with small {alpha} and ß), Û is unacceptable even as an estimate for the upper limit because of the large sampling error. h estimated by DENG's (1998b) method can serve well as a lower bound of the true h as evidenced by its small sampling error.



View larger version (11K):
In this window
In a new window
Download PPT slide
 
Figure 1. Differential contribution of overdominant mutations to heterosis (as measured by {alpha}) and to the standing genetic variation (as measured by ß) in outcrossing populations (solid line) and selfing populations (dotted lines). Plots a and b are for constant and variable mutation effects, respectively. On each plot, the curve to the left of the diamond point is obtained by fixing N (number of polymorphic overdominant loci in outcrossing populations) or o (mean number of overdominant loci in selfing populations) and letting U vary. In plot a, N = 141 for outcrossing populations and o = 4 for selfing populations; in plot b, N = 233 for outcrossing populations and o = 5 for selfing populations. The curve to the right of the diamond point is obtained by fixing U (=1) and letting N or o vary. All other parameters are the same: = 0.36, = 0.03; ho = -0.1, so = 0.03.


 
View this table:
In this window
In a new window

 
Table 2. Estimates of h with Deng's method and U with the inbreeding depression approach to characterize constant dominant mutations in the presence of overdominant mutations in outcrossing populations

Variable dominant mutation effects: The fitness moments approach (Table 3): With N = 0 and {alpha} = ß = 1, U and are underestimated and is overestimated. With N > 0 and 0 < {alpha}, ß < 1, is always biased downward, and the magnitude of bias and sampling variance do not change much with changing {alpha} and ß. The degree of bias is relatively small so that {approx} 0.67 . The small bias and sampling variance of render it an ideal estimate of the lower limit for the true , and it is close to the true parameter value. The bias of Û and changes so that Û and are not always biased. When {alpha} and ß are relative large, so that overdominance does not contribute substantially to the heterosis and to the genetic variation in the population, U and are both underestimated. When {alpha} and ß gradually decrease, so that overdominance contributes more to the heterosis and the standing genetic variation in the population, Û and become unbiased and then overestimated. For the same magnitude of {alpha} or ß, with different parameters ho and so, the degree of bias for Û, , and is different. This is also true throughout this study and is not repeated.


 
View this table:
In this window
In a new window

 
Table 3. Characterizing variable deleterious genomic mutations in the presence of overdominant mutations with the fitness moments approach in outcrossing populations

It should be noted that with different ho and so parameters for overdominant mutations, the same {alpha} may correspond to a different ß. This can be inferred from the corresponding Equation 4 and Equation 5 and those in Appendix 1 and B. It is also evident in every case as can be seen from the numerical values of Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7 Table 8 for outcrossing and selfing populations and for constant and variable mutation effects. To illustrate the monotonic but nonlinear relationship between {alpha} and ß, Figure 1 plots the values of {alpha} and ß for constant and variable mutation effects in both outcrossing and selfing populations.


 
View this table:
In this window
In a new window

 
Table 4. Estimates of with Deng's method and U with the inbreeding depression approach to characterize variable dominant mutations in the presence of overdominant mutations in outcrossing populations


 
View this table:
In this window
In a new window

 
Table 5. Characterizing constant deleterious genomic mutations in the presence of overdominant mutations with the fitness moments approach in selfing populations


 
View this table:
In this window
In a new window

 
Table 6. Estimates of h with Mukai's method and U with the inbreeding depression approach to characterize constant dominant mutations in the presence of overdominant mutations in selfing populations


 
View this table:
In this window
In a new window

 
Table 7. Characterizing constant deleterious genomic mutations in the presence of overdominant mutations with the fitness moments approach in selfing populations


 
View this table:
In this window
In a new window

 
Table 8. Estimates of with Mukai's method and U with the inbreeding depression approach to characterize variable dominant mutations in the presence of overdominant mutations in selfing populations

The inbreeding depression approach (Table 4): With N = 0 and {alpha} = ß = 1, the estimates for U and are both biased downward. With N > 0 and 0 < {alpha}, ß < 1, U is generally underestimated when {alpha} and ß are relatively large and is only overestimated when {alpha} and ß are quite small. However, the sampling variance of Û is usually large. On the other hand, is always biased downward and the sampling variance is miniscule. With decreasing a and ß, the degree of bias of increases. can serve reasonably well as a lower bound of the true .

Selfing populations
Constant dominant mutation effects: The fitness moments approach (Table 5): With N = 0 and {alpha} = ß = 1, the estimates for U, h, and s are unbiased. With N > 0 and 0 < {alpha}, ß < 1, U is overestimated, while h and s are underestimated. The degree of bias increases with decreasing {alpha} and ß. However, the bias is not so dramatic that the upper bound of U and lower bounds of h and s can be estimated, and that they are not wildly far away from the true parameter values. The estimation bias is not very sensitive to changes in ho and so, especially for h and s.

The inbreeding depression approach (Table 6): With N = 0 and {alpha} = ß = 1, the estimates for U and h are nearly unbiased. With N > 0 and 0 < {alpha}, ß < 1, U is generally overestimated, while h is underestimated. The degree of bias generally increases with decreasing {alpha} and ß. Compared with the fitness moments approach, the bias is larger for h and smaller for Û. The smaller bias of Û is largely due to the greatly underestimated h. This can be understood from Equation 2b or Figure 1 in DENG and FU 1998 Down. Although the presence of overdominant mutations will tend to bias Û upward, the bias will be greatly dampened by a largely underestimated h. Compared with outcrossing populations under constant mutation effects with a comparable sample size of genotypes, the sampling error for Û is relatively small, and hence Û can serve well as an estimate for the upper limit. h estimated by Mukai's method (MUKAI et al. 1972 Down) can also serve well as a lower bound of the true h as evidenced by its small sampling error.

Variable dominant mutation effects: The fitness moments approach (Table 7): With N = 0 and {alpha} = ß = 1, the estimates for U and are biased downward and the estimates for are biased upward. With N > 0 and 0 < {alpha}, ß < 1, is always biased downward, and the magnitude of the bias increases slightly with decreasing {alpha} and ß, while its sampling variance remains relatively stable. ranges from 0.77 to 0.5 . The relatively small bias and sampling variance of render it an ideal estimate of the lower limit for . The direction and the magnitude of the bias of Û and change so that Û and are not always biased. When {alpha} and ß are relatively large, so that overdominance does not contribute substantially to the heterosis and the standing genetic variation in the population, U and are both underestimates. When {alpha} and ß gradually decrease, Û and become unbiased and then overestimated. However, for Û and to become biased upward, {alpha} and ß need to be quite small ({alpha} < ~0.56, ß < 0.84) so that overdominance contributes substantially to heterosis and the standing genetic variation in the populations.

The inbreeding depression approach (Table 8): With N = 0 and {alpha} = ß = 1, the estimates for U and are biased. With N > 0 and 0 < {alpha}, ß < 1, U is generally underestimated when {alpha} and ß are relatively large and is only overestimated when {alpha} and ß are quite small. It should be noted that, as with the case for the outcrossing populations, when overdominant mutations are present but do not contribute substantially to heterosis and genetic variation, the bias of Û is smaller than under dominant mutations. This is because the directions of estimation bias caused by overdominant mutations and variable effects of dominant mutations are opposite and they cancel each other, resulting in smaller (or no) bias. The extent of the bias depends on the parameters under estimation and {alpha} and ß parameter values. The sampling variance of Û is small. is always biased downward and the sampling variance is miniscule. With decreasing {alpha} and ß, the degree of bias of increases. can serve well as a lower bound of the true .


*  DISCUSSION
*TOP
*ABSTRACT
*SIMULATIONS
*RESULTS
*DISCUSSION
*APPENDIX 1
*APPENDIX 1
*LITERATURE CITED

Using extensive simulations, we investigated the effect of overdominant mutations on characterizing deleterious dominant mutations by the two existing estimation approaches (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down; DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down; DENG 1998B Down). We developed two important indices and associated analytical derivations to characterize the relative contributions of overdominant mutations to heterosis and genetic variation. The simulation algorithms and the analytical derivations developed are useful for investigating other issues in genetics concerning the mixture of dominant and overdominant mutations in the genome. Estimates for U are biased upward and those for and biased downward by overdominant mutations. However, the degree of bias is generally moderate and depends on the magnitude of the contribution of overdominant mutations to heterosis or genetic variation. This renders the estimates of U and not invariably biased under variable mutation effects, which when working independently will almost always cause U and to be underestimated. We also note that the contributions to heterosis and genetic variation from overdominant mutations are monotonic but not linearly proportional to each other. Our results may not only provide a basis for correct inferences about deleterious mutations from natural populations, but may also alleviate the biggest concern and obstacle in applying the inbreeding depression and fitness moments approaches, thus paving the way for efficiently characterizing deleterious genomic mutations from large natural populations.

Although it is intuitive that the two approaches will yield biased estimates (DRAKE et al. 1998 Down), it is not clear what the magnitude and the direction of the bias will be for different estimates without the extensive simulations conducted here. Overdominant mutations, when acting together with variable mutation effects and depending on their contributions to heterosis and the standing genetic variation, may actually render estimates of U and unbiased. It has been stipulated (DENG and FU 1998 Down; DRAKE et al. 1998 Down) that the inbreeding depression and fitness moments approaches may be least affected by overdominant mutations in selfing populations, because overdominant mutations cannot be maintained by balancing selection there. However, as shown in Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7 Table 8, with comparable contributions from overdominant mutations to heterosis and standing genetic variation, the estimation will be affected to a similar degree in outcrossing and selfing populations. We also note that the influence on the estimation from overdominant mutations will depend not only on their contributions to heterosis and the standing genetic variation, but also on the parameters of overdominant mutations such as ho and so, although such dependence does not seem to be large.

Our simulation results not only reveal the robustness and statistical properties of the current approaches to characterize deleterious dominant mutations in natural populations, but also shed light on the relative efficiencies of the different approaches in different populations. Although the relative efficiencies of all the three available approaches (as outlined in the Introduction) were investigated earlier (DENG and FU 1998 Down), the investigations were not conducted under conditions of mixed dominant and overdominant mutations in the genome. In the present study, the sample sizes implemented in simulations for the two approaches investigated were deliberately set to be either comparable, or those for the inbreeding depression approach were actually larger. Recall that the number of genotypes employed for the fitness moments approach is 400 in outcrossing and selfing populations, while those for the inbreeding depression approach were 450 and 600, respectively, in outcrossing and selfing populations. However, it can be seen from Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7 Table 8 that the estimation by the fitness moments approach is often better than the inbreeding depression approach. This is especially true for outcrossing populations and for the estimation of . The inbreeding depression approach is sometimes better for the estimation of U; however, the better estimation is achieved because of a greatly biased estimation of . Therefore, it is not the original inbreeding depression approach per se that achieves the better estimation for U. It is actually the greatly underestimated by the estimation methods chosen that leads to the less biased U in the inbreeding depression approach. Therefore, the estimation of U by the inbreeding depression approach would greatly depend on the methods chosen for the estimation of . With less biased estimates or assumed values for , simulation results not shown here indicate that the U estimation by the inbreeding depression approach is much worse statistically than that of the fitness moments approach.

The issue of dominance and overdominance has been under debate for decades in genetics (DAVENPORT 1908 Down; EAST 1908 Down; SHULL 1908 Down; CROW 1952 Down; SPRAGUE 1983 Down; WALLACE 1989 Down; HOULE 1989 Down, HOULE 1994 Down; CROW 1993 Down; DENG et al. 1998A Down). The debate has far-reaching significance for agriculture, human health, evolution, and conservation biology, among other areas. While most of the data are consistent with the dominance hypothesis, overdominance cannot be ruled out in many situations (SIMMONS and CROW 1977 Down; CHARLESWORTH and CHARLESWORTH 1987 Down; BARRETT and CHARLESWORTH 1991 Down; STUBER et al. 1992 Down; CROW 1993 Down; MITTON 1993 Down). Given the current status of the debate, instead of favoring one hypothesis over the other, it may be more sensible to examine the issues concerned under mixed dominant and overdominant mutations in the genome, with mutations of each type having different contributions (e.g., to heterosis and/or genetic variation, etc.). The theoretical machinery for measuring the relative importance of dominance and overdominance has not been available. The development of two important indices, {alpha} and ß, provides a basis for investigating a number of other genetic issues related to the contribution of dominant and overdominant mutations to inbreeding and the standing genetic variation in natural populations.

It has long been recognized that, when dominant and overdominant mutations coexist, the heterosis and standing genetic variation will be affected by both. However, the disproportional contributions of overdominant mutations to heterosis and to standing genetic variation have not been documented before. This phenomenon may form a basis for discerning the relative importance of dominant and overdominant mutations in the genome. Studies have been initiated along this line of research. It is worthy of note that, for overdominant mutations to contribute relatively importantly to the standing genetic variation, a substantial proportion of heterosis must be caused by overdominant mutations. This is especially true when overdominant mutations contribute to less than half of the heterosis ({alpha} > 0.5; Figure 1).

For any theory to be of great significance, its underlying assumptions must be examined closely and the important parameters must be estimated. There is no doubt that any genome is subject to continuous bombardment of deleterious genomic mutations. However, no amount of theoretical argument can resolve the issues concerning the importance of deleterious genomic mutations without the important parameters being estimated. Indisputably, characterizing deleterious genomic mutations is extremely important. However, even if the importance is realized by more and more scientists and revealed in more and more biological aspects, the estimates are astonishingly few and thus are imperatively needed (PECK and EYRE-WALKER 1997 Down). Among the three approaches currently available, the statistical properties and the robustness of the fitness moments approach are investigated most thoroughly and best known. Investigation of the other two available approaches (the M-A approach and the inbreeding-depression approach) is also extremely important and is beginning to appear in studies (DENG and FU 1998 Down; DENG et al. 1998B Down). Different approaches have different peculiar assumptions whose validity may be difficult to consolidate in a specific experimental setting (KEIGHTLEY 1994 Down; PECK and EYRE-WALKER 1997 Down; DENG and FU 1998 Down; LYNCH et al. 1998 Down). Examples of these assumptions are M-S balance in the fitness-moments approach and in the inbreeding-depression approach, no line losses because of selection during M-A, no gene conversion for the M-A chromosome in Drosophila, etc. Applying multiple approaches to the same organism and/or characterizing deleterious mutations in diverse organisms may provide a cross-check of the results (and of the underlying assumptions to derive these results) and eventually may crystallize the deleterious mutation parameters.


*  ACKNOWLEDGMENTS

H.-W. Deng thanks Professor M. Lynch for years of advice, continuous encouragement, and support. We are very grateful to Professor Marjorie A. Asmussen and three anonymous reviewers for their extremely careful comments that helped to improve the article. We thank Drs. Robert R. Recker and Mark Johnson and Ms. Carolyn Meeks for careful editing of the manuscript. The work was partially supported by a grant from National Institutes of Health (R01 AR45349) and a Health Future Foundation grant from Creighton University, Nebraska, and by graduate student tuition waiver to J.-L.L. and J.L. from the Department of Biomedical Sciences of Creighton University.

Manuscript received April 20, 1998; Accepted for publication October 30, 1998.


*  APPENDIX 1
*TOP
*ABSTRACT
*SIMULATIONS
*RESULTS
*DISCUSSION
*APPENDIX 1