Genetics, Vol. 162, 1487-1500, November 2002, Copyright © 2002

Estimation of Deleterious Genomic Mutation Parameters in Natural Populations by Accounting for Variable Mutation Effects Across Loci

Hong-Wen Denga,b, Guimin Gaoa, and Jin-Long Lic
a Osteoporosis Research Center and Department of Biological Sciences, Creighton University, Omaha, Nebraska 68131,
b Laboratory of Molecular and Statistical Genetics, College of Life Sciences, Hunan Normal University, ChangSha, Hunan 410081, People's Republic of China
c Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut 06520-8009

Corresponding author: Hong-Wen Deng, Creighton University, 601 N. 30th St., STE. 6787, Omaha, NE 68131., deng{at}creighton.edu (E-mail)

Communicating editor: Z-B. ZENG


*  ABSTRACT
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

The genomes of all organisms are subject to continuous bombardment of deleterious genomic mutations (DGM). Our ability to accurately estimate various parameters of DGM has profound significance in population and evolutionary genetics. The Deng-Lynch method can estimate the parameters of DGM in natural selfing and outcrossing populations. This method assumes constant fitness effects of DGM and hence is biased under variable fitness effects of DGM. Here, we develop a statistical method to estimate DGM parameters by considering variable mutation effects across loci. Under variable mutation effects, the mean fitness and genetic variance for fitness of parental and progeny generations across selfing/outcrossing in outcrossing/selfing populations and the covariance between mean fitness of parents and that of their progeny are functions of DGM parameters: the genomic mutation rate U, average homozygous effect , average dominance coefficient , and covariance of selection and dominance coefficients cov(h, s). The DGM parameters can be estimated by the algorithms we developed herein, which may yield improved estimation of DGM parameters over the Deng-Lynch method as demonstrated by our simulation studies. Importantly, this method is the first one to characterize cov(h, s) for DGM.


THE genomes of all organisms are subject to deleterious genomic mutations (DGM) continuously. In spite of our increasing knowledge of the molecular underpinnings of mutations, little is known about the overall risk exerted on human health and on continuing survivability of other organisms (especially rare and endangered species) by DGM (CROW 1993A Down, CROW 1993B Down, CROW 1995 Down). To assess this overall risk correctly, we need to have a solid knowledge of the genomic mutation rate (U) at which DGM arise in the whole genome of an individual and the distribution of their effects, such as the mean selection coefficient (), the mean dominance coefficient (), and the covariance of dominance and selection coefficients of DGM [cov(h, s)]. Estimation of these parameters is also important for testing the validity of a number of evolutionary theories in genetics (TURELLI and ORR 1995 Down; and the references within DENG et al. 1998 Down, DENG et al. 1999 Down).

Despite the extreme importance of our knowledge of deleterious mutation parameters, few estimates are available (SIMMONS and CROW 1977 Down; CROW and SIMMONS 1983 Down; KONDRASHOV 1988 Down; CROW 1993A Down, CROW 1993B Down, CROW 1995 Down; BATAILLON 2000 Down). Particularly, no method to estimate U is not biased by variable mutation effects, and no method to estimate cov(h, s) is important for our understanding of Haldane's rule by the dominance hypothesis (TURELLI and ORR 1995 Down). The current experimental approaches and the estimation methods of the parameters of DGM are summarized and compared (DENG and FU 1998 Down; DENG et al. 1999 Down; DENG and LI 2001 Down). It is concluded that under their respective assumptions of various approaches, estimation by the Deng-Lynch method (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down) in natural populations generally results in the best statistical quality in terms of bias and sampling variance (DENG and FU 1998 Down). In addition, it has been shown that violation of various assumptions [including the mutation-selection (M-S) balance assumption] underlying the Deng-Lynch method does not seriously undermine its estimation robustness (LI et al. 1999 Down; LI and DENG 2000 Down; DENG and LI 2001 Down).

As with almost all the other estimation methods (except a maximum-likelihood estimation method for mutation-accumulation experiments; KEIGHTLEY 1994 Down), the Deng-Lynch method that applies to natural outcrossing or selfing populations assumes constant fitness effects of DGM. This assumption is well recognized as biologically implausible. Although the estimation bias introduced by variable mutation effects in the Deng-Lynch estimation method by assuming constant mutation effects is not substantial (DENG et al. 1999 Down), an estimation method that considers variable mutation effects may reduce estimation bias (although not necessarily always so). Most importantly, the parameters [e.g., cov(h, s)] characterizing variable effects of DGM can be estimated only in statistical methods that consider variable mutation effects.

In this article, we present a method for estimating DGM parameters accounting for variable effects across loci in natural outcrossing or selfing populations at M-S balance. We investigate the statistical properties (bias and sampling variance) of this new method, using computer simulations in comparison with the Deng-Lynch method (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down) that assumed constant mutation effects across loci.


*  THEORY
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

The assumptions are the same as those of the Morton-Charlesworth method (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down) and the Deng-Lynch method (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down; DENG 1998B Down). Namely, the population is assumed to be large, randomly mating, highly selfing or outcrossing, at linkage equilibrium, and at M-S balance. In addition, the fitness function is assumed to be multiplicative, which is biologically plausible (MORTON et al. 1956 Down; CROW 1986 Down; CRADDOCK et al. 1995 Down; FU and RITLAND 1996 Down). Mutations at each locus are assumed to have constant effect s and h.

In this study, we consider variable mutation effects in the development of an estimation method for DGM parameters in natural populations. Under variable mutation effects across loci, homozygous effect s for mutations is a random variable between 0 and 1. We assume that, for a mutation, dominance coefficients h and s are functionally related so that h = h(s). This assumption is supported by the limited data and theory (SIMMONS and CROW 1977 Down; KACSER and BURNS 1981 Down; CROW and SIMMONS 1983 Down). We divide the domain of s, [0, 1], for new mutations into T intervals with each having a width of 1/T. Let denote the kth interval, and define the probability

When T is sufficiently large, s and h are approximately constant within each interval but are variable across various intervals. Let Uk denote the mutation rate corresponding to mutations with an effect s falling into the interval Ik, and then Uk = Upk.

With the assumptions we have, in outcrossing populations, the number of mutant alleles with mutation effects s falling into an interval Ik within an individual (all in the heterozygous state; MORTON et al. 1956 Down; DENG and LYNCH 1996 Down) follows a Poisson distribution with an expectation

(1a)

(DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down). In selfing populations, the number of loci homozygous for mutant alleles with an effect s falling into an interval Ik within an individual follows a Poisson distribution with an expectation

(1b)

Outcrossing populations:
We illustrate our experimental design and estimation method by using populations capable of selfing. The method may be extended to outcrossing populations where selfing is not feasible as in the Deng-Lynch method (DENG 1998B Down). The basic data structure is outcrossed parents and multiple selfed progeny from each parent (forming selfed families). Let o and s be the mean fitness in the parental and offspring generations, respectively, {sigma}2o the genetic variance of fitness in the parental generation, {sigma}2t the total genetic variance of fitness in the selfed progeny generation, {sigma}2s the genetic variance of the mean fitness of selfed progeny in selfing families, and cov(wp, ws) the covariance between the fitness of a parent (wp) and the mean fitness of its selfed progeny (ws). Under the above assumption for mutation effects that are variable across various intervals at different loci, as in DENG and LYNCH 1996 Down, it can be shown that the fitness moments are related to the DGM parameters as

(2)


(3)


(4)


(5)


(6)


(7)

where the parameters with overbars denote arithmetic mean properties of new DGM parameters, is the harmonic mean dominance coefficient of new mutations, and Wmax is the expected fitness of a mutation-free genotype in an environment where fitness measurements are taken. Wmax serves as a scaling factor so that the fitness measurement can be on any scale instead of just from 0.0 to 1.0 and also so that mean environmental effects of experiments do not influence estimation (DENG and LYNCH 1996 Down).

Among Equation 2Equation 3Equation 4Equation 5Equation 6Equation 7, there are only five independent equations containing six unknown parameters. By assuming one of the six parameters known in the estimation, estimators of the other parameters can be derived. This is the strategy employed in the likelihood characterization of DGM parameters when variable mutation effects are considered in estimation (KEIGHTLEY 1994 Down; DENG et al. 1999 Down; DENG and LI 2001 Down). Here we assume that U is known in the estimation for the time being. Alternatively, an initial value of U may be estimated from other approaches (DENG et al. 1999 Down) or may be estimated by the current experimental design and data with the Deng-Lynch method (DENG and LYNCH 1996 Down; see below). (If we assume that one of the parameters , , and is known, similar estimation procedures can be derived for U and the rest of the other parameters. can be estimated by methods such as that of DENG 1998A Down.) Solving these equations jointly yields estimators of , , and and as

(8)

where

(9)

From these estimates, other composite parameters of DGM, such as the mean number of mutations per genome , mutational variance Vm per generation, and mean mutation effects on fitness U, can be derived (DENG and LYNCH 1996 Down). The covariance of h and s for mutations cov(h, s) can be approximated, or at least an upper bound can be estimated, as

(10)

This is because for any distribution, >= . Let , where cov(h, s) denotes an upper bound of cov(h, s). This offers us the first opportunity to quantify the magnitude and the sign of cov(h, s). It would be impossible to come up with analytical estimators for DGM parameters such as cov(h, s) if in the analytical derivation, variable mutation effects are not considered. This is simply because these parameters such as cov(h, s) would be zero and meaningless in an analytical estimation developed under constant mutation effects.

Selfing populations:
Random pairs of highly selfing and homozygous parental genotypes (denoted as P generation) are crossed to obtain outcrossed progeny (denoted as F1 generation). Let p and {sigma}2p be the mean fitness and genetic variance of fitness in the P generation, respectively, F1 and {sigma}2F1 be the mean fitness and genetic variance of fitness in the F1 generation, respectively, and cov(, F1) be the covariance between the mean fitness of the two parents and the fitness of their F1 progeny. Under variable mutation effects across loci, the fitness moments are related to the DGM parameters as follows:

(11)


(12)


(13)


(14)


(15)

It should be noted that the derivation for Equation 2Equation 3 HREF="#FD4">Equation 4Equation 5Equation 6Equation 7 and Equation 11Equation 12 HREF="#FD13">Equation 13Equation 14Equation 15 assumes mutation effects that are variable. The strategy is to divide the range of variable selection coefficient s (from zero to one) into infinitely small intervals so that s can be treated as constant within each of the intervals but varying across intervals in our analytical derivation. Again, there are six unknowns (U, , , , , and Wmax) in the above five equations. By assuming or estimating one of the six parameters, estimators of the other five parameters can be derived. Here, as earlier for outcrossing populations, we assume that U is known in the estimation for illustration. Alternatively, an initial value of U may be estimated from other approaches (DENG et al. 1999 Down) or may be estimated with the Deng-Lynch method from the same data and experimental design as the current estimation method (DENG and LYNCH 1996 Down). Solving these equations jointly yields estimators of , , and ,

(16)

where

(17)

In selfing populations, we can use Equation 10 to estimate cov(h, s) by the above estimates of , , and , which are unbiased under variable mutation effects with a known correct U. The estimators for and when assuming U is known are the same as those in DENG and LYNCH 1996 Down for selfing populations.

The above estimation developed herein does not assume any specific functional relationship between s and h and any specific distribution form for the selection coefficient s. Therefore, the estimates are robust to different unknown forms of the distribution of s and the functional relationship between s and h. This is true despite that we assume specific distributions of s and a functional relationship between s and h in the following simulation studies to investigate the statistical properties of our estimation.


*  SIMULATIONS AND RESULTS
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

As with KEIGHTLEY 1994 Down, we assume that s for mutations follows a gamma distribution, with a density function

where . {alpha} and ß are the scale and shape parameters, respectively. . As in DENG and LYNCH 1996 Down, we let , where A = 13, which is in rough accordance with the few available data (GREGORY 1965 Down; MACKAY et al. 1992 Down; DENG and LYNCH 1996 Down; DENG and FU 1998 Down). With these assumptions, the parameters , , , and cov(h, s) can be derived as

These DGM parameters can be used for comparison to examine the estimated values with our estimation methods in simulations.

The simulation procedures are the same as those that have been documented extensively earlier (DENG and LYNCH 1996 Down; DENG 1998B Down) and are thus not elaborated here. In simulations, we assume that the fitnesses of various genotypes can be measured with little error, which is justifiable in the investigation of estimation bias and comparison of various estimation methods (DENG et al. 1999 Down). Under the assumptions for the analytical development of our estimation methods, the number of mutant alleles corresponding to an interval Ik per individual follows the Poisson distributions (Equation 1a and Equation 1b) with pk being determined as

It can be shown that

and when ß = 0.5,

where . Erf(x) can be approximated as

(18)

(GAO 1995 Down), where a1 = 0.0705230784, a2 = 0.0422820123, a3 = 0.0092705272, a4 = 0.0001520143, a5 = 0.0002765672, a6 = 0.0000430638.

To evaluate the performance of our estimation in outcrossing populations in simulations, for each set of parameters U, {alpha}, and ß, K parents were sampled from the parental generation, and from each of these, M selfed progeny were produced. The fitness of an individual from the parental generation is

where nk is the number of mutation-bearing loci with their effects falling into the interval Ik in an individual, obtained by random sampling from the Poisson distribution defined above. The fitness of each selfed offspring was obtained by allowing the nk heterozygous loci of a parent to segregate randomly into the AA, Aa, and aa genotypes with respective probabilities of 1/4, 1/2, and 1/4. Letting n1k and n2k (k = 1, ... , T) be the numbers of heterozygous and homozygous loci containing mutations with effects falling into the interval Ik in a selfed offspring, the fitness of the selfed progeny is

Unless otherwise specified, for each set of parameters (U, {alpha}, ß, K, M), we performed 1000 simulations. We let Wmax = 1 throughout, as the value of Wmax does not influence DGM parameter estimation.

For selfing populations, the fitness of an individual from the parental generation is

where nk is the number of mutation-bearing loci with mutation effects falling into the interval Ik in an individual, and it is obtained by random sampling from the Poisson distribution defined earlier. Each parent mates with another random parent (not in the original set of K) to produce a total of K progeny (one per family) with fitness

where n1k and n2k (k = 1, ... , T) are the numbers of homozygous mutant loci in interval Ik in the two parents, respectively.

In the estimation Equation 8 or Equation 16, U must be known, assumed, or estimated with other approaches first. In simulations, we experimented and examined two methods to estimate U: (1) by the Deng-Lynch method (DENG and LYNCH 1996 Down) and (2) by an empirical regression procedure introduced here. We simulated parents and their children according to variable effects for each set of given parameter values of U, {alpha}, and ß, and obtained the estimates Û1, s1, and h1 by the Deng-Lynch method (DENG and LYNCH 1996 Down). (A circumflex indicates an estimated value throughout.) We found a strong linear relationship between the parameter values of U and the estimates Û1 and s1 under any fixed ß. Through a series of simulations, we obtained samples under various parameter values of U, {alpha}, and fixed ß-values, and we obtained estimates Û1 and s1 with the Deng-Lynch method under various fixed ß-values. Then we fit a multiple regression model under each specific ß-value,

(19)

where Û estimates U with little bias when ß is correctly assumed as shown by our simulation results not presented here. The empirical estimation is useful only when the shape parameter ß can be estimated using other methods and experimental data (e.g., KEIGHTLEY 1994 Down).

The simulation results are represented by the data in Table 1 Table 2 Table 3 Table 4. The ranges of the values for the parameters (such as U, , and ) generally cover those reported earlier from classical empirical experiments (e.g., MUKAI et al. 1972 Down; LYNCH et al. 1999 Down). Three general conclusions emerge from our simulation studies under variable mutation effects. First, when U is set to equal true values or when the estimates of U are obtained via Equation 19 by assuming a correct ß-value, application of Equation 8 or Equation 16 to both obligate selfing or outcrossing populations yields nearly unbiased estimates for the DGM parameters with small standard deviation. The estimates of U by Equation 8 have smaller mean square error despite larger standard deviation when U is set equal to the estimates obtained by regression Equation 19 than those obtained by the Deng-Lynch method. The larger standard deviation may be partly due to the fact that Equation 19 is established by empirical regression procedures that involve an additional level of sampling error for the final estimation. The estimates of by Equation 8 in outcrossing populations have smaller sampling variance and smaller bias than those obtained directly by the Deng-Lynch method, e.g., by comparison of the estimates in rows 1 and 3 for each parameter set in Table 1. This is true even when no prior assumption is made about the magnitude of U, when U is first estimated directly with the Deng-Lynch method, and then the estimate of U is used in the current estimation method, (Equation 8) for the other DGM parameters. The estimates of by Equation 8 have smaller or comparable sampling variance than those obtained directly by the Deng-Lynch method for (for each parameter set, compare the estimates of the second to fourth rows with that of the first row in Table 1). The comparison of the estimation quality between the current estimation method and the Deng-Lynch method changes little with the parameter values (Table 1). When ß = 0.5, the bias of the estimates of the parameters is larger than that when ß = 1 and 2. This may be due to the approximation Equation 18 used to compute when ß = 0.5, while the computation of when ß = 1 and 2 is exact.


 
View this table:
In this window
In a new window

 
Table 1. Parameter estimates under variable mutational effects in outcrossing populations (ß = 1.0)


 
View this table:
In this window
In a new window

 
Table 2. Parameter estimates under variable mutational effects in selfing populations (ß = 1.0)


 
View this table:
In this window
In a new window

 
Table 3. Parameter estimates under variable mutational effects in outcrossing populations


 
View this table:
In this window
In a new window

 
Table 4. Parameter estimates with variable mutational effects in selfing populations

Second, when U is set equal to the estimates (Û1) that were obtained by the Deng-Lynch method (DENG and LYNCH 1996 Down) and that are downwardly biased, the estimates of the other DGM parameters by Equation 8 and Equation 16 are biased with small sampling variance (Table 1 and Table 2). For outcrossing populations, the estimation Equation 8 yields less biased estimates with smaller standard deviation for than for the Deng-Lynch method (Table 1), and the estimates of , , cov(h, s) are upwardly biased and estimates of are downwardly biased. The result can be understood from Equation 8, since Û1 is downwardly biased as estimated by the Deng-Lynch method. In selfing populations, Equation 16 yields the same estimates for and as those obtained by the Deng-Lynch method (Table 2), which is expected as pointed out earlier. The estimates of , , and cov(h, s) are upwardly biased and estimates of are downwardly biased because Û1 is downwardly biased, which can be understood from Equation 16.

Third, in outcrossing populations, the cov(h, s) is correctly estimated to be an upper bound of cov(h, s); however, the sign of cov(h, s) can sometimes be estimated to be different from that of cov(h, s). In selfing populations, cov(h, s) can always be estimated with correct sign and small estimation bias.


*  ROBUSTNESS ANALYSIS
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

In the estimation of the DGM parameters, we need a prior estimate of one of the six parameters (such as U as investigated here) based on some external knowledge obtained from other estimation approaches. The estimation bias of this parameter or the bias of an assumed value will cause estimation bias of the other parameters. Hence, we investigate the sensitivity of estimators to the departures of U from true value, using computer simulations (Fig 1 and Fig 2). We define a relative bias rate (RBR), (estimate - true value)/(true value), to measure the sensitivity of estimators to an incorrectly assumed or estimated U value. In examining the robustness of the estimator for cov(h, s), the true value used is the parameter value of cov(h, s) as defined after Equation 10 and not cov(h, s).



View larger version (20K):
In this window
In a new window
Download PPT slide
 
Figure 1. The changes in RBR of the estimates of , , , cov(h, s) obtained by Equation 8 in outcrossing populations when U were given equal to the values that ranged from 0.5U0 to 1.5U0. Each data point was the mean in 1000 simulations with the following sets of parameters and ß = 1.0: (a) U0 = 1.5, = 0.01, and {alpha} = 100; (b) U0 = 1.5, = 0.05, and {alpha} = 20; (c) U0 = 0.5, = 0.01, and {alpha} = 100; and (d) U0 = 0.5, = 0.05, and {alpha} = 20.



View larger version (20K):
In this window
In a new window
Download PPT slide
 
Figure 2. The changes in MRBR of the estimates of , , , cov(h, s) obtained by Equation 16 in selfing populations when U were given equal to the values that ranged from 0.5U0 to 1.5U0. Each data point was the mean in 1000 simulations with the following sets of parameters and ß = 1.0: (a) U0 = 1.5, = 0.01, and {alpha} = 100; (b) U0 = 1.5, = 0.05, and {alpha} = 20; (c) U0 = 0.5, = 0.01, and {alpha} = 100; and (d) U0 = 0.5, = 0.05, and {alpha} = 20.

In simulations for the investigation of the robustness of our current estimation of the other DGM parameters, U is set equal to a given value (denoted as Ugiven), which ranges from 0.5U0 to 1.5U0 (U0 is the true value of U). This range of the estimate of U investigated is reasonable given the magnitude of bias that is normally found with the method such as that of DENG and LYNCH 1996 Down. The changes in the mean relative bias rates (MRBR) of the estimates of the parameter values in 1000 simulations are shown in Fig 1 and Fig 2. It can be seen that when Ugiven ranged from 0.7U0 to 1.5U0 (which means that the departure of Ugiven from U0 ranged from -0.3U0 to 0.5U0), the MRBR of the estimates of the parameter values changed smoothly and changed little in both outcrossing and selfing populations. When Ugiven ranged from 0.9U0 to 1.2U0, the absolute values of the MRBR of the estimates of parameters [except cov(h, s) for outcrossing populations when {alpha} = 20] are <0.185 in both outcrossing and selfing populations. For outcrossing populations, when {alpha} = 20, if Ugiven <= 0.9U0 or Ugiven >= 1.1U0, the absolute values of the MRBR of cov(h, s) are >1.0 (Fig 1B and Fig D). (Note the scale difference of the y-axis in Fig 1B and Fig D, with the other plots in Fig 1 and Fig 2.) Thus, even when U is estimated with some bias, if the magnitude is similar to that obtained by methods such as that of DENG and LYNCH 1996 Down, our current estimation method can generally still yield relatively robust estimates of DGM parameters (except cov(h, s) for outcrossing populations when {alpha} is as small as 20). In outcrossing populations, the MRBR changed the sign in the robustness investigation of cov(h, s) when , respectively. This is because the parameter value cov(h, s) changed the sign from negative to zero and then to positive values under the functions assumed when changes from 0.047 to 0.048.


*  DISCUSSION
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

We have developed a method in this study for considering variable mutation effects across loci in the estimation. The method may yield improved estimation over that of DENG and LYNCH 1996 Down as shown by employing additional and independent information (such as the covariance between mean fitness of parents and that of their progeny) to that employed in DENG and LYNCH 1996 Down, although the experimental design is the same. Importantly, cov(h, s) for DGM can be estimated (Equation 10) from an experiment for the first time. Previously, a negative correlation between h and s has long been conjectured from theory only (KACSER and BURNS 1981 Down) and from limited data (SIMMONS and CROW 1977 Down; CROW and SIMMONS 1983 Down). There has been no formal statistical analysis and experimental design to characterize cov(h, s).

Characterization of cov(h, s) is important, for example, for testing the validity of the dominance hypothesis (TURELLI and ORR 1995 Down) in explanation of Haldane's rule. Haldane's rule states that when one sex is inviable or sterile in the hybrids of two different animal races, that sex is often the heterogametic sex. The dominance hypothesis (TURELLI and ORR 1995 Down) states that alleles decreasing hybrid fitness are partially recessive. For the dominance hypothesis to explain Haldane's rule, it is necessary that cov(h, s) is <0. Hence, our estimation method here may offer the first opportunity to test the validity of the dominance hypothesis in explaining Haldane's rule by characterizing the sign of cov(h, s). Although it would be nice and significant to have estimators for the other DGM parameters as well, such as variance of s, the observable phenotypic moments of fitness do not relate to other DGM parameters (including the variance of s) in our analytical derivation that considers mutation effects in Equation 2Equation 3Equation 4Equation 5Equation 6Equation 7 and Equation 11Equation 12Equation 13Equation 14Equation 15.

In the estimation of the DGM parameters, we need a prior estimate of one of the six parameters based on some external knowledge or based on the estimates obtained from alternative approaches or from the same experimental design by using the Deng-Lynch method as demonstrated here. We provided the estimators of the other DGM parameters by using Equation 8 and Equation 16 when assuming that U is known or estimated via other approaches. If we assume that one of the parameters , (), or is known or estimated from other approaches, estimators of the other DGM parameters can be obtained. Among the parameters, and , (h) can be estimated individually with the analysis methods already developed (MUKAI et al. 1972 Down; DENG 1998A Down) or with the Deng-Lynch method. We present in the Appendix the estimators of other DGM parameters when is assumed or estimated and some representative simulation results.

It can be seen from Equation 1a and Equation 1b that the mean of h for the Charlesworth technique (CHARLESWORTH et al. 1990 Down) in estimating U in selfing populations should be the arithmetic mean , and the mean for the Morton technique (MORTON et al. 1956 Down) in outcrossing populations should be the harmonic mean . This has seldom, if ever, been pointed out because the Morton-Charlesworth technique was derived under constant mutation effects. To our knowledge, there has been no method for estimating either or . Our proposed estimation methods here are able to, again for the first time, allow estimates of and with relatively small bias under variable mutation effects.

The majority of earlier estimation methods for DGM assume constant mutation effects. The only exception is the maximum-likelihood estimation developed for analyses of mutation-accumulation experiments (KEIGHTLEY 1994 Down, KEIGHTLEY 1996 Down). Like our current estimation method, Keightley's maximum-likelihood estimation also needs to assume a parameter value of DGM to estimate the other DGM parameters in his model. Our results (DENG and LI 2001 Down) suggest that a method that accounts for variable mutation effects does not necessarily always yield better estimation than a method that assumes constant mutation effects even under variable mutation effects. In our current estimation, the covariance between mean fitness of parents and that of their progeny is independent of the other measurable experimental data (such as the means and genetic variance of fitness of the two generations across inbreeding/outcrossing) that are used in the Deng-Lynch estimation (DENG and LYNCH 1996 Down). This additional and independent information contributes to the improved estimation of our current method in quality and to our ability to estimate additional DGM parameters that could be estimated earlier.

For our methods that are applicable to natural outcrossing populations and selfing-fertilizing populations, M-S balance is assumed to be the mechanism maintaining variation for fitness. Alternatives to M-S balance, such as functional overdominance or overdominance induced by fluctuating selection, may, in principle, maintain polymorphisms. Most evidence suggests dominance as heterozygous mutation effects and thus is compatible with M-S balance (HOULE 1989 Down, HOULE 1994 Down; HOULE et al. 1996 Down; DENG et al. 1998 Down). However, mechanisms responsible for the maintenance of genetic variance are complex and may differ among populations. If any other mechanism, such as balancing selection or migration, leads to the maintenance of genetic variation (DRAKE et al. 1998 Down; KEIGHTLEY 1998 Down), our methods may result in biased estimation. Using approaches (LI et al. 1999 Down; LI and DENG 2000 Down; H.-W. DENG and J. LI, unpublished results) that we have used to investigate the robustness of the Deng-Lynch method in the presence of violation of the M-S balance assumption, we can and we will pursue in our future studies investigation of how robust the current method is with different degrees of violation of M-S balance assumption.


*  ACKNOWLEDGMENTS

We are grateful to anonymous reviewers for their constructive comments that greatly helped to improve the manuscript. This work is partially supported by grant R01 GM60402-01A1 from National Institutes of Health (NIH). H.W. Deng was also partially supported by grants from Health Future Foundation, NIH K01 grant AR02170-01, NIH R01 grant AR45349-01, grants from State of Nebraska Cancer and Smoking Related Disease Research Program (LB598), Nebraska Tobacco Settlement Fund (LB692), NIH grant P01 DC01813-07, U.S. Department of Energy grant DE-FG03-00ER63000/A00, grants (30025025 and 30170504) from National Science Foundation of China, and grants from HuNan Normal University and the Ministry of Education of China.

Manuscript received April 26, 2002; Accepted for publication August 26, 2002.


*  APPENDIX
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

ESTIMATION OF OTHER DGM PARAMETERS WHEN IS ASSUMED OR ESTIMATED AND SOME REPRESENTATIVE SIMULATION RESULTS
If (in outcrossing populations) or (in selfing populations) is known by other estimation methods or assumed at particular values on the basis of some external knowledge, based on Equation 2 HREF="#FD3">Equation 3Equation 4Equation 5Equation 6Equation 7 and Equation 11 HREF="#FD12">Equation 12Equation 13Equation 14Equation 15, we have estimators for other DGM parameters as follows, the notations being the same as in the text, in outcrossing populations,

(A1)

and in selfing populations,

(A2)

Simulations are performed similar to that described in the text and with the above estimation for other DGM parameters when (in outcrossing populations) or (in selfing populations) is known or estimated. The simulation and the experimental procedures, when (in outcrossing populations) and (in selfing populations) are estimated by the methods of DENG 1998A Down or MUKAI et al. 1972 Down, are detailed in DENG et al. 1998 Down and thus are not elaborated here.

Some representative results are presented in Table A1 and Table A2. It can be seen that, relative to the Deng-Lynch method, the new method developed here can estimate more parameters, such as cov(h, s) and its sign. In an outcrossing population, the sign of cov(h, s) cannot be reliably estimated. However, in selfing populations, if the is estimated first by the Deng-Lynch method and then used in the current method, the sign of cov(h, s) can be characterized correctly.


 
View this table:
In this window
In a new window

 
Table A1. Parameter estimates under variable mutational effects in outcrossing populations when is known or estimated (ß = 1.0)


 
View this table:
In this window
In a new window

 
Table A2. Parameter estimates under variable mutational effects in outcrossing populations when is known or estimated (ß = 1.0)


*  LITERATURE CITED
*TOP
*ABSTRACT
*THEORY
*SIMULATIONS AND RESULTS
*ROBUSTNESS ANALYSIS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

BATAILLON, T., 2000  Estimation of spontaneous genome-wide mutation rate parameters: Whither beneficial mutation? Heredity 84:497-501.

CHARLESWORTH, B., D. CHARLESWORTH, and M. T. MORGON, 1990  Genetic loads and estimates of mutation rates in highly inbred plant populations. Nature 347:380-382.

CRADDOCK, N., V. KHODEL, P. V. EERDEWEGH, and T. REICH, 1995  Mathematical limits of multilocus models: the genetic transmission of bipolar disorder. Am. J. Hum. Genet. 57:690-702.[Medline]

CROW, J. F., 1986 Basic Concepts in Population, Quantitative and Evolutionary Genetics. W. H. Freeman, New York.

CROW, J., 1993a  How much do we know spontaneous human mutation rates? Environ. Mol. Mutagen. 21:122-129.[Medline]

CROW, J. F., 1993b Mutation, mean fitness, and genetic load, pp. 3–42 in Oxford Surveys in Evolutionary Biology, Vol. 9, edited by D. J. FUTUYMA and J. ANTONOVICS. Oxford University Press, Oxford, New York.

CROW, J., 1995  Spontaneous mutations as risk factors. Exp. Clin. Immunogenet. 12:121-128.[Medline]

CROW, J. F., and M. J. SIMMONS, 1983 The mutation load in Drosophila, pp. 1–35 in The Genetics and Biology of Drosophila, Vol. 3c, edited by M. ASHBURNER, H. L. CARSON and J. N. THOMPSON. Academic Press, London/New York.

DENG, H.-W., 1998a  Estimating (over)dominance coefficient and discriminating dominance vs. overdominance as the genetic cause of heterosis. Genetics 148:2003-2014.[Abstract/Free Full Text]

DENG, H.-W., 1998b  Characterization of deleterious mutation rate and properties in outcrossing populations. Genetics 150:945-956.[Abstract/Free Full Text]

DENG, H.-W. and Y.-X. FU, 1998  On the three different methods for estimating deleterious genomic mutation parameters. Genet. Res. 71:223-236.[Medline]

DENG, H.-W. and J. LI, 2001  Comparison of two estimation methods for mutation accumulation experiments: maximum likelihood and method of moments. Life Sci. Res. 5:189-201.

DENG, H.-W. and M. LYNCH, 1996  Estimation of the genomic mutation parameters in natural populations. Genetics 144:349-360.[Abstract]

DENG, H.-W. and M. LYNCH, 1997  Inbreeding depression and inferred deleterious mutation parameters in Daphnia. Genetics 147:147-155.[Abstract]

DENG, H.-W., Y.-X. FU, and M. LYNCH, 1998  Inferring the major genomic mode of dominance and overdominance. Genetica 102(103):559-567.

DENG, H.-W., J. LI, and J.-L. LI, 1999  On the experimental designs and data analyses of mutation accumulation experiments. Genet. Res. 73:147-164.[Medline]

DRAKE, J. W., B. CHARLESWORTH, D. CHARLESWORTH, and J. F. CROW, 1998  Rates of spontaneous mutation. Genetics 148:1667-1686.[Abstract/Free Full Text]

FU, Y.-B. and K. RITLAND, 1996  Marker-based inference about epistasis for gene influencing inbreeding depression. Genetics 144:339-348.[Abstract]

GAO, H.-X., 1995 Statistical Computation. Peking University Press, Beijing.

GREGORY, W. C., 1965  Mutation frequency, magnitude of change and the probability of improvement in adaptation. Radiat. Bot. 5(Suppl.):429-441.

HOULE, D., 1989  Allozyme-associated heterosis in Drosophila melanogaster.. Genetics 123:789-801.[Abstract/Free Full Text]

HOULE, D., 1994  Adaptive distance and the genetic basis of heterosis. Evolution 48:1410-1417.

HOULE, D., B. MORIKAWA, and M. LYNCH, 1996  Comparing mutational variabilities. Genetics 143:1467-1483.[Abstract]

KACSER, H. and J. A. BURNS, 1981  The molecular basis of dominance. Genetics 97:639-666.[Abstract/Free Full Text]

KEIGHTLEY, P. D., 1994  The distribution of mutation effects on viability in Drosophila melanogaster.. Genetics 138:1315-1322.[Abstract]

KEIGHTLEY, P. D., 1996  Nature of deleterious mutation load in Drosophila. Genetics 144:1993-1999.[Abstract]

KEIGHTLEY, P. D., 1998  Inference of genome-wide mutation rates and distributions of mutation effects for fitness traits: a simulation study. Genetics 150:1283-1293.[Abstract/Free Full Text]

KONDRASHOV, A. S., 1988  Deleterious mutations and the evolution of sexual reproduction. Nature 336:435-440.[Medline]

LI, J. and H.-W. DENG, 2000  Estimation of parameters of deleterious mutations in partial selfing or partial outcrossing populations and in nonequilibrium populations. Genetics 154:1893-1906.[Abstract/Free Full Text]

LI, J.-L., J. LI, and H.-W. DENG, 1999  The effects of overdominance on characterizing deleterious genomic mutations in natural populations. Genetics 151:895-913.[Abstract/Free Full Text]

LYNCH, M., J. BLANCHARD, T. KIBOTA, S. SCHULTZ, and L. VASSILIEVA et al., 1999  Perspective: spontaneous deleterious mutation. Evolution 53:645-663.

MACKAY, T. F. C., R. F. LYMAN, and M. S. JACKSON, 1992  Effects of P element insertions on quantitative traits in Drosophila melanogaster.. Genetics 130:315-332.[Abstract]

MORTON, N. E., J. F. CROW, and H. J. MULLER, 1956  An estimate of the mutational damage in man from data on consanguineous marriages. Proc. Natl. Acad. Sci. USA 42:855-863.[Free Full Text]

MUKAI, T., S. I. CHIGUSA, L. E. METTLER, and J. F. CROW, 1972  Mutation rate and dominance of genes affecting viability in Drosophila melanogaster.. Genetics 72:335-355.[Abstract/Free Full Text]

SIMMONS, M. J. and J. F. CROW, 1977  Mutations affecting fitness in Drosophila populations. Annu. Rev. Genet. 11:49-78.[Medline]

TURELLI, M. and A. ORR, 1995  The dominance theory of Haldane's rule. Genetics 140:389-402.[Abstract]




This article has been cited by other articles:


Home page
GeneticsHome page
B. Fernandez, A. Garcia-Dorado, and A. Caballero
The Effect of Antagonistic Pleiotropy on the Estimation of the Average Coefficient of Dominance of Deleterious Mutations
Genetics, December 1, 2005; 171(4): 2097 - 2112.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B. Fernandez, A. Garcia-Dorado, and A. Caballero
Analysis of the Estimators of the Average Coefficient of Dominance of Deleterious Mutations
Genetics, October 1, 2004; 168(2): 1053 - 1069.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
X.-S. Zhang, J. Wang, and W. G. Hill
Redistribution of Gene Frequency and Changes of Genetic Variation Following a Bottleneck in Population Size
Genetics, July 1, 2004; 167(3): 1475 - 1492.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
X.-S. Zhang, J. Wang, and W. G. Hill
Influence of Dominance, Leptokurtosis and Pleiotropy of Deleterious Mutations on Quantitative Genetic Variation at Mutation-Selection Balance
Genetics, January 1, 2004; 166(1): 597 - 610.
[Abstract] [Full Text] [PDF]