Genetics, Vol. 150, 945-956, October 1998, Copyright © 1998

Characterization of Deleterious Mutations in Outcrossing Populations

Hong-Wen Denga
a Osteoporosis Research Center and Department of Biological Sciences, Creighton University, Omaha, Nebraska 68131

Corresponding author: Hong-Wen Deng, Osteoporosis Research Center, Creighton University, 601 N. 30th St., Suite 6787, Omaha, NE 68131., deng{at}creighton.edu (E-mail).

Communicating editor: M. SLATKIN


*  ABSTRACT
*TOP
*ABSTRACT
*THEORY
*COMPUTER SIMULATIONS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Deng and Lynch recently proposed estimating the rate and effects of deleterious genomic mutations from changes in the mean and genetic variance of fitness upon selfing/outcrossing in outcrossing/highly selfing populations. The utility of our original estimation approach is limited in outcrossing populations, since selfing may not always be feasible. Here we extend the approach to any form of inbreeding in outcrossing populations. By simulations, the statistical properties of the estimation under a common form of inbreeding (sib mating) are investigated under a range of biologically plausible situations. The efficiencies of different degrees of inbreeding and two different experimental designs of estimation are also investigated. We found that estimation using the total genetic variation in the inbred generation is generally more efficient than employing the genetic variation among the mean of inbred families, and that higher degree of inbreeding employed in experiments yields higher power for estimation. The simulation results of the magnitude and direction of estimation bias under variable or epistatic mutation effects may provide a basis for accurate inferences of deleterious mutations. Simulations accounting for environmental variance of fitness suggest that, under full-sib mating, our extension can achieve reasonably well an estimation with sample sizes of only ~2000–3000.


THE genome of any organism is subject to continuous bombardment of mutations, the majority of which are deleterious. Numerous theories based on the deleterious genomic mutations have been developed to explain some fundamental phenomena in biology. The validity of these theories critically depends on the rate at which deleterious mutations occur per genome per generation (U) and/or the effects of deleterious mutations.

For example, estimates of U are crucial to testing theories for the evolution of sex and recombination (MULLER 1964 Down; KONDRASHOV 1985 Down, KONDRASHOV 1988 Down; CHARLESWORTH 1990 Down), mate choice (CHARLESWORTH and CHARLESWORTH 1987 Down; KONDRASHOV 1988 Down; KIRKPATRICK and RYAN 1991 Down), outbreeding mechanisms (CHARLESWORTH and CHARLESWORTH 1987 Down), diploidy (KONDRASHOV and CROW 1991 Down), and the accelerated extinction rate of small populations (LYNCH and GABRIEL 1990 Down; LYNCH et al. 1993 Down, LYNCH et al. 1995A Down, LYNCH et al. 1995B Down). Estimates of the other parameters of spontaneous deleterious mutations are also important. Such parameters include the mean dominance coefficient () the mean selection coefficient () the genomic mutation variance scaled by environmental variance (Vm/Ve), and variation of mutation effects. Estimates of and are important for testing the theories of evolutionary transition from haploidy to diploidy (PERROT et al. 1991 Down) and for testing theories of the role of deleterious mutations in extinction of small populations (LANDE 1994 Down; LYNCH et al. 1995A Down). Joint estimates of U, , and determine the rate of input of genetic variance from mutation per generation (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down) and the extent to which the neutral molecular variation is reduced due to the background selection (CHARLESWORTH et al. 1993 Down, CHARLESWORTH et al. 1995 Down; HUDSON and KAPLAN 1995 Down). In finite populations, variation of mutation effects plays an important role in the maintenance of polygenic variations (KEIGHTLEY and HILL 1990 Down) and in determining the persistence time and extinction rate of small populations (LANDE 1994 Down; LYNCH et al. 1993 Down, LYNCH et al. 1995A Down, LYNCH et al. 1995B Down).

However, few estimates are available (CROW and SIMMONS 1983 Down; KONDRASHOV 1988 Down; CROW 1993 Down), and none of the current estimation approaches can yield unbiased results under realistic situations (DENG and FU 1998A Down). A direct estimation approach for bounds of U and , the traditional mutation-accumulation experiment (BATEMAN 1959 Down; MUKAI et al. 1972 Down), takes extensive time and labor and is feasible only for asexual organisms, special sexual organisms (such as Drosophila where special chromosomal constructs are available), and artificially constructed purely inbred lines. An indirect procedure for estimating U makes use of inbreeding depression data (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down), but it depends on an unknown of deleterious mutations ( needed is the harmonic/arithmetic mean in outcrossing/selfing populations; DENG and LYNCH 1996 Down). Estimation of requires more assumptions (COMSTOCK and ROBINSON 1948 Down; HAYMAN 1954 Down; MUKAI et al. 1972 Down; CABALLERO et al. 1997 Down; DENG 1998 Down). Even with the additional assumptions, the estimate is biased and weighted by the selection coefficients of individual mutant alleles at different loci. In addition, this indirect estimation of U is very sensitive to an assumed (or estimated) (DENG and FU 1998A Down).

DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down developed an estimation approach, making better use of the data (changes of both the mean and genetic variance for fitness traits) that can be acquired from selfing/outbreeding in outcrossing/highly selfing populations, to estimate not only U, but also , , and Vm, etc. (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down). All the estimation approaches applied to natural populations (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down; DENG and LYNCH 1996 Down) assume that all the genetic variation is maintained by mutation-selection balance (see DISCUSSION). Under a range of biologically plausible situations investigated, DENG and LYNCH's (1996) estimation approach almost always yields the best estimation (as reflected by the mean square error, a composite index of bias and sampling variance), when compared with other estimation approaches (DENG and FU 1998A Down). However, DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down were mainly concerned with the estimation in outcrossing populations such as those of Daphnia (DENG 1995 Down), in which selfing is feasible. Although approximate equations for full-sib mating were developed (Equation 14; DENG and LYNCH 1996 Down), they are just a rough approximation and thus inaccurate. Applying these approximate estimations to data from full-sib mating experiments, substantial bias will result even under the ideal assumptions underlying the derivation for estimation (H.-W. DENG, unpublished results). Under the necessary assumptions in DENG and LYNCH 1996 Down, development of the exact estimation for inbreeding experiments other than selfing is not trivial, and thus was not developed (though the effort was made as evidenced by the approximate estimations given there) in DENG and LYNCH 1996 Down. Since selfing is not feasible for the majority of outcrossing populations, there is an imperative need to develop estimation procedures with other forms of inbreeding (such as full- and half-sib mating, etc.) that are feasible in almost all outcrossing populations.

In this article we develop exact estimation equations, under the assumptions in DENG and LYNCH 1996 Down, for any form of inbreeding to characterize deleterious genomic mutations. Additionally, we investigate and compare the statistical properties and the robustness of the estimation under common forms of inbreeding (full- and half-sib mating) and those under selfing. Furthermore, estimation employing the total genetic variation in the inbred generation and that using the genetic variance among the mean of inbred families are compared for their relative efficiencies (multiple inbred progeny from each family is not necessary in the former estimation, while it is mandatory in the latter). These investigations will not only provide a guideline for designing an efficient experiment employing samples from outcrossing populations, but will also provide a basis for accurate inferences of deleterious genomic mutations from the values estimated on the basis of some necessary but unrealistic assumptions.


*  THEORY
*TOP
*ABSTRACT
*THEORY
*COMPUTER SIMULATIONS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The assumptions of MORTON et al. 1956 Down, CHARLESWORTH et al. 1990 Down, and DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down are employed to derive analytical estimations. The population is assumed to be very large for a long enough time such that all loci are at mutation-selection equilibrium for segregating polymorphisms. For any locus, this requires that selective disadvantage of deleterious mutations (selection coefficient s) is on the order of 1/Ne, where Ne is the effective population size (KIMURA et al. 1963 Down; LYNCH et al. 1995A Down, LYNCH et al. 1995B Down). This requirement ensures that, besides mutations, selection rather than random genetic drift is the driving force for the standing genetic variation. The frequency of deleterious mutant alleles at any locus is assumed to be small. The fitness function is assumed to be multiplicative, which is biologically plausible (MORTON et al. 1956 Down; CROW 1986 Down; CRADDOCK et al. 1995 Down; FU and RITLAND 1996 Down). Mutations at each locus have constant effects s and h (dominance coefficient). The three genotypic values at each locus with mutations are, respectively,

The number of mutations per genome (n) is assumed to be Poisson distributed with a probability density function p(n) = n ! , where is the mean number of deleterious mutations per genome. Later in this article, we conduct computer simulations to test the effects of violation of some of the above assumptions (such as constant mutation effects and multiplicative fitness function) on estimation.

Under these assumptions, the mean (O) and the genetic variance [{sigma}2w(O)] of fitness in an outcrossing population are found to be, respectively (DENG and LYNCH 1996 Down),

(1)

(2)

Equation 1 is a well-known result (HALDANE 1937 Down; KIMURA et al. 1963 Down; BURGER and HOFBAUER 1994 Down). Wmax is the expected fitness of a mutation-free genotype in the environmental conditions, where the experimental measurements are taken. Wmax serves as a scaling factor so that fitness measurement can be on any scale instead of just from 0.0 to 1.0, and also so that mean environmental effects of experiments do not influence estimation.

Suppose now that the members of an outcrossed population undergo inbreeding with inbreeding coefficient being f in inbred progeny. In the case of selfing, where f = , full-sib mating f = , and half-sib mating f = . For each heterozygous locus in the outcrossed parental generation, an inbred progeny is expected to be heterozygous and homozygous for the deleterious allele with probabilities (1 - f) and f/2, respectively (CROW and KIMURA 1970 Down). Thus, under the assumption of free recombination and by the relationship () = (DENG and LYNCH 1996 Down), the mean fitness (I) in the inbred progeny is

(3)

The genetic variance among the mean fitness of the inbred progeny of different inbreeding families [{sigma}2w(I)] is

(4)

The total genetic variance of fitness [{sigma}2T(I)] in the inbred progeny generation is

(5)
{sigma}2T(I) is the sum of {sigma}2w(I) and the genetic variance among the selfed genotypes within selfed families. The derivation of Equation 3Equation 4 is similar to and simpler than that of Equation 5, thus not elaborated here. Equation 5 is derived as follows: Let w(I) represent the fitness of a random genotype from the inbred offspring generation, then {sigma}2T(I) = E(w2(I)) - 2I , and E(w2(I)) = {Sigma}{infty}n=0 E(w2) P(n) , where E(w2) is the expectation of w2(I) conditional on parents having mutations at n loci. Let xi denote the fitness of the ith locus (which is heterozygous in the outcrossed parents) in an inbred progeny, under the assumptions of multiplicative fitness and unlinked loci E(w2) = E({Pi}ni=0x2i) = {Pi}ni=0E(x2i) . For each parental locus that is heterozygous for mutations, the inbred progeny is expected to be heterozygous or homozygous for the deleterious alleles with probabilities (1 - f) and f/2, respectively, hence E(x2i) = + (1 - f)i(1 - hs)2 +f , which is independent of the ith locus under the assumption of constant mutation effects across loci. Therefore, we have

From Equation 3 and the relationship = (DENG and LYNCH 1996 Down), Equation 5 follows.

To verify our derivation, we substitute f = [the selfing case as considered by DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down] into Equation 3Equation 4Equation 5. We find that Equation 3 and Equation 4 recover the corresponding Equations 1c–1d of DENG and LYNCH 1996 Down and Equation 5 reduces to the corresponding Equation A1(1) of DENG and LYNCH 1997 Down. Additionally, it is noted that if we set f = 0 (outcrossing case), Equation 3 recovers Equation 1 and Equation 4 and Equation 5 are reduced to Equation 2; therefore, the above general equations should be correct. The verification will be further carried out in our computer simulations.

Estimation via the information on {sigma}2w(I) requires the estimation of mean fitness of the inbred progeny from each family and thus the measurement of multiple progeny from each inbreeding family. The mean of the inbred progeny from each family is subject to sampling error as the number of inbred progeny sampled from each family is usually finitely small. Estimation via the information on {sigma}2T(I) does not require multiple inbred progeny from each family. Therefore, estimation of {sigma}2w(I) is subject to two sources of sampling error. One is the number of inbreeding families sampled from populations and the other is the number of inbred progeny sampled from each inbreeding family. However, estimation of {sigma}2T(I) is subject to only one source of sampling error, i.e., the number of inbreeding families sampled from populations. Thus, estimation of the rate and properties of deleterious mutations via estimation of {sigma}2T(I) is likely to be more powerful; it is also likely to be more practical since sometimes multiple inbred progeny are simply not available for each family. This will be investigated and confirmed later, by computer simulations, in the case of selfing. The investigation is necessary since estimation via {sigma}2w(I) is the design that was proposed originally in DENG and LYNCH 1996 Down. In addition, estimation via {sigma}2w(I) may have an advantage in some practical situations, such as when cloning of genotypes in the progeny generation is not feasible (DENG and LYNCH 1997 Down). However, we will focus on studying estimation via {sigma}2T(I) , i.e., estimation will be developed only by Equation 1Equation 2Equation 3 and Equation 5 in this study. Estimation by Equation 1Equation 2Equation 3Equation 4 is straightforward, less powerful, and thus not pursued here.

Define x, y, and z, respectively, as

(6)

From Equation 1Equation 2Equation 3 and Equation 5, the expected values of x, y, and z are, respectively,

(7a)

(7b)

(7c)

Note, in Equation 7b, if h = 0.5 (pure additive case), there should be no inbreeding depression (E(y) = 0) and estimation of U cannot be obtained. However, a pure additive case almost does not exist as suggested by the universal phenomena of inbreeding depression and heterosis. Rearranging and letting a circumflex (ˆ) denote an estimate throughout, we obtain potential estimators for the mutational parameters:

(8a)

(8b)

(8c)

By substituting f = 1/2, the estimation Equation 8aEquation 8bEquation 8c recover those of Equations A1(4a–4c) in DENG and LYNCH 1997 Down. Different forms of inbreeding have different f's. By substituting f for a specific form of inbreeding employed in the experiment into Equation 8aEquation 8bEquation 8c, estimation of U, h, and s and other derivative parameters can be obtained. These derivative parameters include (not exclusively) the genetic variance introduced into a population by new mutations per generation Vm and the mean number of deleterious mutations per genome (DENG and LYNCH 1996 Down).


*  COMPUTER SIMULATIONS
*TOP
*ABSTRACT
*THEORY
*COMPUTER SIMULATIONS
*RESULTS
*DISCUSSION
*LITERATURE CITED

To verify our analytical derivations under the assumptions made, and also to test the robustness of the estimation with the violation of some essential assumptions for analytical derivation, statistical properties (sampling variance and bias) of the estimation are investigated by computer simulations. These investigations will provide a basis for accurate inference of the genomic mutations with the estimation developed under the necessary but implausible assumptions. Specifically, the following assumptions will be tested: (1) the fitness function is multiplicative and there are no epistatic fitness effects of mutations; (2) the mutation effects s and h are constant across loci; and (3) there are no lethal mutations. Some other practical issues are also investigated by computer simulations: (1) the relative efficiencies of estimation by employing the information of the total genetic variation [{sigma}2T(I)] in the inbred generation vs. that of the genetic variation of the mean of inbred families [{sigma}2w(I)] (this will be demonstrated by investigating the selfing case); (2) the relative efficiencies of estimation by employing different degrees of inbreeding (this will be investigated in the case of selfing, full-, and half-sib mating); and (3) the power of the estimation when genotypic values cannot be measured without error.

It should be noted that some of the problems were investigated for estimation developed from Equation 1Equation 2Equation 3Equation 4 (DENG and LYNCH 1996 Down). However, none of the above problems have been investigated for the estimation developed from Equation 1Equation 2Equation 3 and Equation 5, which employ a different experimental design as explained earlier. Although some of the simulation conclusions will be qualitatively similar to those in DENG and LYNCH 1996 Down, they will be quantitatively different. These quantitatively different results (especially the different degrees of bias) form the bases for accurate inference of mutations on the basis of different experimental designs and estimations.

Estimation under constant mutation effects:
We assume that a mutation-selection balance has been reached in the parental generation, so that the number of mutations per individual (all in the heterozygous state) is Poisson distributed with an expectation of = . In each situation, simulations are performed for different sets of parameters. For each parameter set, variable K and H individuals are randomly sampled, respectively, from the outcrossed parental and inbred progeny generations. Initially, the genotypic values are assumed to be measured without error and are defined by the multiplicative fitness function used in the derivation. For a genotype with n mutations (randomly determined from the Poisson distribution) from the outcrossed parental generation, the fitness is

For a genotype sampled from the inbred progeny generation, the fitness is

where n1 and n2 are, respectively, the numbers of loci with mutations at heterozygous and homozygous states. n1 and n2 are determined from two levels of random sampling: (1) A number (n) of loci is randomly determined from the Poisson distribution with mean = ; (2) with inbreeding coefficient f, each of these n loci has a probability of f/2 to be homozygous for the normal A allele, a probability of (1 - f) to be heterozygote Aa, and a probability of f/2 to be homozygote aa. After the genotypic status of each locus is determined as above, n1/n2 are just the sum of loci heterozygous/homozygous for mutations. The genetic variances [{sigma}2w(O) and/or {sigma}2T(I) ] are just the variances among the genotypes under the assumption that genotypic values are measured without error. This assumption will be relaxed later in simulating the power of the experiments. hi and si are the dominance and selection coefficients of the ith locus with mutations. They will be assumed constant initially and made variable later. We arbitrarily let Wmax = 1 throughout, as the values of Wmax do not influence the estimation for the mutation parameters. For each set of parameters (U, h, s, K, H), we perform 500 simulations. The average and standard deviations (SD) of the estimates over the 500 independent simulations are reported. Unless otherwise specified, K = H = 200 in simulations.

Estimation under variable mutation effects:
Mutation effects hi and si across loci are unlikely constant. For example, si may vary anywhere from 0.0 (neutral mutation) to 1.0 (lethal mutation). The rate of occurrence for mutations with different effects may also vary so that mutations of smaller effects may occur at higher rates. To evaluate the direction and the magnitude of bias introduced by variable mutation effects and variable mutation rates, as in DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down, we adopt an exponentially distributed mutation rate for mutations of variable effect si:

(9a)

Also we let

(9b)

As explained in DENG and LYNCH 1996 Down, these are in rough accordance with the few available data (GREGORY 1965 Down; CROW and SIMMONS 1983 Down; MACKAY et al. 1992 Down; KEIGHTLEY 1994 Down) or biochemical arguments (KACSER and BURNS 1981 Down). However, true mutational spectra may be such that the dominance of individual mutations is broadly scattered around such a function (CABALLERO and KEIGHTLEY 1994 Down).

In simulations, we divide the entire range of s (0.0–1.0) into 100 discrete classes of width 0.01. Within each class, mutations have constant effects (hi and si). Each individual from the outcrossed parental generation in the simulation is assigned a number ni of heterozygous mutations from the ith of these classes by drawing from a Poisson distribution with expectation Upi/(hisi), where pi is the density of the mutational distribution in the ith class. For an individual from the inbred progeny generation, nis are first determined as above. Then for each of the ni loci, the genotype is, as before, determined by randomly sampling from the trinomial probabilities determined by f, so that probabilities for different genotypes are f/2 for AA, (1 - f) for Aa, and f/2 for aa, respectively.

Estimation with lethal mutations present in the genome:
Due to their low dominance coefficient, lethal mutations are often sheltered from selection by being kept in heterozygous state in outcrossing populations. To investigate the effects of lethal mutations on estimation, we add an additional low genomic mutation rate (0.01U) to lethals (defined as having s = 1.0 and h = 0.02) (Table 3).


 
View this table:
In this window
In a new window

 
Table 1. Parameter estimates with constant mutation effects


 
View this table:
In this window
In a new window

 
Table 2. Parameter estimates with variable mutation effects


 
View this table:
In this window
In a new window

 
Table 3. The influence of lethals on estimation

Estimation with epistatic mutation effects:
The theory we developed here assumes that deleterious mutations across loci interact multiplicatively. Although there is some good evidence that genes for fitness or its components most likely act multiplicatively (MORTON et al. 1956 Down; CROW 1986 Down; CRADDOCK et al. 1995 Down; FU and RITLAND 1996 Down), synergistically epistatic mutation effects can not be ruled out entirely. Thus, we test the robustness of the estimation method under epistatic mutation effects. To evaluate the potential consequences of epistasis on the mutation-parameter estimates derived from our model, we consider the epistatic fitness model described by CHARLESWORTH 1990 Down and employed by us before (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down):

where n = n1 + (n2/h) is the effective number of heterozygous mutations per individual. n1 and n2 are, respectively, the numbers of loci heterozygous and homozygous for mutations. The parameter ß measures the strength of the synergistic effects of deleterious mutations. With ß = 0, the model reduces to one of multiplicative effects, and with ß > (<) 0, the effects of deleterious alleles are reinforcing (diminishing) epistatic; i.e., as more deleterious alleles are added to the genome, the decline in fitness per additional deleterious allele increases (decreases). Reinforcing epistatic effects were suggested (though not convincingly) by several empirical results (e.g., MUKAI 1969 Down) and are most interesting to geneticists, while empirical evidence for diminishing epistatic effects is essentially nonexistent. Therefore, we focus on investigating the effects of reinforcing epistasis on estimation in this study; epistatic effects refer to reinforcing epistasis hereafter. The ratio measures the relative contribution of epistatic effects to mean fitness. The larger the , the larger the relative contribution of epistatic effects to mean fitness.

We implement the epistatic fitness model by assuming mutations of constant effects (s and h). Under mutation-selection equilibrium, n is approximately normally distributed with the mean and variance being functions of U, h, s, and ß, defined by Equation 3 in CHARLESWORTH 1990 Down. For the outcrossed parental generation, we again assume that all deleterious mutations exist in the heterozygous state before inbreeding, so that the n that is drawn for a parental individual is the number of heterozygous loci in that individual. n1 and n2 for an individual from the inbred progeny generation are determined in a similar fashion as before except that n is now randomly determined from a normal distribution instead of a Poisson distribution. The means and variances of fitness for the two generations are then computed, and our estimators, Equation 8aEquation 8bEquation 8c which assume no epistasis, are applied to the data.

Comparing estimation based on {sigma}2T(I) and on {sigma}2w(I):
We compared estimation by employing the information of the total genetic variation [{sigma}2T(I)] in the inbred generation vs. that of the genetic variation of the mean of inbred families [{sigma}2w(I)] . This is demonstrated in the case of selfing. We investigated under variable mutation effects in the presence and absence of lethals (Table 5) under two different experimental designs.


 
View this table:
In this window
In a new window

 
Table 4. Mutation parameter estimates in the presence of synergistic mutation effects


 
View this table:
In this window
In a new window

 
Table 5. Comparison of estimation based on {sigma}2T(I) vs. that based on {sigma}2w(I)

Estimation when genotypic values are measured with error:
In the simulations discussed above, genotypic values are assumed to be known without error. In this case, sampling error of estimates comes only from random sampling of outcrossed and inbred genotypes. In reality, this would require that each genotype be clonally replicated and assayed a very large number of times, since polygenic traits are usually expressed with some environmental variance (FALCONER and MACKAY 1996 Down). In Table 6, we consider the estimation by accounting for additional effects of finite clonal replicates for each genotype on the sampling error. The results for full-sib mating are presented. We examine the situation in which the broad-sense heritabilities (H2) of fitness are 0.20, 0.40, and 0.60, respectively, in the parental generation. The environmental variance (including random measurement error and developmental instability) for fitness is defined as {sigma}2e = , where {sigma}2G is the genetic variance of fitness defined by Equation 2. Individual fitnesses are then determined by their genotypic values as described earlier, plus a random environmental deviation drawn from a normal distribution with zero mean and variance {sigma}2e . Estimates of genetic variances for fitness are obtained by conducting one-way ANOVA on the simulated data in the parental and offspring generations, respectively, with genotypes as main, and clonal replicates as random, effects.


 
View this table:
In this window
In a new window

 
Table 6. The influence of finite sample size on paremeter estimation


*  RESULTS
*TOP
*ABSTRACT
*THEORY
*COMPUTER SIMULATIONS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Estimation under constant mutation effects:
The parameter estimates for s, h, and U are almost always unbiased with small sampling errors (Table 1). The only exception is when h is very high (h = 0.4), higher than all previously reported h estimates that range from 0.07 to 0.35 (DENG and LYNCH 1997 Down). Then has large sampling variance. Estimation of s, h, and U under selfing is better than under full-sib mating, which in turn is better than under half-sib mating (as reflected by SD of the estimates). This is because the magnitude of the change of mean and genetic variance upon inbreeding is larger with higher degree of inbreeding. The estimates obtained from the repeated simulations are consistent with normal distributions (Kolmogorov-Smirnov test, P > 0.50; SOKAL and ROHLF 1995 Down), thus SD is employed to reflect the sampling properties of the estimation throughout. The improvement of estimation with an increased degree of inbreeding is small when h is small (h = 0.2). The improvement is very dramatic for U estimation when h (h = 0.4) and U (U = 1.5) are large. Data not shown for simulations where s = 0.050 revealed the same conclusions.

Estimation under variable mutation effects:
All the estimates are biased (Table 2). The bias is relatively small when is small and increases with an increasing . The simulated parameters roughly cover most of the previous experimental estimates. Under the simulated parameters, ranges from ~2 to 3, ranges from ~0.35 to ~0.90 , and ranges from ~0.5 U to 0.8 U. Again, as with constant effects, estimation of , , and U under selfing is better than under full-sib mating, which is better than under half-sib mating. The sampling variance decreases with an increasing degree of inbreeding for estimation, while the bias remains roughly constant. Full-sib mating can generally achieve reasonably good estimates in terms of sampling variance.

Estimation with lethal mutations present in the genome:
The presence of rare lethal mutations (in the simulations shown, an expected number of 0.50 per individual) causes the estimates of to inflate by a factor of ~8, and estimates of U and to decrease by factors of ~2 and 4, respectively. In practical applications of our proposed technique, this type of problem can perhaps be minimized by eliminating individuals that are homozygous for lethals from the final analyses. This is a protocol similar to that employed in mutation-accumulation experiments (MUKAI et al. 1972 Down). By dropping inviable inbred progeny (homozygous for lethal mutations) from analyses, the estimation can be greatly improved, even better than when there are no lethals (Table 3). This reflects the common practice of sampling conditional on hatch or birth, etc.

Estimation with epistatic mutation effects:
In general, the biases in estimates of U, h, and s are quite small provided the contribution of epistatic effects to fitness is on the order of <10% (Table 4). With reinforcing epistasis, h and U tend to be underestimated, whereas s tends to be overestimated. When the ratio approaches one, so that synergistic epistasis halves the average fitness of individuals relative to that expected in the absence of epistasis, the bias becomes more substantial. Even with strong epistasis, the estimation of h is altered only slightly and the estimates of U are not downwardly biased by more than ~30%, although the estimates of s can be too high by a factor as large as ~5. Overall, the results suggest that epistasis must be quite strong for our estimation to generate widely unrealistic estimates.

Comparing estimation based on {sigma}2T(I) and on {sigma}2w(I):
When relatively many (10) selfed progeny from each family are sampled (experiment A; Table 5), estimation based on {sigma}2T(I) is about the same as that based on {sigma}2w(I) , but has relatively smaller sampling variance. When only a few (2) selfed progeny from each family are sampled (experiment B), estimation based on {sigma}2T(I) is much better than that based on {sigma}2w(I) , because of both smaller sampling variance and smaller bias. This is consistent with our previous prediction (DENG and LYNCH 1997 Down). When based on {sigma}2T(I), the estimation does not change much for experiments A and B, though the sampling effort is much smaller in experiment B. However, when based on {sigma}2w(I) , the estimation is much better for experiment A than for experiment B, due to the better estimation of the mean fitness of larger selfed families. As pointed out in the THEORY section, estimation of {sigma}2w(I) is subject to two sources of sampling error, one being the number of inbreeding families sampled from populations and the other being the number of inbred progeny sampled from each inbreeding family. However, estimation of {sigma}2T(I) is subject to only one source of sampling error, i.e., the number of inbreeding families sampled from populations. Thus, the estimation via {sigma}2T(I) is indeed more powerful most of the time, especially when many inbred progeny are simply not available for each inbred family.

Estimation when genotypic values are measured with error:
The higher the H2 (Table 6), the more genotypes (K) sampled, or the more replicates (R) cloned for each genotype at assay, the better the estimation, as reflected by the SDs. The bias remains roughly constant with different experiments of different sample sizes. When H2 is reasonably high (>0.40), experiments that employ 100 outcrossed parents and 100 inbred progeny (each from different full-sib matings), with each genotype having at least 10 replicates, can achieve estimation reasonably well. In these experiments, ~2000 individuals need to be assayed. Generally speaking, for a fixed sample size for assay, increasing K can improve estimation more efficiently than increasing R. Even with relatively low H2 (0.20), experiments that employ 150 outcrossed parents and 150 inbred progeny (each from different full-sib matings), with each genotype having at least 10 replicates, can achieve estimation reasonably well. In these experiments, ~3000 individuals need to be assayed.

As a specific example, assume that one can measure fitness of 2000 individuals. Then an actual experiment could be roughly as follows: Sample 100 random outcrossed genotypes from the outcrossing population under study. For each of them, sample one full sib to produce 100 full-sib pairs. Mate these 100 full-sib pairs to generate 100 inbred progeny genotypes. Clonally replicate the 100 random outcrossed genotypes and the 100 inbred genotypes to generate 10 clones for each of the genotypes. Then analyses can be performed (DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down; DENG and FU 1998A Down) to estimate the mean and genetic variation in the parental and inbred offspring generations and the mutation parameters can be estimated by Equation 8aEquation 8bEquation 8c. The dependence of the precision on such an experiment is indicated by footnote a in Table 6.


*  DISCUSSION
*TOP
*ABSTRACT
*THEORY
*COMPUTER SIMULATIONS
*RESULTS
*DISCUSSION
*LITERATURE CITED

In this article, we extended the approach of DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down to any form of inbreeding to characterize deleterious mutations in outcrossing populations. This extension greatly widens the taxa range of outcrossing populations in which characterizing deleterious mutations is feasible. The statistical properties, robustness, and statistical power of our extension are also investigated under several biologically plausible situations. In addition, we compared the estimation under different degrees of inbreeding and two different experimental designs and data analyses. It is revealed that estimation using the total genetic variation {sigma}2T(I) in the inbred generation is generally more efficient than employing the genetic variation among the mean of inbred families {sigma}2w(I) . Additionally, a higher degree of inbreeding employed in experiments yields a higher power for estimation. Our estimation is fairly robust in the presence of synergistic epistasis. For the estimation and experimental design that is different from that in DENG and LYNCH 1996 Down, the simulation results here on the magnitude and direction of estimation bias may provide a basis for accurate inferences of deleterious mutations. Simulations accounting for environmental variance of fitness suggest that, under full-sib mating, our extension can achieve estimation reasonably well with sample sizes of only ~2000–3000 for assay. These sample sizes should be well within the capacity of many laboratories.

In this article, we focus on estimation that employs experimentally measurable information such as mean and genetic variance in outcrossed and inbred generations. If external knowledge exists on the variation and covariation of hi and si of mutation effects, improved (less biased) estimation accounting for the variation and covariation of hi and si may be obtained. This was done in DENG and LYNCH 1996 Down for selfing experiments. However, the utility of such estimation is limited because our knowledge on variation and covariation of hi and si essentially does not exist and may be much harder to acquire than to estimate the mean of hi and si. Recently, we developed a method to approximately estimate the variation of hi in an outcrossed population without having to construct homozygous lines (DENG 1998 Down). However, methods to quantify the covariation of hi and si do not exist at present, and the statistical performances of quantifying the variation of si from mutation accumulation data (KEIGHTLEY 1994 Down) may be poor, as suggested by our preliminary investigations (DENG et al. 1998A Down, DENG et al. 1998B Down). Therefore, estimation incorporating external knowledge of variation and covariation of hi and si is not presented here due to their minimal utility, although it is straightforward to work it out. By employing a more complex design and using more information from the data, estimation that may not be biased by variable mutation effects in both outcrossed and selfed populations can hopefully be developed; furthermore, the covariation between hi and si may be quantified (H.-W. DENG, unpublished results).

One crucial assumption of the estimation developed in this article is that the variation of fitness is maintained by mutation-selection (M-S) balance. This assumption underlies all the previous estimation approaches applied to natural populations (MORTON et al. 1956 Down; CHARLESWORTH et al. 1990 Down; DENG and LYNCH 1996 Down, DENG and LYNCH 1997 Down). Despite tremendous efforts (e.g., HOULE 1989 Down, HOULE 1994 Down; HOULE et al. 1996 Down; CHARLESWORTH and HUGHES 1997 Down; DENG et al. 1998), the extent to which this essential assumption is valid is generally unknown. A critical question is how robust the estimations are with different degrees of violation of the M-S balance assumption. Extensive studies have recently been performed (J.-L. LI, J. LI and H.-W. DENG, unpublished results). They have revealed that violation of M-S balance may not be as substantial as envisioned. Under variable dominance mutation effects, the estimation bias is actually reduced, with overdominance mutations maintained by balancing selection present in the genome.

Currently, there are several different approaches to characterize different aspects of deleterious genomic mutations (DENG 1998 Down; DENG and FU 1998A Down, DENG and FU 1998B Down). Traditional mutation-accumulation experiments used to be implemented with tremendous labor and time, which may have been far from necessary. If designed properly, mutation-accumulation experiments can be executed much more efficiently with much reduced time and labor (DENG et al. 1998A Down, DENG et al. 1998B Down), so that mutation-accumulation may be adopted by many more empiricists. Under realistic situations, all the current estimations are biased. DENG and LYNCH's (1996) original procedure (estimation via genetic variation among the means of selfed families) generally is statistically better than other currently available estimation approaches (DENG and FU 1998A Down). Our extension (estimation via the total genetic variation in the inbred progeny generation; DENG and LYNCH 1997 Down) is shown here to be even more powerful than our original procedure (DENG and LYNCH 1996 Down). The extension in DENG and LYNCH 1997 Down is here further extended to any inbreeding experiments in outcrossing populations. Different estimation approaches have different peculiar assumptions that may be difficult to validate in particular experimental settings (DENG and FU 1998A Down). In addition, besides the statistical properties, different approaches have different advantages and drawbacks in practice and are best applicable to different organisms and in different situations. The estimates obtained by different approaches in different organisms can be crosschecked and, hopefully, will eventually resolve the issues concerning the genomic mutations.


*  ACKNOWLEDGMENTS

I thank Professor M. Lynch for helpful comments on the manuscript and his years of advice, encouragement, and continuous support. I also thank Professor M. Slatkin, Professor A. Kondrashov, and an anonymous reviewer for helpful comments that helped to improve this article. Graduate students J. Li and J.-L. Li helped in running some simulations for this article. This work was partially supported by a grant from National Institutes of Health R01 AR45349 and a Health Future Foundation grant of Creighton University, Nebraska.

Manuscript received April 10, 1998; Accepted for publication July 13, 1998.


*  LITERATURE CITED
*TOP
*ABSTRACT
*THEORY
*COMPUTER SIMULATIONS
*RESULTS
*DISCUSSION
*LITERATURE CITED

BATEMAN, A. J., 1959  The viability of near-normal irradiated chromosomes. Int. J. Radiat. Biol. 1:170-180.

BURGER, R. and J. HOFBAUER, 1994  Mutation load and quantitative genetic traits. J. Math. Biol. 32:193-218[Medline].

CABALLERO, A. and P. D. KEIGHTLEY, 1994  A pleiotropic nonadditive model of variation in quantitative traits. Genetics 138:883-900[Abstract].

CABALLERO, A., P. D. KEIGHTLEY, and M. TURELLI, 1997  Average dominance for polygenes: drawbacks of regression estimates. Genetics 147:1487-1490[Medline].

CHARLESWORTH, B., 1990  Mutation-selection balance and the evolutionary advantage of sex and recombination. Genet. Res. 55:199-221[Medline].

CHARLESWORTH, D. and B. CHARLESWORTH, 1987  Inbreeding depression and its evolutionary consequences. Annu. Rev. Ecol. Syst. 18:237-268.

CHARLESWORTH, B., and K. A. HUGHES, 1997 The maintenance of genetic variation in life-history traits, in Evolutionary Genetics from Molecules to Morphology, edited by R. S. SINGH and C. B. KRIMBAS. Cambridge University Press, Cambridge, UK (in press).

CHARLESWORTH, B., D. CHARLESWORTH, and M. T. MORGAN, 1990  Genetic loads and estimates of mutation rates in highly inbred plant populations. Nature 347:380-382.

CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993  The effects of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303[Abstract].

CHARLESWORTH, D., B. CHARLESWORTH, and M. T. MORGAN, 1995  The pattern of neutral molecular variation under the background selection model. Genetics 141:1619-1632[Abstract].

COMSTOCK, R. E. and H. F. ROBINSON, 1948  The components of genetic variance in populations of biparental progenies and their use in estimating the average degree of dominance. Biometrics 4:254-266[Medline].

CRADDOCK, N., V. KHODEL, P. V. EERDEWEGH, and T. REICH, 1995  Mathematical limits of multilocus models: the genetic transmission of bipolar disorder. Am. J. Hum. Genet. 57:690-702[Medline].

CROW, J. F., 1986 Basic Concepts in Population, Quantitative and Evolutionary Genetics. W. H. Freeman, New York.

CROW, J. F., 1993  How much do we know about spontaneous human mutation rates? Environ. Mol. Mutagen. 21:122-129[Medline].

CROW, J., and M. KIMURA, 1970 An Introduction to Population Genetics Theory. Harper & Row, New York.

CROW, J. F., and M. J. SIMMONS, 1983 The mutation load in Drosophila, pp. 1–35 in The Genetics and Biology of Drosophila, Vol. 3c, edited by M. ASHBURNER, H. L. CARSON and J. N. THOMPSON. Academic Press, New York.

DENG, H.-W., 1995 Sexual reproduction in Daphnia: its control and genetic consequences. Ph.D. Thesis. University of Oregon, Eugene.

DENG, H.-W., 1998  Estimating (over)dominance coefficient and discriminating dominance vs. overdominance as the genetic cause of heterosis. Genetics 148:2003-2014[Abstract/Free Full Text].

DENG, H.-W. and Y.-X. FU, 1998a  On the three methods for estimating deleterious genomic mutation parameters. Genet. Res. 71:223-236[Medline].

DENG, H.-W. and Y.-X. FU, 1998b  Conditions for positive and negative correlations between fitness and heterozygosity in equilibrium populations. Genetics 148:1333-1340[Abstract/Free Full Text].

DENG, H.-W. and M. LYNCH, 1996  Estimation of the genomic mutation parameters in natural populations. Genetics 144:349-360[Abstract].

DENG, H.-W. and M. LYNCH, 1997  Inbreeding depression and inferred deleterious-mutation parameters in Daphnia. Genetics 147:145-155.

DENG, H.-W., Y.-X. FU, and M. LYNCH, 1998a  Inferring the major genomic mode of dominance and overdominance. Genetica 102(103):559-567.

DENG, H.-W., J. LI, and J.-L. LI, 1998b  On the experimental design and data analysis of mutation accumulation experiments. Genet. Res. in press.

FALCONER, D. S., and T. S. MACKAY, 1996 Introduction to Quantitative Genetics. Longman, New York.

FU, Y.-B. and K. RITLAND, 1996  Marker-based inference about epistasis for gene influencing inbreeding depression. Genetics 144:339-348[Abstract].

GREGORY, W. C., 1965  Mutation frequency, magnitude of change and the probability of improvement in adaptation. Radiat. Bot. 5(Suppl.):429-441.

HALDANE, J. B. S., 1937  The effect of variation on fitness. Am. Nat. 71:337-349.

HAYMAN, B. I., 1954  The theory and analysis of diallele crosses. Genetics 39:789-809[Free Full Text].

HOULE, D., 1989  Allozyme-associated heterosis in Drosophila melanogaster. Genetics 123:789-801[Abstract/Free Full Text].

HOULE, D., 1994  Adaptive distance and the genetic basis of heterosis. Evolution 48:1410-1417.

HOULE, D., B. MORIKAWA, and M. LYNCH, 1996  Comparing mutational variabilities. Genetics 143:1467-1483[Abstract].

HUDSON, R. and N. L. KAPLAN, 1995  Deleterious background selection with recombination. Genetics 141:1605-1617[Abstract].

KACSER, H. and J. A. BURNS, 1981  The molecular basis of dominance. Genetics 97:639-666[Abstract/Free Full Text].

KEIGHTLEY, P. D., 1994  The distribution of mutation effects on viability in Drosophila melanogaster. Genetics 138:1315-1322[Abstract].

KEIGHTLEY, P. D. and W. G. HILL, 1990  Variation maintained in quantitative traits with mutation-selection balance: pleiotropic side effects on fitness. Proc. R. Soc. Lond. Ser. B Biol. Sci. 253:291-296.

KIMURA, M., T. MARUYAMA, and J. F. CROW, 1963  The mutation load in small populations. Genetics 48:1303-1312[Free Full Text].

KIRKPATRICK, M. and M. J. RYAN, 1991  The evolution of mating preferences and the paradox of the lek. Nature 350:33-38.

KONDRASHOV, A. S., 1985  Deleterious mutations as an evolutionary factor. II. Facultative apomixis and selfing. Genetics 111:635-653[Abstract/Free Full Text].

KONDRASHOV, A. S., 1988  Deleterious mutations and the evolution of sexual reproduction. Nature 336:435-440[Medline].

KONDRASHOV, A. S. and J. F. CROW, 1991  Haploidy or diploidy: which is better? Nature 351:314-315[Medline].

LANDE, R., 1994  Risk of population extinction from new deleterious mutations. Evolution 48:1460-1469.

LYNCH, M. and W. GABRIEL, 1990  Mutation load and the survival of small populations. Evolution 44:1725-1737.

LYNCH, M., R. BURGER, D. BUTCHER, and W. GABRIEL, 1993  The mutation meltdown in small asexual population. J. Hered. 84:339-344[Abstract/Free Full Text].

LYNCH, M., J. CONERY, and R. BURGER, 1995a  Mutation meltdowns in sexual populations. Evolution 49:1067-1080.

LYNCH, M., J. CONERY, and R. BURGER, 1995b  Mutation accumulation and the extinction of small populations. Am. Nat. 146:489-518.

MACKAY, T. F. C., R. F. LYMAN, and M. S. JACKSON, 1992  Effects of P element insertions on quantitative traits in Drosophila melanogaster. Genetics 130:315-332[Abstract].

MORTON, N. E., J. F. CROW, and H. J. MULLER, 1956  An estimate of the mutational damage in man from data on consanguineous marriages. Proc. Natl. Acad. Sci. USA 42:855-863[Free Full Text].

MUKAI, T., 1969  The genetic structure of natural populations of Drosophila melanogaster. VII. Synergistic interaction of spontaneous mutant polygenes controlling viability. Genetics 61:749-761[Free Full Text].

MUKAI, T., S. I. CHIGUSA, L. E. METTLER, and J. F. CROW, 1972  Mutation rate a