Abstract
We develop a maximumlikelihood (ML) approach to estimate genomic mutation rates (U) and average homozygous mutation effects (s) from mutationaccumulation (MA) experiments in which phenotypic assays are carried out in several generations. We use simulations to compare the procedure's performance with the method of moments traditionally used to analyze MA data. Similar precision is obtained if mutation effects are small relative to the environmental standard deviation, but ML can give estimates of mutation parameters that have lower sampling variances than those obtained by the method of moments if mutations with large effects have accumulated. The inclusion of data from intermediate generations may improve the precision. We analyze lifehistory trait data from two Caenorhabditis elegans MA experiments. Under a model with equal mutation effects, the two experiments provide similar estimates for U of ~0.005 per haploid, averaged over traits. Estimates of s are more divergent and average at −0.51 and −0.13 in the two studies. Detailed analysis shows that changes of mean and variance of genetic values of MA lines in both C. elegans experiments are dominated by mutations with large effects, but the analysis does not rule out the presence of a large class of deleterious mutations with very small effects.
EXPERIMENTAL estimates of rates at which mutations occur in the genome and properties of distributions of mutation effects for fitness and other lifehistory traits are important for several questions in population and evolutionary biology, but have proved to be extremely difficult to obtain (GarciaDoradoet al. 1999; Keightley and EyreWalker 1999; Lynchet al. 1999 for recent reviews). One experimental approach to obtain information on mutation parameters is mutation accumulation (MA) in the laboratory. This involves the random accumulation of spontaneous mutations in replicated chromosomes or inbred sublines, usually over several tens of generations, followed by largescale lifehistory trait assays of MA and control lines. Due to the need for a high degree of replication, the experiments tend to be both tedious and timeconsuming. An important issue, therefore, is the method of parameter estimation, as an experimentalist will wish to extract the maximum amount of information from the available experimental data.
The traditional way to analyze MA experimental data is the BatemanMukai (BM) method of moments (Bateman 1959; Mukai 1964). Under the assumption that mutations have equal effects, an estimate of the genomic deleterious mutation rate (U) per haploid genome per generation is
A drawback with the ML method is that it is quite complex, and has to date been implemented only for cases with a single MA generation plus a control line. In this article we extend the ML approach to analyze data from experiments with an arbitrary number of generations, and thereby make use of all the available information including covariances between phenotypic values for the same lines at different generations. At present, our multigeneration ML method is restricted to the case of equal mutation effects, i.e., the same model as assumed by the BM method. However, the method allows the comparison of results for different experiments under the same model. We investigate the properties of the method by Monte Carlo simulation and compare its precision to the BM approach. Finally, we use the multigeneration ML procedure to analyze data on lifehistory traits from two recently published MA experiments with the wildtype N2 strain of Caenorhabditis elegans (Keightley and Caballero 1997; Vassilieva and Lynch 1999).
MATERIALS AND METHODS
Likelihood framework for several generations: In this section we derive the likelihood function, which is appropriate when data from several generations are jointly used to estimate genomewide mutation parameters, by extending a previously developed method (Keightley 1994). We also allow for the joint estimation of fixed effects such as assay effects that can occur if experimental assays are carried out sequentially or in blocks.
Let Z_{k,tj} denote the phenotypic value of line k assayed after t_{j} generations of mutation accumulation. We assume that the number of mutations fixed in the homozygous state in the line is Poisson distributed with mean λ_{j} = Ut_{j}. The mutations are assumed to have a constant additive effect denoted by s. We assume that environmental effects are normally distributed with variance V_{e}. The phenotypic value is, therefore
This likelihood equation can be generalized to incorporate an arbitrary number of points in time where the lines are assayed. Suppose that the set of MA lines are assayed at generation t_{1}, t_{1} + t_{2}, …, t_{1} + t_{2} + …t_{T}; then the likelihood associated with a line k is
Simulation protocol: To compare the precision of the BM estimator and the ML multigeneration estimators, we carried out Monte Carlo simulations of MA experiments. To a good approximation, the size of a MA experiment can be characterized by its “heritability” at the line level
In addition, we investigated the performance of the BM and ML procedures with data in which mutation effects are exponentially distributed; thus data are analyzed under the “wrong” model of equal effects. For each U, s combination simulated under the constant mutation effects model, we performed simulations with the same U and mean mutation effect
Likelihood maximization: ML maximization was carried out using the simplex algorithm (Nelder and Mead 1965). For the analysis of simulated data, starting values for likelihood maximization were obtained from the data. A linear search strategy was employed in which a set of initial maximizations was performed over a broad range of 10 fixed values of U, varying by a factor of 2–5 about its expected value. Within each of these runs, starting values for s were obtained by fitting the observed change in mean phenotypic value of the lines in the final generation, and starting values of M and V_{E} were computed from the control line data. The simplex was then restarted with the U, s, M, V_{E} combination that gave highest likelihood during the initial linear search through the parameter space. Spurious convergence was checked by restarting the routine after convergence until no further improvement in fit occurred. In the analysis of the C. elegans data, starting values for assay effects or other fixed effects and M and V_{E} were calculated from the control line data. To verify that the global ML had been reached, several sets of starting values for U, s, M, V_{E}, and fixed effects were investigated. Checks were also performed using the grid search strategy described above. Approximated standard errors were obtained from the curvature of profile likelihoods about their maxima (Weir 1996, ch. 2). A C computer program to carry out the likelihood calculations is available on request.
C. elegans data sets: Keightley and Caballero (1997) and Vassilieva and Lynch (1999) carried out MA experiments with the C. elegans strain N2 for 60 and 49 generations, respectively, and with 50 and 100 MA lines, respectively, and measured several life history traits. Reproductive output is lifetime output of viable progeny (Keightley and Caballero 1997) or output for the first 4 days of reproduction (Vassilieva and Lynch 1999). This fitness measure includes the viability of the parents. Longevity was assayed by similar methods in the two experiments, although there were slight differences in the criteria used to score a worm as alive or dead. Daily reproductive output and numbers of parental worms alive were used by Vassilieva and Lynch (1999) to calculate replicatespecific intrinsic growth rate, r, by methods described by Charlesworth (1994). Here, we performed similar calculations to obtain estimates of r for the Keightley and Caballero (1997) data. Vassilieva and Lynch (1999) include several other derived traits, related to r, that are highly correlated with it. We carried out ML analysis with untransformed data for line means, or the BoxCox procedure was used to select the power transformation that best achieved normality, and the data for individual values were transformed accordingly (Sokal and Rohlf 1995, ch. 13), and then line means were calculated. In the cases of r and longevity, for which betweencontrol line variance is nonsignificant (Vassilieva and Lynch 1999), tests for nonnormality were carried out using all available control line data, after correction for assay effects, estimated by a REML analysis (Genstat 5 Committee 1993). In the case of productivity, which shows evidence of control line variance (Vassilieva and Lynch 1999), the test was carried out on residuals after correction for line means and assay effects. The distribution of the Vassilieva and Lynch (1999) control data for r is significantly negatively skewed: under the BoxCox power transformation, the value of the exponent is 3.85 [P < 0.001; see Sokal and Rohlf (1995), p. 417], so these data were transformed. The effect of the transformation is to increase
Analysis under a variable mutationeffects model: A multigeneration ML procedure along the lines described above to estimate parameters of models with variable mutation effects was found to be computationally intractable at the present time. To test for evidence of variability among mutation effects in the C. elegans data, therefore, we analyzed the last generation of MA line data plus control data (line means) with a singlegeneration ML procedure (Keightley 1998). The fit to the data of several gamma distributions (shape parameter β, scale α) ranging from equal mutation effects (β → ∞) to more leptokurtic distributions (β < 1) was compared.
RESULTS
ML analysis of simulated MA experiments with equal mutation effects: Simulation results are summarized in Table 1 for values of U and corresponding values of s such that V_{L} remains constant and V_{L}/V_{EL} = 5 (see materials and methods). Over the range of parameter values simulated, BM and ML give mean estimates close to the simulated values. As expected, if the effect of each mutation is small compared to the environmental standard deviation of line means (σ_{eL}, 0.0566 in this case), ML and BM estimators perform similarly. However, if the effects of mutations are relatively large (s > σ_{eL}), the ML estimator makes more efficient use of the information available and can have a much smaller variance than the BM estimator (see variance of the estimators empirically determined through Monte Carlo simulations in Table 1). The difference in precision becomes smaller if the experiment is noiser (e.g., V_{L}/V_{EL} = 2; data not shown). Both ML and BM estimators can become unstable and give infinite variance among estimates if V_{L}/V_{EL} falls below ~1.
The variances of the estimates shown in Table 1 also suggest that an increase in precision can be obtained with the ML estimator (but much less so with the BM procedure) by including extra generations in the analysis. Again this suggests that ML makes more efficient use of the information available. The effect can increase dramatically as the number of generations assayed and/or the magnitude of mutation effect increases. Deng and Fu (1998) and Deng et al. (1999) in their simulation studies concluded that adding extra generations does not reduce estimation variance, but this only applies to the BM procedure (see Table 1, variance of BM estimates for U and s).
ML analysis of simulated MA experiments with variable mutation effects: Simulation results are reported in Table 2. If data simulated under an exponential distribution of mutation effects model are analyzed under the assumption of constant effects, both methods give estimates of U (s) that are biased downward (upward) by a factor B. Empirically we find that B ≈ 1 + Var[s]/s^{2}. This was already known for the BM estimator (see, e.g., Crow and Simmons 1983; Lynch and Walsh 1998, p. 341). As in the simulations with equal mutation effects (Table 1), if mutation effects are small relative to the environmental standard deviation, both estimators perform essentially identically [see variance (Var) or mean square error (MSE) of estimates, Table 2]. In cases where the mean mutation effect is very large, ML is slightly less biased than BM, and this leads to an increase in precision in terms of the MSE. The improvement in precision is more apparent for estimates of s than for U.
Multigeneration ML analysis of C. elegans MA experiments: Data for productivity, r, and longevity of the Vassilieva and Lynch (1999) and Keightley and Caballero (1997) MA experiments were analyzed by the multigeneration ML method (Tables 3 and 4). Both experiments employed control lines that had been kept frozen, and data from these were included. Keightley and Caballero (1997) carried out repeat assays of generation 32 and 60 MA plus control lines. Vassilieva and Lynch (1999) carried out MA line assays contemporaneously with the control at 4 different generations (7, 20, 30, and 49) and at generation 0 (with the exception of longevity), so we include assay effects along with the other parameters. Each generation was therefore allowed to have a different mean that was common to control and MA lines. We also investigated models where, in addition, an effect was included to allow MA and control lines to have different means. In the likelihood evaluation, the variance of a line mean was inversely proportional to the number of worms assayed.
ML and BM estimates of genomewide mutation rates and average mutation effects for three traits in Vassilieva and Lynch's (1999) experiment are shown in Table 3. Estimates of s are scaled relative to the control population mean. Standard errors (SEs) of estimates are much smaller under ML than BM, a result we also obtained in the Monte Carlo simulation experiments, although the improvement in precision is larger than we expected on the basis of the simulations. Defining precision as the squared coefficient of variation of an estimate, ML is 24 times and 69 times more precise than BM, on average, for U and s, respectively [Vassilieva and Lynch (1999) data]. The inclusion of assay effects in the model may explain a large part of the improvement in precision, because the increase in log likelihood of the model containing these effects was very large (Table 5). The standard errors for the estimates were obtained from profile likelihood curves (Figure 1). In the case of s for all traits, and U for productivity and r, these curves are of quadratic form and reasonably symmetrical, and support limits based on a drop in log likelihood of 2 from their maxima are ~ ±2 SEs from the maximumlikelihood estimates. The profile likelihood for U in the case of longevity is strongly asymmetrical, and +2 SE gives a poor estimate of the upper support limit of ~0.03.
An effect of freezing worms could influence the Vassilieva and Lynch (1999)
results, because control lines were maintained frozen and MA lines were never frozen [both control and MA lines were cryopreserved in Keightley and Caballero (1997)]. To investigate this effect, ML analysis was carried out with an effect for freezing included. Although the effect is significant for all three traits (P < 0.001), the parameter estimates are hardly affected: for example, in the case of
Models with variable mutation effects: The line mean data from the last MA generation along with the control data were analyzed by ML under the assumption that mutations effects are gamma distributed with scale and shape parameters α and β, respectively. In the analysis of the Vassilieva and Lynch (1999) data (generation 49), all the control data were included, and effects specific to each block of control assays were estimated. Because there was also a significant effect for freezing (see above), an MA/control line effect was also estimated. Estimates of U and the mean mutation effect
For productivity, somewhat surprisingly, in both experiments log likelihood decreases as the kurtosis of the assumed gamma distribution increases, i.e., the bestfitting gamma distribution is the limiting case of equal mutation effects with β → ∞. Distributions much more leptokurtic than a gamma distribution with shape parameter 1 (i.e., an exponential distribution) are inconsistent with the data on the basis of likelihoodratio tests. For longevity and r, log likelihood for different β models changes nonsignificantly, so these data contain little information that can allow different distributions of mutation effects to be distinguished.
DISCUSSION
Simulation experiments: Over the range of parameter values studied, if the data conform to the model assumed, the ML and BM procedures give mean parameter estimates close to the simulated parameter values. If mutation effects are relatively small there is little benefit from using ML over the BM method of moments. However, if an appreciable fraction of the genetic variance is contributed by mutations with relatively large effects, ML can produce estimates with substantially lower variances than BM. Presumably, the distribution of MA line values will depart from a normal distribution, and replicates within lines may consistently deviate, so there is information to be extracted from the line value distribution in addition to the first and second moments. Furthermore, individual lines will show “jumps” between generations, and again the ML procedure will use this information. Hence there is a benefit from including additional intermediate generations. The opposite conclusion was drawn by Deng and Fu (1998) and Deng et al. (1999), on the basis of analysis restricted to the BM procedure. It remains to be investigated if, for a fixed number of experimental units available, the best design would involve the assay of several MA generations or rather a single generation assay involving greater replication. This tradeoff may be particularly important for organisms where a single large common garden experiment would be the rule (such as plants).
We also explored the robustness of the constant mutation effects model by simulating data sets in which effects of mutation are exponentially distributed. As with the case of data simulated with equal mutation effects, the BM and ML procedures perform similarly if mutation effects are small relative to the environmental standard deviation. Both methods also show similar levels of bias in this situation. If average effects of mutations are large, ML tends to be less biased and shows a higher level of precision than BM, as measured either in terms of the amongestimate variance or mean square error. As with the case of equal mutation effects, the difference in precision between ML and BM increases as the number of intermediate generations included in the analysis increases, and the effect is more apparent for the mean mutation effect than for U.
Improvement of precision under ML in analysis of C. elegans data: In both C. elegans MA experiments, ML estimates have considerably smaller sampling variances than corresponding BM estimates. The simulation studies suggest that an improvement in precision is to be expected in general (Tables 1 and 2), due to a more efficient used of information, but the improvement turned out to be larger than we expected on the basis of the simulations. There are three factors that may account for this:
In the experiments, there were lines that deviated by several standard deviations from the control means and probably carried mutations with large effects. Data of this sort lead to the greatest improvement of ML over BM.
The fitting of assay effects, which are large and significant (Table 5), removes much of the noise that clouds the results from the regression analysis. This is probably the most important factor.
The model of equal mutation effects appears to give a good fit to the data, at least in explaining the major effect mutations (Table 6), and the improvement in precision is expected to be greatest in this case.
C. elegans MA experiments: The negative estimates of s for productivity are in line with expectation and are in accord with the negative estimates for the mean mutation effect on r, a highly correlated trait. For longevity, Vassilieva and Lynch (1999) obtained a significant erosion of the mean (ΔM) and estimates for U and s of 0.064 and −0.048, respectively. However, due to a discrepancy caused by a single data point (generation 49 for the MA lines), our regression estimate of ΔM is about twothirds of Vassilieva and Lynch's (1999), and our BM estimates of U and s are consequently about 2 times smaller and 1.5 times greater, respectively (Table 2). The data file provided to us contains the most meaningful measure of longevity, and furthermore the level of mutational decay for longevity observed up to generation 49 has not been seen in later generations (M. Lynch and L. Vassilieva, personal communication). ML estimates of U and s for longevity are 0.0040 and −0.26, respectively. In terms of mutational target sizes, the conclusion from the two MA experiments taken together is r > productivity > longevity.
Overall, the ML estimates for the two C. elegans MA experiments agree with one another reasonably well (Tables 3 and 4). Taking an average over traits, estimates of U per haploid genome are 0.0041 (Vassilieva and Lynch 1999) and 0.0057 (Keightley and Caballero 1997). Average estimates of s are −0.51 Vassilieva and Lynch (1999) and −0.13 (Keightley and Caballero 1997), so Vassilieva and Lynch (1999) accumulated mutations with significantly larger effects, particularly for productivity and r. Vassilieva and Lynch (1999) discuss the possibility that natural selection strongly affected the Keightley and Caballero (1997) results. However, the estimates of mean mutation effects rather than numbers of mutations detected differ more strongly. An alternative explanation for this difference is a difference in environmental conditions, because experimental conditions resulted in lower productivity than is typical for the N2 strain in the case of Vassilieva and Lynch (1999), e.g., Johnson and Hutchinson (1993). Furthermore, natural selection is ineffective in eliminating all but strongly deleterious mutations if MA lines are propagated by transferring individuals each generation (Kibota and Lynch 1996; Keightley and Caballero 1997).
The ML estimates of the mutation effect parameter are surprisingly high, particularly for productivity and r in the case of Vassilieva and Lynch's (1999) data set, and may be influenced by extreme lines with low mean fitness. Visual inspection of the data suggested this to be the case: a minority of lines had consistently low fitness across several generations. A scatter plot of the rank of the line means for productivity at the last two generations (30 and 49) gives an indication of the extent of contribution of such extreme lines (Figure 2). Over most of the plot, points seem to be distributed at random, suggesting little covariance between rank across generations, but there is a deficit of points at the lefthand and lower edges along with the suggestion of an excess of lines that rank low at both generations. We further investigated the contribution of the lowranking lines to the U and s estimates by excluding subsets of extreme lines with mean phenotype <50% of the control population mean. [cf. Mukai et al. (1972), who performed a similar procedure]. The effect of excluding extreme lines is to somewhat reduce estimates of both U and s, but the latter estimates are still surprisingly high (e.g., for productivity, s ~ −0.5, data not shown).
Nature of mutational variability for lifehistory traits in C. elegans: Taking an average over traits, the ML analysis of Vassilieva and Lynch's (1999) data provides an estimate for U more than five times smaller than the corresponding average BM estimate. By the criterion of comparing Drosophila melanogaster and C. elegans on the basis of the sizes of their genomes (measured by the number of base pairs), and taking into account the lower number of germ line cell divisions in C. elegans than D. melanogaster, Vassilieva and Lynch (1999) argue that their average estimate of U (per haploid) of 0.025 is only ~2fold smaller than an average U estimate for D. melanogaster from MA experiments with balancer chromosomes [~0.3 per haploid genome; see Simmons and Crow (1977)]. However, the average ML estimate of U (~0.005 per haploid) is five times smaller again, and leads us to conclude that the mutation rates differ ~10fold. If extreme lines are excluded from the ML analysis, this difference increases. Furthermore, comparing two genomes on the basis of numbers of base pairs may be inappropriate, because the more compact C. elegans may contain less redundant DNA than D. melanogaster, and current estimates of gene number in C. elegans and D. melanogaster are similar (Simmenet al. 1998; Ashburneret al. 1999). The estimates of mean mutation effects from the two C. elegans studies and the D. melanogaster experiments involving balancer chromosomes of Mukai (1964), Mukai et al. (1972), and Ohnishi (1977) evaluated under the same model of equal mutation effects point to a qualitative difference in mutation spectra between these organisms. The D. melanogaster balancer experiments are characterized by a mutational erosion for a major fitness component (competitive viability) caused by a high rate of mutations with effects of only a few percent. This is manifest in a rapid decline in viability of “quasinormal” chromosomes. In contrast, average effects of mutations for the primary fitness traits (productivity and r) in C. elegans are estimated to be −0.62 (Vassilieva and Lynch 1999) and −0.15 (Keightley and Caballero 1997), with small ML standard errors. Thus, we argue that the decay in lifehistory trait mean and increase in genetic variance in C. elegans seem to be mostly attributable to mutations with relatively large effects. This can be clearly seen in the frequency distributions of control and MA lines (Keightley and Caballero 1997, Figure 1; Vassilieva and Lynch 1999, Figure 2), which show little appreciable loss in fitness of quasinormal lines. A possible explanation for this difference in behavior is that active systems of transposable elements, present in D. melanogaster but absent from C. elegans strain N2 (Eide and Anderson 1985), could generate a different distribution of mutational effects than nonTE mutations, although other explanations have also been proposed (Keightley 1996; GarciaDorado 1997; Fryet al. 1999).
The estimates for U we have obtained for C. elegans are extremely small. However, most of the analysis here has assumed that mutations have equal effects and produces an estimate of an “effective” number of mutations similar to the effective number of loci influencing a quantitative trait that can be estimated from line crosses (Falconer and Mackay 1996). The estimates presented here from the equaleffects ML analysis can therefore be taken only as an index of the number of mutations that make a large change to lifehistory trait values in the conditions assayed. However, the results of the two C. elegans MA experiments suggest that mutations with small effects make only a very small contribution to a mutational decay of lifehistory traits in the laboratory, a conclusion supported by the analysis of a recent ethylmethane sulfonate mutagenesis experiment in C. elegans (Davieset al. 1999). The question of the generality of mutational theories of population extinction (Lande 1995; Lynchet al. 1995), which depend critically on the distribution of mutation effects, will therefore depend on further work in a variety of species.
Acknowledgments
We thank Mike Lynch and Larissa Vassilieva for kindly providing the data from their mutationaccumulation experiment. We are grateful to Deborah Charlesworth, Esther Davies, Bruno Goffinet, Bill Hill, Armando Caballero, Mark Kirkpatrick, Brian Charlesworth, Ruth Shaw, Dan Schoen, Mike Lynch, and an anonymous referee for helpful comments on the manuscript. T.B. acknowledges support from the European Science Foundation (Plant Adaptation Program) and the French Institut National de la Recherche Agronomique. P.K. acknowledges support from the Royal Society of London.
Footnotes

Communicating editor: A. G. Clark
 Received February 22, 1999.
 Accepted November 23, 1999.
 Copyright © 2000 by the Genetics Society of America