THE genomic deleterious mutation rate (U) and the distribution of mutational effects for fitness, f(s), are important parameters for several theoretical issues in evolution (Charlesworth and Charlesworth 1998), and there has been much recent work on the problem of their estimation. There are currently three statistical approaches to infer U and f(s) on the basis of the distribution of fitness estimates among inbred mutation accumulation (MA) lines maintained under relaxed selection: minimum distance (MD; GarcíaDorado 1997), traditional maximum likelihood (ML; Keightley 1994), and Markov chainMonte Carlo ML (Shawet al. 2002). These methods extract information from the shape of the distribution of MA line means; this information is not used by the BatemanMukai method of moments (BM; Bateman 1959; Mukai 1964). Recently, GarcíaDorado and Gallego (2003) have compared the performance of the BM, MD, and ML procedures by computing means and variances of parameter estimates in replicated simulated data sets and concluded that MD tends to produce mean estimates with the lowest bias and sampling variance. In this letter, I question the evidence that led to these claims.
GarcíaDorado and Gallego's (2003) principal claims are that MD produces unbiased estimates of U and the mean mutational effect E(s), that MD outperforms ML by producing estimates of U that have lower bias and smaller mean squared error (MSE), and that ML performs more poorly because many estimates are “large outliers.” Table 1 summarizes the data on which GarcíaDorado and Gallego (2003) base their conclusions. In 4 of 6 cases mean MD estimates for U appear to be less biased than ML, and in 5 of 6 cases MD estimates of MSE are lower. However, there is a notable difference in the number of replicates that were excluded on the basis of failure to converge (15/62 for MD vs. 6/60 for ML; χ^{2} 1 d.f. = 3.43, P = 0.064). This difference presumably arises because MD and ML use different algorithms to locate maxima (or minima) in the multidimensional parameter space. ML employs numerical integration to compute likelihood of data as a function of U and f(s) and combines grid searches with the simplex algorithm (Nelder and Mead 1965; Presset al. 1992) to attempt to locate the global maximum likelihood. Convergence is declared when the relative change in likelihood between successive iterations is less than a userdefined threshold. The algorithm is guaranteed to converge (although not necessarily to the global maximum) and to produce parameter estimates if the user sets bounds on valid parameter values. MD uses a stochastic algorithm to produce proposal distributions of line means that are functions of U and f(s) and computes “distances” between the empirical and proposal distributions. A grid search is employed to attempt to find the combination of parameter values that minimizes the distance. Failure to converge is declared if the profile of distance as a function of the parameter of interest (i.e., the marginal of distance minimized with respect to all but one parameter) changes nonsignificantly over a range of three times the parameter value. This implies that MD can fail to provide estimates if the profile is flat in the region of the minimum. GarcíaDorado and Gallego (2003) exclude all MD replicates that fail to converge and any ML replicates for which the ML U estimate exceeds 50.
There is therefore an important difference in the criteria that were used to exclude replicates. Under ML, the set of nonexcluded replicates can contain some very large U estimates below the cutoff of 50 (see Figure 1). I argue that it is highly likely that the excluded MD replicates also tended to be at the upper end of the distribution of U values and that the exclusion of a higher proportion of these extreme replicates led to lower bias and lower sampling variance (Table 1). Replicates giving high U values tend to be excluded under the MD criterion because profiles of distance or likelihood frequently reach plateaus or asymptotically approach limits as a function of increasing U. The existence of such flat profiles has been demonstrated in empirical investigations of MD (GarcíaDorado and Marin 1998) and ML (Keightley 1994; Loeweet al. 2003) and in simulations of MD (GarcíaDorado 1997) and ML (Keightley 1998). The behavior does not depend on the way in which the data are analyzed and can be explained by considering the way in which the moments of the distribution of genotypic values of line means (X) change as a function of U: for high values of U the moments of the distribution of X can be held approximately constant by making compensatory changes upward and downward in the values of U and α, respectively (Keightley 1998); as U increases, the shape of the distribution of X (i.e., the proposal distribution under MD) can remain almost unchanged.
MA line data often contain insufficient information to allow unbiased estimation of mutational parameters simultaneously. The parameters are confounded in such a way that the best estimate of the mutation rate is often near a plateau in the profile of distance or likelihood. An estimation procedure that rejects nearly onequarter of such values (Table 1) should not be claimed to show “no bias” (GarcíaDorado and Gallego 2003). Furthermore, in cases where U, α, and E(s) are estimated simultaneously, a comparison of means or variances of parameter estimates cannot substantiate a claim that one estimation procedure outperforms another if a significant proportion of replicates are excluded and different exclusion criteria are employed.
Footnotes

Communicating editor: S. P. Otto
 Received November 25, 2003.
 Accepted January 19, 2004.
 Copyright © 2004 by the Genetics Society of America