- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Kelly, J. K.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Kelly, J. K.
Deleterious Mutations and the Genetic Variance of Male Fitness Components in Mimulus guttatus
John K. Kellyaa Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas 66045
Corresponding author: John K. Kelly, University of Kansas, 1200 Sunnyside Ave., Lawrence, KS 66045., jkk{at}ku.edu (E-mail)
Communicating editor: J. B. WALSH
| ABSTRACT |
|---|
Deleterious mutations are relevant to a broad range of questions in genetics and evolutionary biology. I present an application of the "biometric method" for estimating mutational parameters for male fitness characters of the yellow monkeyflower, Mimulus guttatus. The biometric method rests on two critical assumptions. The first is that experimental inbreeding changes genotype frequencies without changing allele frequencies; i.e., there is no genetic purging during the experiment. I satisfy this condition by employing a breeding design in which the parents are randomly extracted, fully homozygous inbred lines. The second is that all genetic variation is attributable to deleterious mutations maintained in mutation-selection balance. I explicitly test this hypothesis using likelihood ratios. Of the three deleterious mutation models tested, the first two are rejected for all characters. The failure of these models is due to an excess of additive genetic variation relative to the expectation under mutation-selection balance. The third model is not rejected for either of two log-transformed male fitness traits. However, this model imposes only "weak conditions" and is not sufficiently detailed to provide estimates for mutational parameters. The implication is that, if biometric methods are going to yield useful parameter estimates, they will need to consider mutational models more complicated than those typically employed in experimental studies.
The long-debated question of whether or not genetic variation in fitness primarily reflects contributions of low-frequency deleterious alleles maintained by the balance between selection and mutation, or has a substantial contribution from variants maintained at intermediate frequencies by selection, is still unanswered.
CHARLESWORTH and HUGHES 2000
DELETERIOUS mutations have been a focus of study in both genetics and evolutionary biology for most of the history of each field (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The primary mutational parameters are U, h, and s. The genomic deleterious mutation rate, U, equals 2
µi, where µi is the deleterious mutation rate at the ith locus, and the summation is taken over all loci affecting fitness. The selection coefficient, s, is the proportional fitness reduction caused by the mutation when in homozygous form. The dominance coefficient, h, characterizes mutational effects in heterozygotes (see Table 1). A mutation with s = 0.05 and h = 0.2 reduces fitness by 1% in heterozygotes and 5% in homozygotes. As mutational effects are likely to vary among loci (and among different mutations at the same locus), estimates for s and h may actually represent average values for mutational effects. However, because the values of h and s for a given allele are assumed to be constant, this is essentially a model of unconditionally deleterious mutations. Given the range of applications outlined above, accurate estimates for U, h, and s will shed light on a number of unsolved problems in evolutionary biology.
|
At least three different approaches have been developed to estimate the rate and effects of deleterious mutations: mutation-accumulation experiments, molecular evolution surveys, and biometric analyses. These methods, briefly summarized below, each have their respective strengths and weaknesses. They are applicable to different sorts of data. The procedures also differ in the specific parameters that are estimated and in their underlying assumptions regarding genetics and evolution.
Mutation-accumulation experiments estimate mutational parameters from the observed genetic divergence of initially identical lines (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The molecular survey method is based on the rates of synonymous and nonsynonymous gene sequence evolution within an evolutionary lineage (![]()
![]()
![]()
![]()
Biometric methods use patterns of phenotypic variation in fitness components to infer mutational parameters (![]()
![]()
![]()
![]()
![]()
Biometric methods are generally less labor intensive than mutation-accumulation experiments. More importantly, these experiments can potentially yield estimators that are relatively low in bias for a wide range of mutational parameters (![]()
![]()
This article describes a biometric analysis of male fitness measures in Mimulus guttatus, a wildflower commonly known as yellow monkeyflower. My purpose is to address both of the potential pitfalls associated with the biometric method. Mutational parameters are estimated from a breeding experiment that uses randomly extracted, highly inbred lines as parents. This effectively eliminates the problem of purging within the experiment. Allele frequencies should not differ (consistently) between outbred and inbred plants. Second, the experiment is designed so that the number of "empirical comparisons" is greater than the number of unknown parameters. This provides the degrees of freedom necessary to test the adequacy of a model prior to parameter estimation. Following the suggestion of ![]()
Biometric methods depend on specific features of the experimental design such as the type and levels of inbreeding, the sorts of experimental crosses, etc. In the following sections, I first describe the experimental design and define the biometric quantities that can be estimated from this design. I then derive the expected values for these quantities under various forms of the deleterious mutation model (DMM). Next, I describe the likelihood-based methods for parameter estimation. A parametric bootstrapping procedure is developed, both to assess the adequacy of the experimental design and to perform hypothesis tests (obtain an appropriate P value for the various likelihood ratios). Finally, the statistical model is applied to phenotypic data from M. guttatus. The characters are flower size and two male fitness measures, pollen production and the pollen size index (PSI). The PSI is strongly correlated with pollen viability in M. guttatus (![]()
| BIOMETRIC QUANTITIES AND THEIR EXPECTED VALUES |
|---|
Consider a breeding design in which the parents are randomly extracted, fully homozygous inbred lines. Such lines can be formed in several different ways. For select species, completely homozygous lines can be formed in a single generation via chromosomal manipulations, gametophytic self-fertilization, or double-haploid formation. Repeated rounds of inbreeding provide a more generally applicable, albeit slower, method of line formation. Successive generations of self-fertilization can yield lines that are almost fully inbred in only five to six generations. The parental lines used in this study were developed by John Willis through six successive generations of self-fertilization (![]()
![]()
Regardless of the method, genetic purging is likely to accompany the formation of the inbred lines. Lines that become homozygous for mutations that are lethal or cause sterility in homozygous form will go extinct and eliminate these alleles from the experimental population. In contrast, deleterious mutations with smaller effects should fix randomly within lines (![]()
![]()
![]()
A major advantage of using fully inbred lines as parents in a biometric breeding design is that they are already fully purged. Subsequent inbreeding or outcrossing should result in predictable changes in homozygosity without changes in allele frequency (apart from the random changes due to finite sample sizes). Consider the breeding design described by Fig 1. Each "extended family" is constituted from the progeny of three randomly selected inbred lines (labeled L1, L2, and L3). One line (L1) is assigned as the "sire" and is mated to the other two lines, the "dams." The progeny of these crosses comprise two distinct but related outbred families (the families in Fig 1, bottom). Each line is also self-fertilized to produce three inbred families. The inbred families are unrelated to each other, but each is related to one or both of the outbred families. This same mating scheme is applied to other sets of lines generating a large number of extended families.
|
The measurement of progeny from this design provides data sufficient to estimate the mean phenotype of both inbred and outbred plants as well as a number of (co)variance components. Let ß denote the difference in mean phenotype between inbred and outbred plants. Four different comparisons among relatives can be quantified as covariances (Fig 1). CSS is the covariance among plants within selfed families (selfed siblings). CFS is the covariance among plants within outbred families (full siblings). COS is the covariance between outbred plants and their inbred siblings (related through a single parent). Finally, CHS is the covariance of plants from different outbred families (half-siblings related through the sire line). These C terms are "observational variance components" in the terminology of ![]()
![]()
![]()
![]()
Estimates for ß and the observational components are used to assess the sufficiency of a particular deleterious mutation model to explain genetic variability. If demonstrated as sufficient, the same data can then be used to estimate the mutational parameters of that model. For both purposes, it is necessary to express ß, CSS, CFS, COS, and CHS as functions of the mutational parameters. This requires a model for genotypic effects on both phenotype and fitness. Table 1 summarizes the standard fitness model for an individual locus: the deleterious allele (a) exists at population frequency q. Following ![]()
"translates" between phenotype and fitness.
Predicted values for ß, CSS, CFS, COS, and CHS can be obtained by applying quantitative genetic models developed for inbreeding populations (![]()
![]()
![]()
![]() |
(1a) |
![]() |
(1b) |
![]() |
(1c) |
![]() |
(1d) |
![]() |
(1e) |
where qi,
i, hi, and si are the allele frequency and effects associated with the ith locus (see also ![]()
Equation 1aEquation 1bEquation 1cEquation 1dEquation 1e depend on two important assumptions. The first is that the loci contribute additively to the phenotype. Oftentimes, it is assumed that the effects of deleterious mutations at different loci combine multiplicatively. In this situation, trait values should be log-transformed prior to analysis. As the appropriate scale is not obvious a priori, here I apply the methods to both log-transformed and untransformed data (see RESULTS and DISCUSSION). Second, each component is linear in qi because I have neglected terms of order qi2 (see Appendix A). This approximation should be quite accurate because deleterious alleles will not usually obtain high frequencies in a large population.
The frequency of deleterious alleles at a given locus (q) depends on the mutational parameters. I now assume that the natural population is large, randomly mating, and in mutation-selection balance equilibrium. I further assume that deleterious mutations have some phenotypic effect in heterozygotes (h > 0), an assumption supported by genetic data (![]()
![]()
The final step is to characterize the mutational spectrum, the joint density function of
i, hi, and si across loci. I consider three alternative models. In model I, mutational effects are fixed and do not vary across loci. In model II, mutational effects vary according to an exponential distribution. Model III considers variable mutational effects without imposing distributional assumptions.
Model I:
All new mutations have fixed effects
, h, and s. With constant mutational effects,
![]() |
(2a) |
![]() |
(2b) |
![]() |
(2c) |
![]() |
(2d) |
![]() |
(2e) |
where U = 2
µi as defined previously.
The five estimable biometric quantities are functions of only three unknown quantities: U
, h, and s
. Because both U and s are always included in a product with
, neither parameter can be estimated individually (unless a particular value for
is assumed). Only the composite parameters, U
and s
, can be estimated. The fact that the five measurable statistics are a function of only three unknowns allows us to test the sufficiency of model I. The model imposes the constraints that CFS = 2COS2/CSS and CHS = 1/2CFS.
Model II:
New mutations vary in their effects according to a simple probability model. Following ![]()
![]() |
(3) |
where
is the average homozygous effect. Since mutations with larger effects tend to be more recessive (![]()
![]()
![]() |
(4) |
where (-1/
) < K < 0. ![]()
The parameters of model II are U, K,
, and
. I assume that
i is the same across loci. The average dominance of new mutations,
, is
![]() |
(5) |
Integrating over the distributions of mutational effects (see Appendix B), we obtain predicted values for the biometric quantities:
![]() |
(6a) |
![]() |
(6b) |
![]() |
(6c) |
![]() |
(6d) |
![]() |
(6e) |
Again, our five observable quantities are functions of three unknowns. These three estimable quantities are composite functions of the parameters U
, K
, and 
. Model II constrains the value of CFS,
![]() |
(7) |
where h* = COS/CSS.
Model III (weak conditions):
Model II represents only one way in which mutational effects may vary across loci (or between different mutations at the same locus). A range of alternative models could be considered. For example, ![]()
![]()
| STUDY SPECIES AND METHODS |
|---|
M. guttatus (2n = 28; Scrophulariaceae) is a small wildflower, increasingly used as a model organism for genetic studies of floral variation and the evolution of plant mating systems (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
John H. Willis initiated
1200 independent lines of M. guttatus in August of 1995. Each line was founded from the seed set of a separate field-collected plant from Iron Mountain. Each line was subsequently maintained in the greenhouse by single-seed descent (self-fertilization) for six generations. Six successive generations of self-fertilization yield an inbreeding coefficient of >0.98 and I treat the parents as fully inbred (f = 1) for the purpose of calculating covariances among relatives. As expected, these lines are almost completely homozygous at highly polymorphic microsatellite loci with different lines fixed for different alleles (![]()
Three hundred of these sixth-generation lines were used to found 100 extended families following the experimental design of Fig 1. A single plant from the sire line was grown to maturity, self-fertilized, and used as a pollen source for a single plant from each of the dam lines. Each of the dams was also self-fertilized. A maximum of eight pots were seeded per family (40 per extended family) on March 9, 2001 in the University of Kansas greenhouses. At 2 weeks after seeding, all pots were thinned to a single plant (randomly selected) and were subsequently fertilized once per week. A series of morphological measurements were taken on the first flower produced by each plant (see ![]()
![]()
Pollen number and the PSI were estimated for each sample using a Coulter counter model Z1 dual (Coulter, Miami). The microcentrifuge tubes were left open for 1 week after pollen collection to allow the anthers to dry and release their pollen. Each sample was then diluted into 20 ml of electrolyte solution and run through the Coulter counter. The machine counts the number of particles between two size thresholds (10 and 25 µM) and the number of particles that are larger than the upper threshold (>25 µM). The PSI is the proportion of grains in the upper size category. The size thresholds were set on the basis of results from a previous study showing a strong positive correlation between PSI and pollen viability as measured through standard staining techniques (![]()
The distributions of pollen number and PSI within both inbred and outbred plants are given in Fig 2. A genetic analysis of the floral morphology is presented elsewhere (![]()
![]()
![]()
|
| HYPOTHESIS TESTING AND PARAMETER ESTIMATION |
|---|
Maximum likelihood is used for both testing the various models and estimating parameters. By methods described below, I find the parameter values for each of the mutational models (I, II, and III) that produce the highest likelihood for the data. These values are the maximum-likelihood estimates (MLEs) for the parameters of that model. The sufficiency of that model is tested by comparing the maximum-likelihood value for the model with the maximum likelihood of the "unconstrained model." The latter is based on direct estimates for the biometric quantities without the particular restrictions imposed by the various mutational models (see section on constraints below).
The resemblance of relatives may depend on environmental factors as well as genetic factors. These factors must be accounted for in order to infer genetic parameters from phenotypic covariances. Maternal effects are a potentially important source of resemblance among relatives. The variance of maternal effects, VM, can be estimated from this design because the comparison between inbred and outbred sibs (estimating COS) involves both male and female parents (Fig 1). Also, the magnitude of environmental deviations may be greater for inbred individuals than for outbred individuals (![]()
![]()
With regard to hypothesis testing, it is important to distinguish universal constraints from model-specific constraints. The universal constraints apply to all of the models including the unconstrained model. These follow from basic features of inheritance, independent of allele frequencies or the nature of genetic effects. The universal constraints are that VE(o) and VE(i) must be positive; VM, CSS, CFS, COS, and CHS must be greater than or equal to zero; CHS
1/2CFS; and the magnitude of COS is limited by CSS and CFS (![]()
Subject to these universal constraints, MLEs were obtained for the unconstrained model by finding the set of values for MO, ß, CSS, CFS, COS, CHS, VM, VE(o), and VE(i) that maximize l, the log-likelihood function. Here, MO is the mean phenotype of outbred plants. The predicted mean phenotype of inbred plants is MO + ß. I use the standard multivariate normal model for the log-likelihood function,
![]() |
(8) |
where V is the variance-covariance matrix of individual measurements, z is the vector of trait values, X is an "incidence matrix" for fixed effects,
is the vector of fixed effects, and C is a constant determined by the total sample size (![]()
![]()
= [MO, ß]T.Equation 8 assumes a multivariate normal distribution of trait values, but is quite robust to deviations from this distribution (![]()
The maximum of l was determined iteratively by using the first derivatives of l with regard to each parameter and the asymptotic dispersion matrix. Following ![]()
![]() |
(9) |
The derivative of l with regard to a particular variance component
is
![]() |
(10) |
where (
/
)(V) is the derivative of the variance-covariance matrix with regard to
and tr[*] is the trace of a matrix.
Optimization was performed using two methods simultaneously, steepest ascent and scoring (![]()
![]()
![]()
The maximum log-likelihoods for each mutation model were also obtained usingEquation 8, although with different values in the variance-covariance matrix, V. The V matrix is a function of four genetic parameters in the unconstrained model (CSS, CFS, COS, and CHS), in addition to the environmental components. For model I, V contains only two genetic parameters, CSS and h, in addition to the same set of environmental components. The various comparisons between relatives within V are functions of these two parameters: COS = hCSS, CFS = 2h2CSS, and CHS = h2CSS. For model II, V also depends on only two genetic parameters: COS = h*CSS, CFS is given byEquation 7, and CHS = CFS/2. For model III, the elements of V are functions of three genetic parameters (CSS, CFS, and COS) with the genetic covariance of half-sibs equated to CFS/2. However, two of these parameters, CFS and COS, are constrained in value: COS < CSS/2 and CFS < COS. These model-specific constraints affect not only the numerical values within V, but also the derivatives of V with regard to each parameter [the (
/
)(V) matrices ofEquation 10].
Models I, II, and III were tested by comparing the maximum likelihood obtained under that hypothesis (l0) to the maximum likelihood obtained from the unconstrained model (l1). The model was rejected if the likelihood-ratio statistic 2(l1 - l0) was greater than the appropriate critical value. It is generally assumed that, under the null hypothesis, 2(l1 - l0) follows a chi-square distribution with the number of degrees of freedom equal to the number of parameters distinguishing the two models. This would suggest a critical value 5.99 (for P < 0.05) for models I and II because the unconstrained model has two more parameters than either deleterious mutation model. There is no obvious choice for model III because only one parameter is eliminated (CHS = 1/2CFS) while other parameters are constrained in value (COS < 1/2CSS, CFS < COS). These inequalities do not translate in a simple way into degrees of freedom. Even the critical values for models I and II may be suspect because the universal constraints apply to the unconstrained model (see ![]()
The parametric bootstrap involves repeated simulation of the data using the parameter estimates from the relevant mutational model (I, II, or III). Genotypic values for the parental lines are first sampled (from a normal distribution with variance CSS) and the progeny groups are then formed conditional on these parental values (and the values of the various parameter estimates). The numbers of individuals in each family are equivalent to the actual sample sizes from the experiment. Once the simulated data set is created, l1 and l0 are determined by applying the maximum-likelihood programs for the unconstrained and deleterious mutation models, respectively. I simulated 1000 data sets per test and the various models were fitted to each data set. The distribution of values for 2(l1 - l0), across simulated data sets, estimates the sampling distribution of the likelihood-ratio statistic. The P value associated with a particular test is (M + 1)/(R + 1), where M is the number of simulated values that exceed the actual likelihood-ratio value (calculated from the original data) and R is the number of simulations (![]()
In addition to their use in hypothesis testing, the parametric bootstrapping programs can be used to estimate the joint sampling distribution of our ML estimators for any mutational model in this experimental design. These simulations thus provide a means to evaluate properties such as bias and sampling variance. The distribution of simulated data sets demonstrates the range of possible experimental outcomes, including values for the likelihood-ratio statistic, when the null hypothesis is correct (when genetic variation conforms to the mutational model under consideration). I conducted simulation studies of both models I and II for a range of parameter sets to investigate (1) the null distribution of the likelihood-ratio statistic and (2) the statistical properties of the estimators. Each simulated data set was equivalent in terms of sample sizes to the experimental study.
| RESULTS |
|---|
Simulation study:
Table 2 summarizes a representative set of results from the simulation study. Cases 15 are simulations from models I, while cases 610 are from model II. Each simulated data set was first fit to the appropriate mutational model and then to the unconstrained model. The second column of Table 2 gives an empirically determined "critical value" for that set of parameters. Ninety-five percent (950 of 1000) of the simulations yielded likelihood-ratio statistics that were less than this value. For model I, the empirical critical values differed among parameter sets but were always less than the chi-square value of 5.99. For model II, there was also variation among parameter sets, but empirical critical values were somewhat higher than those for model I.
|
The maximum-likelihood parameter estimates for both models are approximately median unbiased (Table 2). They exhibit relatively low sampling variances. The procedures do occasionally fail to yield estimates when certain variance components approach boundary values. However, this occurred in only a small fraction of simulations (not at all for most parameter combinations) and these cases are excluded from Table 2.
Mimulus study:
Parameter estimates with standard errors (SEs) are given in Table 3 for each trait under the unconstrained model. The SEs are derived from the asymptotic dispersion matrix. These are only approximate and significantly nonzero estimates are routinely within 2 SE of zero (![]()
![]()
|
Maximum-likelihood values for each trait under model I are far lower than their associated values under the unconstrained model (Table 4). Each likelihood-ratio test is highly significant by both the parametric bootstrap and
2. A P value <0.001 for the parametric bootstrap implies that the observed likelihood-ratio (LR) statistic was greater than any of the 1000 simulated values. The MLEs for each trait are also given. Confidence intervals could be established for these estimates by resampling (bootstrapping) over extended families. I do not bother to justify the parameter estimates here because the model on which they are based is rejected.
|
There is generally better fit of the data to model II, although the model is still rejected for all traits (Table 5). As expected, the likelihood ratios are much lower for model III (Table 6). In fact, genetic variation in both Ln(pollen number) and Ln(PSI) are statistically consistent with the weak conditions. For all of the models, the likelihood ratios are lower for Ln-transformed male fitness values than for untransformed trait values.
|
|
| DISCUSSION |
|---|
This study clearly illustrates both the strengths and the faults of biometric estimates for mutational parameters. On the positive side, I was able to calibrate several different mutational models with an experiment of only moderate size (2345 plants). These models are surprisingly general, allowing variable mutational effects, maternal effects, and distinct environmental deviations for inbred and outbred plants. The parametric bootstrap simulations demonstrate that, if genetic variation is maintained according to one of the mutational models, the MLEs are surprisingly accurate (Table 2). Taken together, these observations suggest that biometric methods have the potential to yield accurate estimates for mutational parameters with much lower effort than is typically necessary for mutation-accumulation experiments.
The primary weakness associated with our parameter estimates is that they are based on models insufficient to explain the data. Likelihood-ratio tests reject models I and II for each of the traits (Table 4 and Table 5). This is not surprising for flower size, as a previous study demonstrated that this trait is inconsistent with a deleterious mutation model of genetic variation (![]()
![]()
![]()
![]()
The rejection of mutation-selection balance for male fitness traits is clearly tentative. Model III was not rejected for either Ln(pollen number) or Ln(PSI). This may reflect the fact that these "weak conditions" are consistent with a range of genetic architectures, at least within the scope of estimation error. However, it may also mean that variation is governed by a deleterious mutation model that is simply more complicated than models I or II (e.g., ![]()
![]()
Models I and II fail because there is too much additive genetic variation in both flower size and male fitness traits. The genetic covariance of full siblings (CFS) is equal to the additive variance when the parents are fully inbred. The genetic covariance of half siblings (CHS) is equal to one-half the additive variance (Appendix A). Under the deleterious mutation models, the amount of additive variation relative to other components depends on the average dominance of deleterious mutations. For example, with h = 0.2 in model I, the variance among fully homozygous genotypes (CSS) should be 12.5 times greater than the variance among outbred genotypes (CFS) and COS should be 2.5 times greater than CFS.
The CSS estimates are substantially greater than CFS estimates, particularly for log-transformed male fitness traits. However, COS is invariably less than CFS when the unconstrained model is fit to the data (Table 3). The low estimates of COS relative to CSS indicate that, if variation is attributable to rare alleles, these alleles must be fairly recessive on average. However, the more recessive they are, the less additive variation there should be. Estimating h just from COS and CSS in Table 3, we would expect the additive variances of the male fitness traits (as measured by CFS and CHS) to be only
20% of their actual estimated values. An excess of additive genetic variation has also been documented for several life-history traits of Drosophila melanogaster and used to reject mutation-selection balance as a sufficient explanation of variation (![]()
![]()
![]()
The higher-than-expected estimates for CFS and CHS also explain why the estimated dominance coefficients for male fitness traits under models I and II (Table 4 and Table 5) are greater than those obtained previously. ![]()
![]()
![]()
![]()
![]()
0.1 for both pollen number and pollen viability (both measurements log-transformed). In contrast, our estimates of h (from model I) are about twice are large (Table 4) and the estimates of
(from model II) are 3.5 times greater (Table 5).
The comparison of F1 families with parental lines is actually contained within this design (Fig 1) and it is straightforward to show that the regression method is actually estimating COS/CSS. If we calculate COS/CSS from the estimates in Table 3, the values are quite close to those obtained by ![]()
(Table 4 and Table 5) depend not only on COS and CSS but also on CFS and CHS. Maximum likelihood finds the best "compromise" given the relative magnitudes of all four variance components. This requires dominance coefficients to be adjusted upward to account for the higher-than-expected values for CFS and CHS. If intermediate frequency alleles make a substantial contribution to variation in fitness, the regression method (COS/CSS) may actually provide a more accurate estimate for the dominance of rare alleles than the procedure used here (J. K. KELLY, unpublished results).
Assumptions:
Models I and II employ an equilibrium model to predict the frequencies of deleterious alleles as a function of mutational parameters. These frequencies are determined by the action of selection in nature (if the model is accurate), but the effects are estimated from variation measured in the greenhouse. This is noteworthy given that both the relationship between trait values and evolutionary fitness and the relationship between genotype and phenotype are likely to be different in the greenhouse than in the wild. Character heritabilities and genetic correlations may change when organisms are assayed in a novel environment (![]()
![]()
![]()
in the model (Table 1). This parameter effectively translates between the measured phenotype and evolutionary fitness.
The important question for this study is whether the change in growth environment is responsible for the poor fit of models I and II to the data. Could the transfer cause rejection of the mutational models even if the natural population is in mutation-selection equilibrium? It is difficult to provide a definitive answer to this question, but it seems unlikely given that the hypothesis tests do not impose stringent constraints on the values of mutational parameters. Instead, parameters such as h and s of model I are estimated from the data. Thus, we should not falsely reject the null model, even if mutational effects are different in the lab.
The allele frequencies at QTL are the key to the constraints. A false rejection of the mutational models might occur if alleles that are deleterious in nature become advantageous in the lab environment. Such alleles might increase in frequency and the resulting lab-adapted population would exhibit an excess of additive genetic variation relative to the field population. However, the opportunity for this kind of selection was greatly limited in this study. The inbred lines that constitute the parents of the breeding design were generated rapidly by self-fertilization, with random selection of progeny within families (![]()
![]()
![]()
![]()
![]()
![]()
An alternative approach would be to let the natural population adapt to the lab environment prior to biometric analyses. This has the advantage that population allele frequencies will settle to values dictated by the selective conditions of the lab environment (the relevant values of U, h, and s if variation in fitness is maintained in mutation-selection balance). Of course, any parameter estimates would be specific to the lab environment. Moreover, a satisfactory fit of the data to a mutational model would not conclusively indicate that natural variation is maintained by mutation-selection balance. Variation maintained by variable selection, in either space or time, or genotype-by-environment interaction, would likely be lost in the process of lab adaptation.
A second potential concern with this study is that the deleterious mutation models assume random mating in the ancestral population. While the Iron Mountain population of M. guttatus is primarily outcrossing (![]()
![]()
![]()
![]()
![]()
Scale of measurement is another important concern for analyses based on patterns of variation. Scale transformations, such as the logarithm or square root of measurements, are routinely used in quantitative genetics to "stabilize" the variance (![]()
![]()
The most appropriate scale for the application of the present methods is the one in which loci contribute in the most nearly additive way to trait variation. On this scale, the mean phenotype should decline linearly with the inbreeding coefficient (![]()
![]()
![]()
![]()
Concluding comments:
More than anything, this study illustrates the need to combine model evaluation (hypothesis testing) with parameter estimation in biometric studies. Three distinct deleterious mutation models were tested. The first two models were rejected by the data for all characters, primarily because of an excess of additive genetic variation relative to other variance components. Thus, while most of the estimates in Table 4 and Table 5 are reasonable, they cannot be treated as unbiased. However, model III was not rejected for either of the log-transformed male fitness components (Table 6). This suggests that a more elaborate deleterious mutation model may prove sufficient.
How generally will mutation-selection balance prove sufficient as an explanation for variation in fitness components? This is clearly the issue that will confront future applications of the biometric method for estimating mutational parameters. The ubiquity of genetic variation is one of our most general observations of natural populations. The question of how evolutionary processes maintain this variation is one of the oldest in evolutionary biology. It remains one of the most important.
| ACKNOWLEDGMENTS |
|---|
This article benefited from comments by Norm Slade, Scott Williamson, Miles Stearns, Liza Holeski, Tara Marriage, Bruce Walsh, and two anonymous reviewers. Tyler Ternes, Scott Jablonski, and H. S. Arathi helped to collect many of the measurements. This research was supported by funds from National Science Foundation grant DEB-9903758 and the Murphy Scholarship fund.
Manuscript received November 11, 2002; Accepted for publication March 10, 2003.
| APPENDIX A |
|---|


























