## Abstract

How new mutations contribute to genetic variation is a key question in biology. Although the evolutionary fate of an allele is largely determined by its heterozygous effect, most estimates of mutational variance and mutational effects derive from highly inbred lines, where new mutations are present in homozygous form. In an attempt to overcome this limitation, middle-class neighborhood (MCN) experiments have been used to assess the fitness effect of new mutations in heterozygous form. However, because MCN populations harbor substantial standing genetic variance, estimates of mutational variance have not typically been available from such experiments. Here we employ a modification of the animal model to analyze data from 22 generations of *Drosophila serrata* bred in an MCN design. Mutational heritability, measured for eight cuticular hydrocarbons, 10 wing-shape traits, and wing size in this outbred genetic background, ranged from 0.0006 to 0.006 (with one exception), a similar range to that reported from studies employing inbred lines. Simultaneously partitioning the additive and mutational variance in the same outbred population allowed us to quantitatively test the ability of mutation-selection balance models to explain the observed levels of additive and mutational genetic variance. The Gaussian allelic approximation and house-of-cards models, which assume real stabilizing selection on single traits, both overestimated the genetic variance maintained at equilibrium, but the house-of-cards model was a closer fit to the data. This analytical approach has the potential to be broadly applied, expanding our understanding of the dynamics of genetic variance in natural populations.

MUTATION is the ultimate source of genetic variation, and determining the properties of new mutations is a necessary step in understanding the maintenance of genetic variation, adaptation, the evolution and maintenance of sex, and extinction risks in small populations (Lande 1975, 1995; Kondrashov and Turelli 1992; Schultz and Lynch 1997; Whitlock 2000; Zhang and Hill 2002; Johnson and Barton 2005). The mutational heritability of morphological and life-history traits, as well as fitness itself, has been estimated for several species of insects and crop plants (Houle *et al.* 1996; Lynch *et al.* 1999; Halligan and Keightley 2009). These studies have indicated that mutational heritability typically falls within the range 0.0001–0.005 (Houle *et al.* 1996; Lynch *et al.* 1999), with a typical persistence time of new mutations of 50 generations for life-history traits and 100 generations for morphological traits (Houle *et al.* 1996). However, if we are to understand the contribution of mutation to processes operating in natural populations evolving in the vicinity of their adaptive optimum, we need to evaluate the consequences of mutations in unmanipulated high-fitness populations under natural levels of recombination and outcrossing, a goal that has yet to be fully realized (Lynch *et al.* 1999).

Two general approaches have been used to characterize the input of new mutational variance each generation. These experimental approaches, which both reduce the opportunity for selection to operate, thereby allowing drift to strongly influence the fate of new mutations, differ in the extent to which they manipulate the standing genetic variance in the population. In mutation-accumulation (MA) divergence experiments, a highly inbred ancestral population is subdivided into many replicate lines, each maintained through inbreeding, and the divergence (variation) among lines is a measure of mutational variance (Lynch and Walsh 1998). The high level of inbreeding imposed in MA experiments results in the assessment of homozygous effects of new mutations against a homozygous genetic background. In some taxa (*e.g.*, *Drosophila melanogaster*), manipulative tools such as balancer chromosomes can be used to accumulate mutations in only part of the genome and in heterozygotes (Lynch and Walsh 1998). Although MA experiments have been employed effectively in a variety of taxa to estimate mutational variance (Houle *et al.* 1996; Lynch *et al.* 1999; Halligan and Keightley 2009), they may not be well suited to investigating the persistence of new mutations (Barton 1990) because it is selection against the mutation in heterozygote form that largely determines the fate of new mutations in natural populations. Furthermore, MA experiments, initiated from a single highly inbred line, do not provide estimates of the genetic variance segregating in the base population, which is necessary for a direct comparison between the mutational and genetic variance from a single population (Houle *et al.* 1996) to understand the selection acting against new mutations.

The second experimental approach, the middle-class neighborhood (MCN) breeding design, overcomes the drawback inherent in MA experiments of accumulating and measuring the effects of new mutations in a highly inbred genetic background. MCN experimental designs begin from an outbred, randomly mating population and reduce the opportunity for natural selection to operate by enforcing equal contributions to the next generation from all families (Shabalina *et al.* 1997; Lynch *et al.* 1999; Mack *et al.* 2001; Yampolsky *et al.* 2001). Although MCN experiments provide a more relevant estimate of the effects and fate of new mutations than do MA designs, they can suffer from other problems (Keightley *et al.* 1998; Lynch *et al.* 1999). In particular, direct estimation of the mutational variance from MCN experiments in the presence of substantial genetic variance in the base population has been highlighted as a key limitation of this design (Lynch *et al.* 1999). One approach to solving this problem has been to impose a quantitative genetic breeding design simultaneously in both the MCN population and the ancestral population from which the MCN was initially derived and using these data to demonstrate higher genetic variance (owing to new mutations) in the MCN population (Roles and Conner 2008). The logistical challenge of applying a large quantitative genetic breeding design in two populations with a single generation (common environment) is likely to limit the application of MCN designs used in this way despite their benefit in terms of allowing the estimation of mutational variance in seminatural populations undergoing random mating and recombination.

Although not yet widely applied, an analytical solution to the problem of estimating the mutational variance in a population with substantial standing genetic variance is available. Wray (1990) adapted the numerator relationship matrix used in Henderson’s mixed-model methodology (the animal model) (Henderson 1976), which uses pedigree information to estimate the additive genetic variance in measured traits, to account for the input of new mutations each generation. The animal model has facilitated recent advances in understanding of microevolution in natural populations where individual phenotypes and relatedness among individuals are known (Kruuk 2004; Wilson *et al.* 2010). Estimation of mutational variance through an animal model effectively addresses the major limitation of MCN designs identified by Lynch *et al.* (1999), allowing mutational variance to be estimated within the more natural context of an outbred population. Specifically, by fitting both the mutational and additive genetic numerator relationship matrices in these models, it is possible to explicitly partition the mutational variance from the standing genetic variance and to account for changes in the standing genetic variance over the generations of mutation accumulation, particularly the loss of variance owing to inbreeding or to the response to selection under laboratory culture. Pragmatically, by collecting phenotypic data from a modest population size each generation is logistically more tractable than attaining large sample sizes within a single generation, such as might be necessary when applying single-generation breeding designs to simultaneously estimate the genetic variance in the ancestral and derived MCN populations. Analysis of phenotypes across multiple pedigreed generations maintains the statistical power to detect effects. The combination of the MCN design with Wray’s mutational numerator relationship matrix within an animal model analysis has the potential to provide relevant estimates of mutational variance across a wide array of taxa not previously considered amenable to MA studies.

The estimation of mutational effects in heterozygous form in an outbred genetic background should more closely reflect the strength of selection acting against mutations when they first appear in a natural population. In particular, the estimation of both mutational and additive genetic variance from the same data, within the same model, will provide the most direct and biologically relevant determination of the average selection against mutations (*s* = *V*_{M}/*V*_{A}). Although prominent in the theoretical literature (Kondrashov and Turelli 1992), these parameters have been determined relatively few times, and current estimates typically depend on genetic and mutational parameters measured in different populations (and in some cases at different times using different experimental protocols in different laboratories), with mutational parameter estimates derived from inbred lines (Houle *et al.* 1996; Denver *et al.* 2005; McGuigan *et al.* 2014). The simultaneous estimation of standing genetic and mutational variance from the same outbred population also allows a direct quantitative assessment of the theoretical models of mutation-selection balance that attempt to explain equilibrium levels of additive genetic variance (Turelli 1984; Lynch and Hill 1986; Bürger 2000; Johnson and Barton 2005).

Despite its promising features, the animal model approach to estimating mutational variance has been applied on few occasions and only in highly manipulated populations (Keightley and Hill 1992; Casellas and Medrano 2008; Casellas *et al.* 2010). Casellas and colleagues adapted the approach of Wray (1990) to estimate the mutational variance in reproductive traits in a pedigreed inbred laboratory population of mice (Casellas and Medrano 2008) and an experimental flock of sheep (Casellas *et al.* 2010) within a Bayesian framework. Here we report animal model estimates of mutational parameters for 19 traits in an outbred population of *Drosophila serrata* measured over 22 pedigreed generations of mutation accumulation under a MCN breeding design. We use these estimates to make a direct quantitative test among competing models of the maintenance of genetic variance under mutation-selection balance.

## Materials and Methods

### Study system and experimental design

Our experiment was conducted on a population of *D. serrata* collected from The University of Queensland’s St. Lucia campus in Brisbane, Australia. The population was initiated from 33 inseminated wild-caught females (polyandry is typical in the wild: see Frentiu and Chenoweth 2008) and underwent eight generations of random mating as a mass-bred population (initiated from >1000 individuals in generations 2–8). This population was maintained on a 14-day-per-generation cycle in a constant-temperature facility set at 25° with a 12:12 light-dark regime and reared on a yeast-agar-sugar medium (Rundle *et al.* 2005). After eight generations of random mating, we initiated a paternal half-sibling breeding design with 120 sires each mated to two dams (Hine *et al.* 2014); to establish the MCN population, one son and one daughter per full-sibling family were randomly paired to found the first generation of the mutation accumulation experiment, generating 240 pairs in the MCN population.

The MCN population was maintained through 240 mating pairs in each generation for a further 21 generations under the same conditions to which it had been adapted when brought into the laboratory from the wild. A total of 10,440 animals were present in the pedigree. A single male and female offspring per family were randomly paired with a nonsibling of the opposite sex as 2-day-old virgins to initiate the next generation. Full pedigree information was recorded for both sexes in every generation.

We determined that the average inbreeding coefficient in generation 22 was 1.198% (corresponding to a 0.0006 increase per generation), slightly lower than in previous MCN studies in *Drosophila* (0.001–0.002 per generation) (Shabalina *et al.* 1997; Yampolsky *et al.* 2001). On average, 5% of pairs failed to contribute offspring to the next generation; when pairs failed to produce any offspring, three offspring from some (randomly chosen) families were used to ensure that the census population size remained constant (240 pairs). Failure to reproduce might reflect viability selection; similarly low levels of reproductive failure were reported from a MCN experiment in radishes, where 3% of plants failed to produce seeds (Roles and Conner 2008), and in *D. melanogaster*, where 1–2% of pairs failed to reproduce each generation (Shabalina *et al.* 1997).

After being left to mate for 3 days, females were discarded, and larvae were left to develop, while males were removed and assayed for cuticular hydrocarbons (CHCs) and wing characteristics. *D. serrata* CHCs, waxy lipids present on the cuticle, are under directional sexual selection through mate choice but also have consequences for nonsexual fitness (Higgie and Blows 2007; Chenoweth *et al.* 2010; Hine *et al.* 2011; McGuigan *et al.* 2011; Delcourt *et al.* 2012). CHC profiles of individual males were determined using standard gas chromatography (GC) protocols (Sztepanacz and Rundle 2012). The abundance of nine compounds was characterized by the area under the respective peak (Figure 1A); to account for nonbiological variation in abundances, we determined the proportion of each of the nine CHCs and transformed these proportions to log contrasts [where (Z,Z)-5,9-C_{24:2} was the peak used as the divisor] for analysis as appropriate for compositional data (Aitchison 1986; Blows and Allan 1998). Wing size and shape were characterized via nine wing landmarks (Figure 1B) (McGuigan and Blows 2013). Wing size was characterized as centroid size (CS), the square root of the sum of squared distances of each of the nine landmarks to their centroid, a metric commonly used to describe size (Rohlf 1999). Wing size is closely correlated with body size in *Drosophila* (Robertson and Reeve 1952; Partridge *et al.* 1987), and although there is no direct evidence of sexual selection on size, it is genetically associated with sexual fitness in *D. serrata* (McGuigan and Blows 2013). Wing shape was characterized via 10 interlandmark distances measured on aligned landmarks (after CS was taken into account). Like size, male wing shape does not appear to be a direct target of female mate choice but is pleiotropically associated with male sexual fitness and viability (McGuigan *et al.* 2011; McGuigan and Blows 2013).

Throughout this paper, we assume that our experimental population has reached equilibrium within the 30 generations after being founded from the field. Although this is unlikely to be strictly true, the approach to equilibrium is more rapid when the initial starting population has a high genetic variance and will be attained within a few dozen generations in populations of size *N*_{e} < 1/4μ, where μ is the genic mutation rate (Lynch and Hill 1986). In our case, the founding population was sampled from a large field population, and in our experimental MCN design, *N*_{e} was 2*N* − 2 = 958 (Crow and Kimura 1970), enforced for the 21 generations of MA. Assuming that μ is in the range 10^{−5} (Lynch and Walsh 1998) to 10^{−10} (Haag-Liautard *et al.* 2007; Keightley *et al.* 2009, 2014, 2015), the size of our experimental population sits comfortably within this limit.

### Data analysis and parameter estimation

The base population (*i.e.*, the sires of the half siblings) were not assayed for CHCs or wings. In subsequent generations, some males died before they could be phenotyped (but after siring offspring); further, delicate wings of some males were badly damaged and could not be measured. Of the 5040 males present in the pedigree in generations 1–21, CHC data were collected for a total of 4945 individuals, and wing data were collected for 4236 individuals. The pedigree and phenotype data are available at http://dx.doi.org/10.6084/m9.figshare.1542535.

The eight CHC log contrasts, wing CS, and the 10 wing-shape traits were each analyzed individually. The CHCs were measured by GC in three separate blocks, within which samples were randomized among generations: block 1 = generations 1–6; block 2 = generations 7–12; and block 3 = generations 13–21. Because CHC means differed among GC blocks and block was confounded with generation, we first mean centered the CHC data within each GC block prior to analysis. No such effects occurred for wings, and wing data were not centered before analysis.

For each of the 19 traits, we partitioned the phenotypic variation into additive, mutational, and residual components in a univariate animal model (1)where *μ* is the overall mean, *t* indicates generation, *a* is the additive genetic breeding value, *m* is the mutation breeding value, and *e* is the residual error. The variance components for additive and mutation effects were calculated as the variance in breeding values scaled by the corresponding relationship matrix and (Kruuk 2004; Casellas and Medrano 2008). For the error variance, we assumed that errors were uncorrelated among individuals, and therefore, , where **I** is an identity matrix. Although, owing to breeding failures, a small number of males each generation (5% on average in any generation; see earlier) shared a common rearing environment (and were full siblings), for reasons of computational simplicity, we do not fit rearing vial in the models. Common environment therefore might have inflated our estimate of additive genetic variance. However, we note that only a small fraction of animals shared a common environment within a generation, and this was not perpetuated among relatives across generations.

Although large pedigrees with incomplete phenotypic information (*e.g.*, for sex-limited traits in economically important agricultural taxa) are common, the pedigree structure from the MCN design differs markedly from those types of populations. Specifically, wild and agricultural populations typically consist of both full and half siblings, with overlapping generations. In the MCN design, generations are nonoverlapping, and total population size remains constant because family size is experimentally controlled in each generation. This results in a strongly linear pedigree structure in which most of the genetic information comes from parent-offspring relationships (coefficient of coancestry = 0.5). We recorded maternal relationships each generation, but preliminary analyses of model (1) showed that including maternal pedigree links resulted in a dense numerator relationship matrix that required prohibitively long computational times to solve.

Another consideration when including female pedigree links is that our analysis would solve for the breeding values of the 5280 females for which we had no phenotype information (Henderson 1976), an important consideration given our prior knowledge that the genetic bases of CHCs (Gosden *et al.* 2012) and wings (McGuigan and Blows 2007) differ between the sexes. Nevertheless, analyzing model (1) with a male-only pedigree has the effect of ignoring many of the relationships resulting in coefficients of coancestry ≤ 0.125 (except grandfather-grandson, great grandfather–great grandson, etc.). We consider further in the discussion how our pedigree affected the accuracy and precision of our parameter estimates. Additionally, relationship matrices for the male-only pedigree do not account for inbreeding, but as noted earlier, inbreeding was very low in the current experiment and unlikely to have strongly influenced parameter estimates.

To estimate additive genetic variance using the animal model, we used the method of Henderson (1976) to calculate the inverse of the additive genetic numerator relationship matrix (2)where is a lower triangular matrix, and is a diagonal matrix. For a male-only pedigree, each row of has one nonzero value (equal to −0.5) to the left of the diagonal linking sires and sons. If the additive relationship matrix **A** is calculated, the elements on the diagonal of are equal to the inverse of the elements on the diagonal of the Cholesky of **A**.

In an animal model context, new mutations will generate variances and covariances among animals that cannot be traced to the base population because they are the result of mutations accruing in generations subsequent to that ancestral base. To estimate the mutational variance, we used the method of Wray (1990) with the modifications outlined in Casellas and Medrano (2008) to calculate the numerator relationship matrix for mutational effects (3)where t is the number of generations, and is the relationship matrix for additive effects attributable to mutations arising in generation k. The elements of describe the additive genetic relationships among individuals in generations after generation k if ancestors in generations before k are ignored. Our pedigree considered 22 nonoverlapping generations; hence **M** was calculated for generations 0–21 [for an illustration of this procedure, see Wray (1990)].

Because and have the same pattern of nonzero values, it is possible to adapt Henderson’s method for calculating [Equation (2)] to calculate (Henderson 1976). Following Wray (1990), (4)where is a diagonal matrix with elements equal to the inverse of the elements on the diagonal of the Cholesky of **M**, and is the same lower triangular matrix as used to calculate earlier.

We fit model (1) using the MCMCglmm package (Hadfield 2010) in R (v. 3.0.1). The MCMCglmm analysis was implemented for all traits individually with 10,050,000 iterations, 50,000 burn-in, and a thinning interval of 10,000, resulting in 1000 posterior estimates of the parameters for each trait. Priors for the location parameters were normally distributed around a mean of zero and a variance of 10^{8}. The scale parameters for additive and mutational variance were chosen based on vague prior knowledge of additive genetic and mutational heritability for morphological traits. For the residual and additive genetic variance components, priors were inverse-Wishart distributed with degrees of freedom equal to 1 and scale parameters equal to two-thirds and one-third of the phenotypic variance, respectively, corresponding to an additive genetic variance of *h*^{2} = 0.3 (Mousseau and Roff 1987; Roff and Mousseau 1987). For the mutational variance, we used a prior conforming to the noncentral *F*-distribution with the scale hyperparameter equal to one one-hundredth of the residual variation. It is recommended that the scale hyperparameter be a value larger than what would be typically expected (Gelman 2006; Hadfield 2014); hence we specified based on the largest mutational heritability observed for morphologic traits in *D. melanogaster* (Houle *et al.* 1996; Lynch *et al.* 1999). In preliminary analyses, we investigated the robustness of our posterior estimates to our prior specifications; varying the values for the scale parameter and degrees of freedom, we observed little change in the parameter estimates. Even a nonsensical prior, where we switched the scale parameter for the prior variance of additive effects and mutational effects (*i.e.*, a prior where *h*^{2} = 0.01 and ), gave posterior distributions that overlapped the posterior distribution from the prior based on vague prior knowledge (as described earlier).

MCMCglmm estimates of variance components are bounded at zero, and thus a hypothesis test for the presence of mutational or genetic variance based on examining whether posterior density intervals of variance components overlap zero is uninformative. One approach to determine whether estimates are greater than zero is to compare them to the variation expected to be observed through random sampling (Hadfield *et al.* 2010; Aguirre *et al.* 2014). To do this, we randomized the pedigree and reestimated *V*_{A} and *V*_{M} from the observed phenotypic data. Estimates of *V*_{A} and *V*_{M} from the randomized pedigree data give an indication of the amount of variation we can expect by chance, given our experimental design and our statistical model. Because our estimates of observed *V*_{A} and *V*_{M} were derived using an animal model based on a multigenerational pedigree with nonoverlapping generations, to remove the biological structure captured by the pedigree, we randomized individuals among families within generations to randomly break identity-by-descent relationships within the data and then recalculated and . We repeated the pedigree randomization 100 times to account for any uncertainty introduced by the randomization procedure. Then, for every randomization, we replaced the observed and matrices in model (1) with the randomized (null) and matrices. The models based on the randomized pedigrees had the same priors, burn-in, and thinning intervals as those for the observed data, except that because we had 100 randomized data sets, we only stored 10 MCMC samples (after burn-in and thinning intervals) from each null data set per trait to give 1000 posterior samples for null *V*_{A} and null *V*_{M}. Overall, the null *V*_{A} and *V*_{M} estimates for each trait indicated low autocorrelation (*r* < 0.1) among MCMC samples within and among randomizations and reasonable convergence (*R* < 1.23) (Gelman and Rubin 1992).

Finally, since the **A** and **M** relationship matrices in Wray’s model (Wray 1990) share many relationships, we investigated the joint behavior of the estimates of *V*_{A} and *V*_{M} arising from model (1). We estimated the correlation between the observed estimates of *V*_{A} and *V*_{M} for each trait, as well as the correlation between the null *V*_{A} and null *V*_{M} estimates (Supporting Information, Table S1). Observed estimates of *V*_{A} and *V*_{M} were uniformly negatively correlated, while the null estimates were essentially uncorrelated, although there was some bias toward very small negative correlations. Correlations among *V*_{A} and *V*_{M} estimates therefore are generated by the biological signal in the data and not the statistical model.

### Data availability

Data are available through figshare: http://dx.doi.org/10.6084/m9.figshare.1542535.

## Results

### Estimates of *V*_{A} and *V*_{M} in an outbred population

Median heritability ranged from 0.073 to 0.499 for the CHC traits and from 0.316 to 0.514 for wing traits (Table 1), suggesting low to moderate heritability for all traits. To determine the statistical support for the presence of additive genetic variance, we compared the posterior distribution from the observed data to the null expectation for each trait generated by the randomization of the pedigree structure (Table S2). Figure 2A illustrates the outcome of this approach for one trait, (Z,Z)-5,9-C_{29:2}. For this trait, the posterior distribution of *V*_{A} estimated from the observed data is distinct from the null, with a median substantially above the null median (Figure 2A). The posterior distribution of the null data has a nonzero median, as expected from the constraint, inherent in the estimation method, to return a positive variance component. However, the density of posterior values of the null close to zero is greater than that for the observed data (Figure 2A). The 95% posterior confidence intervals of the null and observed distributions do not overlap (Figure 2A), indicating significance of the *V*_{A} estimate at the conventional α = 0.05 level. All but three traits [(Z)-9-C_{25:1}, (Z)-9-C_{26:1}, and ILD 5-6] displayed significant *V*_{A} values using this approach (Table S2).

Mutational heritability is calculated by scaling the mutational variance by the environmental variance (Lynch and Walsh 1998), representing the instantaneous rate of increase in heritability when there is no genetic variation. When estimating *V*_{M} from MA lines derived from a common inbred ancestor, only environmental variation and mutational variance contribute to phenotypic variation, and mutational variance will be a small component of the total variance such that the environmental variance essentially corresponds to the total phenotypic variance. Under a MCN breeding design, standing genetic variance can be substantial, here comprising up to 50% of the phenotypic variance (Table 1). We therefore departed from the standard approach and scaled the mutational variance by the total phenotypic variance to calculate mutational heritability ().

Our estimates of ranged from 0.0010 to 0.0068 for 18 of the 19 traits, with one trait (ILD 5-6) having an of over twice the upper value of this range, 0.018 (Table 1). Both the null and observed posterior distributions of *V*_{M} are truncated at zero, an example of which is shown for the trait (Z,Z)-5,9-C_{29:2} in Figure 2B. There are two consequences of this posterior distribution. First, although the median *V*_{M} of the observed data for (Z,Z)-5,9-C_{29:2} is higher than the median *V*_{M} of the null data (Figure 2B), clearly, a standard test of significance of *V*_{M} from the null would fail to reject the hypothesis of no mutational variance for this trait (and other traits) (Table S2). The null distribution tails off more abruptly than the distribution from the observed data, which has more posterior samples with larger estimates of *V*_{M} and a greater density of values sitting further away from zero (Figure 2B). Across all 19 traits, the median of the observed data was always greater than the median of the null data (Figure 3 and Table S2), a result that is highly unlikely to have occurred by chance. The consistency of greater mutational variance in the observed data over the null therefore strongly suggests that we were able to detect mutational variance in this experiment, although our power to demonstrate statistical significance for each individual trait was very limited. We further consider this question of power in the *Discussion*.

The second consequence of the truncated posterior distribution of observed and null analyses is that the same process that results in a push away from the zero boundary in the null distribution will also inflate the estimate the observed median *V*_{M} to some extent. Here we chose to explicitly account for this known inflation bias in our estimates using the null model estimates to correct the observed data estimates (McGuigan *et al.* 2014). Although the same phenotypic data were used in both observed and null model analyses (and *V*_{P} is the same for both) (Table S2), the posterior samples are not paired. Therefore, we cannot apply the bias correction to the 1000 posterior samples but rather apply it to the summary statistic of that distribution, the median. Thus, we determined the median from each distribution and from these medians calculated (Table 1). We similarly corrected estimates of *V*_{M} scaled by *V*_{A}(*s*) or trait mean *CV*_{M} by first correcting the median estimate of *V*_{M} () and then dividing by median *V*_{A}(*s*) or taking the square root and then dividing by the median trait mean *CV*_{M} (Table 1). In the next section, we compare these adjusted estimates of mutational variance scaled by the total phenotypic variance, additive genetic variance, or the trait mean with published estimates.

### Comparison of magnitudes of mutational variance

We compared our adjusted estimates of mutation with those reported for a range of morphological and life-history traits in other plant and animal taxa (Figure 4). Wing traits closely matched the observed median mutational heritability for morphological traits in other taxa, while CHCs were more similar in median heritability to life-history traits (Figure 4A). To our knowledge, we present the first estimate of mutational variation in CHCs or similar pheromone traits. Mutational variation in wing shape and size has been estimated previously from MA experiments in *D. melanogaster* (Santiago *et al.* 1992; Houle *et al.* 1996; Houle and Fierst 2013). The average for *D. melanogaster* wing dimensions [in Table 1 of Houle *et al.* (1996)] was 0.00202 (*CV*_{M} = 0.00136) compared with the slightly higher estimates in this study for *D. serrata* wing centroid size of 0.00459 (0.00159) and median estimate of the 10 wing-shape traits of 0.00447 (0.00239) (Table 1). Houle and Fierst (2013) reported mutational heritability of wing shape from estimates in both heterozygote (0.001) and homozygote (0.0024) form, both again slightly lower than our estimates.

When mutational variance is scaled by the standing genetic variance, one can calculate the average selection against a mutant heterozygote (*s* = *V*_{M}/*V*_{A}) (Barton 1990). Following the pattern for mutational heritability, median *s* was typically higher for wing traits (0.0132) than for CHCs (0.0043) (Table 1 and Figure 4B). For both types of traits, the median *s* was comparable to that reported previously for morphological traits and slightly below that reported for life-history traits (Figure 4B). Notably, there was a single wing trait (distance between landmarks 5 and 6) (Figure 1B) with *s* markedly higher than for other traits (Table 1). This was the only trait with a median selection coefficient above 0.021, which represents the median level of selection against viability mutations (Houle *et al.* 1996).

### Tests of mutation-selection balance theory

With estimates of standing genetic and mutational variance from the same outbred population, we can begin to make direct quantitative assessments of the theoretical models of mutation-selection balance that attempt to explain the levels of *V*_{A} maintained in the presence of stabilizing selection and mutation. For our traits, a positive relationship exists between the magnitudes of *V*_{A} and of *V*_{M} (Figure 5), and we now determine whether this relationship is consistent with drift alone or direct stabilizing selection on individual traits. The trait ILD 5-6, with the outlying estimate of , has not been included in the following analyses because it would have had a highly disproportionate influence on the model-fitting approach employed; we note, however, that the pattern for this trait was consistent with the predicted pattern under the house-of-cards model generated from the other 18 traits.

As a first step, the equilibrium genetic variance under a neutral model is given by *V*_{A} = 2*N*_{e}*V*_{M} (Lynch and Hill 1986). For our experiment, where *N*_{e} = 958 and the median adjusted *V*_{M} = 4.52 × 10^{−6} (Table S2), this equation predicts an additive genetic variance of 8.65 × 10^{−3}, twenty times greater than the observed median *V*_{A} of these traits (0.397 × 10^{−3}) (Table S2). This neutral expectation is plotted in Figure 5 on the heritability scale and clearly demonstrates that assuming no selection acting against mutations affecting these traits results in substantial overestimation of the standing genetic variance given the observed mutational variance. The failure of this model is not surprising given the evidence for selection acting on both wing and CHC traits in this and related *Drosophila* species (Hine *et al.* 2011; McGuigan *et al.* 2011; McGuigan and Blows 2013).

Next, we considered two competing models of drift-mutation-selection balance for a single trait under direct stabilizing selection. First, assuming that the variance in heterozygous effects of mutations at each locus follows a Gaussian distribution, the Gaussian allelic approximation for mutation-selection equilibrium is given by Lynch and Lande (1993) as (5)where ℓ is the number of loci that contribute to the trait, *V*_{s} is the strength of stabilizing selection on the trait (scaled by *V*_{P}), and and are substituted for *V*_{A} and *V*_{M} to place the prediction on the heritability scale. Although most theoretical work assumes *V*_{s} ≥ 20, empirical estimates of this parameter are lower (median = 5) (Kingsolver *et al.* 2001, 2012; Johnson and Barton 2005), so we fixed *V*_{s} = 5 to fit Equation (5) to the data. Maximum-likelihood optimization was conducted using the NLMIXED procedure in SAS (v. 9.3) (SAS Institute 2011). Figure 5 shows the fit to the data of this model (−2LL = −23.9, AIC = −19.9) when the maximum-likelihood estimate of ℓ = 4.2 is used. It is important to note that this model-fitting approach is not employed to give biologically interpretable estimates of the parameter ℓ because it is completely confounded with *V*_{s}, which we fix at *V*_{s} = 5 to fit model (5).

Using an alternative set of assumptions known as the *house-of-cards mutational scheme*, the distribution of mutational effects at each locus is assumed to be highly leptokurtic, with most *V*_{M} contributed by rare alleles of larger effect (Johnson and Barton 2005). The stochastic house-of-cards approximation for a single trait under direct stabilizing selection is given by (Bürger 2000, equation 2.15) (6)where *U* is genomic per-trait mutation rate, which is thought to be on the order of *U* = 0.1 (Lynch and Walsh 1998). Again fixing *V*_{s} = 5 and substituting and for *V*_{A} and *V*_{M}, we fit this model to the data (−2LL = −28.6, AIC = −24.6), and Figure 5 shows the result when the maximum-likelihood estimate of *U* = 0.061 is used. As with ℓ, this analysis cannot provide a biologically interpretable estimate of *U* because it is confounded with *V*_{s}, which we fix [see model (6)]. Comparison of the Akaike information criterion (AIC) suggests that the house-of-cards model is a better fit to the observed data than the Gaussian allelic approximation. However, as we discuss later, both models tend to overestimate the predicted genetic variance when more biologically plausible parameter values are used.

## Discussion

The generation of new mutational variance in natural populations has widespread evolutionary implications, and a key goal is to understand the consequences of mutation in populations evolving in the vicinity of their adaptive optimum under natural levels of recombination and outcrossing (Lynch *et al.* 1999). The mutational numerator relationship matrix developed by Wray (1990), when used within an animal model, has the potential to change our approach to this difficult task. This approach allows the estimation of mutational parameters from a wide array of taxa and traits in the same way that the widespread adoption of the animal model itself expanded the study of genetic variance beyond the laboratory or captive bred populations.

### Mutational variance in an outbred population

The estimated mutational heritability in wing shape and CHCs for *D. serrata* males was very similar in magnitude to that reported from inbred-line experimental designs applied to *D. melanogaster*. Lynch *et al.* (1999) reported the range of published estimates of to be 0.0005–0.005 for *D. melanogaster*, matching the range found here for *D. serrata* of 0.0006–0.006 with one exception. The similarity in the magnitude of between estimates derived from inbred and outbred experimental populations is surprising because there are two reasons that could cause measured in an MCN experimental design to be downwardly biased in comparison to an inbred population. First, in MCN experimental designs, the high (relative to inbred lines) within-family variance provides the opportunity for selection to act (Lynch *et al.* 1999; Moorad and Hall 2009), under our experimental design, via competition among full siblings within each family vial. Selection against new deleterious mutations then could result in an underestimation of the mutational variance. It is worth noting that selection likely operates on inbred MA divergence lines to an underappreciated extent (Keightley *et al.* 1993; Lynch *et al.* 1999; McGuigan and Blows 2013), suggesting that the comparison of estimates from the different methods is not simply one between the presence and absence of selection.

Second, in the MCN design, mutational effects are characterized in a heterozygous genetic background in contrast to studies of inbred MA lines, where effects are typically determined in homozygous form. If many mutations are recessive, heterozygous effects of mutations on the phenotype will be smaller than homozygous effects, consequently reducing the level of mutational heritability detected under the MCN design. This question of dominance has been explicitly addressed in inbred-line MA studies by crossing among MA lines to estimate mutational effects in heterozygotes. For instance, Houle and Fierst (2013) reported two to three times higher mutational heritability of wing shape when estimated from crosses within *vs.* among MA lines. Although crosses among MA lines assess dominance effects of mutations, the genetic background within which these mutations are assessed is still relatively genetically depauperate. There are few estimates of heterozygous mutational heritability in outbred populations (see Shabalina *et al.* 1997; Roles and Conner 2008). The only other estimate that we are aware of to come from application of Wray’s method (Wray 1990) in an outbred population is = 0.006 for litter size in a domesticated sheep population (Casellas *et al.* 2010). That estimate is at the upper end of reported estimates of mutational heritability in life-history traits (Houle *et al.* 1996) and at the upper range of our estimates here for morphological traits. Overall, it appears that neither source of potential downward bias (within family selection or heterozygous effects) has had a discernible impact on estimated in our outbred population. The magnitude of the mutational variance estimated from inbred populations therefore may be broadly representative of the variance generated by the effects of heterozygous mutations, although further studies in other taxa and traits are needed to confirm this.

It follows that estimates of selection coefficients derived from inbred lines also may reflect the magnitude of selection against heterozygous mutations, of particular interest in models of mutation-selection balance. To the best of our knowledge, our estimates of selection coefficients (Table 1) represent the first in which both the numerator and the denominator have been estimated in the same outbred population; previous estimates of selection coefficients of new mutations were the product of combining estimates of *V*_{M} and *V*_{A} (or *V*_{G}) from different experiments on distinct populations (*e.g.*, outbred for *V*_{A} and inbred for *V*_{M}) (Houle *et al.* 1996). The ratio *V*_{M}/*V*_{A} estimated from the same data in an outbred population reflects the heterozygous effects of new mutations in an outbred genetic background, which closely resembles how mutations would affect the phenotype when they first arise in a natural population. Our estimates have a median of 0.008, comparable to the median estimate for morphologic traits reported for a range of taxa (Houle *et al.* 1996), suggesting an average persistence time of mutations affecting such traits of about 125 generations.

### Limitations of the experimental design

Despite several lines of evidence that suggested we successfully captured the effects of mutation in an outbred population, we were unable to provide evidence for statistically significant mutational variance in any single trait. Similar problems have been encountered when applying the animal model to estimate additive genetic variation in wild populations; when estimating quantitative genetic parameters under natural conditions, data will be more variable and larger sample sizes are required to provide sufficient statistical power (Kruuk 2004). In our experiment, greater variability was introduced not by the environment (as in wild populations) but because we were focused on mutational variance while allowing standing genetic variation to persist in our experimental population, a key feature that allowed us to estimate mutational effects in the heterozygous state.

In general, and ignoring biases introduced by maternal and environmental effects (Clément *et al.* 2001; Kruuk and Hadfield 2007), in an animal model analysis of genetic variation, statistical power is determined by the size of the base population, accuracy is determined by the number of pedigree links between levels of the fixed effects (*i.e.*, generations in our case) (Kennedy and Trus 1993; Hanocq *et al.* 1996; Clément *et al.* 2001; Quinn *et al.* 2006), and precision is determined by the number of phenotypes per family (Quinn *et al.* 2006). All these properties of the experimental design are captured by the T-matrix and hence should apply to both additive genetic [Equation (2)] and mutational [Equation (4)] effects. For our particular experimental design, consisting of 120 base-population males with a pedigree maintained for a further 21 generations and one individual measured per family, we favored accuracy, likely at the expense of power and precision. We expect that increasing the number of individuals sampled in the base population and measuring the phenotypes of multiple individuals from each family would result in greater power to detect effects and more precise estimates of mutational variance.

### Mutation-selection balance in an outbred population

The processes maintaining genetic variation in natural populations remain one of the unresolved questions in evolution (Johnson and Barton 2005). Genetic variation might be maintained by balancing selection, favoring alternate alleles or phenotypes, or through the balance between the input of new mutations and their elimination via selection. Mutation is not an explicit parameter of balancing selection models, although it is the ultimate source of the genetic variance maintained under this mode of selection. Here we explored the implications of the observed magnitudes of standing and mutational variance for models of mutation-selection balance. Mutation-selection balance models of the maintenance of genetic variation have struggled to reconcile the general observations of strong stabilizing selection, high heritability, and high mutation in natural populations (Johnson and Barton 2005). Consistent with this view, both the Gaussian and house-of-cards models tended to overestimate the standing level of *V*_{A} in this population when compared to the data. Notably, we assumed strong selection (*V*_{s} = 5) compared to *V*_{s} ≥ 20 that is used in most theoretical treatments (Johnson and Barton 2005) of mutation-selection balance. Weakening the strength of selection increases the equilibrium *V*_{A} predicted by these models, further worsening their fit to the observed data. Similarly, forcing the other two parameters that were free to vary, ℓ and *U*, to possibly more realistic values, say, ℓ > 10 or *U* = 0.1 (Lynch and Walsh 1998), also increases the equilibrium level of *V*_{A}.

Johnson and Barton (2005) identified three different classes of mutation-selection balance models: direct stabilizing selection on individual traits, direct selection on multiple traits, and pure pleiotropy models (see also Zhang and Hill 2005). Having shown that direct selection on single traits is unlikely to adequately explain the estimated levels of additive and mutational genetic variance in this outbred population of *D. serrata*, we highlight two points concerning the other two classes of mutation-selection balance models. First, we can exclude the pure pleiotropy model as a good fit to these data based on our current estimates. Under the pure pleiotropy class of model, observed mutational heritability on the order of 0.001 implies that standardized stabilizing selection gradients on traits must be −0.001 (*V*_{s} = 500) or weaker to maintain *any* level of heritability (Johnson and Barton 2005, equation A25). While this strength of stabilizing selection has been observed on the genetic variance of some individual CHC traits, it varies considerably among traits and has been reported to be up to 100 times stronger on one CHC expressed in males of our study species (Delcourt *et al.* 2012). A pure pleiotropy model of mutation-selection balance is therefore unable to explain the levels of heritability in this outbred population given the strength of stabilizing selection that has been observed on some of these traits.

Second, it is well known that the morphological traits analyzed here are not genetically independent of one another (Blows *et al.* 2004; Mezey and Houle 2005; McGuigan and Blows 2007; Houle and Fierst 2013; Hine *et al.* 2014), and such pleiotropy will tend to reduce the level of heritability that can be maintained under mutation-selection balance (Bürger 2000, p. 298). This suggests that a multivariate assessment of mutation-selection balance might be informative given that single-trait models generally overestimated the predicted heritability. An assessment of the multivariate extension of the direct selection model requires estimation of the additive and mutational variance-covariance matrices (Lande 1980), which we are currently pursuing for these traits.

How genetic variation is maintained in the presence of selection remains one of the fundamental unresolved issues in evolutionary biology and a key question given that mutation and stabilizing selection are ubiquitous evolutionary forces (Turelli 1984; Johnson and Barton 2005; Walsh and Blows 2009). Mutation-selection balance is one of the two leading explanations (along with balancing selection) for the levels of standing variance in natural populations (Barton 1990; Johnson and Barton 2005; Zhang and Hill 2005), but the data available on key mutation parameters are sparse and of low quality (Houle *et al.* 1996). Wray’s insight (Wray 1990) has provided us with a way of estimating key mutational parameters in an outbred population. By combining these estimates of mutation genetic variance with estimates of standing genetic variance derived from the same population and statistical model, more detailed and internally consistent investigations of mutation-selection balance in outbred populations become feasible.

## Acknowledgments

We thank D. Petfield, A. Denton, and M. E. Vidgen for maintaining this experiment over 18 months; N. Wray, H. S. Lee, and J. Hadfield for discussions on the implementation of the M-matrix animal model; and D. Houle and an anonymous reviewer for comments on a previous version of the manuscript. This research was funded by the Australian Research Council.

## Footnotes

*Communicating editor: C. D. Jones*Supporting information is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.115.178632/-/DC1

- Received May 26, 2015.
- Accepted September 13, 2015.

- Copyright © 2015 by the Genetics Society of America