# The Anomalous Effects of Biased Mutation Revisited: Mean–Optimum Deviation and Apparent Directional Selection Under Stabilizing Selection

- Xu-Sheng Zhang
^{1}and - William G. Hill

- 1
*Corresponding author:*Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, W. Mains Rd., Edinburgh EH9 3JT, United Kingdom. E-mail: xu-sheng.zhang{at}ed.ac.uk

## Abstract

Empirical evidence indicates that the distribution of the effects of mutations on quantitative traits is not symmetric about zero. Under stabilizing selection in infinite populations with normally distributed mutant effects having a nonzero mean, Waxman and Peck showed that the deviation of the population mean from the optimum is expected to be small. We show by simulation that genetic drift, leptokurtosis of mutational effects, and pleiotropy can increase the mean–optimum deviation greatly, however, and that the apparent directional selection thereby caused can be substantial.

IN most models of the maintenance of genetic variance in quantitative traits by mutation–selection balance it is assumed, not least for mathematical simplicity, that the distribution of the effects of mutations on these traits is symmetric about zero (*e.g.*, Bulmer 1980; Turelli 1984; Barton 1990; Keightley and Hill 1990; Zhang and Hill 2002). Mutation-accumulation experiments indicate, however, that mutations significantly affect average values of quantitative traits (Santiago *et al*. 1992; Lyman *et al*. 1996; Mackay 1996; Keightley and Ohnishi 1998; Lynch *et al*. 1998; Garcia-Dorado *et al*. 1999; Vassilieva and Lynch 1999; Ostrow *et al*. 2007; P. D. Keightley and D. L. Halligan, personal communication). For example, Garcia-Dorado *et al*. (1999) found that the mean effect of mutations on abdominal bristle number in *Drosophila melanogaster* is −0.24 environmental standard deviations.

Recently, Waxman and Peck (2003) investigated a model in which this symmetry assumption was relaxed, *i.e.*, a bias from zero in the mean of the distribution of mutational effects. They found that the deviation between the mean phenotypic value of the trait and the optimum (the mean–optimum deviation) is small. Similar estimates were previously obtained by Bürger (2000; see p. 247, Equation 7.13). Both analyses were based on a number of other similar assumptions, however, which turn out to have an important influence on the mean–optimum deviation. In this note we consider some of these assumptions and focus on their impact on Waxman and Peck's (2003) conclusions.

Waxman and Peck (2003) assumed a model in which the allelic effects of mutations at individual loci were normally distributed, but with a mean that departed from zero. Although they allowed differences among the parameters of the distribution of mutant effects at four different loci, in accordance with Welch and Waxman (2002), this generated overall distributions that did not deviate far from the normal (*e.g.*, kurtosis ∼4 in the model for their Figure 3; *cf*. 3 for the normal). Furthermore, in Waxman and Peck's method of generating mutations, the mutants occurring most commonly have an effect equal to the bias. Empirical evidence shows, however, that the distribution of mutational effects on quantitative traits is leptokurtic, with most mutations having very small effects and a few having very large effects (Simmons and Crow 1977; Mackay *et al*. 1992; Caballero and Keightley 1994; Garcia-Dorado *et al.* 1999; Lynch *et al.* 1999; P. D. Keightley and L. D. Halligan, personal communication). Hence a normal distribution is inappropriate.

The mutational variance produced per generation is of rather similar magnitude for different traits and species, (Falconer and Mackay 1996; Houle *et al*. 1996; Lynch and Walsh 1998; Keightley 2004), where is the environmental variance, λ is the average number of mutations per generation per haploid genome, and *a* is the difference in value between mutant and wild-type homozygotes. As *V*_{M} from published experiments depends on the mean square rather than the variance of mutational effects and the mutational bias on the trait implies that *E*(*a*) = Δ ≠ 0, this indicates that, even if all mutants had the same effect, the bias Δ could not exceed (see Vassilieva and Lynch 1999, p. 122). Thus the bias has to be small if the mutation rate is high, whereas Waxman and Peck's (2003) method of generating biased mutational effects with unlimited nonzero mean also implies that the mutational variance *V*_{M} can take any value.

A further important and unrealistic assumption of Waxman and Peck's model is that mutants have an effect on fitness solely through their effect on the trait by stabilizing selection and have no pleiotropic effect on fitness acting through other traits, contrary to the known widespread pleiotropic effect of mutations (Barton and Keightley 2002; Mackay 2004). Bürger's (2000) model was also based on the assumptions of no pleiotropic selection and distributions of mutant effects that are close to the normal. Therefore, to take into account empirical knowledge of mutation parameters, we use a more general joint-effect model as in our previous studies (Zhang and Hill 2002). This includes both pleiotropic and stabilizing selection and a distribution of effects that is much more leptokurtic than the normal and has a mode at zero. We have previously assumed a symmetric effect of mutations on the trait, but we now remove this assumption.

## MODEL AND METHOD

A population of *N* diploid monoecious individuals, with discrete generations, with random mating, and at Hardy–Weinberg equilibrium, is assumed. Mutations are assumed to have additive effects on a quantitative trait *z*, with *a* being the difference in value between homozygotes, and pleiotropic effects on fitness, with *s* (*s* ≥ 0) being the difference in the fitness between homozygotes. Further, it is assumed for simplicity that there is no linkage, epistasis, or overdominance. The quantitative trait is assumed to be under real stabilizing selection with the optimum phenotype at zero and strength characterized by the variance *V*_{s} of its fitness profile. As mutational effects tend to reduce the magnitude of quantitative traits (see references listed above), we consider only negative bias, but the same conclusions would hold for positive bias.

Mutants can have positive or negative effects on the trait. Both were assumed to follow a gamma (α_{a}, τ_{a}) distribution with scale parameter α_{a} and shape parameter τ_{a}, but with a higher chance *P* () of having a negative value, so that mutational effects have mean (*i.e*., the bias) Δ = (1 − 2*P*)τ_{a}/α_{a} and mean square . This was termed the “proportional” method by Keightley and Hill (1987) and is illustrated in Figure 1. The maximum bias in this sampling method is therefore Δ *= −*τ_{a}/α_{a}. The mutational bias and the mean–optimum deviation are expressed in terms of the environmental standard deviation σ_{E}. The pleiotropic effect −*s* (*s* > 0) on fitness of mutations was assumed to follow a gamma (α_{s}, τ_{s}) distribution.

The pleiotropic effect of mutations can cause apparent stabilizing selection because individuals that carry more mutations are more likely to have extreme trait values and lower fitness, inducing a quadratic relationship between them (Keightley and Hill 1990). With biased mutations, the population mean will be drawn in the direction of bias, and apparent directional selection on the trait due to a linear association between phenotypic value and fitness will arise from the real stabilizing selection. If the trait value under selection is *z* and its fitness is *w*, the strength of selection can be decomposed into linear and quadratic terms using regression methods (Lande and Arnold 1983): *w* = α + β*z* + γ*z*^{2}. Suppose that the observed fitness, which includes the pleiotropic effect, is *w _{i}* for an individual having simulated trait value

*z*. Employing least squares, we can estimate both the linear and the quadratic selection gradients,(1)where σ

_{i}^{2},

*m*

_{3}, and

*m*

_{4}are the observed variance and the third and fourth moments of trait values, respectively. If

*m*

_{3}is very small these formulas reduce to(2)(Lande and Arnold 1983).

Analysis was undertaken by including biased mutation in our individual-based Monte Carlo simulation program (Zhang *et al*. 2004). Each generation the sequence of operations was mutation, selection, mating, and reproduction. The fitness of individual *i* was assigned as *w _{i}* = 1 − [Σ

*+*

_{j}s_{ij}*z*

_{i}^{2}/2

*V*

_{s}], where

*z*= Σ

_{i}*is the value of the trait and the optimum is assumed to be zero. If 0 <*

_{j}a_{ij}*w*≤ 1, then the chance that individual

_{i}*i*was chosen as a parent of the next generation was proportional to

*w*; otherwise, it had no offspring.

_{i}The population was started from an isogenic state. Under the approximation of the house-of-cards model (Turelli 1984), Monte Carlo simulations show that after 3*N* generations the population reaches a dynamic equilibrium with its mean phenotypic value and the genetic variance distributed around constant values. Hence, for the following 1000 generations, the mean, the variance, and the third and fourth moments of trait values were computed and averaged to calculate the mean phenotypic value, the variance, and linear and quadratic selection gradients [using formulas (1)] at equilibrium. Results given in the figures were calculated from 16 replicates. In addition, it was assumed that the genomic mutation rate λ = 0.1, the mutational variance is , there are 500 mutable loci, and the strength of stabilizing selection is (except Figure 6, where *V*_{s} ranges from 5 to 30).

## RESULTS

Waxman and Peck (2003) assumed an infinite population and found that the predicted mean–optimum difference (M–OD) is proportional to the per-locus rate of mutation. For plausible choices of parameter values, their estimate of the maximum M–OD could not be >0.01σ_{E}. Using their model of normally distributed mutant effects, but in a finite population, numerical simulations show that genetic drift can enhance the M–OD as pointed out by Waxman and Peck, such that this deviation can be substantial if the population size (*N*) is small (Figure 2). For example, it can be up to 0.1σ_{E} when the effective population size is ∼100 and the mutational bias is Δ = −0.04. Hence the apparent directional selection becomes much stronger with reduced population size. With our joint-effect model, similar but smaller increases are found in both the M–OD and the strength of the apparent directional selection (Figure 2).

With our proportional model of gamma-distributed effects with more than half the mutants decreasing the trait, simulations show that the genetic variance *V*_{G} appears insensitive to mutational bias, as predicted by Waxman and Peck (2003). With increasing mutational bias within its possible range, the size of the M–OD and the apparent directional selection increase, but the apparent strength of stabilizing selection decreases (Figure 3). If the mutational effects on the trait are sampled from a reflected gamma distribution (*i.e.*, symmetric about the mean) but with mean equal to the bias, as assumed by Waxman and Peck but using a normal distribution, we also found that the M–OD increased nonmonotonically. However, for given parameter values, the M–OD is slightly larger than that obtained by the proportional method, but the linear selection gradients of the two models are roughly the same (data not shown).

Pleiotropic effects of mutants on fitness obviously add to the effects of stabilizing selection and therefore reduce genetic variance (Figure 4, a and d), just as when mutation effects are unbiased (Zhang and Hill 2002). For a given bias Δ, pleiotropic selection increases the M-OD up to some point, but then stronger pleiotropic selection decreases it (Figure 4b). When pleiotropic selection is much stronger than stabilizing selection, clearly all mutants will be deleterious and will be lost rapidly, leading to small genetic variance and small M–OD. It is analogous to the situation when the mutation rate λ → 0 (Bürger 2000, p. 236, Equation 6.10). When pleiotropic selection is relatively weak, however, such that stabilizing selection predominates, the argument is more subtle (see the appendix for analysis). In the absence of pleiotropic effects, mutants of positive effect are at a relative selective advantage to the majority of the mutants that reduce the trait, thereby maintaining the mean near the optimum (see Equation A1). With small pleiotropic effects, the duration of segregation and fixation probability of the increasing mutants is also reduced, allowing the mean to drift away from the optimum. Increasing strength of the pleiotropic selection increases the apparent directional selection monotonically, however, because individuals that carry more mutants are expected to have a lower trait value and thus smaller fitness. The increase in apparent directional selection can be substantial: for example, with pleiotropic selection of strength *E*(*s*) > 0.1, the gradient β of apparent directional selection increases from near zero to 0.06 for the parameters used in Figure 4.

As in the nonbiased mutation model, genetic variance decreases as trait effects of mutants become more leptokurtic (Zhang and Hill 2002); with extreme leptokurtosis (τ_{a} < 0.2), the M–OD depends greatly on the value of the shape parameter τ_{a}, but is affected little at higher values of τ_{a} (Figure 5b). The leptokurtosis can greatly strengthen the apparent directional selection (Figure 5c) but appears to have a weak effect on the apparent strength of quadratic selection (data not shown). If the real stabilizing selection is weaker (*i.e.*, increasing *V*_{s}), the apparent stabilizing selection also weakens and the genetic variance increases, and also, as predicted by Waxman and Peck (2003), the M–OD increases but the apparent directional selection weakens (Figure 6). With an increase in the leptokurtosis of the pleiotropic effect (*i.e.*, decreasing its shape parameter τ_{s}), both apparent directional and quadratic selection weaken and genetic variance increases (Zhang and Hill 2002), but the M–OD remains roughly the same (results not shown).

## DISCUSSION

In this simulation-based study we find that the mean–optimum deviation and the magnitude of apparent directional selection can be substantial. Within the range of biologically plausible values of mutation and selection parameters, the mean–optimum deviation can be large (up to 10% of an environmental standard deviation) and the apparent directional selection caused by such a deviation can also be large, with a linear selection gradient up to ∼0.06 for the parameter values investigated in this study. This is in contrast to the results of Waxman and Peck (2003) and Bürger (2000), whose models were based on real stabilizing selection, infinite populations, and mutation distributions that do not deviate far from the normal. In both studies, predicted values of the M–OD were small under plausible choices of parameter values, <1 and <2% of the environmental standard deviation, respectively. Our simulations show that genetic drift (see Figure 2), pleiotropic selection (Figure 4), and leptokurtosis of the mutational effects on the trait (Figure 5) can all increase the M–OD and apparent directional selection. Clearly the considerable discrepancy between our results and those of Waxman and Peck (2003) and Bürger (2000) comes from the combination of all three differences between the models.

These findings provide useful information for understanding some problems in evolutionary biology. Perhaps the clearest example that quantitative traits are under stabilizing selection is for birth weight in humans before modern medicine; but the optimum birth weight was greater than the observed mean (Cavalli-Sforza and Bodmer 1971; Zhivotovsky and Feldman 1992). As the bias of mutation effects on most traits in different model species (*e.g.*, Drosophila, Daphnia) appears to be downward (*e.g.*, Garcia-Dorado *et al*. 1999 and other references above), it can surely be one explanation for the mean–optimum deviation.

Directional selection has been detected for many traits in natural populations, with a median gradient of 0.15 (Hoekstra *et al*. 2001; Hereford *et al*. 2004). This estimate may be biased upward by publication bias and also by skew in the distribution of trait phenotypes, when Lande and Arnold's (1983) formula can overestimate the strength of selection. Skew in the phenotypic distribution due to asymmetric mutations can double the linear selection gradient (Figure 4c). In some field studies no genetic response in quantitative traits under functional (real) directional selection has been observed, even though there is adequate heritable variation (Kruuk *et al*. 2002). Our results provide a possible or partial explanation: under joint pleiotropic and stabilizing selection the apparent directional selection caused by mutational bias can be substantial, especially in small populations and where mutational effects on the trait are highly leptokurtic.

## APPENDIX: INFLUENCE OF PLEIOTROPIC SELECTION ON THE MEAN–OPTIMUM DEVIATION

Assume strong selection such that the frequency of any mutant allele is low and its overall selective value can be approximated by(A1)(Zhang *et al*. 2004). Its mean frequency can be approximated by (Kimura 1969), and the population mean can be evaluated as(A2)where *h*(*a*, *s*) is the joint distribution of trait effect *a* and pleiotropic effect *s* of mutations. If pleiotropic selection is much stronger than stabilizing selection, , then the mean value of the trait can be approximated by , where is the harmonic mean of pleiotropic effect on fitness, indicating that the M–OD decreases with pleiotropic selection.

For the situation where *s* is much smaller than stabilizing selection, we consider a simplified situation where there are only two different mutants with the same pleiotropic effect −*s* but opposite effects on the trait: *a* and −*a* (*s*, *a* > 0). If the probability that the negative mutant occurs is , we have from Equation A2(A3)It is easy to see that the mean value of the trait from (A3) is negative. We want to prove the M–OD decreases with the pleiotropic effect of mutation when it is weak (*i.e*., pleiotropic selection increases the size of the M–OD); that is, *y* ≡ . Differentiating both sides of Equation A3 with respect to *s*, after some algebra we have at *s* = 0,(A4)where is the solution to (A3) at *s* = 0; *i.e.*,(A5)Under the condition of 32λ*V*_{s} ≫ *a*^{2}, which should be satisfied for most plausible mutants, (A5) can be reduced to an approximately linear equation with solution Then , and therefore *y* < 0.

If a trait is controlled by many pairs of mutants, (−*s*, −*a _{i}*) (−

*s*,

*a*)

_{i}*i*= 1, …,

*n*, the same inequality is expected to hold. Therefore, weak pleiotropic selection can increase the size of the M–OD. However, as shown following Equation A2, strong pleiotropic selection leads to rapid loss of mutants and small M–OD.

## Acknowledgments

We are grateful to R. Bürger, D. Houle, and an anonymous reviewer for their comments and constructive suggestions on a previous version of this manuscript. This work was supported by a grant from the Biotechnology and Biological Sciences Research Council (15/G13242).

## Footnotes

Communicating editor: D. Houle

- Received October 16, 2007.
- Accepted April 14, 2008.

- Copyright © 2008 by the Genetics Society of America