## Abstract

Although the frequency and effects of neutral and nearly neutral mutations are critical to evolutionary patterns and processes governed by genetic drift, the small effects of such mutations make them difficult to study empirically. Here we present the results of a mutation-accumulation experiment designed to assess the frequencies of deleterious mutations with undetectable effects. We promoted the accumulation of spontaneous mutations by subjecting independent lineages of the RNA virus ϕ6 to repeated population bottlenecks of a single individual. We measured fitness following every bottleneck to obtain a complete picture of the timing and effects of the accumulated mutations with detectable effects and sequenced complete genomes to determine the number of mutations that were undetected by the fitness assays. To estimate the effects of the undetected mutations, we implemented a likelihood model developed for quantitative trait locus (QTL) data (Otto and Jones 2000) to estimate the number and effects of the undetected mutations from the measured number and effects of the detected mutations. Using this method we estimated a deleterious mutation rate of *U* = 0.03 and a gamma effects distribution with mean and coefficient of variation = 0.204. Although our estimates of *U* and fall within the range of recent mutation rate and effect estimates in eukaryotes, the fraction of mutations with detectable effects on laboratory fitness (39%) appears to be far higher in ϕ6 than in eukaryotes.

THE frequency with which neutral and nearly neutral mutations arise is critical to evolutionary patterns and processes that are governed by genetic drift, including the genetic basis of quantitative trait variation (Keightley and Hill 1988; Eyre-Walker *et al*. 2006), the molecular clock (Ohta 1977), and Muller's ratchet (Butcher 1995). However, because mutations with small effects are impossible to detect using phenotypic assays, most experiments designed to measure directly the frequency and effects of spontaneous mutations have neglected this class of mutations (but see Davies *et al*. 1999; Estes *et al*. 2004). Estimates of nearly neutral mutation rates have, therefore, been limited mostly to inferences from phylogenetic analyses of sequence data, inferences that are only as good as the assumptions they make about population demography and the neutrality of synonymous mutations.

Empirical attempts to measure the shape of the mutation effect distribution come from two sources—collections of artificially induced mutations (Elena *et al*. 1998; Sanjuan *et al*. 2004) and mutation-accumulation (MA) experiments (*e.g.*, Mukai 1964; Wloch *et al*. 2001). Two recent studies—one of spontaneous and induced mutations in yeast (Wloch *et al*. 2001) and one of induced mutations in the RNA virus vesicular stomatitis virus (VSV) (Sanjuan *et al*. 2004)—offer the most direct measures of the mutation effect distribution. These studies represented substantial advances over previous MA experiments because they directly measured the effects of individual mutations. Nonetheless, they serve to illustrate the two major difficulties inherent in any attempt to measure the rate of mutations with small effects. First, the yeast study used a phenotypic assay to detect mutations and was limited to estimating the rates and effects of mutations of moderate to large effect. And second, studies that use induced mutations to overcome the first limitation may not accurately capture the properties of spontaneous mutations. The relative abundance of mutations of small and large effects was found to differ among spontaneous and induced mutations in yeast (Wloch *et al*. 2001). Thus, although the use of site-directed mutagenesis to induce mutations in the VSV genome allowed the detection of a substantial fraction of mutations with neutral or nearly neutral effects, it is not known whether this fraction is representative of spontaneous mutations.

We attempted to overcome both of these difficulties by conducting a traditional MA experiment with the RNA bacteriophage ϕ6, a model system in which it is possible to obtain complete knowledge of the mutational and fitness changes that occurred over the course of the experiment. Following the usual design of MA experiments, we promoted the accumulation of spontaneous mutations by subjecting 10 independent lineages to repeated population bottlenecks of a single individual. The extreme bottlenecks effectively eliminated selection, allowing nonlethal mutations, regardless of whether they were advantageous, deleterious, or neutral, to increase to fixation (a frequency of 100%) with approximately the same probability. We measured fitness following every bottleneck and sequenced whole genomes before and after mutation accumulation. These data provided a complete picture of the timing and effects of the accumulated mutations and allowed us to determine the relative frequencies of mutations with large, small, and nearly neutral effects on fitness.

## MATERIALS AND METHODS

#### Strains and culture conditions:

The RNA bacteriophage ϕ6 used in this study is a laboratory genotype descended from the original isolate (Vidaver *et al*. 1973). *Pseudomonas syringae* pv. *phaseolicola*, the standard host of ϕ6, was obtained from the American Type Culture Collection (ATCC no. 21781). Details of diluting, filtering, culture, and storage of phage and bacteria are published (Mindich *et al*. 1976; Chao and Tran 1997). All phage and bacteria were grown in LC medium (5 g/liter yeast extract, 10 g/liter bactotryptone, 5 g/liter NaCl) at 25°.

#### Serial propagation through bottlenecks of a single phage:

The standard culture method for viruses easily allows for creating bottlenecks. Following the protocol described in (Burch and Chao 1999), phage were inoculated onto plates containing a lawn of the standard host *P. phaseolicola* at a sufficiently low density to ensure the formation of isolated plaques after a 24-hr incubation. Because plaques are generally initiated by a single phage, bottlenecks of a single phage were achieved by harvesting phage from a single randomly chosen plaque. Phage from this plaque were plated on a fresh lawn to obtain a new set of plaques, and the cycle was repeated 40 times in 10 independent lineages. Each bottleneck corresponds to ∼5 generations, bringing the total to 200 generations of mutation accumulation.

#### Fitness measures:

The fitness of individual phage genotypes was determined from measures of plaque size, as described in (Burch and Chao 2004). Phage were plated on a lawn of *P. phaseolicola* and incubated for 24 hr, and digital pictures were taken and used to measure the area of isolated plaques on each plate. In ϕ6, we have shown that under the current culture conditions log(fitness) is related to plaque area by the following linear relationship: log *W* = 0.044 × (plaque area) − 0.340 (*R*^{2} = 0.968, *F*_{1,7} = 213.4, *P* < 0.0001) (Burch and Chao 2004). Day-to-day variation in plaque size affects all genotypes equally, shifting the line up and down without affecting the slope. To ensure that day-to-day variation did not affect estimates of mutational effects (*s*), all of the data for an individual lineage (*i.e*., for a single plot in Figure 1) were collected on the same day from frozen stocks. Data for different lineages were collected on different days. Therefore, day-to-day variation explains the difference among lineages in the plaque area of the ancestral phage.

#### Genome isolation, RT–PCR, and sequencing:

Concentrated phage preparations were treated with DNase and RNase, then treated with 50:50 phenol:chloroform, and ethanol precipitated. The resulting RNA pellet was suspended in 50 μl TE and used as template for a reverse transcriptase reaction performed using N_{6} primers (first-strand cDNA synthesis kit; Amersham Biosciences, Piscataway, NJ). The product of the reverse transcriptase reaction was then amplified with ϕ6-specific primers by PCR, and PCR products were sequenced directly at the University of North Carolina Core Sequencing Facility using both forward and reverse primers to achieve double coverage of most genome regions. In total, we sequenced 12,381 bases (92.5%) of each genome, including all of the coding regions but missing the first and last 70–200 noncoding bases of each genome segment. Although these noncoding regions are known to contain regions of RNA 2° structure necessary for genome replication and packaging, we did not sequence them because the methods required to sequence the ends of linear genome segments are technically difficult and prone to sequencing errors.

#### Stepwise regression analysis of fitness trajectories:

The timing of mutational events and the effects of mutations on fitness were estimated using a forward stepwise least-squares linear regression. Our algorithm started with a model that assumed no mutations (*i.e.*, steps that decrease or increase fitness) and iteratively added mutations in the following manner. In each iteration one additional mutation is added to the set of existing mutations. The location of the mutation is chosen in such a way that addition of a step at that location produces the largest reduction in the residual sum of squares (RSS). Mutations were added to the model until each bottleneck was associated with a mutation. At the end of this process a nested sequence of fitted models was obtained. We then chose the “best” model as the one that gave the smallest value for the Bayesian information criterion (BIC). This criterion is a popular method for model selection proposed by Schwarz (1978). The BIC balances the RSS of a model and the number of parameters involved in fitting that model. Note that with the addition of each mutation to the model there is a reduction in the RSS, but two parameters are added to the model, one for the step location and the other for the height of the added step. To reflect the underlying biological process, we also implemented a constrained version of the algorithm that allowed only deleterious mutations (*i.e.*, steps that reduced fitness). In this case, when mutations were added to the model, we considered only locations that would result in decreasing steps.

We investigated the nature of the detection bias in the stepwise analysis by analyzing simulated MA lineages. In the simulated lineages, mutations accumulated according to a Poisson process with the observed rate *U* = 0.028 (Table 2; estimated from sequence data), and mutation effects were drawn randomly from a gamma distribution with the observed mean effect and coefficient of variation (or shape parameter) β = 0.162 (Table 2; maximum-likelihood estimate, MLE, for *n*_{DEL} = 56). The simulated measure of fitness at a particular time during MA was determined by summing the effects of accumulated mutations and the experimental error, which was drawn randomly from a normal distribution with mean zero and the standard deviation observed in the real data. Our simulated data contained only deleterious mutations; therefore, we analyzed the simulated data using the constrained algorithm that allowed only decreasing steps.

We modeled the detection bias using a logistic function *g*(*s*) that describes the ratio of detected mutations to actual mutations as a function of mutation effect *s*. We use a logistic equation of the form(1)where *h* represents the ratio of detected to actual mutations for mutations of large effects and can exceed 1 if a measurement bias causes the effects of large effect mutations to be overestimated, *w* describes the width of the detection threshold or the extent to which an increase in effect size leads to an increase in detectability, and *s*_{min} describes the location of the detection threshold.

We simulated and analyzed 100,000 MA lineages to yield collections of both simulated and detected mutation effect sizes. We divided both collections into bins of width 0.01 and determined the ratio of detected mutations to simulated mutations for each bin. We then plotted this ratio *vs.* the average effect size for each bin and fitted Equation 1 to the data by determining the values of *h*, *w*, and *s*_{min} that minimized the residual sum of squares over the range 0 < *s* < 0.3 (see Figure 2B).

#### Estimating the distribution of mutational effects:

We modified the maximum-likelihood approach of Otto and Jones (2000) to estimate the effects of the undetected mutations from the measured effects of the detected mutations. The approach was developed for an analogous problem in quantitative trait analysis in which only the largest quantitative trait loci (QTL) can be detected, but one hopes to estimate the number and effects of QTL that were not detected. In our scenario, the method uses maximum likelihood to estimate the distribution of effects of the accumulated mutations from the observed mutational effects *s _{i}* and the location and nature of the detection threshold

*g*(

*s*).

To calculate a likelihood of the observed mutational effects, we first determine the probability density function for detectable mutations,(2)where *s* is the effect of the mutation of interest, is the average mutational effect, β is the coefficient of variation of the effects distribution, and is the probability density function of the gamma distribution specified by and β. If we now let the random variable **S** = (*s*_{1}, *s*_{2}, … , *s*_{n}) represent the *n* observed mutational effects, the likelihood of obtaining these observations is(3)

Using this equation it is possible to obtain MLEs for both and β by searching the joint parameter space or to obtain an MLE for β after specifying a fixed value for Note that the mean mutational effect is exactly determined by the total drop in fitness due to mutation accumulation () and the number of accumulated deleterious mutations (). That is, Therefore, by assuming that all of the accumulated mutations had nonzero deleterious fitness effects, we can specify (on the basis of genome sequence data) and

#### Estimating the effect of selection within the plaque:

Modifying the approach of Kibota and Lynch (1996) to fit plaque growth dynamics (which differ from bacterial colony growth dynamics), we obtained loss probabilities (*P*_{loss}) from a deterministic model that assumes that an individual phage is transferred after 24 hr of plaque growth, that phage achieve five discrete generations in 24 hr, and that wild-type and newly mutated phages have relative fecundities of 1 and 1 − *s* per generation, respectively. Letting *q*[*s*, *t*] be the frequency of mutations with a particular effect *s* after *t* generations, the initial frequency is *q*[*s*, 0] = 0. Each generation, mutation acts to increase and selection acts to decrease the frequency of deleterious mutations according to(4)

We used this iterative equation to calculate *q*[*s*, 5] for each *s*. We specified for all *s*, but the resulting estimates of *P*_{loss}(*s*) were not sensitive to this choice. The probability of loss was then calculated as(5)

#### Model-based analyses of mutation rate and effects:

To investigate the extent to which our analysis represented an improvement over more traditional analysis methods, we implemented two model-based methods for estimating *U* and *s*—the Bateman–Mukai (BM) method and a maximum-likelihood (ML) method. Both methods assume a model of the mutational process to estimate *U* and *s* from fitness data collected only at the start and end of mutation accumulation.

The BM method assumes that the number of mutations per lineage is Poisson distributed with parameter *U* and that mutations have identical effects, *s. U* and *s* are then determined from the equations(6)(7)where *W*_{0} and *W*_{1} are, respectively, fitness before and after mutation accumulation, *E*(log *W _{i}*) is the expectation, or mean, of log

*W*over the 10 lineages, and is the among-line variance. Note that our use of log fitness instead of fitness differs from the original approach of Bateman and Mukai (Bateman 1959; Mukai 1964). We discuss the advantage of analyzing means and variances of log fitness in (Burch and Chao 2004).

_{i}The ML approach also assumes that the number of mutations per lineage is Poisson distributed with parameter *U*, but assumes that mutational effects are drawn from a gamma distribution with scale and shape parameters α and β. The ML estimates for *U* and were calculated using a C program provided by Peter Keightley (Keightley 1994). The program evaluates the log likelihood of the data as a function of particular parameter combinations using numerical integration (Keightley 1994, 1998). We searched over a wide range of parameter combinations to ensure that we found the global maximum likelihood. Ninety-five percent confidence limits were defined by a -fold drop in the log likelihood relative to its maximum value.

#### Phylogenetic estimation of the genomewide deleterious mutation rate:

Sequences of 24 ϕ6 relatives were downloaded from GenBank (accession nos. DQ273591 to DQ273614, L segment; DQ273639 to DQ273662, S segment). The sequences were truncated to coding regions with entire codons, with a final length of 411 nt (L) and 249 nt (S). These truncated sequences aligned perfectly with the ϕ6 sequences (NC003715, L; NC00371, S) with no gaps. The L and S sequence alignments contained 96 and 54 variable sites, respectively. Because reassortment (*i.e.*, recombination) among segments is frequent (Silander *et al*. 2005), the L and S sequence sets were analyzed separately.

MrBayes (http://mrbayes.csit.fsu.edu/index.php) was used to reconstruct phylogenetic trees for each set of 25 sequences, using the general time-reversible model of nucleotide substitution and the gamma model of among-site rate variation (discretized gamma distribution with four categories). The algorithm was run for 2,000,000 generations, with trees sampled every 1000 generations, and reached convergence as assessed by the standard error of split frequencies. The sampled trees were used to make a consensus tree after a burn-in period of 1000, using the “allcompat” option to provide a fully resolved tree. Although some nodes were not well supported, the *d*_{N}/*d*_{S} analysis described below is robust to noise in the tree topology.

We used the CODEML program in PAML (version 3.15; http://abacus.gene.ucl.ac.uk/software/paml.html) to estimate the *d*_{N}/*d*_{S} ratios for each tree. We used the F3x4 codon model and allowed PAML to estimate the transition/transversion ratio. We tested models that allowed among-site variation in *d*_{N}/*d*_{S}, but these models either failed to improve the fit to the data (L segment) or did not substantially change the overall *d*_{N}/*d*_{S} (S segment).

## RESULTS

#### Identification of mutations by genome sequencing:

To obtain a direct measure of the distribution of mutational effects, we used the bacteriophage ϕ6 to found 10 lineages that were independently propagated through 40 bottlenecks of a single phage. To obtain an exact count of the number of mutations that occurred during mutation accumulation, we sequenced the phage genotype from each lineage following the 40th population bottleneck. Sequencing 92.5% of the complete genomes (including all coding regions) revealed a total of 52 mutations. We estimate that 4 additional mutations were missed by our sequencing efforts (0.075 portion unsequenced × 52 observed mutations), bringing the total estimated mutation count to 56. The identity and molecular consequences of the 52 identified mutations are listed in Table 1. Most mutations resulted from base substitutions, although one insertion mutation was also observed. Transitions outnumbered transversions by 46 to 5, and we observed 32 nonsynonymous coding, 10 synonymous coding, and 10 noncoding mutations.

We used a chi-square goodness-of-fit statistic (Sokal and Rohlf 1995) to compare the observed distribution of mutations among lineages to a Poisson distribution with parameter λ = 5.2 (52 total mutations/10 lineages) and found no significant deviation (χ^{2} = 3.827; d.f. = 5; *P* = 0.5746). This result indicates that mutations were not significantly clustered among lineages and suggests that the per generation mutation rate remained constant across the 10 lineages throughout the experiment. In addition, a comparison of the observed location of mutations on the three genome segments (small segment, 13; medium, 18; large, 26) to the random expectation given the proportion of sequenced bases in each segment (small, 0.213 proportion of sequenced bases × 52 identified mutations = 11.1; medium, 0.299 × 52 = 15.6; large, 0.487 × 52 = 25.3) demonstrated that the location of mutations did not deviate significantly from a random distribution among genome segments (χ^{2} = 0.1183; d.f. = 1; *P* = 0.7309).

#### Identification of mutations by their effects on fitness:

We measured the fitness of each lineage following each of the 40 single-phage bottlenecks (Figure 1). Because mutations are acquired in discrete events, the appearance of deleterious mutations should be detectable in these data as stepwise changes in fitness at discrete time points. Therefore, we attempted to identify mutational events by assessing the fit to the data of regression models containing different numbers of steps. We assessed the fit of the data to two types of regression models—models in which both beneficial and deleterious mutations were allowed and models that were constrained to contain only deleterious mutations. The best-fit models of both types are shown in Figure 1.

For the most part, models that included beneficial mutations (red lines in Figure 1) were deemed biologically unrealistic. In particular, three of four of the detected beneficial mutations occurred within one or two bottlenecks of a deleterious mutation (lineages A, B, and J), suggesting that these steps resulted from noise in the data rather than actual beneficial mutations. Therefore, it seemed biologically more plausible to assume that our data set did not contain any beneficial mutations. Models that were constrained to contain only deleterious mutations gave a good fit to the data (blue lines in Figure 1). The constrained stepwise regression analysis detected between one and four mutations in each lineage, for a combined total of 22 mutations across all 10 lineages. The distribution of effects of these mutations is shown in Figure 2A. The effects of the detected mutations ranged from *s*_{min} = 0.054 to *s*_{max} = 0.418, and mutations of small effect were more common than mutations of large effect.

The absence of detectable mutations with effects < ∼5% confirmed the expectation that phenotypic assays often miss mutations with the smallest effects. In our assay, it was feasible to collect only one fitness measure per bottleneck; thus, statistical power relied on obtaining fitness measures of the same genotype over many consecutive bottlenecks. In this case, limitations on our ability to detect mutations with small phenotypic effects depended on the relative magnitudes of experimental error and mutational effects and on the number of bottlenecks that occurred between mutational events.

#### Investigation of the detection threshold:

We investigated the nature of the detection threshold in the stepwise regression analysis by analyzing simulated data sets. We simulated the accumulation of mutations in 100,000 MA lineages using the mutation rate and effect parameters estimated from the real data. We assumed that all of the mutations that accumulated were deleterious, allowing us to fix the mutation rate parameter *U* = 0.028 (56 mutations/10 lineages/200 generations per lineage) and the average mutation effect Note that the detection threshold depends on *U* and the shape of the effect distribution (*i.e.*, on β) so we had to estimate the detection threshold and β simultaneously. We began with (exponential distribution) and iterated through a process of (1) estimating the detection threshold using data simulated with the current estimate of β and (2) updating the estimate of β using the current detection threshold to obtain a more accurate MLE of β (below). After a small number of iterations, the detection threshold and the MLE of β remained unchanged.

By analyzing the simulated MA lineages using the stepwise regression method, we can calculate the ratio of detected to simulated mutations for any effect size and determine the nature of the detection threshold from these data. The resulting data and detection threshold are shown in Figure 2B. Thirty-nine percent of the simulated mutations had effects that were too small to be detected by the stepwise regression analysis, but the probability of detecting a mutation increased rapidly as the effect size increased above *s*_{min} = 0.059. We also noted a tendency to overestimate the effect size of mutations with large effects *s* > 0.08, a phenomenon that is analogous to the “Beavis effect” in QTL analyses (Beavis 1994, 1998). The best-fitting logistic equation (*s*_{min} = 0.059, *w* = 0.0105, *h* = 1.17) accurately describes the nature of the detection threshold (Figure 2B), especially over the range of effect sizes in which mutations tended to fall (89% of detected mutations had effects *s* < 0.3).

#### Estimating the distribution of mutation effects:

From the genome sequencing we knew that the number of accumulated deleterious mutations was *n*_{DEL} ≤ 56 and that the remaining mutations have essentially neutral effects. To investigate the shape of the mutation effect distribution we started with the assumption that all of the accumulated mutations were deleterious; *i.e*., *n*_{DEL} = 56. In this case, the average mutation effect was fixed at and we compared the likelihood of the observed mutation effects for gamma effects distributions with various coefficients of variation, β. The resulting MLE of β was 0.162 (95% C.I.: 0.082 ≤ β ≤ 0.457), which corresponds to the L-shaped gamma distribution shown in Figure 2, A and C (dashed lines). We also allowed *n*_{DEL} < 56 and compared the likelihood of the observed mutation effects for gamma effects distribution with various and β (likelihood profile shown in Figure 2D). The resulting MLEs were (*n*_{DEL} = 33.4) and β = 0.116, but the 95% confidence limits on these estimates include the ML gamma distribution that corresponds to *n*_{DEL} = 56. Figure 2A illustrates that the gamma distributions corresponding to *n*_{DEL} = 56 and *n*_{DEL} = 33.4 both give a good fit to the data.

#### Estimating the effect of selection within the plaque:

Although we tried to minimize its effects, selection operating during the growth of each plaque will have caused the distribution of mutations that *fixed* in our experiment to deviate somewhat from the distribution we hoped to measure—the distribution of *new* mutations. To estimate the effects distribution of new mutations we modeled the probability that newly arisen deleterious mutations were lost due to selection during plaque growth. The resulting loss probabilities are plotted in Figure 3A. Although the probability that new mutations with small to moderate effects (*i.e.*, *s* < 0.1) were lost due to selection is negligible, we observed a number of mutations with effects large enough (*s* > 0.2) that their probability of accumulating in our experiment was reduced by >30% relative to a neutral mutation.

To obtain estimates of the probability density functions of *new* mutations, we combined the loss probability function *P*_{loss}[*s*] (Figure 3A) with the probability density functions of *fixed* mutations (estimated above; Figure 2C) as follows:(8)

In addition, we used the loss probability function to correct the genomewide deleterious mutation rate estimate,(9)where *U*_{new} and *U*_{fixed} are the rates at which mutations appear and fix in our experiments, respectively. *U*_{fixed} was estimated by the ML procedure described above.

We compare the resulting probability density functions of new *vs.* fixed mutations in Figure 3B (*n*_{DEL} = 56) and 3C (*n*_{DEL} = 33.4). Although the distributions are visually similar, in each case the distribution of new mutations contains substantially more weight in the tail of large *s*-values and substantially less weight at *s*-values near zero. The resulting distributions are no longer gamma shaped, but they are nonetheless characterized by a mean effect and coefficient of variation β. We report the adjusted *U*, and β in Table 2. Note that the estimates of *U* are not much affected by selection in the plaque, but the distribution parameters and β change substantially. For example, in the case of the distribution estimated by assuming that all of the accumulated mutations are deleterious (*i.e.*, *n*_{DEL} = 56), the mean mutational effect changes from (fixed mutations) to (new mutations), a difference of 41%.

#### Traditional MA analyses:

To compare the conclusions from the current data-driven analysis to the conclusions of the traditional model-based MA analyses, we also analyzed the data using the BM approach (Bateman 1959; Mukai 1964) and the ML approach of Keightley (1994). These analyses used fitness data only from the start and finish of MA (before mutation accumulation, and ; after mutation accumulation, and ). To estimate the rate and effects of mutations in the absence of fitness data at intermediate time points, these methods assume that mutations are acquired through a Poisson process with rate *U* and that mutation effects are drawn from either an equal-effects distribution (BM) or a gamma distribution with shape and scale parameters α and β (ML). Estimates of *U*, and the coefficient of variation in mutation effects (β) from the data-driven analysis and from these model-based analyses are compared in Table 2. The model-based analyses both estimated mutation rates that were higher than the number of mutations that occurred (as determined from genome sequencing) and, therefore, mutational effects that were smaller than in reality. Note that the ML approach estimated an equal-effects distribution, despite the considerable variation in effect sizes exhibited by the underlying data (Figure 2A).

#### Phylogenetic estimation of selective constraint at nonsynonymous sites:

An alternative way to estimate the fraction of mutations with deleterious effects is to estimate from phylogenetic data the extent of selective constraint operating on coding sequences (*e.g.*, Eyre-Walker and Keightley 1999). By assuming that synonymous mutations are neutral, it is possible to estimate the fraction of nonsynonymous mutations that are also neutral as the *d*_{N}/*d*_{S} ratio. We estimated the *d*_{N}/*d*_{S} ratio from a collection of ϕ6 relatives isolated from nature (Silander *et al*. 2005). Phylogenetic trees were constructed from the two alignments of coding sequences in that study: 411 bp of the polymerase gene P2 and 249 bp of the nucleocapsid shell gene P8. The estimated *d*_{N}/*d*_{S} ratios were 0.0076 and 0.1009, respectively, combining for an average *d*_{N}/*d*_{S} ratio of 0.0542. In other words, very few nonsynonymous mutations evolve neutrally in ϕ6.

## DISCUSSION

In this article we present the results of an MA experiment designed to provide a direct measure of the distribution of spontaneous mutational effects on fitness in an RNA virus. By measuring fitness following every bottleneck during mutation accumulation and sequencing whole genomes at the conclusion of the experiment we could make substantial refinements in analysis and interpretation compared to previous MA experiments. Although we were similarly constrained to assume a particular type of effects distribution (gamma), the ability to measure directly the effects of accumulated mutations with the largest effects provided both greater power to estimate the underlying distribution parameters (*i.e.*, the mean and variance in mutational effects) and greater confidence that a gamma distribution was a reasonable assumption.

In addition, we note that our data differ somewhat from the data produced via site-directed mutagenesis of the RNA virus VSV (Sanjuan *et al*. 2004). Both experimental designs produced measures of the effects of mutations with the largest effects and an estimate of the relative abundances of mutations with detectable and undetectable effects on laboratory fitness. Although the shapes of the effect distributions estimated for spontaneous deleterious mutations in ϕ6 and for induced deleterious mutations in VSV are difficult to distinguish, the mean mutational effect and the proportion of mutations with detectable effects on laboratory fitness both differed. The mean effect of spontaneous mutations in ϕ6 () was lower than the mean effect of random induced nonlethal mutations in VSV (; estimated from Table 1 in Sanjuan *et al*. 2004), and the proportion of spontaneous mutations in ϕ6 with detectable effects on laboratory fitness (39%) was lower than that in VSV (55%; estimated from Table 1 in Sanjuan *et al*. 2004).

It would be a stretch to conclude that the underlying causes of these differences are biological rather than experimental, but in both cases the direction of the difference is consistent with our *a priori* expectation of the difference between spontaneous and random induced mutations. Whereas the probability of sampling a particular random induced mutation does not depend on the identity of the mutation, the probability of sampling a particular spontaneous mutation can depend radically on mutation identity. In particular, transition mutations are expected to be far more likely than transversion mutations, especially in RNA viruses (in our data set transitions outnumbered transversions by 46 to 5). Furthermore, redundancy in the genetic code is expected to make the average effect of a transition smaller than the average effect of a transversion. As a result, the distribution of spontaneous mutations is expected to contain a greater proportion of small-effect mutations than the distribution of random induced mutations. In sum, the fundamental difference between the identities of spontaneous and induced mutations likely explains our lower estimates of both and the proportion of mutations with detectable effects on laboratory fitness.

#### Shape of the mutation effect distribution:

We assessed the fit of two qualitatively different effect distributions: a continuous gamma distribution in which all mutations are deleterious but many have small effects (dashed line in Figure 2A) and a discontinuous distribution in which most mutations are deleterious with gamma-distributed effects, but other mutations are essentially neutral, *i.e.*, have essentially no effect on fitness (solid line in Figure 2A). From a statistical perspective, the simpler model in which all mutations are deleterious is preferable. The more complex model that includes two categories of mutations—deleterious and neutral—does not significantly improve the fit to the data. However, from a biological perspective, our intuition tells us that mutation effects are likely to be divided into two categories with qualitatively different effects. In general, we expect nonsynonymous mutations to have deleterious effects and synonymous and noncoding mutations to be essentially neutral. The close match between the number of deleterious mutations predicted by the maximum-likelihood analysis (*n*_{DEL} = 33.4) and the number of accumulated nonsynonymous mutations (32) supports this view.

In addition, patterns of molecular evolution in ϕ6 support our intuition that the distribution of mutational effects is discontinuous. *d*_{N}/*d*_{S} ratios in ϕ6 gene phylogenies are near zero (mean *d*_{N}/*d*_{S} = 0.05), suggesting that nonsynonymous mutations are almost universally deleterious (*i.e.*, *s* > 1/*N*_{e}) and that synonymous mutations are effectively neutral (*i.e.*, *s* < 1/*N*_{e}). Noncoding mutations also appear to evolve neutrally, at rates that are similar to the synonymous substitution rate (Silander *et al*. 2005).

We note that the deleterious mutations in the estimated discontinuous distribution are described by a gamma distribution that is not L-shaped, but very nearly exponential—the exact distribution shape that has been assumed for decades.

#### Comparisons to other methods and organisms:

One of the most puzzling outcomes of our analysis arises from the comparison with the traditional MA analysis methods. First, despite the extensive variation in effect sizes among the mutations in our collection, the traditional ML analysis (Keightley 1994) indicated that the fitness declines were best modeled using an equal-effects distribution. Second, the assumption made by the traditional analysis methods that mutations have identical effects is expected to cause *U* to be underestimated and *s* to be overestimated. Contrary to this expectation, the BM and ML estimates of *U* (and *s*) are higher (and lower) than our more accurate estimates. Although the BM and ML estimates differ from our estimates only by 2- to 3-fold, it is difficult to gain an intuition for why the estimate of *U* (that is supposed to be an underestimate) exceeds the value obtained by counting the total number of mutations by genome sequencing even though this count includes many effectively neutral mutations. We note that our data were subject to day effects that had similar effects on the initial and final fitness measures for each lineage and that the ML method uses a statistical model that includes environmental (*i.e*., plate) effects, but not day effects (Keightley 1994). Even so, while the presence of day effects might explain the surprising ML estimates, it cannot explain the surprising BM estimates.

A comparison of the *U* and *s* estimates obtained here in an RNA virus to previous estimates from eukaryotic organisms also yields a surprising outcome. Contrary to the *a priori* expectation that RNA viruses are radically different from other organisms, our estimates of *U* = 0.030 and *s* = 0.093 are both within the range of recent estimates in eukaryotes: 10^{−3} < *U* < 10^{−1} and 0.06 < *s* < 0.2 (Keightley and Caballero 1997; Fry *et al*. 1999; Wloch *et al*. 2001; Shaw *et al*. 2002; Joseph and Hall 2004; Baer *et al*. 2005). Clearly some RNA viruses have higher genomic deleterious mutation rates than ϕ6, but our data suggest that it is inappropriate to lump RNA viruses together into a single group that is subject to different evolutionary rules than organisms with DNA genomes.

Although estimates of the average mutation effect in ϕ6 and other organisms are similar, the distribution of mutation effects appears to differ. In our experiment, as in the previous study of the RNA virus VSV (Sanjuan *et al*. 2004), more than one-third of mutations (22/56) had detectable effects on laboratory fitness. In contrast, MA studies of the eukaryote *Caenorhabditis elegans* showed that only 1–4% of mutations have detectable effects on laboratory fitness (Davies *et al*. 1999; Estes *et al*. 2004). It is tempting to explain this difference by claiming that viral genomes are more compact than eukaryotic genomes or that viral genomes contain more functional RNA secondary structure (Brinton and Dispoto 1988; Brown *et al*. 1992; Berkhout and Schoneveld 1993). However, it is also possible that the laboratory environment more closely mimics the natural environment of viruses than it does the natural environment of *C. elegans*. We favor the latter explanation because most gene knockouts in ϕ6 either are lethal or have substantial effects on laboratory fitness (Mindich 2005), whereas most gene knockouts in eukaryotes show no detectable effect on laboratory fitness (Shoemaker *et al*. 1996; Winzeler *et al*. 1999). We suspect that viral genomes are not qualitatively different from eukaryotic genomes (except in size), but rather that measuring fitness in the laboratory has led to an underestimation of both the rate and the effects of deleterious mutations in eukaryotes.

## Acknowledgments

We thank Adam Eyre-Walker, Sally Otto, Jennifer Knies, Patrick Phillips, two anonymous reviewers, and members of the Burch lab for comments that improved the experimental design and the final manuscript. In addition, we thank Steve Marron and his statistical consulting course at the University of North Carolina for providing the means to initiate this collaboration. This work was supported by a grant from the National Institutes of Health to C.L.B.

## Footnotes

↵2

*Present**address:*INRA, UR979, Production Végétales, Petit-Bourg, F-97170, France.Communicating editor: P. Phillips

- Received October 23, 2006.
- Accepted February 23, 2007.

- Copyright © 2007 by the Genetics Society of America