Epistasis and Its Relationship to Canalization in the RNA Virus φ6
Christina L. Burch, Lin Chao


Although deleterious mutations are believed to play a critical role in evolution, assessing their realized effect has been difficult. A key parameter governing the effect of deleterious mutations is the nature of epistasis, the interaction between the mutations. RNA viruses should provide one of the best systems for investigating the nature of epistasis because the high mutation rate allows a thorough investigation of mutational effects and interactions. Nonetheless, previous investigations of RNA viruses by S. Crotty and co-workers and by S. F. Elena have been unable to detect a significant effect of epistasis. Here we provide evidence that positive epistasis is characteristic of deleterious mutations in the RNA bacteriophage φ6. We estimated the effects of deleterious mutations by performing mutation-accumulation experiments on five viral genotypes of decreasing fitness. We inferred positive epistasis because viral genotypes with low fitness were found to be less sensitive to deleterious mutations. We further examined environmental sensitivity in these genotypes and found that low-fitness genotypes were also less sensitive to environmental perturbations. Our results suggest that even random mutations impact the degree of canalization, the buffering of a phenotype against genetic and environmental perturbations. In addition, our results suggest that genetic and environmental canalization have the same developmental basis and finally that an understanding of the nature of epistasis may first require an understanding of the nature of canalization.

WHEN mutations occur together, their realized effects depend on the epistasis, or interaction, between them. In the absence of epistasis, mutations act independently and are expected to have a multiplicative effect on fitness (additive on a log scale). When epistasis is present, it can be of two forms: negative if the fitness of organisms bearing two mutations is lower than expected or positive if the fitness of such organisms is higher than expected. Epistasis is believed to be biologically important because the form of epistasis determines the outcome of a variety of evolutionary processes, among them the evolution of sex (Mukai 1969; Kondrashov 1982; Otto and Feldman 1997; Barton and Charlesworth 1998), ploidy (Kondrashov and Crow 1991; Jenkins and Kirkpatrick 1995), speciation (Wagner et al. 1994), inbreeding depression (Charlesworth 1998), genetic drift (Kondrashov 1994; Lande 1994; Schultz and Lynch 1997), genetic load (Rice 1998), and the origin of life (Wagner and Krall 1993). In most cases, the impact of epistasis stems from the fact that natural selection is more or less effective in changing the frequency of advantageous or deleterious mutations depending on the form of epistasis. Thus, negative epistasis decreases the load of deleterious mutations because selection eliminates a mutation that occurs in a genome with other mutations more rapidly than a mutation in a genome without additional mutations. However, despite the importance of these acknowledged roles for epistasis, quantitative measurements of epistasis are still not numerous and the few existing measurements have often been challenged (Chao 1988; de Visser et al. 1997a,b; Elena and Lenski 1997; Elena 1999; Lenski et al. 1999; Peters and Keightley 2000; Whitlock and Bourguet 2000; Crotty et al. 2001; Wloch et al. 2001). As a result, there is still no general understanding as to when or how epistasis arises in real biological systems.

RNA viruses have recently emerged as a possible system for quantifying epistatic interactions. Elena (1999) measured the correlation between fitness and number of mutations in foot-and-mouth disease virus (FMDV) to determine the average sign of epistasis and found a log-linear relationship, i.e., no epistasis. More recently, Crotty et al. (2001) measured the correlation between infectivity and number of mutations in polioviruses and found evidence for positive epistasis (their Figure 4). However, in both studies the statistical power of the data was limited. Although Elena's result did not find a significant deviation from log-linearity, an epistatic effect up to a magnitude of 70.5% of the effect of individual mutations could have been missed by the power of his analysis. Our own reanalysis of Crotty and colleagues' data (their Tables 2 and 3) also did not find any significant deviation from log-linearity (F1,1 = 0.94, P = 0.5104), although a trend toward positive epistasis was evident.

Because of the importance of epistasis to evolutionary processes, we conducted a study targeted at measuring the sign of epistasis in the RNA bacteriophage φ6. To gain statistical power, we increased sample sizes by using a fitness assay based on plaque size. The experimental design consisted of allowing mutations to accumulate in independent viral lineages subjected to population bottlenecks. The use of plaque size as a fitness measure allowed us to assay fitness of hundreds of viral lineages. Five φ6 genotypes with decreasing fitness values were subjected to such mutation-accumulation (MA) experiments, providing an estimate of the average effect of mutations in each genotype. We report that mutational effects on fitness were significantly smaller in lower-fitness genotypes, supporting a model of positive epistasis in φ6. We also took advantage of this experimental design to estimate environmental effects on each genotype and report that environmental effects on fitness were correlated with mutational effects on fitness. We discuss the relevance of these findings to studies of canalization and epistasis in other organisms.


MA experiments (Bateman 1959; Mukai 1964) promote the accumulation of mutations by subjecting genetic lineages to repeated population bottlenecks of a single individual. The sampling of mutations in these experiments is nearly unbiased because genetic drift prevails over natural selection during the extreme bottlenecks. With selection removed, all nonlethal mutations, regardless of whether they are advantageous, deleterious, or neutral, can increase to fixation (a frequency of 100%) with approximately the same probability (but see Kibota and Lynch 1996). However, because most mutations are deleterious, MA experiments in general offer the best sampling of the rate and effects of deleterious mutations. From our previous work (Chao 1990; Burch and Chao 1999), we knew that sequential bottlenecks of one virus rapidly lead to the accumulation of deleterious mutations in φ6.

As deleterious mutations accumulate in independent lineages, the expectation is that mean fitness across all lineages declines at rate Us and variance in fitness among lineages increases at rate Us2, where U is the genomic deleterious mutation rate and s is the average effect of a deleterious mutation (Bateman 1959; Mukai 1969). Consequently, U and s can be estimated from changes in the mean and variance in fitness among a collection of MA lineages.

To assess the average sign of epistasis using MA experiments, independent values of s must be estimated for a series of genotypes that vary either in fitness or in mutation number. We chose to create these genotypes by first subjecting a single high-fitness genotype to an MA experiment in which several independent phage lineages were each propagated through 40 successive population bottlenecks. Because mutations are randomly acquired in independent MA lineages, clonal isolates from different lineages are expected to differ in both the number and identity of mutations. Several of the resulting genotypes were chosen as a starting point for the current study on the basis of their ability to evenly span a range of fitness down to 52% of the fitness of the ancestral phage.

Because φ6 has the high mutation rate characteristic of RNA viruses, short-term MA experiments can be used to estimate s independently in genotypes that differ in fitness. Therefore, we conducted MA experiments in which each of the chosen genotypes was propagated in many independent lineages through a single population bottleneck (see Figure 2). s was estimated for each genotype, and the sign of epistasis was determined from a regression of the magnitude of s on the fitness of each genotype. With no epistasis, deleterious mutations should have the same effects in all genotypes, and the regression should have a slope of zero. With positive epistasis the regression coefficient should be positive in sign, with lower-fitness (more heavily mutated) genotypes experiencing smaller mutational effects and vice versa for negative epistasis.

A side benefit of using MA experiments to estimate s in various phage genotypes is that variation due to the environment, Ve, is also estimated for each genotype. By simultaneously measuring the effects of mutation and of the environment in multiple genotypes we were able to determine whether mutational and environmental sensitivity is correlated in φ6. Waddington (1940)(1942) predicted that the developmental mechanisms that produce canalization (phenotypic buffering) would act by simultaneously buffering against the effects of mutation and the environment, and there is some evidence to support this prediction (e.g., Waddington 1940; Mather 1953; Tebb and Thoday 1954; Scharloo 1991; Stearns et al. 1995; Ancel and Fontana 2000; Bergman and Siegal 2003). However, it is not known whether most spontaneous mutations will impact the degree of both mutational and environmental sensitivity.


Strains and culture conditions:

The RNA bacteriophage φ6 used in this study is a laboratory genotype descended from the original isolate (Vidaver et al. 1973). Pseudomonas syringae pv. phaseolicola, the standard host of φ6, was obtained from the American Type Culture Collection (ATCC; no. 21781). Details of diluting, filtering, culture, and storage of phage and bacteria are published (Mindich et al. 1976; Chao and Tran 1997). All phage and bacteria were grown in LC medium at 25°.

Phage propagation:

Details for protocol have been described (Burch and Chao 1999). Phage were plated onto a lawn of the standard host P. phaseolicola and incubated to allow the phage to reproduce and form plaques on the lawn. After 24 hr, isolated plaques were randomly chosen from the plate, and phage were harvested from the plaques and then diluted and plated on a fresh lawn to start a new growth cycle. One growth cycle on the lawn corresponds to about five generations.

Plaque-size determination:

Plaque sizes were measured as mean plaque area and were determined by plating phage on a lawn of P. phaseolicola at a low density (<50 phage per plate) to ensure nonoverlapping plaques. Following a 24-hr incubation, digital pictures were taken and used to measure the area of isolated plaques on each plate. Image analysis was performed using Scion Image Version 3b (Scion Corporation, Frederick, MD). The mean area per plaque on an individual plate corresponds to one measure in the distributions of lineage fitness.

The relationship between plaque size and log fitness was calibrated (Figure 1) using a set of nine phage genotypes whose fitness had been determined using the standard growth rate measure of fitness (Chao 1990). The standard measure estimates phage reproduction relative to that of a marked ancestral φ6 and is measured over the course of 24 hr. Phage achieve approximately five generations of growth over 24 hr (our unpublished observation); thus our standard fitness measure (W5) captures differences in reproduction that result after five generations. For the current study, this measure was adjusted to reflect differences in reproduction that result from only a single generation (W1), using the equation W1 = (W5)1/5 or log10W1 = 1/5log10W5.

Figure 1.—

Correlation between plaque size and relative growth rate measures of phage fitness. Data shown are from a reference set of phage that carry different deleterious mutations. Plaque size is the mean area of an individual plaque and was measured by taking the average size of plaques grown on three plates. Log W is a one-generation measure of log(fitness) and was measured using the standard relative growth rate assay, also performed in triplicate (materials and methods; Chao 1990).

The resulting one-generation estimates of log(fitness) were regressed on measures of plaque size (area in square millimeters), yielding the equation: log W1 = 0.044 × (plaque size) − 0.340 (R2 = 0.968, F1,7 = 213.4, P < 0.0001). This equation was used to convert measures of plaque size into estimates of log(fitness) throughout this study. Here and henceforth, log W refers to the base 10 logarithm of this one-generation estimate of fitness.

Generation of phage genotypes used to initiate the experiment:

The “wild-type” laboratory genotype is denoted φ6ANC throughout this study. Mutant phage were generated by propagating φ6ANC through 40 single-phage bottlenecks (i.e., plaque-to-plaque transfers) in many independent lineages (as in Chao 1990; Burch and Chao 1999). Intensified genetic drift during the bottlenecks caused the accumulation of random deleterious mutations, and four of the resulting mutants were selected for use in our study because they spanned a range of fitness down to 52% of the fitness of φ6ANC. Genome sequence of the mutant phage (C. Burch, unpublished data) confirms that each mutant contains between 1 and 3 mutations. For convenience, these mutants are denoted φ6M1, φ6M2, φ6M3, and φ6M4, in order from highest to lowest fitness.

Estimation of mutation numbers and effects:

We estimated the mutation parameters U, the number of deleterious mutations acquired per generation, and s, the average effect of mutations, using two methods: the Bateman-Mukai (BM) method (Bateman 1959; Mukai 1964) and the maximum-likelihood (ML) method developed by Keightley and Ohnishi (1998). The BM method is described in detail in Figure 2 . Briefly, mutations are assumed to have equal effects, so that estimates of U and s can be obtained from the per generation changes in the mean and variance of fitness among lineages. ML offers an advantage over the BM method because mutations are not required to have equal effects, but are instead described by a family of gamma distributions. The ML method uses the likelihood to compare the fit to the data of different mutation-effect distributions and yields estimates of the distribution scale and shape parameters, α and β. The mean mutational effect is then calculated as s = β/α. We assumed unidirectional gamma distributions (i.e., all mutations are deleterious) unless otherwise specified. To compare the two approaches, U and s can be estimated by ML after specifying β = 0, which yields a gamma distribution that approximates the assumption of equal mutation effects.

Figure 2.—

Experimental design and analysis. In the manner of Bateman (1959) and Mukai (1964), we conducted a mutation-accumulation experiment in which viruses were propagated in replicate lineages through a single bottleneck. The experiment described here was performed on five φ6 genotypes that differed in fitness. For each genotype, phage were harvested from a single plaque, diluted, and plated on fresh bacterial lawns to give isolated plaques at t0. A single plaque from each t0 plate was randomly chosen and phage were harvested from these plaques, diluted, and plated as for t0. Although only three lineages are diagrammed, in reality an average of >100 lineages were propagated for each genotype. The average log(fitness) of phage on each plate was determined using plaque size. At t0 and t1, the distribution of log(fitness) is characterized by its mean, E(log Wi), and variance, Var(log Wi), which is broken into Ve, the component of variance due to environment, and Vm, the component due to mutation. Mutations are assumed to appear randomly by a Poisson process at rate U and to carry effect x = log(1 + s), in which case mean log(fitness) decreases by an increment Ux with each bottleneck. Because all of the phage on the t0 plates are descended from a single plaque, the variance in log(fitness) at t0 resulted only from environmental effects (Ve). However, variance in log(fitness) at t1 resulted from both environmental effects (Ve) and mutations that were acquired between t0 and t1 (Vm = Ux2).

Measuring epistasis:

To compare the results of our study with other studies of epistasis, we estimated from our data an equation of the form Math1where Wk is the fitness of a virus carrying k mutations, and a and b describe the linear and quadratic effects of mutations, respectively. First, we estimate log Wk for several k, using the definitions Math2 Math3where sk is the average effect of a mutation on fitness when it appears in an individual carrying k mutations. We then substitute into Equation 3 the relationship between sk and log Wk measured in Figure 3a (sk = −0.394 × log Wk 0.128), yielding Math4

Figure 3.—

Effects and rates of mutations. (a) Relationship between BM estimates of s and the log W0 of five phage genotypes. As in Figure 2, W0 is the fitness of a genotype before the accumulation of mutations. Because s is measured for each genotype relative to the value of W0 for that genotype, s describes the average effect of a mutation appearing in the genetic background of the genotype. We show −s instead of s to indicate that the magnitude of deleterious mutations increases with log W0. (b) Relationship between BM estimates of U, the genomic deleterious mutation rate, and log W0.

By iterating Equation 4, log Wk was estimated for k = 1–5, generating a similar number of data points to that used in previous studies. The log Wk were regressed on k, and the quadratic model that best fit these estimates was log Wk = −0.0562k + 0.0038k2.


All analyses were performed as described (Sokal and Rohlf 1995).


Previous attempts to quantify epistasis in viruses (see below) have detected an effect but lacked the power to demonstrate significance. To maximize our chances of detecting significant epistasis in φ6, we increased the power of our study by using a fitness measure that would allow sampling of a large number of mutated individuals. We measured the fitness of our viral isolates by plaque size instead of the standard fitness assays based on growth rate. In Figure 1 we show the linear relationship between plaque size and the standard fitness measure (log W) and demonstrate that plaque size is an excellent predictor of log W (R2 = 0.968). Because the measurement of plaque size is automated and reliable, we were able to measure simultaneously the fitness of viral isolates with a sample size of thousands instead of dozens (Chao 1990).

Estimating mutational effects (s) in genotypes of varying fitness:

MA experiments were conducted on five phage genotypes of decreasing fitness to determine the average effect of deleterious mutations on each genotype. Each genotype was propagated in ∼100 replicate lineages through one bottleneck of a single phage (see Figure 2), and plaque size was measured both before and after the bottleneck. For each of the five phage genotypes, phage exhibited significantly smaller plaque sizes after the bottleneck than before it (P < 0.05 by two-tailed t-tests), attesting to the high genomic mutation rate and the prevalence of deleterious mutations.

Our analysis of these data (see Figure 2) was similar to the traditional analysis of MA experiments. Because plaque size correlates linearly with log(fitness) in φ6, rather than with fitness, we adapted the traditional analysis to estimate U and x, instead of U and s, where x is the magnitude of the average change in log(fitness) resulting from a single mutation, and x = log(1 + s). Rearranging the equations from Figure 2 that describe the change in the mean and variance in log(fitness), U and x were then calculated as Math5 Math6where W0 and W1 are the fitnesses of phage lineages before and after the bottleneck, respectively. s was estimated directly from x and measures the average effect of mutations on fitness relative to W0, the fitness of the genetic background in which the mutations appear. Our analysis differs from that of previous MA experiments primarily because we analyzed distributions of log(fitness) directly. We chose not to convert our data from log(fitness) to fitness because the traditional analysis is easily adapted to log(fitness) distributions and, moreover, because the conversion of outlying points would have inflated the variance in fitness to the extent that signal resulting from differences in mutational effects and number would have been overwhelmed by noise.

We used Equations 5 and 6 to estimate U and s and plotted the relationships between U and s and the log (fitness) of the initial genotype in Figure 3. Least-squares linear regressions demonstrated a significant increase in the magnitude of s with genotype fitness (F1,3 = 63.08; P = 0.0042), but no significant relationship between U and genotype fitness (F1,3 = 2.49; P = 0.2128). Estimates of s ranged from −0.010 to −0.107 in the lowest- and highest-fitness phage, respectively, and U was estimated at 0.067 deleterious mutations per generation. Although the values change slightly, estimates of U and s obtained using ML and assuming equal mutation effects (Table 1) also demonstrate a significant relationship between s and genotype fitness (F1,3 = 18.26; P = 0.0235) and no significant relationship between U and genotype fitness (F1,3 = 3.43; P = 0.1612).

View this table:

BM and ML estimates of U and s under a model of equal mutation effects

The BM estimates are compared to estimates obtained using ML in Table 1. Using the assumption of equal mutation effects, ML analyses yield estimates of U that are somewhat higher and estimates of s that are somewhat lower than the BM estimates. In general, the difference between the estimates is not more than twofold. ML tends to yield higher estimates of U and lower estimates of s if there is variability among mutation effects (Keightley and Ohnishi 1998). Thus, further ML analyses were carried out in which mutational effects were assumed to be gamma distributed, rather than uniform. Unfortunately, these analyses did not allow us to obtain improved estimates of U and s. Only in two cases (φ6ANC and φ6M2) did gamma-distributed mutational effects yield an improved fit to the data, and in both of these cases the models yielding the best fit to the data were biologically unrealistic (β → 0, U → ∞, s → 0).

In a final attempt to probe the distribution of mutation effects, we used ML to test for the presence of beneficial mutations. Setting β = 0, we assumed an equal mutation effects distribution, but allowed a proportion p of mutations to be beneficial. We tested whether a model allowing p > 0 yielded an improved fit to the data over a model in which p was fixed at zero. For no genotype did a model with p > 0 yield a significantly improved fit to the data. Thus, an estimate of p = 0 was obtained in each of the five genotypes, suggesting that beneficial mutations were not present in our data set.

Estimating environmental effects on genotypes of varying fitness:

We examined the effects of environmental perturbations on fitness in two ways. First, we used the data from the MA experiments to assess Ve, the contribution of the environment to variation in log(fitness) for each of the five phage genotypes. Ve was calculated as the variance in plaque size among lineages before the bottleneck, i.e., before mutations have the opportunity to accumulate. This estimate of Ve is equivalent to the within-lineage variance that is often calculated in MA experiments (e.g., Zeyl and DeVisser 2001). A least-squares linear regression of the estimates of Ve on log (fitness) of the five φ6 genotypes demonstrated a significant positive relationship (Figure 4a ; F1,3 = 77.99, P = 0.0031). Because Ve measures variance in log(fitness), this relationship demonstrates a greater than proportional increase in environmental sensitivity with increasing fitness.

Figure 4.—

Sensitivity to the environment. (a) The relationship between estimates of Ve, the variance in log(fitness) due to environmental effects, and log W0. (b) Mean plaque size of (▪) high-, (○) mid-, and (•) low-fitness genotypes measured over a range of bacterial host densities.

We wanted to confirm the relationship between fitness and environmental sensitivity in an independent experiment; therefore we assayed plaque size over a range of bacterial densities (Figure 4b). We investigated the effect of host density instead of other environmental factors, such as temperature or humidity, because we believe host density has the greatest influence over viral growth rate under our experimental conditions. Although genotypes of high, medium, and low fitness all exhibited a maximal plaque size at an intermediate host density of 108 bacteria/plate, the rate at which plaque size declined with a change in host density increased with the fitness of the genotype. Least-squares regression was used to fit separate lines to the data on the left and right sides of the maximum. The slopes of the best-fit lines differed significantly between genotypes on both sides (left, F1,5 = 18.36, P = 0.0078; right, F1,22 = 57.2, P < 0.0001). Because plaque size is a linear function of log(fitness), the increase in the slope on each side of the maximum indicates that phage fitness shows a greater than proportional increase in environmental sensitivity as overall fitness increases.


Estimating mutation parameters:

Using the BM method to analyze a series of short-term mutation-accumulation experiments, we estimated the mutation parameters U and s in five φ6 genotypes. We estimated U at 0.067 and found no evidence that U differs between genotypes. In contrast, estimates of s ranged between −0.010 and −0.107 and increased in magnitude with the fitness of the phage genotype. Our attempts to use the ML approach to obtain more accurate estimates of the mutational parameters met with only limited success. We obtained parameter estimates using ML with an equal mutation effect distribution (β = ∞), and these estimates were similar to the BM estimates of U and s. However, adding variance in mutational effects (β < ∞) either did not improve the fit of the model to the data or yielded biologically implausible estimates of U and s.

Because we were able to obtain parameter estimates only by assuming equal mutation effects, our estimates of U (and s) can be considered only as lower (and upper) bounds. Nonetheless, we have some confidence that these bounds are close to the true mutation parameters in φ6. Chao et al. (2002) measured the mutation rate in φ6 by determining the reversion rate of an amber mutation. Their measure of 2.7 × 10−6 reversions per generation, multiplied by the φ6 genome size of 13,391 bp, yields a genome-wide mutation rate measure of U = 0.036. Our estimate of U = 0.067 differs from this measure by only twofold. In addition, C. Burch (unpublished data) has directly measured the effects of 11 random spontaneous mutations in φ6ANC and found an average effect s = −0.103, nearly identical to the estimate obtained here for φ6ANC (Table 1).

In addition to the assumption of equal mutation effects made in the BM analysis, we have considered several other possible sources of error. In our view, the most likely source of error could result from a sampling bias in the lowest-fitness genotypes. Such a bias would result if mutations with large effects appearing in low-fitness genotypes produced phage that were unable to form visible plaques. Three lines of evidence argue against this possibility. First, phage of substantially lower fitness (20%) than the lowest fitness phage used here still form visible plaques. The effects of mutations would have to be more than an order of magnitude larger than the effects estimated here to generate phage of such a low fitness. Second, when φ6M4, the lowest-fitness genotype, is removed from our data set, the regression of s on log(fitness) remains significant (F1,3 = 37.98; P = 0.0253). Third, estimates of U were not smaller in low-fitness genotypes, suggesting that we did not miss a whole class of mutations in the low-fitness genotypes.

An additional concern is that beneficial (or compensatory) mutations could be more common in the lower-fitness genotypes. On first consideration it would seem that estimates of s, the average effect of mutations, would decrease as an increasing number of beneficial mutations shifted the average effect toward zero. In reality, however, the manner in which beneficial mutations break the assumption of equal mutation effects has a nonintuitive effect on the BM estimates of U and s. Because beneficial mutations would cause between-line variance to increase without depressing mean fitness, beneficial mutations would actually inflate estimates of s, rather than causing estimates of s to approach zero. Therefore, unless beneficial mutations were more common in higher-fitness genotypes, they could not have driven our findings. In any case, we did not detect beneficial mutations in our data set. ML estimates of p, the proportion of beneficial mutations, were zero in all genotypes.

Finally, because U and s are not estimated independently, it is of concern that our lowest-fitness genotype yields an estimate of U that is fourfold higher than any other estimate. If this estimate of U were erroneously high, then the estimate of s in this genotype would be erroneously low. This possibility is especially worrisome because the lowest-fitness genotype plays a large role in determining the relationship between s and log W (Figure 3a). However, when this data point is removed from our data set, the regression of s on log W remains significant (discussed above). Since there is less variance, and no trend, in the estimates of U in the remaining four genotypes, it is unlikely that the lack of independence between estimates of U and s drove our findings.

Measures of epistasis:

Our data demonstrate directly that mutational effects are smaller in low-fitness viral genotypes than in high-fitness genotypes. We can infer from these data that the effect of deleterious mutations decreases with increasing mutation number (the conventional definition of positive epistasis) because genotypes bearing a greater number of deleterious mutations have, by definition, a lower fitness. Thus, our finding of smaller mutational effects in lower-fitness viral genotypes demonstrates that epistasis is positive in φ6. Findings of positive epistasis are not unknown in the literature (Seager and Ayala 1982; Seager et al. 1982; Lenski et al. 1999), but such findings are not common and are balanced by findings of negative epistasis (Mukai 1969; Chao 1988; Whitlock and Bourguet 2000) or no epistasis (Chao 1988; de Visser et al. 1997a,b; Elena and Lenski 1997; Elena 1999; Lenski et al. 1999; Peters and Keightley 2000; Whitlock and Bourguet 2000; Crotty et al. 2001; Wloch et al. 2001).

Narrowing the focus to studies of RNA viruses, it is apparent that our finding of positive epistasis contrasts with the findings of two earlier studies of RNA viruses (Elena 1999; Crotty et al. 2001). Neither earlier study was able to demonstrate a significant effect of epistasis, either positive or negative. To compare the results of all three studies directly, we estimated from our data an equation of the form log Wk = ak + bk2, where Wk is the fitness of a virus carrying k mutations, and a and b describe the linear and quadratic effects of mutations, respectively (see materials and methods: Measuring epistasis). In this equation, b is a measure of the strength of epistasis, where b < 0 indicates negative epistasis and b > 0 indicates positive epistasis. The a and b that give the best fit to our data are shown in Table 2 , together with estimates of a and b from the Elena (1999) and Crotty et al. (2001) studies.

View this table:

Cross-study comparison of mutational effects

From these data it is clear that our estimate of b is similar in magnitude to estimates from the other studies. Furthermore, our estimate is the smallest of the three, making it unlikely that our ability to demonstrate epistasis resulted because epistasis in φ6 is greater in magnitude than that in other RNA viruses. Instead, we are left with the explanation that increased statistical power underlies our ability to find epistasis.

Differences in statistical power could have arisen from several sources, and we concentrate on two: sample size and experimental design. First, statistical power is maximized when sample sizes are large. The use of plaque size as a measure of fitness in our experiment allowed an experimental design in which ∼33 mutations were sampled from each of 5 genotypes (∼100 lineages × 0.067 mutations/generation × 5 generations/bottleneck ≅ 33 mutant phage). In the FMDV study, the relationship between log(fitness) and mutation number was examined in 20 genotypes carrying between 1 and 6 mutations. In the polio study, log(fitness) was assayed once each on five pools of thousands of mutant viruses at various levels of mutagenesis (correlated with k). Although our sample size can be argued to be larger than that of either previous study, in neither case is it obvious that small sample size was the primary factor limiting the ability to detect epistasis.

The more important difference between our study and previous studies may be that we examined mutational effects in genotypes that differed in fitness rather than in mutation number. The power to detect epistasis will depend on the variance in mutational effects experienced by a particular class of viruses. It is possible that viruses of a particular fitness exhibit a lower variance in mutational effects than viruses carrying a particular number of mutations. Although no empirical data are available to address this possibility, it is clear that the range of possible mutational effects could be quite different for two genotypes carrying equal numbers of mutations. A genotype carrying one severely deleterious mutation cannot be damaged to the same extent as a genotype carrying one mildly deleterious mutation. In contrast, the range of possible mutational effects would be the same for two genotypes of equal fitness. The classification of genotypes by fitness instead of by mutation number is the most obvious difference between our experimental design and that of the previous studies. It is our opinion that this difference is the major factor underlying our ability to detect epistasis when the previous studies could not.

Epistasis and canalization:

Our results demonstrate that mutant φ6 are less sensitive to mutations and to environmental perturbations than wild-type φ6. This observation contrasts directly with classical observations (mostly in Drosophila) that wild-type organisms generally become more sensitive to environmental and genetic perturbations after acquiring mutations (Waddington 1942; Mather 1953; Tebb and Thoday 1954; Dunn and Fraser 1958; Scharloo 1991). The classical observations were explained by Waddington (1940)(1942), who proposed that developmental mechanisms that buffered phenotypes against environmental and mutational perturbations existed in the wild type and that these mechanisms broke down in mutant individuals. Such developmental buffering is called canalization, and organisms possessing buffering mechanisms are considered to be canalized. From our data, it is clear that wild-type φ6 is not canalized in this sense.

The dissimilarity between ours and the classical results prompted us to consider our results in the context of the canalization literature and to consider more closely the relationship between canalization and epistasis. This relationship is definitional. The existence of canalization implies that some genotypes will be better buffered than others; thus, canalization implies epistasis: mutations will have smaller effects in genotypes that are more canalized than in genotypes that are less canalized. In our case, we found that wild-type φ6 is not canalized, because mutations had larger effects in the wild type than in lower-fitness, mutated, genotypes. This finding is equivalent to a finding of positive epistasis in φ6.

Although it is important to note the difference between ours and other data sets it may be more important to note the similarities. If we compare our data to the abundant data from Drosophila we find that in both cases a genotype's sensitivity to mutation and its sensitivity to the environment are correlated (e.g., Waddington 1940; Mather 1953; Tebb and Thoday 1954; Scharloo 1991). This correlation suggests that in both Drosophila and φ6 the developmental bases of genetic and environmental sensitivity are shared. When we combine the definitional relationship between canalization and epistasis with the shared developmental basis of genetic and environmental canalization, it follows that the nature of epistasis should correspond exactly to the consequences of mutations for environmental sensitivity. In fact, data exist to confirm this suggestion. When environmental sensitivity increases with mutations as in Drosophila (e.g., Waddington 1940; Mather 1953; Tebb and Thoday 1954; Scharloo 1991), epistasis is negative (Mukai 1969; Whitlock and Bourguet 2000); when environmental sensitivity decreases with mutations as in φ6, epistasis is positive.

There is no reason to expect that all viruses will behave like φ6, exhibiting positive epistasis and a lack of canalization. In fact, studies of simulated RNA folding (Ancel and Fontana 2000) and of RNA secondary structure in actual viral genomes (Wagner and Stadler 1999) both suggest that viral genomes can be canalized. Theory suggests two factors that may determine whether a population will evolve to be canalized or not: population size and environmental heterogeneity. Small population size (Krakauer and Plotkin 2002) and environmental heterogeneity (Ancel and Fontana 2000; Gibson and Wagner 2000) are both predicted to select for canalization. In contrast, large population size is predicted to select against canalization (Krakauer and Plotkin 2002), and selection for robustness in homogeneous environments is probably too weak to produce canalization (Wagner et al. 1997). We submit the possibility that different viruses, because of differences in effective population size or in environment, could be either canalized or uncanalized. Our data suggest that the nature of epistasis would differ in such genotypes.


We thank D. Weinreich, O. Tenaillon, A. Poon, C. Dahlberg, and R. Lande for discussions and P. Keightley for providing software for the ML analyses. This work was supported by funding from Howard Hughes Medical Institute and the National Science Foundation to C.L.B. and grant R01-GM60916 from the National Institutes of Health to L.C.


  • Communicating editor: D. Begun

  • Received August 8, 2003.
  • Accepted March 4, 2004.


View Abstract