| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138
1 Corresponding author: Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Ave., Room 3000, Cambridge, MA 02138.
E-mail: amurray{at}mcb.harvard.edu
| ABSTRACT |
|---|
|
|
|---|
Three methods are commonly employed to measure phenotypic mutation rates: mutation-accumulation assays, mutant accumulation assays, and fluctuation assays. The mutation-accumulation assay involves passing a culture through recurrent bottlenecks, ideally of a single cell/individual, such that all mutations are nearly neutral. This is useful for determining the rate of mutations affecting fitness since repeated bottlenecks will reduce the effect of selection (KIBOTA and LYNCH 1996; ZEYL and DEVISSER 2001). This method works well in multicellular organisms, where the population size can be maintained at the bottleneck; however, in microorganisms, where a visible colony must be allowed to form, selection will still occur between the bottlenecks. Several methods are available for estimating phenotypic mutation rates from mutation-accumulation assays (GARCIA-DORADO and GALLEGO 2003); alternatively, direct sequencing can be used since all mutations occur in the same genome (DENVER et al. 2004; HAAG-LIAUTARD et al. 2007).
In the mutant-accumulation assay, the frequency of a phenotype that is neutral in the environment of the experiment but can be selected for in an alternative environment is monitored in an exponentially growing culture by periodically plating an aliquot of the culture onto selective medium. Once the population reaches a size such that the probability of a new mutation occurring in the next generation is approximately one, the frequency of mutants will increase linearly with time. Thus an accurate estimate of phenotypic mutation rate requires a long intervals between frequency measurements and these experiments typically last for hundreds of generations. Because most of the population has not accumulated the neutral mutation, beneficial mutations will predominantly occur in cells lacking the neutral mutation, thus slowing the accumulation of mutants.
In the fluctuation assay, many parallel cultures are inoculated with a small number of cells, grown under nonselective conditions, and plated to select for mutants (LURIA and DELBRÜCK 1943). The number of mutations that arise in each culture will follow the Poisson distribution; however, the number of mutant cells per culture will vary greatly since early mutations will lead to "jackpots," cultures that contain a great many mutant individuals.
The simplest way to estimate the expected number of mutations that occur in each culture (m) is from the fraction of cultures with zero mutants, which should be e–m. This method (P0) was used by LURIA and DELBRÜCK (1943) in their original article describing the fluctuation assay. The full distribution of mutants per culture (the Luria–Delbrück distribution) can be described by a set of recursive equations (MA et al. 1992). The most accurate method for estimating m (Ma–Sandri–Sarkar maximum likelihood) finds the m that gives the best fit of the Luria–Delbrück distribution to the data (SARKAR et al. 1992; ROSCHE and FOSTER 2000). By simulation, STEWART (1994) calculates 95% confidence intervals for m obtained using this method; however, for the confidence intervals to be meaningful, the data must follow the Luria–Delbrück distribution.
One way to estimate the quality of data is to plot the cumulative distribution of mutant frequencies on a log–log plot; Luria–Delbrück-distributed data presented in this way will produce a straight line with slope –1 (ROSCHE and FOSTER 2000). Deviations from linearity show that the data do not approximate a Luria–Delbrück distribution. This graphical approach ignores jackpots (since they lie far off the line) and cultures with zero mutants (due to the log transformation). In this article we introduce a simulation-based approach to determining the quality of data from fluctuation assays and show how this method can detect deviations from the Luria–Delbrück distribution.
To convert phenotypic mutation rates to a per-base-pair mutation rate we need to estimate the effective target size for phenotypic mutation. Although the concept of effective target size is important in evolutionary theory—it links the mutation rate to a particular phenotype to the mutation rate per genome per generation—it has not been explicitly defined. Previous work has used "target size" to refer to the size of the gene in which mutations are selected (DRAKE 1991) but could also describe the number of mutations within a reporter that are detectable. We propose a more general and probabilistic definition of effective target size that encompasses the relationship between phenotypic and genomic mutation rates. Our definition illustrates where uncertainties in estimates of genomic mutation rate arise and shows how this parameter can be calculated from experimental data. In particular, we distinguish between the effective target size based on the average per-genome mutation rate and the locus-specific effective target size conditioned on a mutation event occurring in a specific region of the genome.
We measured mutation rates to resistance to 5-fluoroorotic acid (5-FOA), canavanine, and
-factor. 5-FOA is nontoxic, but can be converted into toxic 5-fluoro-uracil by the uracil biosynthesis pathway. The product of the URA3 gene catalyzes a key step in this process; therefore, 5-FOA predominantly selects for ura3 loss-of-function mutants. Canavanine is a toxic arginine analog, whose uptake requires the arginine transporter. Canavanine selects for loss-of-function mutants of this transporter, which is encoded by the CAN1 gene.
-Factor is a peptide pheromone secreted by mating-type
(MAT
) cells. Binding of the pheromone to the Ste2 receptor on a MATa cell signals through a MAP-kinase cascade to initiate the mating-response genes and a G1 arrest (DOHLMAN 2002). Wild-type MATa cells secrete a protease, Bar1, which degrades
-factor; deleting BAR1 prevents growth on medium containing
-factor and allows us to measure the rate of resistance to
-factor using the fluctuation assay. There are at least 10 genes whose loss-of-function results in
-factor resistance; therefore, we expect the mutation rate to
-factor resistance to be an order of magnitude higher than the mutation rate to 5-FOA and canavanine resistance. We find the phenotypic mutation rates to 5-FOA, canavanine, and
-factor resistance to be 5.43 x 10–8, 1.52 x 10–7, and 3.07 x 10–6/genome/generation, respectively. Combining our estimates of phenotypic mutation rates and locus-specific effective target sizes, we conclude that the per-base-pair mutation rates at URA3 and CAN1 are 3.80 x 10–10 and 6.44 x 10–10/bp/generation, respectively, suggesting that the mutation rate varies across the yeast genome.
| MATERIALS AND METHODS |
|---|
|
|
|---|
::ADE2, MATa. Yeast were grown in synthetic complete (SC) medium (SHERMAN et al. 1974), SC medium without uracil (SC –Ura), or SC medium with only 1% glucose (SCLG). The cultures that composed the fluctuation assays were plated onto four types of selective media: 1x canavanine [SC medium without arginine (SC –Arg), 60 mg/liter L-canavanine (Sigma-Aldrich, St. Louis)]; 10x canavanine (SC –Arg, 0.6 g/liter L-canavanine); 5-FOA [SC –Ura, 1 g/liter 5-FOA (Sigma-Aldrich)]; and
-factor [yeast extract, peptone, dextrose (YPD), 10 µg/ml
-factor] (Bio-Synthesis, Lewisville, TX). In preparation for plating several spots of mutant cultures on each plate, the plates were overdried by pressing a Whatman filter paper (grade 3, 90 mm) onto the plates, using a replica-plating block and allowing the filter to remain in place for at least 30 min. The filters remove
1 ml of liquid and plates can be used for several days after filters have been removed.
Fluctuation assays:
Fluctuation assays were performed on 10 clones of GIL104 to determine the rate at which cells mutated to become resistant to 5-FOA, 10x canavanine, or
-factor. Media and culture volumes were chosen such that a similar number of mutants would be counted for each phenotype: 200 µl of SC, 100 µl of SC, and 10 µl of SCLG for resistance to 5-FOA, 10x canavanine, and
-factor, respectively.
To begin each fluctuation assay, a single clone was grown overnight to saturation in SC –Ura, diluted 1:10,000 into the appropriate medium, dispensed into 96-well plates, and sealed with an aluminum plate seal to prevent evaporation. This represents initial inocula of
2000, 1000, and 200 cells for the cultures assayed for mutations to 5-FOA, 10x canavanine, and
-factor resistance. Cultures were grown for 2 days at 30° without shaking (only 1 day for the low-glucose cultures, which saturated after 1 day's growth) and resuspended using a Titramax 1000 orbital shaker (Rose Scientific, Cincinnati) prior to plating. Twenty-four cultures were pooled, diluted, and counted in triplicate using a Beckman Coulter particle counter (Beckman Coulter, Fullerton, CA) to determine the average number of cells per culture. The remaining 72 cultures were spot plated onto overdried plates to select for mutants: 200-µl cultures were spotted onto 12 5-FOA plates (six spots per plate); 100-µl cultures were spotted onto eight 10x canavanine plates (nine spots per plate); and 10-µl cultures were brought up to 100 µl with sterile water and spotted onto eight
-factor plates (nine spots per plate). A Tecan Genesis liquid handler (Tecan U.S., Durham, NC) was used to semi-automate spot plating. We assume that the loss of mutant cells due to evaporation, liquid handling, and incomplete plating efficiency is negligible. Under the conditions used, the average numbers of cells per culture were 2.1 x 107, 1.3 x 107, and 3.8 x 105, for 5-FOA, 10x canavanine, and
-factor, respectively.
Plates were allowed to dry overnight at room temperature and then incubated at 30° for 1, 2, or 5 days for
-factor, 10x canavanine, and 5-FOA, respectively, after which time the number of mutants per spot was counted using a dissection microscope. For 10x canavanine and
-factor plates we used a size threshold: colonies <1 mm at 10x magnification for canavanine or 3 mm at 6x magnification for
-factor were presumed to result from mutations that had occurred after the cells were plated and were not counted. The choice of the size cutoff was based on looking for a natural break in the colony-size distribution. However, the size distribution was not bimodal; therefore, it is reasonable to assume that some leaky mutants were excluded. This is clear when we observe jackpots of mutants smaller than our size threshold, which were excluded from the analysis. For this reason, it is important that the strains we sequenced to determine target size were chosen from the plates from the fluctuation assays so that any leaky mutants, which were excluded from the determination of mutation rates, were also excluded from the calculation of target size. Fluctuation assays for resistance to 1x canavanine were performed similarly to those for 10x canavanine except 1x canavanine plates were counted 3 days postplating.
Analysis of fluctuation data:
Fluctuation data were analyzed by the Ma–Sandri–Sarkar maximum-likelihood method in which the data are fitted to a model of the Luria–Delbrück distribution on the basis of a single parameter m, the expected number of mutation events per culture (SARKAR et al. 1992). Mutation rate is calculated from the equation µ = m/N, where N is the average number of cells per culture (approximately equal to the number of cell divisions per culture since the initial inoculum is much smaller than N). Ninety-five percent confidence intervals on m and µ were assigned using Equations 29 and 30 from FOSTER (2006).
The data were also fitted to a two-parameter model that accounts for postplating growth. This model is a Luria–Delbrück distribution overlaid with a Poisson distribution with a rate Nµd = md, where d is the mean number of cell divisions (in which mutants could occur and be detected) in the lineage of cells that were plated on the selective plates; d can be related to the number of generations of growth postplating (g) by d = 2g – 1. The probability distribution for the number of mutants per culture in the two-parameter model is thus the joint distribution of the Luria–Delbrück (parameter m) and the Poisson (parameter n = md); the m's are the same assuming that the mutation rate is the same for the postplating cell divisions.
We used two tests to assess whether the one- or the two-parameter models best fitted the data. Both relied on calculating the probability of recovering the observed data given the model. This probability, Pj, where j is the number of parameters in the model, was calculated as
, where pi is the probability of observing i mutants and N is the number of independent cultures that produce i mutant colonies. The log-likelihood-ratio test calculates
= 2(log(P2) – log(P1)), which is distributed as a
2-distribution with 1 d.f. Akaike's information criterion (AIC) calculates the parameter AIC = 2j – 2(log(Pj)) and asserts that the best fit is the one with the lowest AIC value. The F test is not appropriate for these comparisons since it assumes a linear model with errors that are drawn from the same normal distribution at each data point, both assumptions that are violated by the data generated from fluctuation assays.
Sequencing of ura3 and can1 mutants:
Table 1 lists the primers that were used to amplify and sequence the ura3 and can1 alleles from 5-FOA and 10x canavanine-resistant colonies, respectively. Prior to the isolation of genomic DNA, 5-FOA and 10x canavanine-resistant colonies were restreaked on selective medium.
|
| RESULTS |
|---|
|
|
|---|
Performing fluctuation assays:
One way to increase the accuracy of mutation rate estimates from fluctuation assays is to increase the number of cultures (STEWART 1994). Typically, fluctuation assays are performed in test tubes; however, to increase the throughput, we perform the assays in 96-well plates. We plate 72 of the cultures to selective medium to determine the number of mutants per culture; the remaining 24 are used to determine the average number of cells per culture (see MATERIALS AND METHODS). Using the 96-well format we can vary the culture volume from 10 to 200 µl and can measure mutation rates over two orders of magnitude (Table 2).
|
The combination of spot plating and the 96-well format allow for automation of the fluctuation assay. To demonstrate this, we semi-automated the process using a liquid handler; this enabled us to perform all fluctuation assays described here—the equivalent of three 720-tube fluctuation assays—in parallel.
Analyzing fluctuation data and postplating growth on 1x canavanine:
There are many methods for calculating mutation rates from fluctuation data (FOSTER 2006), of which the Ma–Sandri–Sarkar maximum-likelihood method is preferred because it is the most accurate, it is valid for any range of the expected number of mutation events per culture (m), and 95% confidence intervals can be calculated by an empirically determined set of equations (STEWART 1994; ROSCHE and FOSTER 2000). For estimates of mutation rates and 95% confidence intervals generated from this method to be accurate the data must approximate the Luria–Delbrück distribution.
We tested this approximation by using the Ma–Sandri–Sarkar maximum-likelihood method to estimate m and then plotting the predicted cumulative frequency distribution of mutants against the experimental data. Fluctuation assays on 5-FOA produced close agreement between predicted and observed distributions (Figure 1). In contrast, assays on 1x canavanine and
-factor produced data that deviate significantly from the Luria–Delbrück distribution. Compared to the expected distribution, cultures with a small number of mutants are underrepresented and cultures with many mutants are overrepresented in the 1x canavanine experiment (Figure 1, one-parameter model). This deviation can be explained as the combination of a Luria–Delbrück distribution and a Poisson distribution.
|
We quantified the improvement of the fit by two methods. The first is to compare the best log likelihood calculated under one- and two-parameter models. As long as a large number of cultures are used, twice the difference between the best one- and two-parameter scores [
= 2(log(P2 Param) – log(P1 Param)] is distributed as
2 with 1 d.f. This means that if the best two-parameter score is >1.92 log units better than the best one-parameters score the probability is >0.95 that the two-parameter model is a better fit to the data. The second method was to use the AIC, which uses both the log likelihood and the number of parameters in calculating a score to find the model that best fits the data. Both are described in more detail in MATERIALS AND METHODS.
For fluctuation assays on 1x canavanine, both the log-likelihood-ratio test and the AIC indicate that the data are best fitted by the two-parameter model. For fluctuation assays on 5-FOA, there is no improvement of fit using the two-parameter model. To minimize postplating mutation we increased the canavanine concentration 10-fold and counted the plates 1 day earlier. The data from 10x canavanine more closely approximate the Luria–Delbrück distribution (Figure 1). Although the two-parameter model still gives a slightly better fit, according to both the log-likelihood-ratio test and the AIC, the data are best fitted by the one-parameter model. For the fluctuation assay on
-factor, the data are best fitted by the two-parameter model; however, both models fail to capture all features of this distribution (Figure 1).
Phenotypic mutation rates:
Fluctuation assays were performed to determine mutation rates to
-factor, 10x canavanine, and 5-FOA resistance for 10 isogenic clones of a strain from the W303 background (GIL104); the data were analyzed using the one-parameter and two-parameter models (Tables 2 and 3, respectively). For each assay, the log-likelihood-ratio test and the AIC were applied to determine which model best fitted the data (Table 3). All fluctuation assays on
-factor are best described by the two-parameter model; whereas, all fluctuation assays on 5-FOA are best described by the Luria–Delbrück distribution (the one-parameter model). For 10x canavanine, five fluctuation assays according to the log-likelihood-ratio test, or six according to the AIC, are best fitted by the two-parameter model.
|
-factor, 10x canavanine, and 5-FOA resistance to be 3.07 x 10–6, 1.52 x 10–7, and 5.43 x 10–8, respectively. For 5-FOA resistance, the data are best described by the one-parameter model (d = 0 for the two-parameter model, meaning that postplating growth and mutation does not occur); therefore, we can use Equations 29 and 30 of FOSTER (2006) to assign a 95% confidence interval to our estimate of mutation rate. This yields a confidence interval of 4.97–5.91 x 10–8 per generation (Table 2). For the two-parameter model we determined confidence intervals by simulation. For each combined 720-culture fluctuation assay we determined the most-likely values for m and d, given the data. To gauge the expected variation in these parameters, we simulated 1000 fluctuation assays by sampling the combined Luria–Delbrück/Poisson distribution using parameters determined from the data. We take the 95% confidence intervals for our estimate of m to be the values of m that encompass 95% of the simulated experiments. From this we calculate the 95% confidence intervals on the two-parameter model to be 2.65–3.62 x 10–6, 1.34–1.71 x 10–7, and 4.78–5.87 x 10–8 for
-factor, 10x canavanine, and 5-FOA resistance, respectively.
Mutational spectra and target size:
Mutational spectra:
We wanted to convert our phenotypic mutation rates to per-base-pair mutation rates. The material for this conversion is the mutational spectra; from the fluctuation assays we sequenced 237 ura3 alleles and 227 can1 alleles from 5-FOA- and 10x canavanine-resistant strains, respectively (the identity of all 464 mutant alleles is available in the supplemental information at http://www.genetics.org/supplemental/). Thirty 5-FOA-resistant mutants contain wild-type URA3 alleles; 29 of these mutants are uracil prototrophs. It has been reported that mutations in FUR1 can confer this phenotype; however, we failed to find any mutations within the coding sequence of this gene for any of the 29 5-FOA-resistant uracil prototrophs (data not shown). None of the 207 ura3 mutants are prototrophic, and each contains a single mutation or two mutations within a few nucleotides. There are 167 bp substitutions (64 nonsense and 103 missense), 22 single-bp deletions, 3 2-bp deletions, 3 single-bp insertions, one 3-bp insertion, two large tandem duplications, and nine double mutations (Figure 2). All 227 10x canavanine-resistant mutants contain a single mutation or closely adjacent mutations at the CAN1 locus. We find 150 bp substitutions (70 nonsense and 80 missense), 55 single-bp deletions, 8 single-bp insertions, one 2-bp insertion, 10 double mutations, and 3 complex mutations including a can1 allele containing a 27-bp deletion and a 30-bp tandem duplication (Figure 3).
|
|
Definition of effective target size:
We define a target size to perform two calculations: determining the genomic mutation rate given experiments that measure the rate of mutation at a particular locus and predicting the rate at which a particular phenotype arises given a genomic mutation rate. This requires that we deal with two problems: variation in the base pair substitution rate across the genome and differences between base pairs in the probability that a substitution at that base produces a mutant phenotype. Allowing the target size to depend on the location of the gene seems counterintuitive, but is essential if we want to consider the effects of rearrangements (man made or evolutionary) that alter the location of the gene. We deal with the different substitutions at the same base pair by defining the target size to be the number of base pairs that would account for the observed mutation rate if every substitution at each of these bases produced the mutant phenotype. As an example, consider the target size for nonsense mutations in a gene with 150 bp at which a single substitution could produce a nonsense codon and 24 bp (such as the C:G base pair in a UCA codon) at which two different substitutions both yield nonsense codons. The target size for nonsense mutations would be 150/3 + 24 · (2/3) = 66 bp, meaning the gene behaves as if it had 66 bp at which any substitution would produce a nonsense codon and that we could derive the genomic mutation rate by dividing the observed rate of nonsense mutations in this gene by 66.
There is no neat formula for the fraction of substitutions that give missense mutations, and each gene will have a different distribution of base pairs where zero, one, two, or all three possible substitutions produce mutant phenotypes. We deal with this complexity by estimating the number of possible base pair substitutions that give rise to mutant phenotypes and dividing by three, to produce a target size that assumes that for any base pair in the target, all three substitutions produce the mutant phenotype.
We define effective target size as the size of the genome, G, multiplied by the probability that introducing a single genomic mutation (this could be a base pair substitution, insertion/deletion, transposition, etc.) will result in the phenotype of interest:
![]() |
![]() |
![]() |
![]() |
): the mutation rate to a given phenotype (
for the mutation rate to canavanine resistance) divided by the target size for such mutations (
) and the overall, genomewide mutation rate (
) divided by the length of the genome (G),
![]() |
is the mutation rate to canavanine resistance,
is the genomewide average mutation rate per base pair per generation, and
is the mutation rate per genome per generation. Similarly, the average mutation rate per base pair per generation at the CAN1 locus (
) is the mutation rate to canvanine resistance (
) divided by the locus-specific, effective target size for mutations to canavanine resistance at CAN1,
. The locus-specific target size and mutation rate,
and
, are related to the genomewide target size and mutation rate,
and
, through the parameter
CAN1, which is the ratio of the mutation rate at the CAN1 locus compared to the genomewide average;
= 1 identifies loci where the mutation rate equals the genomic average, loci where
< 1 are coldspots, and those where
> 1 are hotspots:
![]() |
![]() |
Calculation of effective target size and the per-base-pair mutation rate:
The effective target size to canavanine resistance,
, is difficult to determine experimentally; however, from mutational spectra we can determine the locus-specific effective target size to canavanine resistance conditioned on a mutation at the CAN1 locus,
. Table 4 provides a summary of the calculations of this target size and the per-base-pair mutation rate at CAN1 and URA3. To calculate
, we first rewrite it as
![]() |
and
are the frequency with which base pair substitutions and insertion, deletion, or other DNA rearrangements (which we collectively refer to as indels) occur. Assuming that all indels in CAN1 result in loss of function,
is 1773 bp. To determine
we separate the observed base pair substitutions into nonsense and missense (70 and 80, respectively). CAN1 contains 226 possible nonsense substitutions, 54 of which we found (as expected, some mutations were identified multiple times). The 80 missense mutations represent 63 unique substitutions. Assuming that we identified the same proportion of detectable missense and nonsense mutations, we can calculate the number of missense mutations conferring canavanine resistance as 63(226/54) = 264. Since we want our effective target size to be the number of base pairs at which any substitution would produce a mutation and there are three possible substitutions at each base, the locus-specific effective target size for canavanine resistance at the CAN1 locus by way of missense and nonsense mutations is 264/3 = 88 bp and 226/3 = 75 bp, respectively. From the CAN1 sequence, we know the location of every possible nonsense mutation. For the missense mutations we know that there are 264 possible mutations; however, our method is blind to the locations of the mutations other than those identified in our mutational spectra. A locus-specific effective target size for missense mutations of 88 bp could represent 88 positions where any of the three possible substitutions causes a phenotype, 264 positions where only 1 of 3 substitutions causes a phenotype, or something in between.
|
![]() |
![]() |
66%) and 77/226 (
34%), respectively, but only 9% of base pair substitutions result in canavanine resistance. Thus the fraction of mutations that are substitutions,
, is actually 0.95
and those that are indels,
, is only 0.05. Using these values, we estimate the locus-specific effective target size to canavanine resistance at the CAN1 locus to be
![]() |
![]() |
Similar calculations for URA3 show that
![]() |
![]() |
Taking into account that only 207 of the 237 5-FOA-resistant mutants sequenced were ura3 mutants, the rate of mutation to 5-FOA resistance at URA3 is 4.75 x 10–8/cell/generation. Thus we calculate
![]() |
![]() |
| DISCUSSION |
|---|
|
|
|---|
Analyzing fluctuation data:
The Ma–Sandri–Sarkar maximum-likelihood method is the most accurate method for estimating the expected number of mutants per culture (m) from fluctuation data; however, this method assumes that the data follow the Luria–Delbrück distribution. We have shown that postplating proliferation and mutation of canavanine-sensitive cells on 1x canavanine plates can be detected since it produces a deviation from the expected Luria–Delbrück distribution. If the data are not corrected, this leads to an overestimation of the mutation rate. One can correct for this by fitting the data to a two-parameter model that accounts for postplating growth or largely eliminate it by increasing the concentration of canavanine. Other processes that introduce error into mutation rate estimates such as differential growth rates between mutants and nonmutants (ZHENG 2005) and poor plating efficiency (STEWART et al. 1990; STEWART 1991) will also produce deviations from the expected Luria–Delbrück distribution. Therefore, we suggest that fitting fluctuation data to the cumulative distribution and comparing the sum-of-squares differences with simulated data should be used as a general method for assaying the quality of data resulting from fluctuation assays.
We can assign significance to deviations from the Luria–Delbrück distribution by simulation. Data from the 1x canavanine fluctuation assay (Figure 1) give a maximum-likelihood value of m = 4.80. We calculated the sum-of-squares differences between the cumulative distribution of the data and the Luria–Delbrück distribution with m = 4.80. To determine the expected sum-of-squares differences, we simulated 10,000 72-tube fluctuation assays by sampling from the Luria–Delbrück distribution with m = 4.80 and calculated the sum-of-squares differences for each simulated experiment. We find that only 3.5% of the simulated experiments have a poorer fit to the Luria–Delbrück distribution than the observed 1x canavanine data compared to 30 and 41% for 10x canavanine and 5-FOA, respectively.
Mutational spectra:
We sequenced 237 5-FOA-resistant ura3 alleles and 227 10x canavanine-resistant can1 alleles to determine the locus-specific effective target size for phenotypic mutations. From these data sets we can garner additional information regarding the mutagenic processes leading to loss of function at URA3 and CAN1. Nonsense mutations represent a larger fraction of base pair substitutions in the can1 data set (47% vs. 38%). This indicates that a larger fraction of missense mutations cause loss of function for URA3 (10.9% vs. 6.8% as calculated by dividing the number of possible loss-of-function missense mutations by the number of possible missense mutations). This difference is reflected in our calculation of locus-specific effective target size where, although the coding sequence of CAN1 is 2.2 times larger, the effective target size for loss of function by way of base pair substitutions is only 1.6 times larger. Loss-of-function mutations in our mutational spectra are overrepresented at conserved residues (P = 1.5 x 10–5, Wilcoxon rank sum, A. SINGHAL and A. SEGRE, data not shown). In our compiled URA3 and CAN1 mutational spectra we identified 88 single-bp insertions/deletions in which deletions were overrepresented by 7:1 (P < 0.001, chi square). Of the base pair substitutions, 206 were transversions and 111 were transitions (Table 5). This is consistent (P > 0.25, chi square) with the 2:1 ratio of transversions to transitions we expect if all substitutions are equally probable.
|
In addition, we can test for mutational hotspots/coldspots by asking if the number of times we found a given base pair substitution deviates from what we would expect from binomial sampling. For URA3 the numbers of mutations we identified zero, one, two, three, or four times are 206, 71, 18, 12, and 6. These deviate significantly from the expectation of binomial sampling (184, 98, 26, 5, and 1; P < 0.01, chi square). Similarly, for CAN1 the numbers of mutations we identified zero, one, two, three, or four times are 373, 91, 20, 5, and 1, which deviate significantly from binomial sampling (360, 111, 17, 2, and 0; P < 0.05, chi square). Therefore, although we do not see regional biases in our mutational spectra we do find particular substitutions to be over/underrepresented, possibly reflecting biases due to local sequence context effects. The variation we find in the yeast URA3 and CAN1 genes is significantly less than the degree of variation seen across the LacI gene in Escherichia coli (MILLER et al. 1977).
We found 20 instances of multiple mutation events occurring in the same strain. One can1 allele contains a 27-bp deletion and a 30-bp imperfect duplication separated by 312 wild-type bp. The remaining 20 were multiple mutation events occurring within a few nucleotides of each other, 9 in ura3 and 11 in can1 (Table 6). In one case the same complex mutation, a double deletion and base pair substitution, was found in two can1 strains that were adjacent during much of the processing (restreaking, genomic DNA preparation, PCR, and sequencing); therefore, this may represent a single event that was inadvertently sampled twice. Half of the multiple mutation events are interspersed with one or more bases of wild-type sequence; therefore, multiple mutation events must have occurred. These events may represent instances where lesion bypass has occurred and the multiple mutations result from decreased fidelity of translesion polymerases. The translesion polymerase Pol
can efficiently extend unpaired primer termini resulting from incorporation opposite a lesion and it is thought that up to half of all spontaneous mutations occur in a Pol
-dependent manner. (RATTRAY and STRATHERN 2003; PRAKASH et al. 2005).
|
) to phenotypic mutation as a way to link the mutation rate per genome per generation to the measurable phenotypic mutation rate. We have defined effective target size as
![]() |
). Intuitively this means that if one considers only base pair substitutions, the CAN1 gene is effectively 163 bp where any base pair substitution will result in canavanine resistance. This value is then used to calculate the locus-specific effective target size to canavanine resistance by way of any mutation (
), meaning that CAN1 is effectively 236 bp where any mutation will result in canavanine resistance. To calculate
one needs to weight the effective target sizes for canavanine resistance by way of each particular class of mutation by the frequency with which that mutation occurs. The effective target sizes that are calculated are valid only as long as the frequencies of particular classes of mutation are conserved and, therefore, are likely to vary between strain backgrounds and growth conditions. Varying the selective medium may alter the fraction of missense mutations; for example, some ura3 mutants will form colonies at low concentrations of 5-fluoroorotic acid but not at high ones. Therefore, we determined the effective target sizes by sequencing mutant ura3 and can1 alleles from the same plates that were used for the fluctuation assays.
Genes of similar lengths may have very different effective target sizes. Mutational hotspots such as microsatellite sequences and polynucleotide runs will increase the effective target size by increasing the local rate of frameshift mutations. Mutator alleles not only increase the mutation rate, but also influence the effective target size by altering the mutational spectra. In addition, since mutation rates are believed to vary across the genome (ITO-HARASHIMA et al. 2002; HAWK et al. 2005), the effective target size may change if a gene is moved to a different location. In the context of this experiment, if CAN1 were moved to a location in the genome where the mutation rate is twofold higher than at its endogenous locus, the target size will be twice as large, since, given the definition of effective target size, moving the gene doubles the probability of a mutation resulting in canavanine resistance given a single mutation occurring anywhere in the genome. In our notation, we do not explicitly state that the CAN1 gene is at an endogenous location. We do, however, indicate that our estimate of locus-specific effective target size is conditioned upon a mutation within the 1773-bp region of the CAN1 coding sequence. We specify this with the superscript CAN1 (
) to distinguish this locus-specific effective target size from the effective target size for a mutation occurring anywhere within the genome (
).
and
are related by the hotspot parameter
,
![]() |
Target size for mutations conferring resistance to
-factor:
If we take
to be the average of the per-base-pair mutation rates at CAN1 and URA3, 5.12 x 10–10, we can estimate the effective target size for mutation to
-factor resistance as
![]() |
-factor resistance is 41.5 kbp. Summing over the lengths of known targets (STE2, STE4, STE5, STE7, STE11, STE12, STE20, STE50, FAR1, and FUS1) accounts for only 18.6 kbp. There are four possible explanations for this inconsistency. There could be unidentified genes whose inactivation results in
-factor resistance; however, given the degree to which the mating pathway has been studied, it is unlikely that enough components remain unidentified to account for this difference. It could be that many more loss-of-function missense mutations are possible for signaling proteins than for enzymes or transporters. The third possibility is that a change of the mating-type locus from MATa to MAT
provides an additional class of mutation to
-factor resistance. Although the strain used in this study is heterothallic, a spontaneous double-strand break at the MAT locus can be repaired off of the silent HML
cassette, resulting in mating-type switching and
-factor resistance. The rate of mating-type switching in heterothallic yeast is estimated to be between 10–7 and 10–6 (KLEIN and WINTERSBERGER 1988). An additional possibility is that CAN1 and URA3 are located in regions that are coldspots compared to the genomewide average mutation rate (or one or more of the genes involved in
-factor resistance could be located in a mutational hotspot; however, since the target for
-factor resistance is spread over the genome, it likely averages over local hotspots and coldspots). Asserting that the known genes that can mutate to confer
-factor resistance had an average
of 1 predicts
CAN1 = 0.55 and
URA3 = 0.32.
Mutation rate per base pair per generation:
We measured phenotypic mutation rates and, from the same experiments, the locus-specific effective target sizes to phenotypic mutation. Our results indicate that the per-base-pair mutation rate at URA3 and CAN1 is
and
, respectively. DRAKE (1991) obtains similar values, but his method differs slightly. He also utilizes fluctuation assays and mutational spectra; however, rather than calculate the effective target size to phenotypic mutation, Drake estimates the number of base pair substitutions that occurred in the coding sequence as 64/3 times the number of nonsense mutations detected, ignoring missense mutations detected in the mutational spectra. He then calculates a correction factor (the inverse of the detection frequency) to scale the mutation rate and then divides the corrected mutation rate by the size of the open reading frame.
In principle Drake's method and ours should yield similar values for the per-base-pair mutation rate. Analyzing our data using the Drake method yields estimates of
and
, respectively. Drake converts the per-base-pair mutation rate to a per-genome mutation rate by scaling to the size of the genome. Since these estimates of the per-base-pair mutation rate are specific for particular loci, scaling up is accurate only if mutation rate is uniform across the genome. Several experiments suggest that mutation rate varies across the genome by at least an order of magnitude (ITO-HARASHIMA et al. 2002; HAWK et al. 2005). On a genomic scale, URA3 and CAN1 are relatively close (83 kb apart on the left arm of chromosome V) yet our two-point estimates of the per-base-pair mutation rate differ by a factor of 1.7.
To determine if this difference in mutation rate is significant, we need to place confidence limits on our per-base-pair mutation rate estimates. We showed that the 95% confidence intervals for our estimates of phenotypic mutation rate are 1.34–1.71 x 10–7, and 4.78–5.87 x 10–8 for resistance to 10x canavanine and 5-FOA, respectively. Since only 207 of 237 5-FOA-resistant mutations are URA3 mutants, the 95% confidence interval for the rate of loss of function of URA3 is 4.17–5.13 x 10–8.
We used a bootstrapping method to generate 95% confidence intervals around our estimates of the effective target size by discarding 25% of our mutational spectra data and recalculating the effective target sizes. This process was iterated 10,000 times for both
and
. The ranges of these distributions are 99.90–162.82 and 190.64–285.60 for
and
. Excluding the extreme 2.5% at either end of the distribution, 95% of the values lie between 109.17 and 140.81 for
and between 207.16 and 257.73 for
. To place conservative confidence limits on the per-base-pair mutation rate estimates we took the lower bound for phenotypic mutation rate divided by the upper bound for the effective target size and vice versa. For example, the lower bound on
is 4.17 x 10–8/140.81 = 2.96 x 10–10. Performing this for both URA3 and CAN1 yields nonoverlapping confidence intervals of 2.96–4.70 x 10–10 and 5.21–8.24 x 10–10/bp/generation for
and
, respectively. It is possible this observed difference reflects a difference in the ability to detect mutations at these two loci and not an underlying difference in the per-base-pair mutation rate. However, differences in the plating efficiency or phenotypic lag, which would alter the ability to detect mutants, are expected to produce deviations from the Luria–Delbrück distribution, which is not observed in our data (Table 3).
In this article we have shown that the per-base-pair mutation rate varies on two length scales: between different positions within the CAN1 and URA3 genes and between the genes themselves. It is possible that these two observations are related; however, since we measure forward mutation rates over large targets, we likely average over the local sequence effects. Therefore, the difference in the per-base-pair mutation rate we observe between CAN1 and URA3 is most likely due to a mutation rate variation on a larger scale. ITO-HARASHIMA et al. (2002) find that the frequencies of ochre suppressor mutations, detected at eight identical tRNA-Tyr alleles, vary by a factor of
20. HAWK et al. (2005) show that the rate of microsatellite frameshift mutations varies 16-fold across the genome, due in part to variation in the efficiency of mismatch repair. Consistent with this, we show that the per-base-pair per-generation spontaneous mutation rate is nonuniform across the genome and varies
2-fold between two reporters, 83 kb apart, on the left arm of chromosome V.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| LITERATURE CITED |
|---|
|
|
|---|
AKAIKE, H., 1974 A new look at the statistical model identification. IEEE Trans. Automatic Control 16: 716–723.
DENVER, D. R., K. MORRIS, M. LYNCH and W. K. THOMAS, 2004 High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature 430: 679–682.[CrossRef][Medline]
DOHLMAN, H. G., 2002 G proteins and pheromone signaling. Annu. Rev. Physiol. 64: 129–152.[CrossRef][Medline]
DRAKE, J. W., 1991 A constant rate of spontaneous mutation in DNA-based microbes. Proc. Natl. Acad. Sci. USA 88: 7160–7164.
FOSTER, P. L., 2006 Methods for determining spontaneous mutation rates. Methods Enzymol. 409: 195–213.[Medline]
GARCIA-DORADO, A., and A. GALLEGO, 2003 Comparing analysis methods for mutation-accumulation data: a simulation study. Genetics 164: 807–819.
HAAG-LIAUTARD, C., M. DORRIS, X. MASIDE, S. MACASKILL, D. L. HALLIGAN et al., 2007 Direct estimation of per nucleotide and genomic deleterious mutation rates in Drosophila. Nature 445: 82–85.[CrossRef][Medline]
HAWK, J. D., L. STEFANOVIC, J. C. BOYER, T. D. PETES and R. A. FARBER, 2005 Variation in efficiency of DNA mismatch repair at different sites in the yeast genome. Proc. Natl. Acad. Sci. USA 102: 8639–8643.
ITO-HARASHIMA, S., P. E. HARTZOG, H. SINHA and J. H. MCCUSKER, 2002 The tRNA-Tyr gene family of Saccharomyces cerevisiae: agents of phenotypic variation and position effects on mutation frequency. Genetics 161: 1395–1410.
KIBOTA, T. T., and M. LYNCH, 1996 Estimate of the genomic mutation rate deleterious to overall fitness in E. coli. Nature 381: 694–696.[CrossRef][Medline]
KLEIN, F., and U. WINTERSBERGER, 1988 Determination of mating-type conversion rates of heterothallic Saccharomyces cerevisiae with the fluctuation test. Curr. Genet. 14: 355–362.[CrossRef]
LURIA, S., and M. DELBRÜCK, 1943 Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28: 491–511.
MA, W. T., G. H. SANDRI and S. SARKAR, 1992 Analysis of the Luria-Delbrück distribution using discrete convolution powers. J. Appl. Probab. 29: 255–267.[CrossRef]
MILLER, J. H., D. GANEM, P. LU and A. SCHMITZ, 1977 Genetic studies of the lac repressor. I. Correlation of mutational sites with specific amino acid residues: construction of a colinear gene-protein map. J. Mol. Biol. 109: 275–298.[CrossRef][Medline]
PRAKASH, S., R. E. JOHNSON and L. PRAKASH, 2005 Eukaryotic translesion synthesis DNA polymerases: specificity of structure and function. Annu. Rev. Biochem. 74: 317–353.[CrossRef][Medline]
RATTRAY, A. J., and J. N. STRATHERN, 2003 Error-prone DNA polymerases: when making a mistake is the only way to get ahead. Annu. Rev. Genet. 37: 31–66.[CrossRef][Medline]
ROSCHE, W. A., and P. L. FOSTER, 2000 Determining mutation rates in bacterial populations. Methods 20: 4–17.[CrossRef][Medline]
SARKAR, S., W. T. MA and G. H. SANDRI, 1992 On fluctuation analysis: a new, simple and efficient method for computing the expected number of mutants. Genetica 85: 173–179.[CrossRef][Medline]
SHERMAN, F., G. FINK and C. LAWRENCE, 1974 Methods in Yeast Genetics. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
STEWART, F. M., 1991 Fluctuation analysis: the effect of plating efficiency. Genetica 84: 51–55.[CrossRef][Medline]
STEWART, F. M., 1994 Fluctuation tests: How reliable are the estimates of mutation rates? Genetics 137: 1139–1146.[Abstract]
STEWART, F. M., D. M. GORDON and B. R. LEVIN, 1990 Fluctuation analysis: the probability distribution of the number of mutants under different conditions. Genetics 124: 175–185.[Abstract]
ZEYL, C., and J. A. DEVISSER, 2001 Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics 157: 53–61.
ZHENG, Q., 2005 New algorithms for Luria-Delbrück fluctuation analysis. Math. Biosci. 196: 198–214.[CrossRef][Medline]
Communicating editor: J. LAWRENCERelated articles in Genetics:
| ||||||||||||||||||||||||||