Abstract
When the DNA polymerase that replicates the Escherichia coli chromosome, DNA polymerase III, makes an error, there are two primary defenses against mutation: proofreading by the ϵ subunit of the holoenzyme and mismatch repair. In proofreading-deficient strains, mismatch repair is partially saturated and the cell’s response to DNA damage, the SOS response, may be partially induced. To investigate the nature of replication errors, we used mutation accumulation experiments and whole-genome sequencing to determine mutation rates and mutational spectra across the entire chromosome of strains deficient in proofreading, mismatch repair, and the SOS response. We report that a proofreading-deficient strain has a mutation rate 4000-fold greater than wild-type strains. While the SOS response may be induced in these cells, it does not contribute to the mutational load. Inactivating mismatch repair in a proofreading-deficient strain increases the mutation rate another 1.5-fold. DNA polymerase has a bias for converting G:C to A:T base pairs, but proofreading reduces the impact of these mutations, helping to maintain the genomic G:C content. These findings give an unprecedented view of how polymerase and error-correction pathways work together to maintain E. coli’s low mutation rate of 1 per 1000 generations.
ACCURATE mutation rates have recently been determined for a variety of wild-type and mutant strains of Escherichia coli using mutation accumulation (MA) experiments coupled with whole-genome sequencing (WGS). Such experiments revealed that, at least in a laboratory setting, few DNA repair pathways are essential for maintaining E. coli’s low mutation rate of 1 mutation per 103 generations (Lee et al. 2012; Foster et al. 2015). Of 11 E. coli strains each defective in a major DNA repair pathway, only those unable to repair oxidative damage showed a substantial increase in spontaneous mutation rates (Foster et al. 2015). Thus, the major determinants of replication accuracy are the intrinsic fidelity of DNA replication, replication proofreading, and postreplication mismatch repair (MMR).
E. coli’s replicative DNA polymerase, polymerase III (Pol III), is a multiprotein machine. As measured in vitro, the polymerase subunit, α (encoded by the dnaE gene), has an intrinsic error rate of one per 104–105 nucleotides incorporated (Bloom et al. 1997). The major determinant of this accuracy is a restrictive active site that sterically prevents most mismatches (Johnson 2010). The 3′ to 5′ exonuclease of the proofreading subunit of Pol III, ϵ (encoded by the dnaQ gene), improves accuracy by removing mismatched nucleotides, allowing polymerase to resynthesize the DNA. In vitro, proofreading improves the accuracy of DNA synthesis 10- to 100-fold (Bloom et al. 1997). Based on the mutation rates of reporter genes, estimates of proofreader’s contribution to replication accuracy in vivo have ranged from 102- to 105-fold (Fowler et al. 1974; Cox and Horner 1982; Schaaper 1988, 1993; Nowosielska et al. 2004). Using an MA protocol, Tsuru et al. (2015) reported that proofreading improved accuracy only 25-fold (Tsuru et al. 2015). However, the E. coli strain used in that study carried a deletion of the dnaQ gene, and such strains rapidly accumulate suppressor mutations in the dnaE gene, some of which lower the mutation rate (Lancy et al. 1989; Fijalkowska and Schaaper 1995).
To obtain an accurate estimate of the intrinsic error rate of DNA Pol III in vivo, proofreading must be eliminated. However, in addition to its proofreading functions, ϵ is an important structural component of the core polymerase and its loss causes severe growth defects. Partial function alleles of dnaQ can overcome this problem and allow the contribution of proofreading to the overall mutation rate to be evaluated (Cox and Horner 1982; Taft-Benz and Schaaper 1998). For the study reported here, we used the mutD5 allele of dnaQ, which reduces the exonuclease activity by 98% while maintaining the core polymerase structure (Fijalkowska and Schaaper 1996; Taft-Benz and Schaaper 1998; Perrino et al. 1999). The mutational phenotypes of the mutD5 allele have been extensively investigated using reporter gene assays (Fowler et al. 1974; Cox and Horner 1982; Schaaper 1988, 1993). Here, we extend this work to the entire chromosome by using an MA protocol followed by WGS.
Several factors complicate the mutational analysis of mutD5 mutant strains. Strains carrying certain mutant dnaQ alleles are induced to various degrees for the SOS response (Slater et al. 1994; Gautam et al. 2012; Whatley and Kreuzer 2015), which could alter the mutational profile. In addition, the mutator phenotype of mutD5 mutant strains is medium-dependent; mutation rates are 10- to 1000-fold higher when mutD5 mutant strains are grown on rich medium rather than on minimal medium (Cox and Horner 1982; Schaaper 1988). Finally, as mentioned above, suppressor mutations may arise that could alter the mutational profile.
To obtain a better estimate of the intrinsic error rate of DNA Pol III and a more complete understanding of the role of ϵ in replication fidelity, we used the MA/WGS approach to analyze the mutation rates and mutational spectra of a mutD5 mutant strain and a mutD5 mutant strain also defective in MMR. We evaluated the impact of growth on rich vs. minimal media. In addition, we show that the SOS-induced error-prone polymerases do not contribute to the mutation rate or spectra of E. coli strains carrying the mutD5 allele.
Materials and Methods
Bacterial strains and media
All strains used in this study, the methods of their construction, and the media used are given in Supplemental Material, Table S1. Genetic constructions were confirmed by PCR analyses using the oligonucleotides listed in Table S2. Further details are in the supplemental materials and methods.
Estimation of mutation rate from fluctuation assays
Mutation rates were determined as described (Foster 2006; Hall et al. 2009), using mutation to nalidixic acid resistance (NalR) as the reporter.
MA experiments
The MA procedure has been described previously (Lee et al. 2012; Foster et al. 2015). Generations were estimated from the colony diameters as previously described (Lee et al. 2012). More details are given in the supplemental materials and methods.
With these highly mutating strains, several precautions were taken to minimize the occurrence of mutations before or during the MA procedure that might modify the mutation rates or spectra. MA lines were initiated from at least two founders so that lines derived from founders that, after sequencing, proved to carry known mutator or antimutator mutations could be eliminated. The MA procedure was restricted to three to six passages to minimize selection. After sequencing, any MA lines that had known mutators or antimutators, or had mutation rates > 2 SD above or below the mean, were eliminated.
Genomic DNA preparation, library construction, sequencing, and SNP and insertion/deletion calling
Genomic DNA (gDNA) was isolated from an aliquot of an overnight culture (in rich or minimal medium as appropriate) inoculated from freezer stocks made after the last passage of each MA line. That the constructed deletions were present in each MA line was confirmed using diagnostic PCR of the gDNA before library construction; the oligonucleotides used are listed in Table S2. Library construction, sequencing, SNP and insertion/deletion (indel) calling, and mutation annotation are described in the supplemental materials and methods.
Some MA lines were eliminated because of poor sequence coverage. Identical mutations in two or more lines arose if mutations occurred in the founder colony or if cross contamination occurred during streaking. If lines shared > 50% of their mutations, then only one line was retained for analysis. If lines shared < 50%, each mutation was randomly assigned to only one of the lines.
Estimation of mutation rates from MA experiments
For each experiment, the mutation rate was estimated by dividing the total number of mutations accumulated by all the MA lines by the total number of generations that were undergone. This value for mutations per generation was then divided by the appropriate number of sites (A:T sites, G:C sites, etc.) to give the conditional mutation rate. The individual mutation rates for each line were used to compute confidence limits (CLs) (see the supplemental materials and methods for further details).
Statistical analysis
Standard statistical analysis was used (Zar 1984). Means and CLs were calculated from the MA lines for each experiment as described (Foster et al. 2018). Values and 95% CLs for ratios between variables were calculated as in Rice (1995). The expected values for χ2 tests were calculated from the numbers of the relevant feature in the genome or from the results of 1000 Monte Carlo simulations for each strain, as described (Lee et al. 2012).
Data availability
Strains are available upon request. File S1 contains the supplemental materials and methods. File S2 contains supplemental tables, which include strain genotypes, methods of strain construction, oligonucleotide sequences, and detailed data from each experiment. File S3 contains supplemental figures that are referenced in the text. The sequences, SNPs, and indels reported in this paper have been deposited in the National Center for Biotechnology Information Sequence Read Archive [https://trace.ncbi.nlm.nih.gov/Traces/sra/ (accession no. SRP013707)] and in the IUScholarWorks Repository (hdl.handle.net/2022/20340). Supplemental material available at Figshare: https://doi.org/10.25386/genetics.6513035.
Results
Mutational profile of mutD5 and mutD5 mutL mutant strains growing on rich medium
Base pair substitution rates:
The base pair substitution (BPS) rate of the mutD5 mutant strain growing on rich (LB) medium was 84 × 10−8 BPS/generation/nt, 4000-fold greater than that of the wild-type strain and 35-fold greater than that of the MMR-defective strains (Table 1). This increase relative to wild-type is in the middle of the 102–104-fold range reported in previous studies (Fowler et al. 1974; Schaaper 1988, 1993; Fijalkowska and Schaaper 1996; Nowosielska et al. 2004). To estimate the intrinsic error rate of DNA Pol III, we deleted the mutL gene in the mutD5 mutant strain, creating a strain deficient in the two most important pathways for correcting replication errors: MMR and proofreading. The BPS rate of the mutD5 mutL mutant strain was 125 × 10−8 BPS/generation/nt, 1.5-fold greater than that of the mutD5 mutant strain (Table 1), an increase slightly smaller than the 1.6–3.4 range previously reported (Schaaper 1993).
Selective pressure during the MA experiment:
Selective pressure is usually evaluated by the ratio of nonsynonymous to synonymous (NS/S) BPSs. Based on the codon usage in E. coli MG1655, the expected NS/S ratio is 3.25 (Lee et al. 2012), and this was significantly greater than the ratios for the mutD5 and mutD5 mutL mutant strains (Table 2). One thousand Monte Carlo simulations using the BPS spectra of the mutant strains yielded NS/S ratios of ∼2, slightly (6%), but statistically significantly, greater than the observed ratios (Table 2). Thus, the mutD5 and the mutD5 mutL mutant strains appear to be under mild selective pressure, likely because they have poor viability, as previously observed (Fijalkowska and Schaaper 1996).
If mutations accumulate in a neutral manner, the number of BPSs in coding and in noncoding (C/NC) DNA should reflect the numbers of base pairs in each (Lee et al. 2012). We previously observed that the C/NC ratio was significantly less than expected in wild-type strains, but slightly greater than expected in MMR-defective strains, suggesting that MMR preferentially repairs coding DNA (Lee et al. 2012). The C/NC ratio of the mutD5 strain was 5.47, not significantly different from the 5.74 ratio based on the genome or the 5.51 ratio obtained from Monte Carlo simulations using the BPS spectrum of the mutD5 mutant strain (Table 3). However, the C/NC ratio in the mutD5 mutL mutant strain, 6.96, was a significant 20% greater than the expected ratios both from the genome and from simulations, and 30% greater than the ratio of the mutD5 strain (χ2 = 74.5, P < 0.001) (Table 3). However, it was close to the 6.63 reported for a mutL mutant strain (Lee et al. 2012) (χ2 = 0.4, P = 0.5), suggesting that the apparent preference to repair coding DNA in wild-type strains is solely due to MMR, and that proofreader does not have this preference.
The BPS spectra:
The spectrum of BPS accumulated by the mutD5 mutant strain growing on rich medium is shown in Figure 1A and detailed in Table 4 (the numbers of BPSs are given in Table S3). As has been previously reported (Schaaper 1988, 1993; Fijalkowska and Schaaper 1996; Nowosielska et al. 2004), transitions occurred sixfold more frequently than transversions. The A:T transition rate was only 1.4-fold greater than the G:C transition rate (χ2 = 138, P < 0.001), less than the threefold observed with MMR-defective strains (Lee et al. 2012; Foster et al. 2018). The rates of the various transversions also varied significantly (χ2 = 501, P < 0.001) in the order A:T to T:A > G:C to T:A > A:T to C:G >> G:C to C:G, a pattern similar to that observed for MMR-mutant strains (Lee et al. 2012; Foster et al. 2018).
The conditional BPS rates and spectra accumulated by the mutD5 and mutD5 mutL-mutant strains. The bars represent the BPSs per generation per number of A:T or G:C base pairs in the genome; the error bars are 95% CLs. (A) Cells were grown on LB medium; mutD5, PFM163; mutD5 mutL, PFM165/397/399. (B) Cells were grown on minimal glucose medium; mutD5, PFM163; mutD5 mutL, PFM165. The data in (B) for the mutD5 strain on minimal medium is presented with an expanded axis in Figure S4A. BPS, base pair substitution; CL, confidence limit.
Deleting MMR repair in the mutD5 mutL mutant strain resulted in a 2.4-fold increase in the G:C transition rate, which in this strain exceeded the A:T transition rate by 1.6-fold (χ2 = 1229, P < < 0.001) (Figure 1A, Table 4, and Table S3). This increase in G:C transitions entirely accounted for the difference in mutation rates between the mutD5 and mutD5 mutL mutant strains, and resulted in a spectrum of BPSs closely resembling that of the wild-type strain (Table 4). In contrast, other studies have found that the rate of A:T transitions exceeds that of G:C transitions in MMR-defective mutD5 mutant strains (Schaaper 1993) (see Discussion). The rates of the various transversions occurred in the same pattern in the mutD5 mutL mutant strain as in the mutD5 mutant strain.
The DNA strand bias of BPSs:
In MMR-defective strains, A:T transitions are 2.4-fold more frequent when A is on the lagging strand template (LGST) and T is on the leading strand template (LDST) than in the opposite orientation. Likewise, G:C transitions are 2.3-fold more frequent when C is on the LGST and G is on the LDST than in the opposite orientation (Lee et al. 2012; Bhagwat et al. 2016; Foster et al. 2018). Neither the mutD5 nor the mutD5 mutL mutant strain exhibited these strong strand biases (Table 5). In the mutD5 mutant strain, A:T transitions were only 1.17-fold more frequent when A was on the LGST than on the LDST, and there was no strand bias for G:C transitions. In the mutD5 mutL mutant strain, A:T and G:C transitions occurred 1.19- and 1.03-fold more frequently with A and C on the LGST. While statistically significant, these 10–20% strand biases are much less prominent than the twofold biases exhibited by MMR-defective proofreading-proficient strains, suggesting that nucleotide misincorporation during DNA replication is not strand biased but proofreading is (see Discussion).
The local sequence context of BPS:
The sequence context in which a base pair appears affects its mutability (Lee et al. 2012; Sung et al. 2015). In both the mutD5 and the mutD5 mutL mutant strains, the adjacent bases are the most important determinants (Figures S1 and S2). Therefore, we analyzed the influence of only the bases immediately 5′ and 3′ to the mutated base. While there are 64 possible triplets, in double-stranded DNA only 32 are nonredundant. A triplet and its reverse complement (each read 5′ to 3′) are equivalent since each pairs with the other on the opposite DNA strand.
As shown in Figure 2, the mutation rate of A:T base pairs in the triplets 5′NAC3′+5′GTN3′ was ≈twofold greater than the average rate of A:T base pairs in the other triplets (throughout this report, a triplet and its complement are both presented 5′ to 3′ with the mutated base in the middle). This pattern is similar to that observed for both the wild-type and MMR-mutant strains, except that the dominance of 5′NAC3′+5′GTN3′ sites (10- to 16-fold) was more dramatic in the MMR-defective strains (Lee et al. 2012; Foster et al. 2018). In the mutD5 mutant strain, the mutation rate of G:C base pairs in the triplets 5′NGC3′+5′GCN3′ was also about twofold greater than the average mutation rate of G:C base pairs in the other triplets, but in the mutD5 mutL mutant strain, this ratio dropped to 1.2- to 1.6-fold. Thus, these sites are not as prominent in the mutD5 mutL spectrum as 5′NAC3′+5′GTN3′ sites. Based on the results from all the strains examined to date, mutations are potentiated by a C 3′ to the purine or a G 5′ to the pyrimidine at A:T base pairs and, to a lesser extent, at G:C base pairs. Interestingly, the context bias of BPSs in the mutD5 mutL mutant strain is similar in pattern and relative magnitude to that in the wild-type strain (Figure S3), which will be further discussed below (see Discussion).
The context bias of the base pair substitutions accumulated by the mutD5 and mutD5 mutL strains. The x-axis labels are the 32 nonredundant triplets oriented 5′NMN3′ with the mutated base in the center. The bars represent the BPS per generation per triplet in the genome; the error bars are 95% CLs. (A) Cells were grown on LB medium; mutD5, PFM163; mutD5 mutL, PFM165/397/399. (B) cells were grown on minimal glucose medium; mutD5, PFM163; mutD5 mutL, PFM165. The data in (B) for the mutD5 strain on minimal medium is presented with an expanded axis in Figure S4B. BPS, base pair substitution; CL, confidence limit.
Conclusions based on the phenotype of mutD5 mutant strains are complicated by the possibility that MMR becomes saturated when proofreader is deficient (Schaaper 1988). Comparison of the mutation rates of the mutD5 and mutD5 mutL mutant strains (Table 4) indicates that the ability of MMR to prevent A:T mutations was saturated, but not its ability to prevent G:C mutations; the G:C mutation rate of the mutD5 mutant strain increased an additional 2.17 ± 0.05-fold with the loss of MMR (mean ± 95% CL) (see Discussion).
Spontaneous indel rates and spectra:
As previously observed for wild-type and MMR-defective strains (Lee et al. 2012), in both the mutD5 and mutD5 mutL mutant strains, the rates of small (≤ 4 bp) indels were one-tenth the BPS rates (Table 6). Also as expected from previous studies (Streisinger et al. 1966; Lee et al. 2012), in both the mutD5 and the mutD5 mutL mutant strains, homopolymeric runs were hotspots for indels and the indel rate increased exponentially with the length of a run (Figure 3A). In the mutD5 mutant strain, all types of indels occurred at nearly the same rates. However, in the mutD5 mutL mutant strain, A:T insertions dominated, occurring 1.7-fold more often as A:T deletions (χ2 = 60, P < 0.001) and 2.1-fold more often as G:C insertions (χ2 = 108, P < 0.001) (Figure 4A and Table 6). G:C deletions were also prominent, occurring 1.8-fold more frequently than G:C insertions (χ2 = 56, P < 0.001) and 1.4-fold more frequently than A:T deletions (χ2 = 23, P < 0.001).
The rates of the indels in homopolymeric runs accumulated by the mutD5 and mutD5 mutL mutant strains. The bars represent the indels per generation per number of base pairs in each run of nt length in the genome. The error bars are 95% CLs, some of which are smaller than the symbols. (A) Cells were grown on LB medium; mutD5, PFM163; mutD5 mutL, PFM165/397/399. (B) Cells were grown on minimal glucose medium; mutD5, PFM163; mutD5 mutL, PFM165. CL, confidence limit; indel, insertion/deletion.
The conditional rates and spectra of the indels accumulated by the mutD5 and mutD5 mutL− mutant strains. The bars represent the indels per generation per number of relevant base pairs in the genome; the error bars are 95% CLs. (A) Cells were grown on LB medium; mutD5, PFM163; mutD5 mutL, PFM165/397/399. (B) Cells were grown on minimal glucose medium; mutD5, PFM163; mutD5 mutL, PFM165. The data in (B) for the mutD5 strain on minimal medium is presented with an expanded axis in Figure S5. CL, confidence limit; indel, insertion/deletion.
Mutational profile of mutD5 and mutD5 mutL mutant strains growing on minimal medium
The BPS profile:
Growing strains carrying the mutD5 allele on minimal rather than on rich medium lowers the mutation rate (Fowler et al. 1974; Cox and Horner 1982; Schaaper 1988). To evaluate the resulting mutational profile, we performed MA/WGS experiments with the mutD5 and mutD5 mutL mutant strains growing on glucose minimal medium. Relative to growth on rich medium, the BPS rate of the mutD5 mutant strain declined sixfold, to 15 × 10−8 BPS/generation/nt, whereas the BPS rate of the mutD5 mutL mutant strain declined only 1.5-fold, to 84 × 10−8 BPS/generation/nt (Table 4).
The ratios of NS/S BPSs of the mutD5 and the mutD5 mutL mutant strains growing on minimal medium, 1.97 and 1.75, were significantly less than expected from the genome or from simulations (Table 2), but not significantly different from that observed when the cells were grown on LB medium (χ2 = 0.4, P = 0.53 and χ2 = 1.5, P = 0.23, respectively). Thus, the mutD5 and mutD5 mutL mutant strains appear to be under some selective pressure whether they are growing on rich or on minimal medium.
The ratio of BPSs in C/NC DNA for the mutD5 mutant strain grown on minimal medium was significantly less than expected based on the genome or simulations (Table 3), and also significantly less than the ratio obtained when it was grown on rich medium (χ2 = 31, P < 0.001). Thus, the lower mutation rate of the mutD5 strain on minimal medium results in a slight bias for BPSs to occur in noncoding DNA. In contrast, the mutD5 mutL mutant strain grown on minimal medium showed the same bias for BPS to occur in coding DNA as it did when it was grown on rich medium (χ2 = 0.1, P = 0.7) (Table 3). Thus, on both types of media, MMR appears to preferentially prevent mutations in coding DNA, as previously observed (Lee et al. 2012).
The BPS spectra for the mutD5 and mutD5 mutL strains grown on minimal medium are shown in Figure 1B and given in Table 4 (also see Figure S4A and Table S3). Overall, the differences in the BPS spectra between the two growth media were modest. When the mutD5 mutant strain was grown on minimal medium, A:T transitions declined disproportionally relative to G:C transitions (7.6-fold vs. 4.7-fold; χ2 = 160, P < 0.001). When the mutD5 mutL mutant strain was grown on minimal medium, the ratio of transitions to transversions was double the ratio seen when the strain was grown on rich medium, largely due to a threefold decline in the relative rate of transversions. G:C transitions also declined slightly; their rate was 1.3-fold higher than that of A:T transitions (χ2 = 87 P < 0.001), compared to 1.6-fold higher when the mutD5 mutL mutant strain was grown on LB.
DNA strand bias:
Overall, growth on minimal medium did not change the strand biases from those observed when the strains were grown on LB. The one exception was a 1.2-fold increase in the frequency at which G:C transitions occurred with C on the LGST in the mutD5 mutant strain, which was significantly greater than expected (Table 5).
The local sequence context of BPSs:
Growing the mutD5 mutL strain on minimal medium resulted in nearly the same pattern of local sequence biases as growth on rich medium (Figure 2B). In particular, mutations at A:T base pairs were two- to threefold more frequent in the context 5′NAC3′+5′GTN3′, just as they were when the cells were grown on LB, indicating that DNA polymerase makes these errors frequently when cells are growing on either medium. However, in the mutD5 mutant strain, the influence of the 3′C was nearly gone, suggesting that MMR is better able to correct these errors when the cells are growing on minimal medium, probably because of the lower error rate (Figure 2B and Figure S4B).
Spontaneous indel rates and spectra:
As observed when strains were grown on rich medium, when the mutant strains were grown on minimal medium, indel rates were 10-fold lower than BPS rates (Table 6), homopolymeric runs were hotspots for indels, and the indel rate increased exponentially with the length of the run (Figure 3B). The spectra of indels in the two media were also similar (Figure 4B, Figure S5, Table 6, and Table S4). The only striking difference was the dominance of A:T insertions in the mutD5 mutant strain, which occurred at a twofold higher rate than A:T deletions (χ2 = 11, P = 0.001) and a fivefold higher rate than G:C insertions (χ2 = 34, P = < 0.001.
The SOS response does not contribute to the mutational load of the mutD5 mutant strain
Previous studies have reported that the SOS response is induced to various degrees in cells carrying mutant alleles of dnaQ (Slater et al. 1994; Gautam et al. 2012; Whatley and Kreuzer 2015). However, we have not investigated the extent to which SOS may be induced in our strains under the conditions of the MA experiments. The SOS response controls the expression of the two error-prone DNA polymerases, DNA Pol IV (encoded by the dinB gene) and Pol V (encoded by the umuDC genes) (Kenyon and Walker 1980; Fernández de Henestrosa et al. 2000), which could contribute to the mutational load of the mutD5 mutant strain. To test this hypothesis, we performed MA/WGS experiments with mutD5 mutant strains in which dinB, or both dinB and umuDC, were deleted, or which carried an allele of the SOS repressor, lexA3, that constitutively represses the SOS response (Mount et al. 1972).
As shown in Tables 1 to 5, deletion of the dinB gene, or both the dinB and the umuDC genes, in the mutD5 mutant strain made no significant difference in the BPS rates, spectra, or the other mutational parameters tested. Likewise, the rates and spectra of indels were unaffected by the deletions (Table 6). Surprisingly, the BPS rate of the mutD5 lexA3 strain was 1.4-fold higher than that of the mutD5 strain (t = 13, d.f. = 2, P = 0.005) (Table 4), suggesting that some other LexA-repressed gene may act to prevent some BPSs. Otherwise, the lexA3 allele did not affect the mutational profile of the mutD5 mutant strain. All of these results indicate that neither the error-prone polymerases nor the SOS response overall contributes to the mutational load of the mutD5 mutant strain in our MA experiments.
Discussion
The results of our studies of E. coli with a deficiency in proofreading can be summarized as follows.
The mutation rate of strains carrying the mutD5 mutant allele is ≈4000-fold higher than the mutation rate of the wild-type strain. This factor falls in the middle of previous estimates of 102–105. Loss of MMR increases this factor 1.5-fold.
As revealed in a strain defective for both proofreading and MMR, the replicative polymerase, Pol III, has a bias for making the errors that produce transitions, especially A:T transitions at 5′NAC3′+5′GTN3′ sites and, to a lesser degree, G:C transitions at 5′NGC3′+5′GCN3′ sites. However, overall, the spectrum of replication errors is dominated by G:C transitions.
Pol III has little strand bias for making errors. However, proofreading is strand-biased, resulting in the 2× bias observed for G:C transitions in wild-type strains, and for both G:C and A:T transitions in MMR-deficient strains.
Both proofreading and MMR have a bias for correcting the errors that produce A:T transitions, thus these transitions become prominent when either one is defective. Proofreader is also efficient at correcting the mismatches leading to G:C transitions, but, since these are the more prominent replication errors, G:C transitions dominate the wild-type spectrum.
When the mutD5 mutant strain is grown on minimal medium, its mutation rate is sixfold lower than when it is grown on LB, but this factor is only 1.5-fold if MMR is defective.
Neither the activities of the error-prone polymerases nor the SOS response overall contributes to the mutation load of the mutD5 mutant strain.
Our results differ in certain respects to those of previous studies of mutD5 and mutD5 mutL mutant strains. Using mutation to LacI−d as the reporter, Schaaper (1988, 1993) found that the BPS spectrum of the mutD5 mutant strain was dominated by G:C transitions, whereas that of the mutD5 mutL mutant strain was dominated by A:T transitions. In contrast, our results showed that the BPS spectrum of the mutD5 mutant strain was slightly biased toward A:T transitions, whereas the BPS spectrum of the mutD5 mutL mutant strain was biased toward G:C transitions. There are a number of possible reasons for these differences. First, while the LacI−d phenotype can result from a number of mutational events (Schaaper and Dunn 1991), the target is only 210 bp and does not include every possible sequence context in the genome. Second, the mutD5 alleles may differ. The mutD5 allele used in early studies by Schaaper and others had a long history of passages and genetic manipulations. Indeed, we sequenced the dnaQ gene of a strain derived from the original mutD5 mutant isolate (Degnen and Cox 1974) and discovered that it was actually the gene from E. coli B, not E. coli K12. Finally, as mentioned above, these highly mutating strains accumulate mutational enhancers and suppressors that can change the mutational profile.
Because of these considerations, we took precautions to ensure that our results were due only to loss of proofreading. First, we used recombineering to transfer only the E. coli K12 dnaQ gene carrying the mutD5 mutation, a C to T mutation at position 44 of the coding sequence (Fijalkowska and Schaaper 1996), to our parental strain. Before being used in a MA experiment, we sequenced the dnaQ and dnaE genes of each derived strain to verify that the dnaQ gene carried only the mutD5 mutation and that the dnaE gene was wild-type. Also, before use, we performed fluctuation tests to ensure that strains had the expected mutation rates. After sequencing the MA lines, we eliminated any that had known mutators or antimutators, or that had mutation rates > 2 SD above or below the mean, which would indicate that unknown mutation rate modifiers had appeared during the experiment.
Previous studies have found that MMR is saturated, at least partially, in strains that carry the mutD5 allele (Schaaper 1988). Here, we show that when growing on LB, the mutD5 mutL− mutant strain has a BPS rate 1.5-fold higher and an indel rate 1.6-fold higher than the mutD5 mutant strain, indicating that MMR is able to correct errors in the mutD5 strain. While this difference is much less than the ≈120-fold increase in the BPS rate observed when MMR is inactive in a proofreading-proficient strain (Table 1), the number of BPSs that MMR prevents in the mutD5 mutant strain, ≈2 per generation, is greater than the number that MMR prevents in the wild-type strain, ≈0.1 per generation. A similar conclusion can be made for the effect of MMR on indel formation; ≈0.2 indels per generation are prevented by MMR in the mutD5 mutant background, but only 0.02 in the wild-type background (Table 6 and Lee et al. 2012). However, although MMR may be working at high efficiency in the mutD5 mutant strain, it clearly cannot drive the mutation rate down to wild-type levels. In addition, when proofreader is defective, MMR appears to be nearly saturated for BPSs at A:T sites but not at G:C sites. Although in the absence of proofreading 5′NAC3′+5′GTN3′ are hotspots, A:T BPSs also arise at high rates at the other A:T sites, and these are relatively poor substrates for MMR [see the accompanying article in this issue by Foster et al. (2018)].
Most, if not all, of the increase in mutation rate of mutD5 mutant strains when growing in LB rather than minimal medium is due to the thymidine in LB (Degnen and Cox 1974; Erlich and Cox 1980). The most likely mechanism is a direct interaction between dTTP and ɛ that partially inactivates proofreading (Biswas and Kornberg 1984). Previous work has shown that because of the lower error rate, MMR is not saturated when mutD5 strains are grown on minimal medium (Schaaper 1988). From our data, MMR was able to prevent ≈3 BPSs per generation when the mutD5 mutant strain was growing on minimal medium, less than a twofold increase in efficiency over when the mutD5 mutant strain was growing on LB. Thus, in confirmation of previous results, the sixfold increase in mutation rate when the mutD5 strain is growing on LB medium must be due to some factors in addition to further saturation of MMR.
Our data show that neither the SOS response overall, nor the error-prone polymerases specifically, contribute to the mutation rate of strains carrying the mutD5 allele. The error-prone polymerases also did not add to the mutation rate during MA experiments with wild-type E. coli (Foster et al. 2015). We hypothesize that, in our strains and under our conditions, the SOS response may not be induced to sufficient levels to produce mutations. In support of this hypothesis, overproduction of the mutD5 allele, which is dominant, did not induce the SOS-response as measured by prophage induction (Gautam et al. 2012). Whatley and Kreuzer (2015) found that even in highly mutating dnaQ mutant strains the level of SOS induction, as measured by a lacZ fusion to the SOS-induced gene dinD, was only twofold higher than in wild-type strains; these authors concluded that mutation rate and SOS induction were not coupled in dnaQ mutant strains.
In the absence of MMR and proofreading, mutations were biased toward conversion of G:C to A:T base pairs and for creation of +1 A:T indels (Table 4 and Table 6). We assume that these biases are intrinsic to DNA Pol III. A long-standing a hypothesis, called the “A-rule,” postulates that some (but not all) DNA polymerases are biased for binding and inserting As when replicating past abasic sites and certain other DNA lesions [reviewed in Strauss (2002)]. However, this process creates mainly transversions, whereas the spectrum in our MA experiments is dominated by transitions, and also would be unlikely to produce +1 A:T indels. In addition, the estimated rate of spontaneous depurination during replication fails by two orders of magnitude to account for the mutation rates observed in the mutD5 mutant strains (Lee et al. 2012). Thus, our results suggest that DNA Pol III has a preference for inserting A’s even when replicating undamaged DNA.
The spectrum and context bias of BPSs in the mutD5 mutL mutant strain is similar in pattern and relative magnitude to that in the wild-type strain (Figure S3 and Table 4), suggesting that the effects of MMR and proofreading are synergistic, but nonetheless leave the signature of replication errors to appear in wild-type cells, albeit at a 6000-fold lower rate. Both MMR and proofreading are more efficient at preventing BPSs at A:Ts than at G:Cs, but this factor for MMR is about fourfold whereas for proofreader it is only twofold. Given that replication errors are biased toward G:C transitions, and that proofreader is 40-fold more powerful than MMR but only slightly biased against preventing G:C mutations, the result is that G:C transitions dominate the wild-type spectrum. But within that context, A:T BPSs at 5′NAC3′+5′GTN3′ sites and, to a lesser degree, G:C BPSs at 5′NGC3′+5′GCN3′ sites, are hotspots in every genetic background. These mutations, particularly the A:T mutations, are well corrected by MMR but not preferentially corrected by proofreader, and so also appear in the wild-type spectrum.
The G:C content of the E. coli genome is ∼50%. In the absence of error correction, the mutational bias of replication would tend to increase the A:T content unless selection reversed the trend. The results presented here indicate that proofreading is the major error-correcting activity maintaining the G:C content of the genome, reducing the nearly twofold bias for replacing G:C with A:T base pairs to the 1.4-fold bias seen in wild-type cells.
Acknowledgments
We thank the following past members of the P.L.F. laboratory for technical assistance: H. Bedwell-Ivers, C. P. Coplen, J. Eagan, N. Gruenhagen, N. Ivers, E. Popodi, I. Rameses, D. Simon, K. Smith, K. Storvik, J. P. Townes, and L. Whitson.; Roel Schaaper for the strain provided; and the anonymous reviewers of this paper for helpful suggestions. The National BioResource Project at the (Japanese) National Institute of Genetics provided bacterial strains and plasmids. This work was supported by the National Institutes of Health (T32 GM-007757 to B.A.N.) and the US Army Research Office Multidisciplinary University Research Initiative Award (W911NF-09-1-0444 to P.L.F. and H.T.).
Note added in proof: See Foster et al. 2018 (pp. 1029–1042) in this issue for a related work.
Footnotes
Supplemental material available at Figshare: https://doi.org/10.25386/genetics.6513035.
Communicating editor: J. Nickoloff
- Received November 14, 2017.
- Accepted June 14, 2018.
- Copyright © 2018 by the Genetics Society of America