Abstract
To characterize the hisD3052 −1 frameshift allele of Salmonella typhimurium, we analyzed ~6000 spontaneous revertants (rev) for a 2-base deletion hotspot within the sequence (CG)4, and we sequenced ~500 nonhotspot rev. The reversion target is a minimum of 76 bases (nucleotides 843–918) that code for amino acids within a nonconserved region of the histidinol dehydrogenase protein. Only 0.4–3.9% were true rev. Of the following classes, 182 unique second-site mutations were identified: hotspot, complex frameshifts requiring ΔuvrB + pKM101 (TA98-specific) or not (concerted), 1-base insertions, duplications, and nonhotspot deletions. The percentages of hotspot mutations were 13.8% in TA1978 (wild type), 24.5% in UTH8413 (pKM101), 31.6% in TA1538 (ΔuvrB), and 41.0% in TA98 (ΔuvrB, pKM101). The ΔuvrB allele decreased by three times the mutant frequency (MF, rev/108 survivors) of duplications and increased by about two times the MF of deletions. Separately, the ΔuvrB allele or pKM101 plasmid increased by two to three times the MF of hotspot mutations; combined, they increased this MF by five times. The percentage of 1-base insertions was not influenced by either ΔuvrB or pKM101. Hotspot deletions and TA98-specific complex frameshifts are inducible by some mutagens; concerted complex frameshifts and 1-base insertions are not; and there is little evidence for mutagen-induced duplications and nonhotspot deletions. Except for the base substitutions in TA98-specific complex frameshifts, all spontaneous mutations of the hisD3052 allele are likely templated. The mechanisms may involve (1) the potential of direct and inverted repeats to undergo slippage and misalignment and to form quasi-palindromes and (2) the interaction of these sequences with DNA replication and repair proteins.
“The universe is full of magical things patiently waiting for our wits to grow sharper.”
—Eden Phillpots
HOW cells make mutations in the absence of any known exposure toexogenous mutagens has been a subject of continual interest since the beginning of this century (de Vries 1901). This process, called spontaneous or background mutation, is important because it provides the substrate for evolution, generates disease, and provides insight into basic biological processes. Although spontaneous mutations are rare events, they are a feature of the life history of all organisms and appear to occur at a constant rate per genome per replication (Drake 1991). The stability of the genome is maintained by cellular processes that assure accurate DNA replication and faithful DNA repair (Echols and Goodman 1991; Lindahl 1993); an increased spontaneous mutation rate may pose a health risk (Crow 1997).
The causes of spontaneous mutation are varied and include errors in base selection by the DNA polymerase as well as lapses in mismatch repair (Echols and Goodman 1991); slippage during replication or repair (Ripley 1990); unrepaired or incorrectly repaired DNA damage resulting from deamination, depurination, depyrimidination, alkylation, oxidation, or strand breakage (Lindahl 1993); and a variety of other factors (Loeb and Cheng 1990; Drake 1991; Smith 1992; Bridges 1994). All types of mutations, from base substitutions to chromosomal aberrations, can arise spontaneously, and many factors affect the frequencies and types recovered (Smith 1992).
DNA sequence analysis has been used to generate spontaneous mutation spectra for a number of genetic targets, in organisms ranging from bacteriophages to humans, in order to infer mechanisms of spontaneous mutation (Ripleyet al. 1986; Schaaperet al. 1986; Halliday and Glickman 1991; Kohleret al. 1991; Schaaper and Dunn 1991; Gordonet al. 1993; Cole and Skopek 1994; Glickmanet al. 1994; Rocheet al. 1994; Kalinowskiet al. 1995; Xuet al. 1995). Among the mutations identified in these studies, deletions, duplications, and frameshifts constitute a significant portion, and insights into mutational mechanisms have been obtained from spontaneous mutation spectra generated in reverse-mutation frameshift systems (Ripley 1990).
The hisD3052 allele of Salmonella typhimurium is a −1 frameshift that was induced by the acridine nitrogen mustard ICR-364-OH (Oeschger and Hartman 1970; Hartmanet al. 1986). This allele has been used more than any other frameshift allele for the identification of mutagenic agents (Kieret al. 1986). Revertants of this allele have been analyzed by deduction of the DNA sequence from the amino acid sequence of the histidinol dehydrogenase polypeptide coded by revertants of the allele (Isono and Yourno 1974) and by cloning and DNA sequence analysis (Fuscoeet al. 1988; O'Hara and Marnett 1991). The development of a colony probe hybridization procedure to detect a 2-base hotspot deletion of a GC or CG within the sequence (CG)4 (Kupchella and Cebula 1991) and the application of polymerase chain reaction (PCR)/DNA sequence analysis to identify the remaining frameshifts (Bellet al. 1991; Kupchella and Cebula 1991) have made it practical to construct informative mutation spectra.
Tens of thousands of agents have been evaluated for mutagenic activity at the hisD3052 allele, and the mutation spectra of ~20 of these have been determined (DeMariniet al. 1993). However, only a few spontaneous revertants of hisD3052 have been sequenced: 11 (Kupchella and Cebula 1991) and 37 (O'Hara and Marnett 1991) in wild-type repair strains; 7 in a ΔuvrB strain (Fuscoeet al. 1988); and 16 in a ΔuvrB, pKM101 strain (Bellet al. 1991). A recent study (Wallace and Josephy 1994) used the colony probe hybridization procedure (Kupchella and Cebula 1991) to screen 1076 spontaneous revertants of strain TA98 (ΔuvrB, pKM101) for the presence of the hotspot 2-base deletion, and 25 nonhotspot spontaneous revertants were sequenced.
We have extended these studies by analyzing by oligonucleotide probing a total of ~6,000 spontaneous revertants of the hisD3052 allele in four DNA repair backgrounds to detect the hotspot mutation, and we have sequenced ~500 of the nonhotspot revertants. These data have then been used to (1) characterize the target for reversion of the hisD3052 allele; (2) provide insight into the constraints for the selection and recovery of hisD3052 revertants; (3) identify the influences of the uvrB allele and the pKM101 plasmid on spontaneous mutations at this allele; (4) characterize the size, location, and sequence context of spontaneous frameshifts and compare those to mutagen-induced mutations at this allele; and (5) propose mutational mechanisms.
MATERIALS AND METHODS
Mutagenicity assay: S. typhimurium strains TA1978 (hisD3052, rfa), TA1538 (hisD3052, rfa, ΔuvrB), and TA98 (hisD3052, rfa, ΔuvrB, pKM101) were kindly provided by Dr. B. N. Ames (Biochemistry Department, University of California, Berkeley, CA). The isogenic strain UTH8413 (hisD3052, rfa, pKM101) was provided by Drs. T. H. Connor and T. S. Matney (Graduate School of Biomedical Sciences, University of Texas, Houston, TX) (Inmanet al. 1983). The standard plate-incorporation assay (Maron and Ames 1983) was performed, and revertants (rev) were counted and picked for molecular analysis after 3 days of incubation. Revertants were recovered from plates containing 100 μl of dimethyl sulfoxide (DMSO; Burdick & Jackson, Muskegon, MI) in either the presence or absence of Sprague-Dawley aroclor 1254-induced male rat liver S9 (1.8 mg of S9 protein/plate) that was prepared as described (Maron and Ames 1983).
Mutants were collected from 2–8 independent cultures per strain, and mutants arising on the plate are considered independent in origin because each arises from a single cell that was immobilized and physically isolated from other cells within the top agar. The same mutation was isolated from the same culture on several occasions, although no more than 2–3 identical mutants were isolated from any one set of mutants sequenced (~40) per culture. The repeated isolation of such mutants from subsequent cultures of the same strain suggested to us that if jackpot mutations had occurred in a culture prior to plating, such mutants were due only slightly, if at all, to jackpot mutations, and were, in fact, frequently occurring, independent mutations.
Colony purification, DNA isolation, PCR, and DNA sequence analyses: A total of 5981 independent spontaneous revertants of the hisD3052 allele were streaked onto minimal medium supplemented with biotin (Maron and Ames 1983) and incubated for 2 days at 37°, to purify each revertant clone and to assure that no nonrevertant cells from the background lawn were present. The purified revertants were then screened for the presence of a 2-base hotspot deletion of a GC or CG within the sequence CGCGCGCG, located at nucleotides 878–885, by means of a colony hybridization procedure (Kupchella and Cebula 1991). Each revertant was subjected to two independent hybridizations to confirm the results. The numbers of revertants probed were 1842 for TA1978, 300 for UTH8413, 1858 for TA1538, and 1981 for TA98.
A total of 496 revertants that did not contain the hotspot deletion were subjected to PCR and DNA sequence analysis (Bellet al. 1991). The numbers of revertants sequenced were 156 for TA1978, 40 for UTH8413, 143 for TA1538, and 157 for TA98. Briefly, revertant colonies were boiled for 10 min in 200 μl of water, centrifuged for 10 min, and 5–10 μl of the supernatant were used in an asymmetric PCR in which the primers were present at a ratio of 1:100. After 40 cycles of heating and cooling, the reaction was subjected to ultrafiltration, and the amplified ssDNA was sequenced using ddITP/dITP termination mixes.
Statistical analyses: Statistical comparisons of the hisD3052 mutation spectra were performed using the program of Adams and Skopek (1987), which produces a Monte Carlo estimate of the P value of the hypergeometric test (a generalization of Fisher's exact test).
RESULTS AND DISCUSSION
Mutation spectra and classification of revertants: general observations: The spontaneous mutation spectra of the hisD3052 allele in four DNA repair backgrounds are shown in Figure 1. Statistical comparisons of the four spectra in pair-wise combinations, all four together, or with or without inclusion of the hotspot values, showed that the four mutation spectra were significantly different from each other (P < 0.001). All of the mutations resided within a 76-base region, which identifies the hisD3052 target for reversion, and the mutations could be categorized into five general groups: hotspot, deletions, duplications, insertions, and complex. The mutant frequencies and percentages at which each mutation class was recovered in each strain are shown in Table 1 and illustrated in Figure 2. Although the mutation classes and the role of DNA repair are discussed in detail below, some general observations can be made from examination of the mutation spectra and the classification of the mutants (Figures 1 and 2, Table 1).
The percentage of the hotspot mutation (−CG or −GC at CGCGCGCG at position 878–885) was lowest (13.8%) in strain TA1978 (wild type) and increased to a high of 41.0% in strain TA98 (ΔuvrB, pKM101) (Table 1, Figure 2). Approximately 26–31% of the mutations in the uvr+ strains (TA1978 and UTH8413) were other deletions, whereas the ΔuvrB allele increased this percentage to ~39–47% (Table 1). Combining the hotspot deletion with other deletions, deletions accounted for ~45–51% of the mutations in the uvr+ strains but for ~80% of the mutations in the ΔuvrB strains (TA1538 and TA98) (Table 1, Figure 2). In contrast, duplications were the predominant mutation (53.6%) in the uvr+ strain TA1978 (wild type) and accounted for ~45% of the mutations in strain UTH8413 (pKM101) (Table 1, Figure 2). Although the pKM101 plasmid clearly increased the percentage of the hotspot mutation in either a uvr+ or uvr− strain, it decreased the percentage of other deletions or duplications (Table 1). Insertions were rare and occurred at ~1% among all strains. Complex mutations (those involving more than one change) were equally rare (~1%) among all of the strains except TA98 (ΔuvrB, pKM101), where 10.1% of the mutations were complex (Table 1, Figure 2).
The overall reversion frequency was rather similar (19–22 rev/108 survivors) for all of the strains except TA98, which exhibited a reversion frequency of 37/108 survivors (Footnote a of Table 1). This increase was due to the increased production of the hotspot, other deletions, and complex mutations (Table 1, Figure 2). The reversion frequencies indicate that duplications occurred ~2 times as frequently relative to deletions among uvr+ cells, whereas deletions occurred 2–4 times more frequently relative to duplications in ΔuvrB cells (Table 1, Figure 2). The pKM101 plasmid had little effect on reversion frequencies in an otherwise wild-type cell, but it clearly increased the frequency of the hotspot, other deletions, and complex mutations (by 9–18-fold) when in the presence of the ΔuvrB allele.
Saturation of the spectrum: Prior to characterizing the reversion target and each mutation class in detail, we estimated how saturated the mutation spectra were by determining the proportion of sites at which the same mutation was recovered at least twice relative to the total number of sites at which all revertants were recovered. Combining all mutations, approximately one-third of the mutations in strains TA1978, TA1538, and TA98 were recovered multiple times, whereas only ~16% of the mutations in UTH8413 were so recovered. Combining all strains, more than 50% of the deletions and duplications were recovered multiple times, whereas only ~17% of the insertions and 33% of the complex were so recovered. Combining all strains and mutations, ~53% of the mutations were recovered more than once. Although there are underrepresented mutation classes among certain strains, the combined spectra in Figure 1 likely show >50% of the possible spontaneous mutations that can revert the hisD3052 allele. The combined spectra show 182 unique mutations, which indicates that at least this number of mutational pathways can revert the hisD3052 allele.
Mutant frequency and percentage of classes of spontaneous mutations at the hisD3052 allele in four DNA repair backgrounds
Mutation spectra of spontaneous revertants of strains TA1978, UTH8413, TA1538, and TA98. The sequence numbering is based on the number of nucleotides in the hisD3052 allele. The dash after position 893 represents the −1 deletion of a C that constitutes the hisD3052 allele. Open bars, deletions; filled bars, duplications; open bars with attached triangles, complex mutations involving deletion, addition, and/or base substitution; mutations connected by a line, complex mutations involving deletion or duplication plus a base substitution at a nearby site. Each symbol represents the mutation present in a single revertant. Approximately 50% of the revertants analyzed from the TA strains arose in the presence of S9 and 50% in the absence of S9. Because the two collections for each TA strain were not significantly different (P > 0.07), they were combined; the UTH8413 revertants arose in the absence of S9.
Target for reversion of the hisD3052 allele: Perhaps the most striking feature of all four mutation spectra is the length of the reversion target—76 nucleotides (Figure 3). Although previous studies have shown that the reversion target (either spontaneous or induced) may be more than 60 bases in length (DeMarini et al. 1994, 1995a,b; Levineet al. 1994b) our present study extends the target for spontaneous reversion to the 76-base length shown in Figure 3. A precedent for a similarly large reversion target has been demonstrated for certain frameshift alleles within the rIIB gene of bacteriophage T4 (Ripleyet al. 1986; Ripley 1990). This target size is comparable to that of the forward-mutation target SupF (Seidmanet al. 1985). Our study also identifies the regions in which deletions or duplications are recovered, with the largest deletion being 46 bases, and the largest duplication being 35 bases in length (Figure 3). Note that the 3′ boundary for duplications is defined by the third position of the TGA codon (site 904–906) at the 3′ end of the target; duplications at that site would produce a stop codon.
The hisD gene codes for L-histidinol dehydrogenase (EC 1.1.1.23, CAS No. 9028-27-7), which catalyzes the last step in the biosynthesis of L-histidine (L-histidinol + 2NAD+ → L-histidine + 2NADH). Its sequence and that of the entire histidine operon has been determined (Carlomagnoet al. 1988), and the structural features of the enzyme have been compared to homologous enzymes in other organisms (Bruniet al. 1986). This two-step oxidation from substrate alcohol via intermediate aldehyde to product acid is catalyzed by a single active site, and the substrate binds first, followed by the coenzyme (Grubmeyeret al. 1989; Tenget al. 1993).
The absence of a crystal structure for either the hisD protein or the closely related 4-electron dehydrogenases, such as UDP-glucose dehydrogenase, prevented us from linking the hisD3052 target DNA sequence with a proven structural feature. However, because most nucleotide-binding proteins bind their coenzymes in similar ways, some inferences were possible. The dinucleotide-binding domains of dehydrogenases such as the NAD-binding domain are usually a duplicated Rossmann fold and consist of a 6-stranded parallel β-sheet with four linking α-helices. The hisD3052 target region between aa 281–306 could, therefore, be a hydrophilic loop that connects a β-sheet to an α-helix. Small alterations within this sequence might be tolerated, allowing the formation of an undistorted Rossmann fold, which is necessary for coenzyme (NAD) binding.
A BEAUTY-BLAST search using the search engines at the Baylor College of Medicine (Houston, TX) yielded the protein domains shown in Figure 4, which consist of a histidinol dehydrogenase signature and 9-element fingerprint. The signature sequence runs from aa 230–262 (PROSITE pattern PS00611), and the fingerprint spans virtually the entire length of the protein sequence. The hisD3052 target resides within a 49-aa segment that separates motifs 7 and 8 (aa 271–320). The motifs are conserved amino acid sequences and, presumably, are critical for function. Alterations within the motifs would likely cause loss of function, whereas alterations between them might be neutral if such alterations do not greatly alter the tertiary structure. Our recovery of duplications as large as 46 bases and deletions as large as 35 bases within the target region attests to the dispensable function of the amino acids coded by the target sequence. Duplications have been recovered from the 5′ end only to the potential stop codon (TGA) at the 3′ end (site 904–906). The recovery of deletions from the 3′ end to a region just short of the 5′ end suggests that deletions at the 5′ end would disrupt the flanking motif and inactivate the enzyme.
Frequency of spontaneous revertants of the hisD3052 allele distributed by (A) mutation class and (B) strain. Data are from the totals in Table 1.
Another critical feature of all four mutation spectra is the low frequency with which precise (true) revertants arise spontaneously at the hisD3052 allele. A true revertant is one that has reverted by the addition of a C at the site of the original −C mutation in this allele (the missing C is noted by a dash after nucleotide 893 in Figure 1). The percentage of spontaneous true revertants was 3.9% (7/181) for TA1978, 1.9% (1/53) for UTH8413, 1.0% (2/209) for TA1538, and 0.4% (1/266) for TA98; the reversion frequencies (rev/108 survivors) for true revertants for these strains were 0.9, 0.4, 0.2, and 0.1, respectively. As discussed later, these results parallel the frequencies of 1-base duplications among the various DNA repair backgrounds.
Hotspot mutations: The percentage of hotspot mutations for each strain was 13.8% in TA1978 (wild type), 24.5% in UTH8413(pKM101), 31.6% in TA1538 (ΔuvrB), and 41.0% in TA98 (ΔuvrB, pKM101) (Table 1). These increases in the percentage of hotspot mutations across the strains paralleled the increases in the mutant frequency of hotspot mutations (Table 1, Figure 2). This mutation can be explained most simply by a model in which misaligned replication intermediates, derived from slippage of one strand relative to the other, might be stabilized within iterated sequences (Streisingeret al. 1966; Streisinger and Owen 1985). Such a model for the hisD3052 hotspot region would involve the correct incorporation of a C opposite a G (or a G opposite a C), possibly followed by extension, and proceeding eventually to a 2-base slippage promoted and directed by the relative instability of the primer terminus and the specific sequence context. This model is essentially the correct-incorporation/slippage model we have described previously for mutagen-induced hotspot mutations at the hisD3052 allele (Levine et al. 1994a,b).
Target for reversion of the hisD3052 allele. The target is a minimum of 76 base pairs in length. Regions are noted within which deletions and duplications have been recovered. The largest deletion (35 bp) and duplication (46 bp) that we have recovered are shown; these two mutations were determined to be spontaneous in origin and were obtained in a previous study (Levineet al. 1994b). The motifs (regions of conserved amino acid sequences) within the hisD protein are shown as open bars at the top of the figure; the filled bars represent the regions of nonconserved amino acid sequences, which, presumably, are somewhat dispensable in function. The hisD3052 target DNA sequence codes for a 25-amino-acid sequence within the 49-amino-acid, nonconserved region of the protein beginning at amino acid 271 and ending at amino acid 320.
Consideration of the mechanism proposed above must be reconciled with the observation that the spontaneous percentage and mutant frequency of the hotspot mutation increase from three to five times, respectively, from strain TA1978 (wild type) to TA98 (ΔuvrB, pKM101) (Table 1, Figure 2). The pKM101 plasmid results in an error-prone DNA polymerase that may be more processive through slipped-intermediate structures of the type proposed here, leading to an increased frequency of hotspot mutations relative to that in wild-type cells (Levine et al. 1994a,b). Most interestingly, the ΔuvrB allele greatly enhances the formation of the hotspot mutation, suggesting that the wild-type gene may play an important role in repairing such slipped-intermediate structures.
The recent observation that UvrAB proteins bind to bubble and loop regions in duplex DNA (Ahn and Grossman 1996) and bind at these regions in undamaged DNA with an affinity similar to that for damaged DNA, suggests that such proteins may bind to the types of slipped intermediates postulated here and may play a role in the repair of such structures. The disabled UvrAB system in TA1538 and TA98 and the potential inability to repair looped-out regions may account for the increased frequency of hotspot mutations in these strains relative to wild-type strains. This effect of the ΔuvrB allele, combined with the pKM101 plasmid, may explain the high frequency (41.0%) of spontaneous hotspot mutations in strain TA98 (ΔuvrB, pKM101).
Frameshift hotspots in the rIIB gene of bacteriophage T4 (Ripleyet al. 1986; Ripley 1990) and in the lacI gene of Escherichia coli (Farabaughet al. 1978; Halliday and Glickman 1991) are also consistent with the correct-incorporation/slippage model, as are experimental studies with plasmids containing CpG repeats (Bicharaet al. 1995) or studies on the disruption of dyad repeats (Peteset al. 1997). In addition, there are some parallels between the effects of the ΔuvrB allele and the pKM101 plasmid on the spontaneous occurrence of the lacI and hisD3052 hotspots (Halliday and Glickman 1991; Table 1). For example, the ΔuvrB allele increased the mutant frequency of both hotspots approximately two times relative to wild type. However, it decreased the percentage of the lacI hotspot but increased by approximately two times the percentage of the hisD3052 hotspot. Likewise, the pKM101 plasmid increased modestly (approximately 1.1 to 1.6 times) the mutant frequencies of both hotspots relative to wild type. However, it decreased the percentage of the lacI hotspot but increased modestly (1.8 times) the percentage of the hisD3052 hotspot.
A concerted (templated) mutation involving the deletion of 8 bases and their replacement by 3 bases that are templated from sequence that is 27–30 bases away by the formation of a quasi-palindrome. Recovery of this mutation twice in our study and by Cebula (1995) supports the view that this mutation is templated.
In addition to the correct-incorporation/slippage model, another mutational mechanism that can be considered involves the potential of the repeated GpC motif to form Z-DNA. DNAs that can assume Z-DNA conformation are often subject to high frequencies of deletion (Freundet al. 1989; Jaworskiet al. 1989; Bicharaet al. 1995). This mechanism awaits further evaluation as a basis for the hisD3052 hotspot mutation.
Regardless of whether correct-incorporation/slippage or Z-DNA is responsible for the occurrence of the spontaneous hotspot mutation, the data suggest that the mechanism responsible for this 2-base deletion is different from that which causes 5- or 8-base deletions within the hotspot region. Although the correct-incorporation/slippage model and possibly Z-DNA could account for 5- and 8-base deletions within this reiterated region, the recovery of such mutations within this region was low and similar (1–2%) among all four strains. In contrast, the percentage of the 2-base deletion within the hotspot region was high and increased across strains, from 13.8% in strain TA1978 (wild type) to 41.0% in strain TA98 (ΔuvrB, pKM101). As we argue below, the correct-incorporation/slippage model described above for the hotspot mutation may account for fewer than one-fourth of the other 2-base deletions or larger deletions and duplications in the spontaneous mutation spectra of the hisD3052 allele.
A GenBank search showed that the hotspot sequence CGCGCGCG is present in a variety of genes in a variety of organisms. In humans, it is present in the following genes: oncogenes (ABL, C-JUN, L-MYC, R-RAS, RB); cell-cycle genes (CDC25, Cyclin D1); homeodomain genes (EN1, HOX2.2, HOXD3, HOXD7); and DNA maintenance genes (hTOP1, hTR, POLα, XRCC1, XRCC2). It is present among ~6% of the promoter or upstream regions, ~1% of coding regions, but only 0.1% of non-coding (intronic) regions of human genes. Thus, this sequence appears to be conserved especially in regulatory regions of the human genome. Whether this sequence is also a hotspot for mutation in these genes is not yet known. However, recent studies have found that dinucleotide repeats of various types are frequent hotspots for both base-substitution and frameshift mutations associated with a variety of diseases in humans (Casimiret al. 1991; Antequera and Bird 1993; Sutherland and Richards 1995; Rodenhiseret al. 1997).
Spontaneous insertion mutations at the hisD3052 allele and their associated repeat (template)
Complex mutations: Nearly all of the spontaneous complex frameshifts can be explained by one of two models: misincorporation/slippage or concerted (templated) mutagenesis. As we have described previously, modification of the correct-incorporation/slippage model to involve misincorporation followed by slippage can explain most mutagen-induced complex frameshifts identified at the hisD3052 allele (DeMarini et al. 1994, 1995a,b, 1996; Levineet al. 1994b). This model can also explain spontaneous complex frameshifts based on the fact that DNA polymerases incorporate the incorrect nucleotide at some defined frequency.
The complex frameshifts involving a 2-base deletion at the hotspot or a 1-base duplication at the TGA stop codon were recovered only in strain TA98 (ΔuvrB, pKM101), whereas the cryptic complex frameshifts (apparent 2-base deletions of the CC or GG flanking the hotspot) were recovered in all strains except UTH8413 (pKM101), although most occurred in TA98. Thus, neither the ΔuvrB allele nor the pKM101 plasmid appeared to be required for the production of the cryptic complex frameshifts; however, both factors combined enhanced the production of cryptic complex frameshifts. Perhaps most notable is the finding that both factors appeared to be required for the spontaneous production of the other complex frameshifts (i.e., the 2-base deletion complex frameshifts at the hotspot and the 1-base duplication complex frameshifts at the stop codon). Thus, we view these complex frameshifts to be specific to strain TA98.
These last two types of complex frameshifts occurred within a repeated sequence, either a 2-base repeat at the hotspot or a monotonic run of three Cs at the stop codon. These are the two sites within the hisD3052 target with the most repetitive sequence, and, thus, they are the two sites within the target that would provide the maximum opportunity for slippage, which is required for the misincorporation/slippage model. Considering that the UvrAB proteins can bind to loops and bubbles (Ahn and Grossman 1996) and may be involved in the repair of such slipped-intermediate structures, the absence of this function in ΔuvrB cells may permit such structures to persist during replication. This, coupled with (1) the translesion synthesis ability provided by the pKM101 plasmid that may permit replication across such structures and (2) a high frequency of misincorporation, may account for the almost exclusive production of these complex frameshifts in strain TA98 (ΔuvrB, pKM101).
In the absence of the plasmid in strain TA1978 and TA1538, the production of the cryptic complex frameshifts may be mediated by SOS functions that are present at a low level in pKM101-minus strains of Salmonella (Eisenstadt 1987; Smith and Eisenstadt 1989; Smithet al. 1990; Nohmiet al. 1991; Woodgateet al. 1991). The pKM101 plasmid contains the mucAB genes, which appear to be at least partial functional analogues of the E. coli umuDC genes, which participate in the SOS response in E. coli (Walker 1984; Blancoet al. 1986). Consistent with its role in the production of complex frameshifts at the hotspot and stop codon, the pKM101 plasmid has been shown to enhance primarily the frequency of base substitutions (Fowleret al. 1979; Gordonet al. 1993). Also, recent studies have demonstrated the relevance of the plasmid and its translesion function to eukaryotic DNA polymerases (Littleet al. 1989; Odaet al. 1996).
Although the misincorporation/slippage model explains these complex mutations, is does not explain the few remaining complex mutations in the spontaneous mutation spectra. Instead, these mutations appear to represent a category of complex frameshifts called concerted or templated mutations (Ripley 1990, 1991). These complex mutations appear to be independent of the DNA repair backgrounds studied here and are characterized by their repeated recovery, despite the fact that they contain complex changes. Their formation can generally be explained by the presence of complementary sequence that templates the mutation.
Proposed mechanism for the production a 1-base insertion involving slippage between two quasi-repeated regions, followed by extension, realignment, extension, and replication. The insertion results in the conversion of one of the quasi-repeated sequences such that there are now two direct repeats.
Figure 4 illustrates the complex frameshift recovered twice (once in TA1978 and once in TA1538) that involves the deletion of 8 nucleotides at positions 900–907 and the replacement of the deleted sequence with the 3-base sequence TTC. A misalignment permitted by the self-complementarity of the palindrome postulates an intrastrand DNA misalignment to initiate the mutation, which can then be explained precisely by an intermolecular strand switch during DNA replication (Ripley 1990; Roscheet al. 1997). Thus, the DNA polymerase switches strands just prior to the deleted 8 bases and uses the AAG on the other side of the hairpin to template the insertion of the TTC.
The multiple recovery of this complex mutation by us and others (Cebula 1995) and the potential of this quasi-palindrome to template the mutation, suggests strongly that this mutation is a concerted or templated complex frameshift. The recovery of this complex frameshift in wild-type and ΔuvrB strains suggests that the nucleotide excision repair system did not influence the production of this class of mutation. The similar concerted mutation in the same region in TA98 likely results from different enzymatic processing of this quasi-palindromic region. Other complex mutations have been described at the hisD3052 allele that also appear to be templated and may involve fold-back mechanisms (Kupchella and Cebula 1991; Cebula 1995).
One-base insertions: In this study, 1-base insertions were defined as mutations in which the inserted nucleotide was different from either of its flanking neighbors. In contrast, 1-base duplications were those mutations in which the inserted nucleotide was identical to at least one of its flanking neighbors. One-base insertions in the hisD3052 allele were rare in all four strains studied here, occurring at only 0.6–1.9% or 0.1–0.3 rev/108 survivors. Although the numbers were limited, it appeared that the addition of either the ΔuvrB allele or the pKM101 plasmid had little, if any influence on the frequency of this class of mutation. As shown in Figure 1 and illustrated in Table 2, six unique insertions were recovered among all the strains, and one of these was recovered twice.
Similar to the concerted (templated) mutations described above, all six of the 1-base insertions could be modeled as sequence-directed mutations, i.e., mutations templated by a nearby sequence. The lack of influence of either the ΔuvrB allele or the pKM101 plasmid on these mutations would be consistent with the lack of influence of these genetic factors on the concerted mutations described earlier. In all six mutants, the insertion produced a sequence motif that was a direct repeat of a sequence 1–52 nucleotides away (Table 2). A possible mechanism for the insertion of a T in strain TA1538 is shown in Figure 5 in which slippage and realignment occur between the repeated regions. Concerted (templated) 1-base insertions have been found in other organisms and were explicable by similar mechanisms (Schaaperet al. 1986; Ripley 1990; Fieldhouse and Golding 1991).
The last mutation listed in Table 2 was categorized as a complex mutation because it contained more than one change (an insertion of a T and a C → A base substitution). However, the production of this mutation may also have been templated by a mechanism similar to that for the 1-base insertions because the mutation created a repeat of an adjacent sequence. We note that 1-base insertions were the rarest category of mutation class recovered among the spontaneous revertants of the hisD3052 allele. Consideration of the complexity of the possible mechanism required to produce such mutations (Figure 5) may account partially for the rare occurrence of this class of mutation. In this regard, it is interesting to note that the 1-base insertion that was recovered twice had the shortest distance (1 nucleotide) between the inserted base and the base that served as its probable template. This suggests that the closer such quasi-repeated sequences, the higher the probability that such templated mutations may occur.
One-base duplications: In contrast to 1-base insertions, 1-base duplications are those mutations in which a nucleotide has, apparently, been duplicated. The 1-base duplications comprised an important portion of the spontaneous mutation spectrum at the hisD3052 allele, being the most prevalent class of mutation in the uvr+ strains (even more prevalent than the hotspot mutation), and being the third most prevalent mutation in the ΔuvrB strains (following the hotspot and 2-base deletions). In the uvr+ strains, 1-base duplications accounted for ~30% of all spontaneous mutations at the hisD3052 allele; however, they accounted for only 14.4% in TA1538 (ΔuvrB) and only 6.4% in TA98 (ΔuvrB, pKM101). The ΔuvrB allele reduced by one-half the frequency of this mutation class relative to wild-type; in a ΔuvrB background, the pKM101 plasmid reduced the frequency by one-half again, implying an important influence of the plasmid-coded proteins on the mechanism underlying the production of this mutation class in a ΔuvrB background.
Classical misalignment/slippage models would predict that simple 1-base duplications might occur preferentially at monotonic runs of nucleotides (Streisingeret al. 1966; Streisinger and Owen 1985). An examination of the association between 1-base duplications and monotonic runs indicated that, indeed, ~80–88% of these mutations were at such sites in uvr+ strains, but only ~60% were at such sites in the ΔuvrB strains. Thus, the UvrABC system not only permitted the production of a high frequency of 1-base duplications, but also permitted their production preferentially at sites of monotonic runs. In contrast, the absence of this system reduced not only the frequency of 1-base duplications but also the proportion that were at monotonic runs. With the exception of strain UTH8413 (pKM101), the percentage of sites of 1-base duplications that were at monotonic runs was 53–58%; this value was 80% for UTH8413. Thus, in general, almost half the sites at which 1-base duplications occurred were not monotonic runs.
Using a method of analysis developed previously for 1-base frameshifts in bacteriophage T4 (Ripleyet al. 1986), we determined that the frequencies of 1-base duplications at the hisD3052 allele ranged from a low of 0.08 rev/site in UTH8413 (pKM101) for nonrepeated bases (sites of size 1) to a high of 4 rev/site in TA98 (ΔuvrB, pKM101) for a monotonic run of 3 bases, which was a 50-fold difference. Because the mutant frequencies in monotonic runs represented the total number of mutants produced at each of the bases within the site, mutant frequencies were compared by normalizing them to the number of bases within the sites (rev/base), rather than comparing the frequencies of rev/site. This normalization showed that in uvr+ strains, the frequencies of 1-base duplications were low (0.42 and 0.08 rev/base for TA1978 and UTH8413, respectively) at sites of size 1 but increased four to seven times (1.91 and 0.59 rev/base) at sites of size 2; the frequencies dropped slightly at the site of size 3. In contrast, the frequencies remained the same at all sites regardless of size in strain TA1538 (ΔuvrB) and also at sites of size 1 and 2 in strain TA98 (ΔuvrB, pKM101); however, the frequency increased substantially (4.9 times) at the site of size 3 in TA98.
Based on the above analysis, the following percentages of spontaneous 1-base duplications at the hisD3052 allele can be ascribed to slippage: 80.4% (45/56) for TA1978, 87.5% (14/16) for UTH8413, 0% (0/30) for TA1538, and 23.5% (4/17) for TA98. Thus, the vast majority of 1-base duplications in uvr+ strains may involve slippage at monotonic runs; however, almost none of the 1-base duplications in a ΔuvrB strain may involve slippage at such runs. The addition of the pKM101 plasmid in TA98 permitted the production of approximately one-fourth of the 1-base duplications to be ascribed to slippage in that strain.
For 1-base duplications, 48% of the sites and 82.8% of the mutations were shared between strain TA1978 (wild type) and TA1538 (ΔuvrB), suggesting that the elimination of the nucleotide excision repair system caused a shift in the sites at which 1-base duplications arose spontaneously. Likewise, in a ΔuvrB background, 34.8% of the sites and 58.3% of the 1-base duplications were shared between strain TA1538 (ΔuvrB) and TA98 (ΔuvrB, pKM101), suggesting that the addition of the pKM101 plasmid also caused a shift in the site at which these mutations occurred as well as in the number of mutations at these sites.
Studies in other organisms, such as bacteriophage T4, have shown that 1-base frameshifts at monotonic runs of 2 bases are not more frequent than at sites containing two different bases that cannot misalign (Ripleyet al. 1986). The present study indicates that 1-base duplications occur more frequently in uvr+ cells than in ΔuvrB cells and that slippage at monotonic repeats may play an important role for these mutations in uvr+ cells but little role in ΔuvrB cells. Although the reasons for this are unclear, one possibility is that in uvr+ cells, slippage at monotonic repeats results in 1-base loops that may be repaired by proofreading or mismatch repair enzymes (Tran et al. 1996, 1997; Greene and Jinks-Robertson 1997). However, there is some suggestion that the UvrABC system in uvr+ cells may interfere with this process by binding at such sites and excising nucleotides from either strand (Huanget al. 1994) or not at all (Ahn and Grossman 1996). This may account for the increased frequency of 1-base duplications in uvr+ cells and at primarily monotonic runs. In contrast, the absence of the UvrABC system in ΔuvrB cells may permit other repair systems to operate efficiently on 1-base loops or mismatches, resulting in a lower frequency of 1-base duplications and with most not occurring at repeats in ΔuvrB cells. For example, mismatch repair of quasi-palindromes has been proposed previously for mutations of this type in the lacI gene of E. coli (Schaaperet al. 1986).
Nonhotspot, 2-base deletions: We have characterized nonhotspot, 2-base deletions as a distinct group among other, larger deletions because of the fact that the hotspot mutation is also a 2-base deletion. Thus, we have asked if there are any general features or mechanisms that could be ascribed to this subset of deletions, such as the correct-incorporation/slippage model used to explain the hotspot deletions. Deletions in general were more frequent in the ΔuvrB strains than in the uvr+ strains (Table 1, Figure 2), and the frequency of nonhotspot, 2-base deletions also was influenced by the UvrAB system. This mutation arose at a percentage of ~8–10% in all the strains except for TA1538 (ΔuvrB), where it was ~16%. This increased percentage of 2-base deletions may indicate that the UvrAB system suppresses the formation of this class of mutation, perhaps by repairing a looped-out region; and in the absence of this system, the percentage of 2-base deletions increases. The addition of the pKM101 plasmid in TA98 reduced the percentage to wild-type levels.
In considering a role for the correct-incorporation/slippage model, we determined the frequency of nonhotspot, 2-base deletions at repeats, which would be the type of DNA sequence necessary to permit slippage and allow application of the model. There are two sites of tandem repeats (ACAC and GCGC) and 1 site of a monotonic repeat (CCC) within the hisD3052 target; spontaneous, 2-base, nonhotspot mutations occurred at all three sites (Table 3). All four strains gave rise to 2-base deletions at the ACAC site; however, most were recovered in strain TA1978 (wild type). In contrast, only strain TA1538 (ΔuvrB) gave rise to two, 2-base deletions at the other tandem repeat (GCGC). Although misalignments within monotonic runs of at least 3 bases could, theoretically, give rise to 2-base deletions (or duplications in other systems), such mutations are generally not found within monotonic runs (Ripley 1990). Nonetheless, a total of five 2-base deletions were recovered at the CCC site in all the strains except TA98 (Table 3).
Spontaneous nonhotspot, 2-base deletions with associated repeats
This analysis indicated that the frequency of nonhotspot, 2-base deletions at either tandem or monotonic repeats was 27.8% (5/18) in TA1978 (wild type), 50% (2/4) in UTH8413 (pKM101), 26.5% (9/34) in TA1538 (ΔuvrB), and 8.3% (2/24) in TA98 (ΔuvrB, pKM101). Although deletions in general and 2-base deletions as a class increased in a ΔuvrB background compared to a wild-type background (see above), the frequency of nonhotspot, 2-base deletions at repeats was not influenced by ΔuvrB. The pKM101 plasmid reduced the frequency of this class of mutation in strain TA98; the data in strain UTH8413 may be too limited to be meaningful. For all strains combined, the percentage of nonhotspot, 2-base deletions at repeated sequences relative to the total number of nonhotspot, 2-base deletions was only 22.5% (18/80). Thus, if slippage is invoked to explain this specialized class of 2-base deletions (and it cannot explain any of the remaining nonhotspot, 2-base deletions), then slippage accounts for less than one-fourth of all of the nonhotspot, 2-base deletions in the spontaneous mutation spectra of the hisD3052 allele.
Of the two sites at which 2-base deletions were recovered in all four strains, one is the ACAC tandem repeat at site 887–890. The other tandem repeat, GCGC at site 907–910, was recovered multiple times but only in strain TA1538. It is interesting that the monotonic repeat CCC (site 901–903) was recovered in all strains except TA98; instead, the DNA repair background of TA98 processed this site preferentially into 1-base duplications and complex frameshifts involving 1-base duplications (see above). An analysis of the percentage of shared sites and the number of shared mutations at a site revealed that the ΔuvrB allele shifted the site specificity relative to wild type, such that only 44.4% the nonhotspot, 2-base deletions occurred at the same sites in strains TA1978 and TA1538. The addition of the pKM101 plasmid to a ΔuvrB background resulted in an even lower percentage (29.2%) of shared sites. These results implicate a role for both genetic factors in the frequency and site specificity of spontaneous, nonhotspot, 2-base deletions at the hisD3052 allele.
There are at least two general mechanisms by which these mutations could be modeled. Those at repeats could be modeled by simple slippage mechanisms of the type described previously. In addition, some of these, as well as other nonhotspot, 2-base deletions could be modeled as looped-out regions in quasi-palindromic hairpins that could be processed by mismatch repair.
Duplications ≥4 bases in length: As discussed previously, duplications were the predominant class of mutation in the uvr+ strains, whereas deletions were the predominant class of mutation in the ΔuvrB strains. We are unaware of a precedent for this observation, and further studies will be required to examine the role of the nucleotide excision repair system in this process. Considering the frequency of duplications ≥4 bases in length, such mutations accounted for 22.7% (41/181) and 15.1% (8/53) of the mutations in TA1978 and UTH8413, respectively. In contrast, they accounted for only 3.8% (8/209) and 2.6% (7/266) of the mutations in TA1538 and TA98, respectively. Thus, this class of mutation occurred six times more frequently in uvr+ strains than in ΔuvrB strains. The pKM101 plasmid reduced the frequency of this class of mutation in both nucleotide excision-repair backgrounds. Therefore, both DNA repair systems influence the frequency with which large duplications arise spontaneously at the hisD3052 allele.
Large duplications were recovered among all four strains at only two sites: 852–855 and 895–898. These two sites are at the two ends of the target, and the mutations recovered were primarily 4-base duplications. With the exception of duplications at these two sites, none of the other duplication sites were shared between the uvr+ and ΔuvrB strains. In other words, the remaining duplications in the uvr+ strains do not occur at the same sites as those in the ΔuvrB strains.
This extreme site specificity indicates that, for these duplications, there is a shift in the site at which they are produced that is associated with the presence or absence of the nucleotide excision repair system. Only 17.6% of duplications ≥4 bases in length occurred at the same sites between strain TA1978 (wild type) and TA1538 (ΔuvrB); likewise, only 22.2% of these mutations occurred at the same sites when the pKM101 plasmid was added to the ΔuvrB background.
Frequency of size classes of spontaneous (A) duplications and (B) deletions at the hisD3052 allele. The hotspot deletion is not included in the deletion category; data are from Table 1.
An analysis of the frequency with which duplications of various sizes were recovered showed that 1-, 4-, and 7-base duplications occurred approximately twice as frequently in uvr+ strains compared to ΔuvrB strains (Figure 6A). A similar pattern likely occurred for larger duplications, but the low number of such duplications in the spectra prevents the demonstration of this observation for larger duplications. The frequency of duplications decreased in parallel with the size of the duplications.
Previous studies have demonstrated a critical role for repeats, quasi-palindromes, and misalignments in the production of duplications (Ripley 1990). Consideration of the percentage of duplications that had precisely aligned direct repeats just within one end and just outside the other end of the duplication showed that ~25–30% of the duplications in TA1978, UTH8413, and TA98 had no such repeats; however, for TA1538 (ΔuvrB), this percentage rose to 45% (Figure 7A). (Percentages were calculated by using the total mutant frequencies for duplications in Table 1 as the denominator and the values in Figure 7A as the numerator.) In contrast, ~45–50% of the duplications in all strains except TA98 had 2-base flanking repeats, whereas this percentage was only ~20% in TA98 (Figure 7A). Finally, ~15–20% of the duplications in all strains except TA98 had 3-base flanking repeats, but this percentage was ~45% in TA98 (Figure 7A). Only a few duplications were recovered with 4-base flanking repeats, and those were in TA1978 (wild type).
Frequency of spontaneous (A) duplications and (B) deletions at the hisD3052 allele that are flanked by precisely aligned direct repeats of various length. The hotspot deletion is not included in the deletion category.
These observations have implications for the mechanism by which duplications ≥4 bases in length arise spontaneously at the hisD3052 allele. Perhaps the binding of the UvrAB proteins to a looped-out region on the replicating strand, which is the type of loop-out required to produce duplications by a simple slippage mechanism, promotes duplications by binding at such sites and maintaining their secondary structure for a transient period while replication proceeds. Then, the proteins dissociate from the DNA, the repeat slips back to its original (correct) position, and the repeated region is replicated again, resulting in the duplication. The ability of UvrAB proteins to bind to loops and bubbles and not to cut undamaged DNA (Ahn and Grossman 1996) would be consistent with this proposal.
The most common duplication ≥4 bases in length (a 4-base duplication at site 895–898 that was recovered in all four strains a total of 24 times) may be explainable by simple slippage. The duplication (GGCA) is flanked by a 3-base direct repeat of GGC (CGCC GGCA GGCC). Slippage of the replicating strand after replication of the duplicated region, followed by extension, would produce the duplication. This model is illustrated in Figure 8 for another duplication involving this type of slippage, but one in which such slippage apparently occurred twice on two sets of repeats to produce the 13-base duplication at site 895–898/890–898. Various factors may influence palindrome formation and stability (Davison and Leach 1994), including the presence of inverted repeats between the direct repeats, which may promote deletions by stabilizing the palindrome but that may inhibit duplications by making the denaturation of the palindrome less likely (Trinh and Sinden 1993).
Role of two slippage events on the replicating strand to produce a complex duplication.
Deletions ≥5 bases in length: Deletions were the most frequent class of mutation in the ΔuvrB strains and occurred less frequently in uvr+ strains, where duplications were predominant (Table 1, Figure 2). The mutant frequency for deletions was also highest in TA98, which indicated that the pKM101 plasmid also influenced this class of mutation, although not as profoundly as the nucleotide excision repair system. Considering the frequency of deletions ≥5 bases in length, such mutations accounted for 21.1% (38/181) and 18.9% (10/53) of the mutations in TA1978 and UTH8413, respectively. In contrast, they accounted for 30.6% (64/209) and 30.1% (80/266) of the mutations in TA1538 and TA98, respectively. Thus, the frequency of this class of deletion was also enhanced in a ΔuvrB background. In contrast to its effect on duplication frequencies, the pKM101 plasmid appeared to have little effect on the frequency of deletions. Approximately 40% of deletions ≥5 bases in length occurred at the same sites in strains TA1978 vs. TA1538 and in TA1538 vs. TA98. Although this is a rather low percentage of shared sites, it is twice as high as that for duplications between these strains.
An analysis of the frequency with which deletions of various sizes were recovered showed that 2-base deletions, as discussed previously, were especially enhanced in TA1538 by the ΔuvrB allele. With few exceptions, however, the other sizes of deletions appeared to decrease in frequency in parallel with increasing length of deletion regardless of DNA repair background. One notable exception was that of 11-base deletions. This class of mutation occurred from three to four times more frequently in the ΔuvrB strains compared to the uvr+ strains (Figure 6B). In fact, this class of deletion accounted for ~10% of all mutations in the ΔuvrB strains, compared to 0–3% in the uvr+ strains. Furthermore, 11-base deletions accounted for approximately one-fourth of all nonhotspot deletions in the ΔuvrB strains, compared to 0–10% in the uvr+ strains. The addition of the pKM101 plasmid had little influence on the frequency of this class of mutation in the ΔuvrB cells, with 11-base deletions accounting for 24.5% (24/98) in TA1538 and 23.0% (24/104) in TA98 of the nonhotspot deletions. The increased frequency of 11-base deletions in ΔuvrB cells is interesting because the uvr repair system excises an 11–13-base segment of DNA, which would encompass enough nucleotides to account for this class of mutation (Huanget al. 1994). Thus, perhaps this repair system keeps such mutations at a minimal level in uvr+ strains by repairing 11-base loop-outs, for example.
As with duplications, previous studies have identified a critical role for repeats, quasi-palindromes, and misalignments in the production of deletions (Ripley 1990). In addition, a variety of other factors, such as sequence context, repair proteins, sister-strand exchange, and the integrity of the palindrome, have been shown to influence deletion formation (Weston-Hafer and Berg 1989; Kazic and Berg 1990; Sindenet al. 1991; Lovettet al. 1993; Schaaper 1993; Santos-Rosaet al. 1996; Saveson and Lovett 1997). Consideration of the percentage of deletions that had precisely aligned direct repeats just within one end and just outside the other end of the deletion showed that ~70–80% of the deletions in the hisD3052 allele had no such repeats (Figure 7B). Most of the remainder had 2-base, flanking repeats, and only a few percent had flanking repeats containing 3–4 repeated bases (Figure 7B). Thus, direct repeats appear to be associated far less frequently with deletion termini than they are with duplication termini, where the majority of the mutations are associated with direct repeats (Figure 7). This implies something fundamentally different about the mechanisms underlying these two classes of frameshift mutations. Although a detailed examination of potential mechanisms for the many deletions in the spectra is not presented here, an example of how slippage between two direct repeats might lead to a 14-base deletion is shown in Figure 9.
Conclusions: This study has identified the target for reversion of the hisD3052 allele to be a minimum of 76 bases in length, spanning nucleotides 843–918. This region codes for 25 amino acids that reside within a 49-amino-acid sequence that is a nonconserved region within histidinol dehydrogenases. This region is likely a linking region between two motifs and, to some extent, is dispensable. Nearly all (96.1–99.6%) of the spontaneous revertants (as well as induced revertants) of this allele were double mutants and, thus, likely have less than wild-type activity.
There were five general classes of mutations that arose spontaneously at the hisD3052 allele: a 2-base deletion hotspot, complex, insertions, duplications, and deletions. The effect of DNA repair on the mutant frequency of these classes of mutations is summarized in Table 4, which shows that the ΔuvrB allele had a similar effect on either a wild-type or a pKM101 background, i.e., it caused an increase in the frequency of hotspot deletions as well as nonhotspot deletions, but it caused a decrease in the frequency of duplications. The effect of the pKM101 plasmid was also similar on either a wild-type or a ΔuvrB background, i.e., it increased the percentage of hotspot mutations. Both the ΔuvrB allele and the pKM101 plasmid were required for the production of complex mutations involving misincorporation/slippage at the hotspot and at the TGA stop codon.
The inducibility, site specificity, and possible mechanisms associated with the five general classes of spontaneous mutations are summarized in Table 5. Both the hotspot mutation and complex mutations involving misincorporation/slippage at the hotspot or the TGA stop codon have been shown repeatedly to be inducible by various mutagens (Cebula and Koch 1990; Kupchella and Cebula 1991; DeMarini et al. 1992, 1993, 1994, 1995a,b, 1996; Levine et al. 1994a,b; Wallace and Josephy 1994; Cebula 1995; Shelton and DeMarini 1995; DeMarini 1998). With the exceptions noted in Table 5, these are the only two classes of spontaneous mutations out of the five that are generally inducible by mutagens. The concerted complex mutations and insertions have never been demonstrated to be inducible by any mutagenic treatment and, thus, arise by mechanisms specific to spontaneous mutagenesis.
Possible mechanisms for a 14-base deletion at sites 888–911 or 989–912 or 990–913 involving precisely aligned, direct repeats.
A summary of some of the possible mechanisms by which the various mutation classes arise spontaneously (Table 5) indicates that with only one exception, these mutations are likely templated. The one exception was the base-substitution associated with each complex mutation at the hotspot and the TGA stop codon in TA98, which appear to be errors of incorporation by DNA polymerase. Although the role of oxidative damage has been investigated as a causative factor in spontaneous mutagenesis at the hisD3052 allele (Mortelmans and Cox 1992), oxidative mutagens, such as hydrogen peroxide, do not revert the hisD3052 allele (Abu-Shakra and Zeiger 1990). Additional evidence also suggests that oxidative damage may not be involved in the spontaneous production of large deletions (Oller and Thilly 1992). A recent review (Josephyet al. 1997) suggests that unidentified aromatic amines or nitroarenes present in nutrient broth or produced endogenously might contribute to the spontaneous reversion of the hisD3052 allele. This interpretation is based on altered spontaneous revertant yields in strains that either overproduce acetyltransferase or are deficient in nitroreductase.
The mutational mechanisms described herein may have considerable relevance to spontaneous mutagenesis in other organisms, including humans. Slippage-misalignment mechanisms contribute considerably to mammalian cell mutagenesis (Kimuraet al. 1994) and account for many of the deletions and duplications found in the human tumor suppressor gene p53 in various types of cancers (Greenblattet al. 1996) as well as in mitochondrial DNA (mtDNA) in various mtDNA-associated diseases (Madsenet al. 1993). Repeats and palindromes also may underlie the trinucleotide repeat-associated neurological diseases (Ashley and Warren 1995; Darlow and Leach 1995) as well as play an important role in aging (Strehler 1995). Combined with a previous study (Greenblattet al. 1996), a recent assessment of the spectrum of somatic mutations in the p53 gene in various tumors indicated that most resulted from endogenous, spontaneous mechanisms and by selection (Krawczaket al. 1995).
Effect of DNA repair on mutant frequency of classes of spontaneous mutations at the hisD3052 allele
Inducibility, site specificity, and possible mechanisms of spontaneous mutations at the hisD3052 allele
The spontaneous mutations recovered at the hisD-3052 allele illustrate the important role for direct and inverted repeats, which are distributed in abundance throughout the target sequence. Their potential to undergo slippage and misalignment and to form quasi-palindromes likely accounts for the diversity of 182 unique, spontaneous frameshifts identified in the present study. However, these structural features of the DNA sequence alone do not account for spontaneous mutagenesis as evidenced by the effect of DNA repair on the frequency and site specificity of some of the mutation classes. Thus, the challenge for future studies is to discern the nature of the interactions between DNA replication and repair proteins and these DNA sequences. As indicated in the quotation at the beginning of this report, our wits will indeed need to grow sharper to appreciate more fully the apparent magic underlying spontaneous mutagenesis.
Acknowledgments
We thank Thomas A. Cebula and colleagues, U.S. Food and Drug Administration, Washington, DC, for generously sharing with us their protocol for identifying the CG/GC hotspot deletion by colony probe hybridization prior to publication. We express our appreciation to W. Thomas Adams and Thomas R. Skopek for providing us with their statistical program. We thank Sara Page, Linda J. Hendee, B. Kay Lawrence, Carolyn F. Felton and Douglas A. Bell who contributed some of the DNA sequence data presented here. We thank Douglas A. Whitehouse for assembling the mutation spectra on the computer. We especially thank Dr. Lynn S. Ripley for her advice and insight during the preparation of this manuscript and for her contagious conviction that spontaneous mutagenesis can be understood. We also thank two anonymous reviewers for their thoughtful suggestions for improving the manuscript. This manuscript was prepared in part from a dissertation submitted by J.G.L. in partial fulfillment of the requirements for the Ph.D. degree at the University of North Carolina at Chapel Hill. This manuscript has been reviewed by the National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency, and approved for publication. Approval does not signify that the contents necessarily reflect the views and policies of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use.
Footnotes
-
An expanded version of this manuscript is also available.
-
Communicating editor: P. L. Foster
- Received September 1, 1997.
- Accepted February 9, 1998.
- Copyright © 1998 by the Genetics Society of America