- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Dietrich, C. R.
- Articles by Schnable, P. S.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Dietrich, C. R.
- Articles by Schnable, P. S.
Maize Mu Transposons Are Targeted to the 5' Untranslated Region of the gl8 Gene and Sequences Flanking Mu Target-Site Duplications Exhibit Nonrandom Nucleotide Composition Throughout the Genome
Charles R. Dietricha,b, Feng Cuib,c, Mark L. Packilag, Jin Lib,d, Daniel A. Ashlocke, Basil J. Nikolauf,h, and Patrick S. Schnablea,b,c,g,d,ha Interdepartmental Plant Physiology Program, Iowa State University, Ames, Iowa 50011,
b Department of Zoology and Genetics, Iowa State University, Ames, Iowa 50011,
c Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa 50011,
d Interdepartmental Genetics Program, Iowa State University, Ames, Iowa 50011,
e Department of Mathematics, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011,
f Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011,
g Department of Agronomy, Iowa State University, Ames, Iowa 50011,
h Center for Plant Genomics, Iowa State University, Ames, Iowa 50011
Corresponding author: Patrick S. Schnable, Iowa State University, Ames, IA 50011., schnable{at}iastate.edu (E-mail)
Communicating editor: J. A. BIRCHLER
| ABSTRACT |
|---|
The widespread use of the maize Mutator (Mu) system to generate mutants exploits the preference of Mu transposons to insert into genic regions. However, little is known about the specificity of Mu insertions within genes. Analysis of 79 independently isolated Mu-induced alleles at the gl8 locus established that at least 75 contain Mu insertions. Analysis of the terminal inverted repeats (TIRs) of the inserted transposons defined three new Mu transposons: Mu10, Mu 11, and Mu12. A large percentage (>80%) of the insertions are located in the 5' untranslated region (UTR) of the gl8 gene. Ten positions within the 5' UTR experienced multiple independent Mu insertions. Analyses of the nucleotide composition of the 9-bp TSD and the sequences directly flanking the TSD reveals that the nucleotide composition of Mu insertion sites differs dramatically from that of random DNA. In particular, the frequencies at which C's and G's are observed at positions -2 and +2 (relative to the TSD) are substantially higher than expected. Insertion sites of 315 RescueMu insertions displayed the same nonrandom nucleotide composition observed for the gl8-Mu alleles. Hence, this study provides strong evidence for the involvement of sequences flanking the TSD in Mu insertion-site selection.
ABOUT a dozen families of maize transposons have been identified (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
215-bp terminal inverted repeats (TIRs) that are highly conserved and are thought to be recognized by the MuDR-encoded transposase (![]()
![]()
Some transposons exhibit nonrandom patterns of insertion. For example, miniature inverted repeat transposable elements such as the Tourist and Stowaway described by ![]()
![]()
![]()
![]()
![]()
As expected, Mu insertions that are responsible for mutations are located in genesusually in exons, but in some cases in noncoding regions (reviewed by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Despite the widespread use of Mu transposons for gene cloning, relatively little is known of Mu transposon insertion preference within genes. There is evidence to suggest Mu transposons insert nonrandomly within at least some genes (reviewed by ![]()
600-bp region around intron 1 (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Although some evidence suggests that Mu transposons may insert at preferred sites within genes, to date this hypothesis has been tested only via the analysis of relatively few mutant alleles that were generated from multiple, and often unrelated, Mu stocks. It is therefore difficult to draw firm conclusions regarding the specificity of Mu insertions from the extant data. In the current study, each member of a large collection of Mu-induced glossy8 (gl8) alleles generated from genetically related Mu stocks was characterized by PCR amplification to determine whether the locus was disrupted by a Mu insertion. Subsequently, sequence analyses of the resulting PCR products established the exact Mu insertion sites in 75 of the 79 gl8-Mu alleles. These data demonstrate that Mu transposons have a strong preference for inserting into the 5' UTR of the gl8 gene. Analysis of sequences flanking the 9-bp TSD has revealed a highly significant conservation of nucleotide composition in the positions directly flanking the 9-bp TSD. Analysis of insertion sites from 315 RescueMu transposons demonstrated that the nonrandom nucleotide composition flanking the gl8-Mu insertion sites is not unique to insertions in the gl8 gene.
| MATERIALS AND METHODS |
|---|
Genetic stocks:
The Mu transposon stocks used to generate 75 of the Mu-tagged gl8 alleles have been described previously (![]()
![]()
The gl8 locus was originally defined by a spontaneous mutation (![]()
All inbred lines were maintained by selfing and/or sib mating. The inbred lines Q66 and Q67 (Schnable accession nos. 111 and 113, respectively) were originally obtained from A. Hallauer (Iowa State University). The inbred lines B77 and B79 (Schnable accession nos. 403 and 404, respectively) were originally obtained from D. Robertson (Iowa State University). The inbred line W64A (Schnable accession no.142) was provided by D. Pring.
Isolation and sequencing of the gl8 genomic clones:
A B73 genomic library constructed in
Dash II (Stratagene, La Jolla, CA) by J. Tossberg (Pioneer Hi-Bred International) was screened by DNA hybridization (![]()
purification steps, a 140-bp fragment isolated from the 5' end of the 1.4-kb apparent full-length gl8 cDNA clone pgl8 (![]()
1512-38, was isolated and determined to contain a 6.8-kb HindIII fragment containing the entire gl8 gene. This 6.8-kb HindIII fragment was sequenced at the Iowa State University Nucleic Acid Facility on an ABI 373A automated DNA sequencer (Applied Biosystems, Foster City, CA). Sequence analyses were performed using the Sequencher Version 3.0 software package (Gene Codes, Ann Arbor, MI). The 3.6-kb HindIII/SacI fragment from the 5' half of the gl8 genomic clone
1512-38 was subcloned into pBSK to make the clone pgl83.6. Probe B (Fig 1) was obtained by PCR amplification of pgl83.6 with primers gl8a58 and gl8a51.
|
Isolation of genomic DNA:
F2 families segregating for gl8-Mu alleles were grown in greenhouse sand benches for 7 days, at which time individual glossy plants (gl8-Mu/gl8-Mu) were identified by the "water-beading" phenotype (![]()
![]()
For 11 of the gl8-Mu alleles derived from a directed-tagging experiment, F2 seed was not available. For these alleles, 10 individual seedlings from crosses such as cross 2 or cross 3 (see RESULTS) were pooled and DNA was extracted via the method of ![]()
![]()
Mu transposons:
TIR sequences for Mu1, Mu3, Mu4, Mu5, Mu7/rcy, Mu8, and MuDR were obtained from GenBank accession nos.
X13019,
U19613,
X14224,
X14225,
X15872,
X53604, and
M76978, respectively. The left and right TIRs were defined as the TIRs that were listed first and second, respectively, in the appropriate GenBank entry. The TIRs of Mu2 were obtained from ![]()
PCR amplification of Mu-flanking regions:
PCR was performed using a primer in the conserved region of the Mu TIR, primer Mu-TIR (5' AGA GAA GCC AAC GCC A(AT)C GCC TC(CT) ATT TCG TC 3'), in combination with individual gl8-specific primers to amplify the gl8 sequences flanking each Mu insertion. PCR amplification reactions were performed with a PTC-200 (MJ Research, Waltham, MA) thermal cycler with the following conditions: denature at 94° for 1 min, anneal at 62° for 1.5 min, and extend at 72° for 2 min for 40 cycles, followed by a final extension at 72° for 5 min. PCR products obtained by amplification with Mu-TIR and a gl8-specific primer upstream or downstream of the Mu transposon within the gl8 gene were called 5' or 3' products, respectively. To amplify the 3' product of Mu10 and Mu12 insertions, it was necessary to lower the annealing temperature to 55°. To compensate for the high GC content in the 5' half of exon 1 of the gl8 gene, DMSO was added to a final concentration of 10% for PCR reactions involving primers xx022, 8a2840, or mcd696. PCR products were purified using a QIAGEN (Valencia, CA) PCR purification kit (catalog no. 28104) and sequenced. PCR reactions were performed using the following gl8-specific primers. The approximate location of each primer is shown in Fig 1.
- gl8a54: 5' GCC ACC CGG ACT AAA ACC TG 3'
- gl8a59: 5' TAA TGG CCT CGC TGT CAC 3'
- gl8a61: 5' AGC AGC AGC GAT CAC CTC AG 3'
- gl8a51: 5' TGT GCC TGC CCC TGT GTC 3'
- gl8a58: 5' AAG AGT GTG GCG CGT GCT ATG 3'
- gl8a62: 5' AAG TGA GAA AGA AAG GTT GTC C 3'
- gl8a64: 5' TTT CGA ATA TTT GTC CTA CTG TTA G 3'
- 8a2840: 5' CCA CCC ACC ACC GGA TAT AGG TCA TG 3'
- mcd696: 5' CGC ACC TCG GGG ACC TTG G 3'
- xx022: 5' CGG ATC AGA AGG CAC GAC GGA G 3'
- gab457: 5' GGT GGA CGA GGA GCT GAT G 3'
- gab830: 5' CAT TGC ACA TCA ATA CCC TTG CTC TTG TAC TC 3'
- gab812: 5' TCA AGA TGC CTC TAT GTT GAG TAC AAG AGC AAG 3'
- gl8ain: 5' CTC AGG AGG TAA TGG TAG 3'
- gab869: 5' GCC AGC CCC TTC TTG CGG ATC TTA ATG 3'
- g24he.p4: 5' CCT ATG CTC GTG CTG CCG TTC GTC 3'
- 8a2637: 5' GTG GCG ACA AAG CTT GCA TCT ATC AGG AAG TCT 3'
Isolation of the 5' UTR region from Gl8 progenitor alleles:
A portion of the gl8 gene containing the 5' UTR region was sequenced from each of the Mu stock progenitors. This region of the Gl8-B77, Gl8-B79, and Gl8-Q67 alleles was PCR amplified with primers gl8a58 and mcd696 (GenBank accession nos.
AF348367,
AF348368,
AF348369, respectively). Because the Gl8-Q66 allele could not be amplified with gl8a58, it was amplified using the primer pair gl8a62 and mcd696 (GenBank accession no.
AF348370). The resulting PCR products were purified and cloned into the TOPO TA cloning vector (Invitrogen, Carlsbad, CA, catalog no. K4500-40) and a bulk of 10 individual clones from each gl8 allele was sequenced.
Genetic algorithm for alignment of sequences:
The genetic algorithm (![]()
| RESULTS |
|---|
The gl8 gene structure:
The gl8 gene was previously cloned using a Mu-tagged allele (gl8-Mu 88-3142) that has a Mu8 insertion near the gene's start codon (![]()
genomic clone,
1512-38, was isolated (see MATERIALS AND METHODS) and restriction analysis identified a 6.8-kb HindIII fragment that contained the entire gl8-coding region. This 6.8-kb fragment was completely sequenced (GenBank accession no.
AF302098). Comparisons between this B73 genomic sequence and the apparent full-length gl8 cDNA clone (pgl8) described by ![]()
Isolation of gl8-Mu alleles:
The gl8 gene product is the ß-keto acyl reductase component (![]()
![]()
![]()
![]()
![]()
![]()
Cross 1: Mu Gl8 Pr/Gl8 Pr x gl8-ref pr/gl8-ref pr:
Ears from cross 1 were individually shelled and kernels from each ear were planted in greenhouse sand benches such that family structures were maintained. Rare glossy seedlings from this cross carried newly generated gl8-Mu alleles. Because each gl8-Mu allele was isolated from a different ear, each allele must necessarily represent an independent mutational event. The exceptional glossy seedlings, which had the genotype gl8-Mu Pr/gl8-ref pr, were transplanted to pots and crossed to a Gl8 pr stock (cross 2) to facilitate the genetic separation of the gl8-Mu and gl8-ref alleles.
Cross 2: gl8-Mu Pr/gl8-ref pr x Gl8 pr/Gl8 pr:
Kernels from cross 2 segregated one purple (Pr/pr):one red (pr/pr). Because pr is genetically tightly linked to gl8 (<1 cM; ![]()
In some instances, glossy progeny from cross 1 were crossed onto inbred lines such as W64A (Gl8 Pr/Gl8 Pr) instead of the Gl8 pr stock (cross 3).
Cross 3: gl8-Mu Pr/gl8-ref pr x Gl8 Pr/Gl8 Pr (W64A or other inbreds):
Because all of the colored kernels from cross 3 were purple, it was not possible to use phenotypic selection to identify kernels that carried gl8-Mu alleles. Instead, in these families F2 analysis was used to distinguish between gl8-Mu Pr/Gl8 Pr (F2 will not segregate red kernels) and gl8-ref pr/Gl8 Pr (F2 will segregate red kernels) progeny of cross 3.
Identification of Mu insertion sites:
To identify the Mu insertion site for each of the 79 gl8-Mu alleles, PCR was performed on each allele using a gl8-specific primer in combination with a primer located in the highly conserved TIRs of Mu transposons. The resulting PCR products that hybridized to the gl8 genomic sequence were purified and sequenced. For each gl8-Mu allele, as many as 16 gl8-specific primers (Fig 1) spanning a 6.0-kb interval containing the gl8 gene were used to identify a primer that in combination with the Mu-TIR primer would amplify the gl8/Mu-flanking DNA. Amplification products were obtained from 75 of the 79 gl8-Mu alleles analyzed. Although Mu-flanking PCR products were obtained from both sides of the Mu transposon for most of these alleles, sequence analysis of only one side was sufficient to determine the transposon insertion site. These analyses revealed that 62 of these 75 Mu insertions (>80%) had occurred in the 5' UTR to the gl8 gene and 52 (69%) of those occurred within an
60-nucleotide interval of the gl8 5' UTR (Fig 2). Ten positions within the 5' UTR experienced multiple Mu insertions. One position in particular was host to 15 independent insertion events. Each of the alleles associated with a multiple insertion site was reamplified and sequenced from DNA extracted from an independent batch of seedlings. In all cases the results from the second analysis were in agreement with those from the first.
|
Because the Mu stocks used to generate most of these mutant alleles were maintained via crosses to the F1 hybrids B77 x B79 and Q67 x Q66 (![]()
|
To ensure that the gl8-ref allele used in the directed-tagging experiments did not interfere with the PCR-based mapping of Mu transposons, gl8-ref was characterized by PCR amplification and sequence analysis. No gl8-hybridizing PCR products were obtained with DNA from gl8-ref/gl8-ref individuals using the Mu-TIR primer in combination with any of 16 gl8-specific primers (data not shown). In addition, no DNA sequence polymorphisms were found between the gl8-ref allele and the wild-type Gl8-B73 allele in the region defined by PCR primers gl8a58 to gl8ain, which includes the entire coding region.
Analysis of Mu TIR sequences:
The gl8/Mu-flanking PCR products contained 39 nucleotides of Mu TIR sequence terminal to the Mu-TIR primer annealing site. Comparing the sequence of these 39 nucleotides from each gl8-Mu allele to the left and right TIRs of the previously defined Mu transposons (Fig 3) identified most of the Mu transposons. Insertions corresponding to six Mu transposons (MuDR, Mu1, Mu2, Mu4, Mu8, and Mu10) were identified in this fashion. Several novel TIR sequences were also recovered. The novel TIRs identified from the 5' and 3' PCR products of gl8-Mu 91g211 define the left and right TIRs of the Mu11 transposon (GenBank accession nos.
AF247740 and
AF247741, respectively). An additional novel Mu transposon was identified from gl8-Mu 91g241 and was designated Mu12. Sequences from the 5' and 3' PCR product from gl8-Mu 91g241 revealed that the terminal 39 bp from the TIRs of Mu12 are perfect inverted repeats and were designated as the left and right TIRs of Mu12 (GenBank accession nos.
AF247742 and
AF302101, respectively). Analysis of the sequence of the 5' PCR product from gl8-Mu 91g209 identified the inserted transposon as Mu10. Because only the left TIR of Mu10 had previously been reported, the 3' PCR product from gl8-Mu 91g209 was sequenced to obtain the right TIR of Mu10 (GenBank accession no.
AF302099).
|
Characterization of the four gl8-Mu alleles in which Mu insertions were not detected:
Mu insertions were not identified in 4 of the 79 gl8 alleles analyzed in this study. These 4 alleles were subjected to further analyses. On the basis of PCR amplification and sequence analysis it was determined that the progenitor of the directly tagged allele gl8-Mu 91g215 was Gl8-Q67 (data not shown). DNA gel blot analysis was then performed using probe B (Fig 1). Although a restriction fragment length polymorphism (RFLP) exists between gl8-Mu 91g215 and Gl8-Q67 (Fig 4, lanes 5 and 1, respectively), gl8-Mu 91g215 is indistinguishable from gl8-ref (Fig 4, lane 3) in this hybridization experiment. Therefore, it is likely that during cross 2 the gl8-Mu 91g215 allele was replaced by the gl8-ref allele as the result of a crossover between gl8 and pr.
|
DNA isolated from a plant homozygous for gl8-Mu 91g159 failed to hybridize to probe B (Fig 4, lane 4). Hybridization with an unrelated single-copy probe established that lane 4 in Fig 4 contains DNA of sufficient quality and quantity to yield hybridization signals to single-copy genes (data not shown). A subsequent experiment using the same blot and the 1.4-kb gl8 cDNA also failed to detect the gl8 gene in lane 4 (data not shown). This result suggests that gl8-Mu 91g159 consists of a deletion of the entire coding region of the gl8 gene.
The gl8-Mu 94-1641-25 allele was not generated from a Mu stock that contains Q66-, Q67-, B77-, and B79-derived alleles (see MATERIALS AND METHODS). Hence, it is not possible to determine with certainty its progenitor allele. However, the 5' UTR of gl8-Mu 94-1641-25 was PCR amplified and sequenced from plants homozygous for this allele and was found to be identical to Gl8-B73. In addition, DNA gel blot analysis using probe B revealed that gl8-Mu 94-1641-25 was indistinguishable from Gl8-B73 (Fig 4, lanes 6 and 2, respectively). These results suggest that gl8-Mu 94-1641-25 either contains a minor rearrangement not detectable via RFLP analysis or has an insertion (or other mutation) outside of the 6.8-kb HindIII fragment detected by the RFLP analysis.
Although it was not possible to identify a Mu insertion in gl8-Mu 93B227, RFLP analysis of this allele shed some light on the nature of its molecular lesion. Lane 9 of Fig 4 contains DNA from a pool of 10 progeny resulting from a cross between the inbred line W64 and a plant with the genotype gl8-Mu 93B227/gl8-ref (cross 3). Analysis with probe B revealed that this DNA sample contains two RFLP fragments. A comparison between lanes 3 and 10 revealed that the gl8-ref and Gl8-W64 alleles are indistinguishable in this hybridization experiment and account for the smaller RFLP signal in lane 9. Hence, the larger RFLP signal in lane 9 must be derived from gl8-Mu 93B227. This RFLP differs from each of the four possible progenitor alleles of gl8-Mu 93B227: Gl8-Q67, Gl8-66, Gl8-B77, and Gl8-B79 (lane 1 and data not shown). Hence, gl8-Mu 93B227 may contain a novel Mu transposon that could not be amplified because it contains divergent TIR sequences or because sequence divergence near the insertion site of gl8-Mu 93B227 relative to the Gl8-B73-derived primers may have impeded amplification. Alternatively, this allele may have resulted from another type of molecular rearrangement.
Noncanonical target-site duplications:
For most alleles, sequence analysis of the gl8/Mu PCR product derived from one side of the Mu transposon was sufficient to determine the Mu insertion site and Mu identity. Nonetheless, to better define a novel transposon or to confirm apparent sequence anomalies, the gl8/Mu-flanking PCR products were sequenced from both sides of the Mu transposons associated with 14 alleles. Analysis of these 14 sequences revealed that 4 alleles (gl8-Mu 91g168, gl8-Mu 91g169, gl8-Mu 91g213, and gl8-Mu 91g239) did not have the characteristic 9-bp TSD (Fig 5). Each of these alleles was reamplified and sequenced from an independent batch of seedlings. In all instances results from the second analysis were in agreement with the first set of analyses. The sizes of the 5' and 3' Mu-flanking PCR products from gl8-Mu 91g168 were not in agreement with the predicted product sizes from any of the 4 progenitor alleles, suggesting that this allele contained a deletion. Sequence analysis of both of these PCR products established the progenitor of gl8-Mu 91g168 as Gl8-Q66 and revealed an apparent deletion of 232 bp from the Mu insertion site (Fig 5A). However, the PCR-based characterization performed here could not distinguish between a deletion and two closely linked (232 bp) Mu1 insertions. Such a structure was observed in the hcf106-mum2 allele in which a second Mu1 transposon inserted 244 bp downstream of the original Mu1 insertion of the hcf106-mum1 allele (![]()
![]()
![]()
![]()
|
Sequence data from one side of the Mu insertion of alleles gl8-Mu 91g169, gl8-Mu 91g213, and gl8-Mu 91g239 established polymorphisms relative to each of the four progenitor alleles. For these alleles, sequence data were also obtained from the other side of the inserted Mu transposon. The progenitor of gl8-Mu 91g169 could be identified as Gl8-B77 but contained a deletion of seven nucleotides from the 3' TSD followed by the insertion of four C's directly flanking the Mu insertion (Fig 5B). Analysis of the Mu-flanking sequence from gl8-Mu 91g213 identified the progenitor as Gl8-Q67, but the 3' flanking sequence contains an A-to-C transversion in the second nucleotide position of the TSD (Fig 5C). Sequence analysis of the 3' PCR product from gl8-Mu 91g239 identified the progenitor as Gl8-B79 and the Mu transposon as either Mu3 or Mu4. Sequence obtained from the 5' PCR product established that the inserted Mu transposon was Mu4. However, the TSDs on the two sides of this transposon were not identical due to an apparent deletion of a G from the 3' TSD (Fig 5D).
Statistical analysis of target-site sequences:
Mu insertions occurred most frequently in an
60-bp region between nucleotide positions 20 and 80 in the 5' UTR of the gl8 gene. Depending on the progenitor allele, and after removal of gaps that were inserted to allow alignment of the sequences, the actual length of this region varies from 47 to 51 nucleotides. A motif search of the B73-gl8 genomic sequence identified a five-base motif, CACNG, which appears frequently in this region of the 5' UTR that experienced the highest Mu insertion frequency. Also depending on the progenitor allele, the CACNG motif appears between four and six times in this
60-bp region with as many as five motifs arranged in tandem. Excluding the targeted Mu insertion region of the 5' UTR, the CACNG motif appears at the expected frequency (
1/256 bp) in the 3045 nucleotides of the B73-gl8 sequence between primers 3 and 10 (Fig 1). Therefore the nonrandom distribution of CACNG motifs in the gl8 gene mirrors the nonrandom insertion pattern of Mu transposons in the gl8 gene.
A genetic algorithm was used to identify the sequence alignment that would provide the best consensus sequence for the Mu insertion sites identified in this study. Genetic algorithms create a population of solutions to a problem and use an iterative selection and variation process to identify good solutions. In this case, the goal is to determine, for each insertion site, which of the two possible orientations provides the best alignment with all other insertion sites. The genetic algorithm does this by determining for each insertion site the orientation that maximizes the divergence from the background distribution of nucleotides. This maximization is performed by selecting alignments with high divergence scores in pairs and then applying variation operations (crossover and mutation) to the alignment to generate similar and possibly superior alignments (see MATERIALS AND METHODS).
The 35 unique insertion sites as defined by the 10 nucleotides 5' of the TSD, the 9-bp TSD, and the 10 nucleotides 3' of the TSD were analyzed in the manner described above. Although each of the 71 Mu insertion sites listed in Table 1 is from an independent insertion event, to avoid biasing the data from sites with multiple insertion events, only a single instance of any given 29-bp insertion site was used in this analysis. Fig 6 shows the orientations of the sequence alignments as selected by the genetic algorithm.
|
As shown in Table 2, the 35 insertion sites are GC rich and particularly low in T. This may be due, at least in part, to the GC richness of the gl8 gene. To normalize for this high GC content an expected nucleotide frequency was calculated based on the GC content of the combined insertion sites. Chi-square analyses were performed to identify positions that had significant deviations from the expected nucleotide composition. Within the TSD, T's, G's, and C's appeared at positions 2, 8, and 9, respectively, at significantly higher-than-expected frequencies. Position 4 has a lower-than-expected frequency of A. The conserved nucleotides at positions 2, 8, and 9 and the weak consensus nucleotides at the other TSD positions are consistent with the reverse complement of consensus sequences reported by ![]()
![]()
|
|
Chi-square analyses of the 20 positions flanking the gl8 TSDs reveals a more significant nucleotide conservation than is observed within the 9-bp TSD itself. Five of the six positions directly flanking the 9-bp TSD have a nucleotide composition that differs from the expected nucleotide composition at the 99% confidence interval (Table 2 and Fig 6). These include conserved C's, A's, and G's at positions -2, +1, and +2, respectively, and significantly lower frequencies of Cs at positions -1 and +3. Two of these conserved nucleotides, the -2 C and +2 G, are particularly interesting given that they are complementary bases equidistant from the TSD. When this analysis was extended to include 50 nucleotides flanking either side of the Mu insertion, additional positions with significant deviations from the expected were not identified at rates higher than would be expected by chance (data not shown).
Analysis of RescueMu insertion sites:
To determine if the nonrandom nucleotide composition observed among gl8 insertion sites is a common feature of Mu insertion sites across the genome, a large collection of RescueMu insertion sites was analyzed as described above for the gl8-Mu insertion sites. A total of 369 independent RescueMu insertion-site sequences generated by the Walbot laboratory were recovered from GenBank by identifying forward and reverse sequences from individual RescueMu events. Eighteen of these RescueMu insertion sites have TSD lengths other than 9 bp (Table 4). An additional nine RescueMu insertion sites were recovered that contained mismatched TSDs (data not shown). Therefore, 315 RescueMu sequences remained after removal of these noncanonical RescueMu insertion sites and insertion sites that did not have at least 60 bp of sequence flanking each side of the TSD. These insertion sites (consisting of 60 bp flanking each side of the TSD plus the 9-bp TSD) were aligned using the genetic algorithm described previously. Table 5 contains the nucleotide composition and total chi-square values for the 10 positions flanking the TSD and the 9-bp TSD for the optimal RescueMu alignment. Within the TSD, six of the nine positions have nucleotides that differ from the expected nucleotide frequency at the 99% confidence interval and two of the remaining three positions have nucleotides that differ from the expected nucleotide frequency at the 95% confidence interval. The one remaining position (position 4) has a significantly lower-than-expected frequency of A's. Position 4 was therefore designated as B, in accordance with the International Union of Biochemistry (IUB) ambiguity code, in the RescueMu consensus TSD and in the combined consensus TSD. The 9-bp consensus TSD for the RescueMu insertion sequences is 5' CTCBCAGAC 3', which is strikingly similar to the consensus 9-bp TSD from the gl8-Mu insertion sites and the reverse complements of the ![]()
![]()
|
|
As was the case for the gl8-Mu insertion sites, the nucleotides with the highest deviations from the expected nucleotide composition were observed at the positions directly flanking the TSD. The three positions flanking either side of the TSD had a conserved nucleotide at the 99% confidence interval. These nucleotides are CCT at positions -1, -2, and -3, respectively, and AGG at positions +1, +2, and +3, respectively. Hence, the three positions immediately flanking either side of the TSD are conserved for the paired 3-bp inverted repeats CCT and ATT. Therefore, each of the four conserved nucleotides identified from the gl8-Mu insertions is also conserved at its respective position in the RescueMu insertion sites. The larger size of the RescueMu data set, however, apparently allowed the additional conserved nucleotides to be identified. A plot of the total chi-square at each of the 129 positions reveals that nearly all the positions with significant deviations from the expected occur within a 15-bp region centered on the TSD and that the positions with the highest chi-squares directly flank the 9-bp TSD, particularly the -2 and +2 positions (Fig 7).
|
Analysis of the RescueMu sequence alignment also revealed a significant difference between the nucleotide composition of the sequences to the left and right of the TSD (Fig 8A). GC profiles of the gl8 and bz1 genes reveal that the preferred target site for Mu insertions also occurs at the interface between regions with low and high GC content (Fig 9). These data suggest that Mu transposons may have a preference for inserting into regions where DNA composition is transitioning from low to high GC content.
|
|
| DISCUSSION |
|---|
Analysis of a collection of 79 gl8 mutant alleles derived from Mu stocks revealed that the vast majority (at least 75/79) contained Mu insertions. Of these 75 alleles, 62 had Mu insertions in the
140-bp 5' UTR of the gene. These 62 alleles include numerous independent insertions at exactly the same nucleotide positions of the 5' UTR. This provides strong evidence for the targeting of Mu transposons not only into the 5' UTR of the gl8 gene but also into specific positions within the 5' UTR. Although the genetic screen used to isolate the gl8-Mu alleles ensured that each allele arose via an independent transposition event, the independence of alleles that contain Mu insertions at the same nucleotide positions was further confirmed by their (1) distinctive transposon identities, (2) transposon orientations, and (3) the wild-type progenitor alleles. For example, of the 15 independent Mu insertions at position 67, 4 are MuDR and 11 are Mu1. Of the 11 Mu1 insertions, 6 are oriented left to right and 5 are oriented right to left. In addition, these 11 Mu1-induced alleles were derived from both Q67 and B79 progenitors. Hence, this collection of gl8 alleles provides convincing evidence for the preferential insertion of Mu transposons into specific regions of a maize gene.
Because each of the gl8-Mu alleles in this study was selected on the basis of the presence of a mutant phenotype, it is likely that the observed distribution of Mu transposons in this collection does not represent a random sampling of insertion events in the gl8 gene because insertions into introns or the 3' UTR may not have resulted in a mutant phenotype. However, this phenotypic selection cannot explain the clustering of multiple independent Mu insertions in the 5' UTR because Mu insertions elsewhere in the gl8-coding region also yield mutant phenotypes but were recovered at much lower frequencies. Analysis of Mu insertion alleles from a reverse genetics screen that did not depend upon phenotypic selection would provide a useful comparison. However, Mu-based reverse genetic screens typically do not yield a sufficient number of alleles per gene to provide evidence of targeting.
Another advantage of working with a defined set of independent insertion alleles generated from a related Mu stock is that alleles are not limited to ones that can be amplified by primers specific to known TIR sequences. If alleles that cannot be amplified with the Mu-TIR primer are present in the collection, insertions in such alleles can still be identified by DNA gel blot hybridization. Of the 79 gl8-Mu alleles analyzed in this study, only 1 allele failed to amplify with the Mu-TIR primer and was shown by gel blot analysis to have an insertion. This suggests that most, if not all, of the active Mu transposons present in at least the Schnable lab Mu stocks have now been identified. Of course, different Mu stocks may contain additional active Mu transposons.
In addition, because 72 of the 75 gl8-Mu alleles in which a Mu transposon was identified were derived from related Mu stocks maintained by the Schnable lab, the insertion frequency of each Mu transposon can be compared. Of the 72 insertions from the Schnable lab Mu stock, 41 were generated by insertions of Mu1, 18 by MuDR, 3 by Mu8, 1 by Mu2, 1 by Mu4, 1 by Mu10, 5 by Mu11, and 2 by Mu12. The overwhelming majority (59/72 or 82%) of these insertions were of the two transposons that are typically most common in Mu stocks, Mu1 and MuDR. The typically less common Mu2, Mu4, and Mu8 transposons were responsible for relatively few of the gl8-Mu alleles. Hence, insertions in the gl8 gene seem to occur at frequencies proportional to their abundance in typical Mu stocks. This is in contrast to observations involving other maize genes that appear to be preferentially targeted by specific Mu transposons. For example, all but 2 of the 24 Mu-induced bz1 alleles that have been characterized (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The novel Mu transposons Mu10, Mu11, and Mu12 were found as new insertions from an active Mu line and have characteristics typical of Mu transposons such as conserved TIRs and the ability to generate 9-bp TSDs upon insertion. The partial TIR sequences from Mu10, Mu11, and Mu12 exhibit a high degree of identity to TIRs from known Mu transposons but are clearly distinct from each other and all previously identified Mu TIR sequences. Of the 39 bp of sequence obtained from these TIRs, 18 nucleotides are conserved among all Mu TIRs. Eight additional positions that are identical in the TIRs of Mu1, Mu2, Mu3, Mu4, Mu5, Mu7/rcy, Mu8, and MuDR are not conserved among the TIRs of Mu10, Mu11, and Mu12. Four of these eight divergent positions are part of the predicted MURA transposase-binding site that extends from nucleotides 2556 (![]()
![]()
The identification of only three previously uncharacterized Mu transposons in this study suggests that the number of active Mu transposon family members is relatively small. Previous studies using DNA gel blotting to compare hybridization patterns obtained with Mu TIR specific probes to patterns obtained from probes specific to the internal regions of the cloned Mu transposons suggested that only 5060% of the Mu TIRs in the genome are associated with known Mu transposons (![]()
![]()
![]()
Small insertions, deletions, and single-nucleotide substitutions in TSDs, such as the ones associated with these gl8-Mu alleles, have been identified previously in revertant alleles resulting from transposon excision (reviewed by ![]()
![]()
![]()
![]()
![]()
The tendency for Mu transposons to preferentially insert into genic regions has now become fairly well established (![]()
![]()
![]()
![]()
![]()
![]()
The composition of the sequences flanking TSDs ha








