Abstract
The maize Ac/Ds transposon family was the first transposable element system identified and characterized by Barbara McClintock. Ac/Ds transposons belong to the hAT family of class II DNA transposons. We and others have shown that Ac/Ds elements can undergo a process of alternative transposition in which the Ac/Ds transposase acts on the termini of two separate, nearby transposons. Because these termini are present in different elements, alternative transposition can generate a variety of genome alterations such as inversions, duplications, deletions, and translocations. Moreover, Ac/Ds elements transpose preferentially into genic regions, suggesting that structural changes arising from alternative transposition may potentially generate chimeric genes at the rearrangement breakpoints. Here we identified and characterized 11 independent cases of gene fusion induced by Ac alternative transposition. In each case, a functional chimeric gene was created by fusion of two linked, paralogous genes; moreover, each event was associated with duplication of the ∼70-kb segment located between the two paralogs. An extant gene in the maize B73 genome that contains an internal duplication apparently generated by an alternative transposition event was also identified. Our study demonstrates that alternative transposition-induced duplications may be a source for spontaneous creation of diverse genome structures and novel genes in maize.
THE complex architecture of the modern maize genome was shaped by ancient whole-genome duplication and subsequent diploidization events (Messing et al. 2004; Haberer et al. 2005). Transposable elements proliferated during the diploidization process (Gaut et al. 2000) and today comprise ∼85% of the maize genome (Schnable et al. 2009). For example, multiple insertions of high-copy-number families of class I retrotransposons greatly expanded the intergenic regions of the maize genome. Comparisons of maize and its close relative sorghum indicate that gene number and colinearity are largely conserved (Hulbert et al. 1990), whereas the intergenic regions differ greatly due to frequent retrotransposon insertions in maize. A good example is the adh1 region of maize inbred B73. Nineteen nested LTR retroelements and two solo LTRs together make up 74% of the intergenic sequences between adh1 and the nearest adjacent gene (Tikhonov et al. 1999). Transposition and proliferation of retroelements thus can largely explain genome size variations, the C-value paradox, and discordant evolutionary history among different grass species (SanMiguel and Vitte 2009). Class II DNA transposons generally do not amplify to such high copy numbers as the class I retroelements, but they can induce structural changes in various ways. For example, the maize Ac/Ds transposon family, class II transposable elements, can undergo a process of alternative transposition that can generate large-scale structural changes such as duplications, translocation, inversions, and deletions (Zhang et al. 2006; Yu et al. 2011). Moreover, a recent study reported evidence that alternative transposition of transposon dhAT-Zm1 generated three tandem duplications during the evolution of the maize B73 genome (Zhang et al. 2013).
In addition to generating structural variations, class II elements and some families of class I retroelements often insert near gene regulatory and coding sequences where they may potentially induce ectopic gene expression or generate chimeric genes in grass genomes (Bennetzen 2005). In rice, Wang et al. (2006) found that >30% of retrogenes had recruited flanking exons at insertion sites and formed chimeric genes. The class II transposons Pack-MULE in rice (Jiang et al. 2004) and helitrons in maize (Morgante et al. 2005) were also reported to generate chimeric transcripts by producing dispersed duplications that engage genic sequences from different genes. In maize, Zhang et al. (2006) reported that functional chimeric genes could be formed by deletions induced by Reversed-Ends Transposition (RET) of the class II Ac/Ds transposon system.
The autonomous Ac element is 4565 bp in length, encodes a single transposase (TPase) gene, and contains TPase-binding sites located in Terminal Inverted Repeats (TIR) and subterminal regions (Kunze et al. 1987; Kunze and Starlinger 1989; Starlinger et al. 1989; Kunze and Weil 2002). Fractured Ac (fAc) and nonautonomous Ds elements retain one or both functional transposon termini, respectively, but lack TPase-coding capability; they can be nonautonomously transposed and participate in alternative transposition reactions (Fedoroff et al. 1983; Kunze and Starlinger 1989; Starlinger et al. 1989; Varagona and Wessler 1990; Zhang and Peterson 1999; Du et al. 2011).
The study described here follows a recent report that demonstrated that RET can generate de novo segmental duplications at the p1 locus in maize (Zhang et al. 2013). We hypothesized that the same mechanism of Ac alternative transposition can shuffle exons and generate chimeric genes and thus can serve as a novel mechanism to generate new genes. Here we show that (1) RET of reversed Ac/fAc ends in the maize p1 gene can induce an interchromosomal reciprocal translocation (p1-vvD103) with disrupted p1 function; (2) among the progeny of p1-vvD103, subsequent RET events produced a series of 11 new alleles (named as P1P2-1–P1P2-14) with ∼70-kb segmental duplications and restored p function; and (3) an extant gene in the maize B73 genome (AC234515.1) contains an internal duplication that is consistent with the structure produced by an alternative transposition event. The P1P2 duplications shuffle exons from different genes to generate fusion transcripts with the phenotypes resembling the p1 gene. Our study demonstrates that RET-induced segmental duplication is a mechanism to produce both genome structural variation and novel maize genes.
Materials and Methods
Maize stocks and screen
P1-ovov454 was heterozygous with 4Co63 (p1-ww). The p1-ovov454 plants were crossed with Ac tester lines p1-ww; r-m3::Ds. Ears were screened for multikernel sectors with colorless pericarp and a 1:1 ratio of spotted to nonspotted kernels. The colorless spotted kernels were planted and analyzed, yielding the allele p1-vvD103. The p1-vvD103 plants were also crossed to the p1-ww; r-m3::Ds tester line. Red sectors with purple spots were phenotypically screened and selected as candidates of P1P2 alleles.
The Ac tester line is an r1 loss-of-function line by Ds insertion. The introduction of active Ac in the genome can somatically excise Ds and recover r1 function, which produces purple sectors on the aleurone (Kermicle 1980). For p1-vvD103/p1-ww r-m3::Ds ears, we observed normal frequency of purple sectors, indicating that Ac in p1-vvD103 allele is active.
Molecular biological methods
Leaf tissue from young plant was collected and ground in liquid nitrogen. Total DNA was prepared by using a modified cetyltrimethylammonium bromide extraction protocol (Porebski et al. 1997). HotMaster Taq polymerase from Eppendorf (Hamburg, Germany) was used in the PCR reaction. The PCR samples were heated at 94° for 2 min, followed by 35 cycles of 94° for 20 sec, 60° for 30 sec, and 65° for 2 min. Another cycle at 65° is extended for 8 min. PCR primers sequences are listed in Supporting Information, Table S1.
Genomic Southern blot was performed according to the protocol by Sambrook et al. (1989); the washing stringency used is 0.5% SDS, 0.5× SSC at 60°.
Pericarp samples for RT-PCR analysis were collected from kernels at 15–20 days after pollination and frozen in liquid nitrogen. Total RNA was extracted by an RNeasy Plant mini kit (Qiagen, Valencia, CA) and treated with DNase I (Qiagen) to remove residual genomic DNA. The PCR products were purified by a Gel/PCR DNA fragment extraction kit (IBI Scientific, Peosta, IA). Sanger sequencing was performed by the DNA Facility at Iowa State University.
Data availability
Maize stocks are available upon request. Reference genome sequence data for maize inbred lines B73 (Schnable et al. 2009) and Mo17 (Xin et al. 2013) have been reported. Sequences of maize genomic MITEs are from Chen et al. 2014. Table S1 contains sequences of oligonucleotide primers used in PCR experiments. Table S2 contains cytogenetic data for presence of reciprocal translocations associated with the P1P2 alleles. File S1 contains sequences at the p1-vv-D103 chromosome 1-10 translocation junction. File S2 contains sequences of P1P2 fusion junctions. File S3 contains sequence of the p2 transcript detected in allele P1P2-3. File S4 shows the alignment of MITE SQ225117735 contained in the duplications in maize gene AC234515.1 with representative genomic MITE SE220500133. File S5 shows an alignment of the 3′ TIR of MITE SQ225117735 contained in the duplications in maize gene AC234515.1 with the 5′ TIR of a complementary genomic MITE SQ225005122. File S6 shows examples of maize B73 genomic MITEs containing 5′ TIRs with perfect complementarity to the 3′ TIR of the duplication-associated MITE SQ225117735. File S7 shows an alignment of p1 and p2 sequences including the sites of Ac insertion derived from standard transposition in p1 (Athma et al. 1992) and alternative transposition in p2 (this study).
Results
P1-vvD103, an interchromosomal translocation allele induced by reversed-Ac-ends transposition
In this study we use a pair of paralogous genes, p1 and p2, as markers to phenotypically track the structural changes at the maize p1 and p2 loci. Both genes encode R2R3 Myb-like transcription factors that regulate the biosynthesis of red phlobaphene pigments (Lechelt et al. 1989; Grotewold et al. 1991, 1994). The p1 and p2 genes share >98% identity in the ORF sequences, but exhibit distinct expression patterns due to different promoter regions. P1 is expressed in pericarp, cob, tassel glumes, and silk, and p2 is expressed in anther and silk (Zhang et al. 2000). Alleles of p1 are commonly identified by a suffix that indicates their expression in pericarp and cob: P1-rr specifies red pigmentation in both pericarp and cob, p1-ww has no pigmentation in either pericarp or cob, and p1-vv exhibits variegated pigmentation in both pericarp and cob (Grotewold et al. 1991; Athma et al. 1992).
Our studies were initiated with the allele p1-ovov454, which conditions orange-variegated pericarp and cob (Figure 1). The p1-ovov454 allele contains a p1 gene in which a full-length Ac and a 2039-bp fractured Ac (fAc) are inserted in the p1 second intron (Figure 1A) (Yu et al. 2011). The 5′ end of Ac and the 3′ end of fAc are oriented toward each other and are separated by 822 bp; in this configuration the Ac 5′ end and the fAc 3′ end can undergo RET to generate rearrangements such as deletions and inversions (Yu et al. 2011, 2012). To identify new RET-induced rearrangements, we screened a population of p1-ovov454 ears for those with multi-kernel sectors exhibiting loss of p1 function. One such sector gave rise to an allele (p1-vvD103) that specified colorless pericarp with infrequent red sectors (Figure 1C). In crosses with the Ac tester stock p1-ww rm3::Ds, the p1-vvD103 allele exhibits normal Ac activity (see Materials and Methods for details of Ac activity test). Interestingly, the p1-vvD103/p1-ww plants exhibit irregular seed spacing on the ear and 50% pollen abortion, indicative of male and female semisterility. These results are consistent with the occurrence of a large inversion or an interchromosomal translocation.
RET of Ac/fAc produces reciprocal chromosome translocation and segmental duplications. Black and blue lines indicate maize chromosome 1 and 10 sequences, respectively. Red lines with arrowheads indicate Ac and fAc elements; open and solid red arrowheads indicate 3′ and 5′ Ac termini, respectively. Black and gray boxes indicate exons of maize p1 and p2 genes, respectively. This legend applies to all figures. Photographs of kernels of plants heterozygous for each allele are presented alongside their schematic structures. (A) Upper (black line): chromosome 1 containing p1 and p2 loci with Ac/fAc elements in progenitor allele p1-ovov454. Lower (blue line): chromosome 10 with transposition target site indicated by vertical orange line. (B) Following chromosome replication, Ac and fAc termini are excised from p1 locus and inserted into target site on chromosome 10. (C) Translocation structure of the p1-vvD103 allele. Asterisk marks the 5.9-kb duplication at the chromosome 1-10 translocation junction in p1-vvD103. (C′) Following DNA replication, the reciprocal translocation chromosomes contain two identical sister chromatids. The ∼70-kb region of the ensuing duplication is indicated by gray-shaded arrows. (D) The reverse-oriented Ac and fAc termini excise from the p1 locus of the upper chromatid and insert into the target site (green vertical line) of p2 gene on the sister chromatid. (E) Schematic structure of the P1P2 allele with associated chromosomes. Upper line indicates translocation chromosome 1-10 containing ∼70-kb direct duplications indicated as gray-shaded arrows. Lower line indicates reciprocal translocation chromosome 10-1.
To identify the rearrangement(s) involved, we cloned the sequences flanking the Ac and fAc junctions by inverse PCR. The sequences flanking the Ac/fAc junctions were used in BLASTN searches against B73 RefGen_v2 (Lawrence et al. 2004) and were mapped to a locus on the short arm of chromosome 10. Once the target locus was identified, new primers were designed and used in PCR with p1-vvD103 as template to confirm the validity of both Ac and fAc junctions (Figure 2, A–C). In addition, primers from the chromosome 10 target locus were used in PCR with DNA from oat–maize chromosome addition lines as templates (Ananiev et al. 1997; Kynast et al. 2002); the results (Figure 2D) confirmed that the junction sequences were derived from chromosome 10. Finally, we confirmed the presence of a chromosome 1-10 translocation by cytogenetic analyses of meiotic pachytene-stage nuclei from sporocytes of plants of genotype p1-vvD103/p1-ww. An example is shown in Figure 2E, which clearly shows a quadrivalent association between the short arms of chromosome 1 and 10. The chromosome 1-10 translocation breakpoint in p1-vvD103 occurs in p1 intron 2, and hence places p1 exon 3 on a different chromosome from exons 1 and 2; this structure is consistent with the disruption in p1 function and semisterility observed with the p1-vvD103 allele.
The p1-vvD103 allele is associated with a chromosome 1-10 reciprocal translocation. (A) Schematic structures of p1-vvD103 reciprocal translocation chromosomes 1-10 (upper) and 10-1 (lower). All symbols as in Figure 1. Positions of primers D5, Ac5, P3, and D3 are indicated by the short horizontal arrows. (B) Gel analysis of PCR amplification by primers Ac5 + D5. Lanes 1–4: P1P2-2, p1-vvD103, p1-ovov454; p1-ww[4Co63]. Both p1p2 and p1-vvD103 alleles show a band at ∼5 kb. (C) Gel analysis of PCR amplification by primers D3 + P3. Lanes 2–4: p1-vvD103, p1-ovov454; p1-ww[4Co63]. The p1-vvD103 allele shows a band at 2.5 kb. (D) PCR analysis of genomic DNA from oat–maize addition lines using primers D3 + D5 to identify reciprocal translocation chromosome. Lane 1, oat; lane 2, maize p1-ww[4Co63]; lanes 3–12, oat–maize addition lines with maize chromosomes 1–10, respectively; lane 13, maize p1-ovov454. Lane 12, oat–maize addition line containing maize chromosome 10 shows a band of the same size (800 bp) as produced by maize p1-ww[4Co63] and allele p1-ovov454. (E) Maize sporocyte nucleus (pachytene stage) of p1-vvD103/p1-ww[4Co63]. The short and long arms of chromosome 1 and chromosome 10 are designated 1S, 10S, 1L, and 10L. “10C” indicates the centromere of chromosome 10. The breakpoint of desynapsis in chromosome 10 is in the short arm near the end, and the breakpoint in chromosome 1 is in the short arm near the middle.
We propose that the translocation in p1-vvD103 was generated by RET as shown in Figure 1, A–C. In this model, Ac TPase utilizes the reverse-oriented termini of fAc and Ac in progenitor allele p1-ovov454 as transposition substrates, followed by insertion of these transposon ends into the chromosome 10S target site. As shown in Figure 1B, the Ac 5′ terminus is joined to the distal end of 10S, while the 3′ terminus of fAc is ligated to the proximal side, thereby maintaining the monocentric condition of each chromosome. In addition, we detected a 5.9-kb segmental duplication (labeled by the asterisk in Figure 1C; sequences are listed in File S1) at the translocation junction of Ac 5′ terminus in p1-vvD103. The 5.9-kb segment includes 1601 nucleotides from the p1 locus (822 bp of p1 intron 2 plus 779 bp of fAc) and 4331 bp sequences duplicated from the translocation junction on chromosome 10. We propose that this duplication was generated by repair of a transposition-induced double-strand break (DSB) through template switching followed by nonhomologous end joining (NHEJ), as explained further in the Discussion.
P1P2, the p1 and p2 fusion allele induced by reversed-Ac-ends transposition
The progeny ears of p1-vvD103/p1-ww exhibited occasional kernels with small red pericarp stripes and sectors. This observation was surprising because the p1 gene exons 1 and 2 were separated from exon 3 due to the chromosome 1-10 translocation in p1-vvD103. In a larger population of p1-vvD103 ears, we obtained a number of multi-kernel sectors exhibiting distinct red pigmentation. The new phenotype highly resembles that of p1-ovov454 (Figure 1), suggesting that the structural changes likely involve p1 and/or p2 loci, by which we name them as “P1P2.”
Like the p1-vvD103/p1-ww plants, the P1P2/p1-ww plants show irregular seed spacing on the ear and 50% pollen abortion, resulting from male and female semisterility. We also observed a ring of four in diakinesis cells indicative of quadrivalent associations in reciprocal translocation heterozygotes (Table S2). The translocation junction could also be amplified by PCR with primers Ac5 and D5 in P1P2 alleles in the same way as for p1-vvD103 (Figure 2B). Collectively, these results indicate that the newly obtained red pericarp phenotype was produced in spite of the chromosome 1-10 translocation. To characterize Ac activity, plants carrying P1P2 were crossed with the Ac tester stock p1-ww r-m3::Ds (see Materials and Methods for details of the Ac activity test). Interestingly, the P1P2 alleles exhibited developmentally delayed Ds excision, distinct from p1-vvD103. This phenotype is associated with increased Ac copy number resulting in the characteristic Ac negative-dosage effect (McClintock 1950, 1952; Heinlein et al. 1994; Kunze et al. 1995; Kunze and Weil 2002).
To explain the formation of the P1P2 alleles from p1-vvD103, we propose a model of RET during DNA replication followed by insertion of the excised transposon ends into the p2 locus of the sister chromatid (Figure 1, C–E). This mechanism doubles Ac copy number, maintains the intact translocation junctions, and generates a duplication of sequences located between the insertion and excision sites (∼70 kb in size). The duplication junction links the p1 sequences from the excision site to the p2 sequences at the insertion site, creating a transcriptional fusion of exons 1 and 2 of the p1 gene with exon 3 of the p2 gene. The P1P2 fusion alleles would retain the 5′ promoter region of the p1 gene and hence should be transcribed and expressed in kernel pericarp.
To test this hypothesis, we performed PCR on genomic DNA of independently-derived P1P2 alleles using one primer (Ac5) on Ac and three primers spaced along intron 2 and exon 3 of the p2 gene; these primers should amplify the predicted p1::p2 duplication junction (Figure 3A). The results confirm the prediction: strong product bands of various sizes are observed in 12 independent P1P2 alleles (Figure 3B, lanes 1–14), while little or no products are formed in the negative control p1-ww[4Co63] (“J”), the grandparent allele p1-ovov454 (454), or the immediate progenitor p1-vvD103 (“D”). Further evidence is provided by genomic Southern blot using the probe fragment pp1, which detects both p1 and p2 genes. The results from a subset of P1P2 alleles are shown in Figure 3C. In addition to bands derived from the intact p1 gene (10 kb), the p1 fragment linked to the chromosome 10-1 translocation (6.5 kb) and the allele from inbred 4Co63 (∼6 kb), the P1P2 alleles exhibit novel bands ranging from 7.9 to 3.9 kb. Importantly, the sizes of PCR products (Figure 3B) and Southern blot bands (Figure 3C) are consistent with their origin from the P1P2 duplication junction depicted in Figure 3A. Additional tests by semiquantitative PCR and genomic DNA gel blot indicate the presence of duplications in these alleles (Figure S1).
The P1P2 alleles contain tandem direct duplications with breakpoints in the p2 gene. (A) Schematic structures of P1P2 fusion allele on chromosome 1-10 and reciprocal translocation region on chromosome 10-1. Bracketed segments indicate HindIII restriction fragments, and the open boxes represent the positions of probe fragment pp1. Black arrowheads indicate positions of PCR primers. (B) Gel profiles of PCR products from duplication junctions. Primers Ac5 + P1 amplified template DNA from J, 454, D, and P1P2-1–P1P2-5. Alleles P1P2-6 to P1P2-10 were amplified using primers Ac5 + P2, and alleles P1P2-12 and P1P2-14 were amplified using primers Ac5 + P3 (from a separate gel). (C) DNA gel blot analysis of P1P2 alleles hybridized with probe fragment pp1. Arrows indicate bands common to multiple templates. Bands of varying size (ranging from 7.9 to 3.9 kb) are derived from duplication junctions specific for each allele. Lanes in B and C: J—p1-ww[4Co63]; 454—P1-ovov454; D—p-vvD103. Lane 7f contains DNA from a derivative of P1P2-7 in which Ac has transposed and a newly formed fAc is present; hence its PCR patterns are somewhat different from those of lane 7.
Finally, the PCR products spanning the duplication junctions in each of 11 P1P2 alleles were cloned and sequenced. The sequences flanking the 5′ end of Ac are all derived from breakpoints in the p2 gene (summarized in File S2). These results confirm that, in each case, the 5′ end of Ac is inserted into the p2 gene at the positions shown in Figure 4.
Structures of chimeric P1P2 fusion genes. (A) Exon/intron structure of p2 gene with Ac insertion sites specific to each P1P2 alelle indicated by numbered red triangles. Arrowhead marked “ATG” indicates translation start codon; stop sign indicates stop codon. (B) Fine structures of P1P2 fusion genes. Each chimeric allele contains p1 exons 1 and 2 and partial intron 2, fused via Ac with the p2 gene. Sequences shown are from the Ac/p2 fusion junctions; sequences from p2 exons 1, 2, and 3 are shown in gray boxes.
Chimeric P1P2 alleles are expressed in maize pericarp
To explain the pericarp pigmentation phenotype observed in the P1P2 alleles, we tested the expression of chimeric fusion alleles in the developing pericarp, where p1 is expressed. We dissected developing kernel pericarp (15–20 days after pollination) from progenitor allele p1-vvD103 and 10 of 11 P1P2 fusion alleles, prepared total RNA, and performed RT-PCR (Materials and Methods). Using primers located on exon 1 of p1 and exon 3 of p2, we detected a 620-bp PCR band in all P1P2 fusion alleles (Figure 5). The size of the 620-bp band is consistent with its origin from a chimeric transcript containing exons 1 and 2 from the p1 gene and exon 3 from p2. These results confirm that the fusion alleles are transcribed in the pericarp.
The chimeric P1P2 alleles are expressed in maize pericarp. (A) Diagram of predicted structure of P1P2 transcripts, indicating positions of primers P4 and P5. (B) Gel analysis of RT-PCR products produced by primer P4 from p1 exon 1 and P5 from p2 exon 3 (620 bp), with the internal control of a tubulin gene fragment (322 bp). Total RNA extracted from developing kernel pericarp was used as template for RT-PCR. Depending on whether plants were homozygous or heterozygous at the fusion locus, different intensities of 620-bp bands were obtained for different samples of P1P2 alleles.
To assess the potential functionality of the resulting protein, we aligned the deduced P1P2 protein sequence with that of the Myb-like transcription factor P1 (Zea mays) AAL24047.1 (Figure S2). The chimeric protein shares 95% identity with maize P1 protein and contains both the R2R3 Myb DNA-binding region and the putative transcription activation domain (Essers et al. 2000; Sidorenko et al. 2000; Zhang et al. 2000; Kunze and Weil 2002). Considering the high similarity of the P1 and predicted P1P2 proteins, we consider it likely that P1P2 is capable of activating the phlobaphene biosynthetic pathway and hence is responsible for the kernel pericarp pigmentation observed in P1P2 fusion alleles.
In P1P2-1, P1P2-2, and P1P2-3, the Ac insertion site (and hence the p1/p2 junction) is located upstream of p2 intron 2, leading to a precursor transcript with a tandem duplication of intron 2 and exon 2 (the first from p1 and the second from p2). We tested for potential alternative splicing of the p1-p2 transcript in these three fusion alleles by performing RT-PCR using flanking primers P4 and P5. This test yields a single 620-bp band in these alleles, as for the other fusion alleles, indicating that only one copy of exon 2 remains in the final transcript; this result was confirmed by sequencing the 620-bp RT-PCR product (data not shown). Because p1 and p2 exon 2 sequences differ at two SNPs, we examined the signal peaks in the Chromas sequencing profiles to determine which exon 2 is retained. As shown in Figure S3, only p1 signals were detected in the P1P2-1 and P1P2-2 alleles, indicating that p2 exon 2 is always removed; i.e., alternative splicing does not occur. In contrast, allele P1P2-3 exhibits mixed p1/p2 sequence peaks. However, further investigation showed that the proximal duplicated copy of p2 contains an additional composite insertion (Zhang et al. 2014) that drives expression of p2 in pericarp (File S3). Therefore, the p2 signals seen in P1P2-3 may be derived from the ectopically expressed p2 gene. In summary, two alleles tested (P1P2-1 and P1P2-2) show no evidence of alternative splicing, while for a third allele (P1P2-3) the evidence is inconclusive.
Extant maize gene contains duplicated exons derived by alternative transposition
A previous report provided evidence that the RET mechanism has contributed to maize B73 genome evolution by inducing tandem duplications (Zhang et al. 2013). To test whether RET events have also been involved in formation of chimeric genes, we developed and applied bioinformatics methods (T. Zuo, unpublished results) to search the maize B73 reference genome sequence (Schnable et al. 2009) for genes containing duplications possibly derived by alternative transposition events. One such case (gene AC234515.1; Figure 6, A and B) contains 12 exons; exons 10 and 11 are nearly identical duplications of exons 8 and 9. The duplicated exons are contained within tandem 1054- to 1056-bp duplications that are 99% identical. The same duplicated structure of gene AC234515.1 is also found in the reference genome sequence of another maize inbred line, Mo17 (Xin et al. 2013); the duplication in Mo17 is 99% identical (2106/2110) with that of B73.
RET model for generation of duplications in maize gene AC234515.1. (A) Structure of maize gene AC234515.1_FGT002 downloaded from the Maize Genome Database (http://www.maizegdb.org). (B) AC234515.1 contains two directly duplicated segments of 1054 and1056 bp (gray-shaded arrows), including exons 8, 9, 10, 11, intron sequences, and duplicated PIF-Harbinger MITE elements. The MITEs (blue lines) have degraded 5′-end sequences (truncated solid arrowheads) and intact 3′ ends (complete open arrowheads). Both MITEs are flanked by TSDs (“TAA”) shown as black vertical lines. (C) Prior to RET, the progenitor gene may consist of 10 exons (labeled here as 1–9, plus 12). An intact 5′ end (blue triangle) from a PIF-Harbinger MITE element is predicted to be located downstream of exon 12. (D) The reverse-oriented 3′ and 5′ MITE termini (circled in cyan color) flanking exon 12 undergo RET beginning with excision of the termini. The intertransposon segment containing exon 12 is circularized and presumably lost. The excised termini reinsert into intron 7 on the sister chromatid, generating TSDs indicated by vertical orange lines. (E) The expected gene structure generated by RET. The gene contains identical duplicated segments (gray-shaded arrows) containing exons 8 and 9 and MITE elements located precisely at the duplication endpoints. The 5′ MITE end located downstream of exon 12 may be subsequently excised or segregated from the AC234515.1 gene.
Interestingly, two PIF-Harbinger MITE transposons are found precisely at the duplication endpoints; these MITEs belong to the DTH_Zem5 family (Chen et al. 2014, indexed with the accession nos. SQ225117734 and SQ225117735; alignment with the representative MITE SE220500133 is shown in File S4). The duplicated MITEs are 100% identical in sequence, but are <95% similar to other MITEs in the genome. This suggests that these MITEs were generated in a relatively recent transposition or duplication event. However, both MITEs contain two nucleotide substitutions at the 5′ TIR and one substitution at the 3′ TIR. These changes severely reduce the complementarity of the 14-bp TIRs and would most likely render these MITEs unable to undergo standard transposition. Thus, it is unlikely that these MITEs were generated by two independent standard transposition and insertion events. However, these MITEs and their associated duplications could result from RET, as this mechanism does not require intact termini from the same element. For example, a maize fAc element, which contains only the 3′ terminus of Ac, is capable of undergoing RET in cooperation with a 5′ Ac termini provided by a nearby Ac element (Zhang et al. 2009).
We propose that the duplicated structure of gene AC234515.1 was formed by RET during DNA replication (Figure 6, C–E) as follows: The progenitor gene may consist of 10 exons and contain a PIF-Harbinger MITE element inserted into intron 9, as well as a second MITE located downstream of the last exon. Subsequent to its initial insertion, the 5′ and 3′ termini of the intron 9 MITE underwent the mutations noted above. Because the 3′ TIR has only a single mismatch, it could participate in RET with the 5′ terminus of a downstream MITE with a complementary TIR. These two MITE termini were excised from the chromatid in an RET event (Figure 6D). The excised MITE termini reinserted into intron 7 of the progenitor gene on the sister chromatid, generating the 1054- to 1056-bp direct segmental duplications of exons 8 and 9 and intron 9 including the MITE. Reinsertion after RET generated the same TSD sequences as those flanking the initial MITE insertions, probably due to the strong preferential insertion at TA sequences during the transposition of PIF/Harbinger (Bureau and Wessler 1992; Jiang and Wessler 2001).
Our model presupposes the existence of an intact 5′ end from a PIF-Harbinger element of the DTH_Zem5 family located downstream of gene AC234515.1. We could not identify this type of MITE terminus within the 1-Mbp distance 3′ of AC234515.1 in the maize B73 sequence; possibly it was lost sometime after the RET event due to excision or segregation. However, we did identify other maize genomic MITEs that contain 5′ TIRs with perfect complementarity to the 3′ TIR of the duplication-associated MITEs (File S5 and File S6). These MITEs with complementary 5′ TIR were also flanked by 3-bp TSD (File S6), indicating their competence for transposition. Together, these findings are entirely consistent with the proposed mechanism of duplication via RET involving complementary 5′ and 3′ TIRs derived from two different MITEs.
In addition, other potential mechanisms of duplication (reviewed in Hastings et al. 2009) are inconsistent with the structure and sequence of the duplication in AC234515.1. One well-known mechanism for generating genomic duplications is unequal crossing over, also known as Non-Allelic Homologous Recombination (NAHR). In NAHR, displaced crossing over between two independent MITE insertions in introns 7 and 9 could produce a duplication; however, this crossover mechanism would also result in a third MITE located exactly at the beginning of the duplications (Figure S4). However, there is no MITE present at this position in gene AC234515.1 in either B73 or Mo17. Excision of the predicted third MITE from its position after the duplication is also unlikely due to the noncomplementary 5′ and 3′ termini, which would have existed prior to the duplication. Moreover, the fact that the two MITEs in the duplication are 100% identical is entirely consistent with a recent RET event, but is highly unlikely to result from NAHR. If the duplication were generated by NAHR, the two MITEs would most likely be polymorphic since they would have originated from two independent insertions into the progenitor gene, and a crossover between them would have produced a hybrid element. We also discount the possibilities of other known mechanisms such as NHEJ, fork stalling and template switching, and microhomology-mediated break-induced replication. Each of these mechanisms would be expected to occur at random genomic positions, whereas in this case the MITEs are located precisely at the duplication endpoints, strongly suggesting that the MITEs played a causative role in the duplication mechanism. In addition, there is no evidence of microhomologies, ectopic sequences, or filler DNA as would be expected from these repair mechanisms. Taken together, the evidence supports RET as the mechanism that generated the duplications in gene AC234515.1 and is inconsistent with other duplication models.
Discussion
Previous reports have demonstrated the ability of Ac/Ds alternative transposition events to generate genome rearrangements in maize, including deletions, duplications, inversions, and translocations (Zhang and Peterson 2004; Zhang et al. 2006, 2009, 2013; Huang and Dooner 2008). At each rearrangement breakpoint there is the potential for coding sequences of two different genes to be fused to form a novel chimeric gene. Here we show that such events can occur to generate functional fusion genes. We obtained genome rearrangements using visual screens for changes in maize kernel pericarp pigmentation, a trait controlled by the maize p1 gene. Starting with a pigmented maize stock containing a p1 allele (p1-ovov454) with Ac/fAc insertions in p1, we isolated a loss-of-function allele (p1-vvD103) with a chromosome 1-10 reciprocal translocation. This translocation breaks the p1 gene in intron 2, placing the p1 promoter and exons 1 and 2 on a different chromosome from exon 3. From this stock, we then isolated and characterized a series of 11 gain-of-function alleles in which p1-coding sequences were joined to a paralog (p2) located ∼70 kb proximal to p1. The fusion junctions occur at various sites in the p2 gene, but all appear to be capable of producing a chimeric gene product encoded by p1 exons 1 and 2 and p2 exon 3. Due to the high sequence similarity of p1 and p2, we propose that their fusion products retain the capacity to regulate flavonoid biosynthesis. Transcript analysis shows that the chimeric genes are expressed in developing kernel pericarp, as would be expected given the genes’ location downstream of the p1 promoter.
We propose that these 11 cases were generated by a type of alternative transposition known as RET (Zhang and Peterson 2004) in which an Ac end excised from the p1 gene on one chromatid is inserted into the p2 gene on the sister chromatid. This event produces a duplication of ∼70 kb representing the DNA between p1 and p2. This model is consistent with previous reports of RET-induced duplications (Zhang et al. 2013) and is supported by PCR and Southern blot data.
In an independent study, Zhang et al. (2006) documented a series of chimeric p-oo alleles (orange pericarp, orange cob) also generated by RET events, except that the transposon excised from p1 reinserted into the proximal p2 gene of the same chromatid. This results in a deletion of the ∼70-kb intergenic segment and formation of a chimeric gene consisting of the promoter, exon 1 and exon 2 of p2 fused to exon 3, and downstream enhancer of p1 (Sidorenko et al. 1999). Although the p2 gene is normally not expressed in pericarp (Zhang et al. 2000), p-oo transcripts were detected in kernel pericarp, suggesting that the downstream enhancer of p1 activates the p2 promoter. An interesting observation is that the intensity of pigmentation of different p-oo alleles was inversely correlated with both the size of the chimeric intron and the distance to the downstream enhancer. In contrast, the pigmentation intensities of the fusion alleles described here do not correlate with intron size (Figure S5). This suggests that splicing efficiency does not appreciably differ for these introns ranging in size from 7.9 to 11.9 kb. Thus, the observed variation in p-oo expression is likely due to varying distances between the p1 3′ enhancer and the p2 promoter.
The distribution of duplication junctions in p2 (and hence the Ac insertion sites during RET) are distinctly nonrandom: 4 of 11 insertions are located at the 5′ region of p2 gene, including the first two exons, and 6 insertions are located at the 3′ region of p2 intron 2. Only one P1P2 allele has a breakpoint near the center of intron 2, and the 5′ half of intron 2 is devoid of insertions. This pattern suggests that the 5′ region of p2 and 3′ region of p2 intron 2 are Ac insertion hot spots and that the 5′ half of intron 2 is a cold zone. A similar pattern of nonrandom Ac insertion was observed in 21 p1-vv alleles derived by intragenic transposition of Ac (Athma et al. 1992): 15 were clustered in the 5′ region of the p1 gene, including exon 1 and exon 2, and 6 were inserted in the 3′ region of p1 intron 2. To compare the Ac insertion preferences between p1 and p2, we aligned the p1 and p2 genomic sequences and plotted the insertion sites of Ac in p1-vv by standard transposition and p2 by RET (Figure S6 and File S7). The results indicate that the target site preferences of Ac insertion during standard and alternative transposition are similar. The areas of frequent Ac insertions in both p1 and p2 exhibit high sequence similarity, while the regions devoid of Ac insertions are more diverse. The conservation of hot spots and dissimilarity of cold zones suggest that insertion site preference is largely a result of preferential targeting to hot spots, rather than avoidance of cold zones.
The chimeric genes isolated here were derived from a loss-of-function allele (p1-vvD103) borne on a translocation chromosome, which was itself derived from progenitor P1-ovov454 by alternative transposition. Previous research has demonstrated that RET followed by insertion of the excised transposon ends into a different chromosome can generate translocations (Zhang et al. 2009). However, in the p1-vvD103 allele described here, the translocation junctions contain an additional 5.9-kb duplication. Sequences of the duplication suggest that it was formed by DSB repair involving template switching followed by NHEJ as described for the origin of an Ac-immobilized allele in maize (Conrad and Brutnell 2005). As shown in the model in Figure 7, two pairs of DSBs were initiated by Ac TPases during alternative transposition: one pair at the p1 locus where the alternative transposon excised and the other pair at chromosome 10 where the excised transposon ends inserted. Following the excision event, the 3′ terminus of the alternative transposon was ligated to the proximal side of the chromosome 10 break precisely at the translocation site. While at the 5′ DSB, the broken end invaded and copied the sister chromatid for 1601 bp, resulting in an extension of p1 and partial fAc sequences. Similarly, the break at the distal side of chromosome 10 used the sister chromatid as template for replication of 4331 bp from chromosome 10. The two extended fragments annealed at the 3-bp microhomology site “GTA” and were covalently joined by the NHEJ repair mechanism (Lehman 1997).
Template switching followed by NHEJ explains the 5.9-kb duplication at the p1-vvD103 translocation junction. (A) Replicated chromosomes 1 (black) and 10 (blue) in progenitor p1-ovov454 allele are shown; chromosome 1 contains p1 and p2 loci with Ac/fAc insertions indicated in red. The segments to be duplicated are shaded in pink (1601 nt from p1 locus) and blue (4331 nt from chromosome 10). (B) Excision of fAc and Ac termini by Ac transposase generates DSBs and circularizes the small intertransposon segment. Ac transposase also cuts at the insertion target site (orange vertical line) on chromosome 10, which will join with the fAc end to generate the chromosome 10-1 translocation. (C) Repair of DSBs. The DSB at the Ac 5′ end primes repair replication using the sister chromatid as template, copying p1 and partial fAc sequences. Similarly, the DSB at the distal side of chromosome 10 uses sister chromatid as replication template. (D) The two extended chromatids are joined by NHEJ using the 3-bp microhomology site “GTA,” producing a chromosome 1-10 translocation junction containing a 5.9-kb duplication composed of 1601 nt from chromosome 1 and 4331 nt from chromosome 10. (E) Final structure of translocation chromosomes in p1-vvD103 allele.
In summary, we characterized 11 de novo segmental duplications and associated functional fusion genes induced by alternative transposition. We also identified an extant gene AC234515.1 in the maize B73 genome that contains an internal duplication apparently generated by an alternative transposition event. These results indicate that Ac (and possibly other TEs that transpose via a cut-and-paste mechanism) can generate duplications and functional chimeric genes in the maize genome. The pattern of genetic diversity in maize is attributable to mechanisms such as mutation, recombination, artificial selection, gene conversion, and homology-based gap repair (Gaut et al. 2000). Our results demonstrate another possible mechanism for the formation of chimeric genes and further illustrate the potential role of TEs in generating genetic diversity.
Acknowledgments
We thank Meixia Zhao for her great help in the bioinformatics searches of MITE sequences in the maize genome; Terry Olson for technical assistance in the molecular experiments; and Douglas Baker for field assistance. This research is supported by National Science Foundation award 0923826 (to T.P., D.F.W., and J.Z.).
Authors’ contributions: D.W. conceived of the study, designed the experiments, carried out the molecular genetic studies, and drafted the manuscript. C.Y. conceived of the study, designed the experiments, and carried out the molecular genetic studies. T.Z. performed a genome search for duplication structures and drafted the manuscript. J.Z. conceived of the study and participated in the design of the experiments. D.F.W. participated in the design of the study and performed the cytogenetics analysis. T.P. conceived of the study, participated in its design and coordination, and drafted the manuscript. All authors read and approved the final manuscript.
Footnotes
Communicating editor: J. Sekelsky
Supporting information is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.115.178210/-/DC1.
- Received May 15, 2015.
- Accepted September 6, 2015.
- Copyright © 2015 by the Genetics Society of America