Digestion–ligation–amplification (DLA), a novel adaptor-mediated PCR-based method that uses a single-stranded oligo as the adaptor, was developed to overcome difficulties of amplifying unknown sequences flanking known DNA sequences in large genomes. DLA specifically overcomes the problems associated with existing methods for amplifying genomic sequences flanking Mu transposons, including high levels of nonspecific amplification. Two DLA-based strategies, MuClone and DLA-454, were developed to isolate Mu-tagged alleles. MuClone allows for the amplification of subsets of the numerous Mu transposons in the genome, using unique three-nucleotide tags at the 3′ ends of primers, simplifying the identification of flanking sequences that cosegregate with mutant phenotypes caused by Mu insertions. DLA-454, which combines DLA with 454 pyrosequencing, permits the efficient cloning of genes for which multiple independent insertion alleles are available without the need to develop segregating populations. The utility of each approach was validated by independently cloning the gl4 (glossy4) gene. Mutants of gl4 lack the normal accumulation of epicuticular waxes. The gl4 gene is a homolog of the Arabidopsis CUT1 gene, which encodes a condensing enzyme involved in the synthesis of very-long-chain fatty acids, which are precursors of epicuticular waxes.
INSERTIONAL mutagenesis is widely used in functional genomics. For example, insertion mutants obtained via T-DNA in Arabidopsis (Alonso et al. 2003) and rice (Sallaud et al. 2004) and via transposons in maize (Brutnell 2002; Brutnell and Conrad 2003; May et al. 2003; McCarty et al. 2005; Settles et al. 2007), rice (Kolesnik et al. 2004; Miyao et al. 2003; Kumar et al. 2005), and Arabidopsis (Speulman et al. 1999) have been used for both forward and reverse genetics. In both situations it is necessary to identify sequences flanking the insertional mutagen. For example, the availability of sequence-indexed collections of T-DNA insertion mutants (Alonso et al. 2003) has greatly facilitated the functional analysis of Arabidopsis. Such reverse genetic resources are generated by creating large numbers of independent insertion events and then identifying and sequencing the DNA flanking the insertional mutagen. To be cost effective such flanking sequences are typically amplified using one of several available “genome-walking” strategies (Shyamala and Ames 1989; Alonso et al. 2003; O'Malley et al. 2007; Vandenbussche et al. 2008; Uren et al. 2009).
Similarly, once mutant phenotypes have been identified following forward genetic screens, the challenge in cloning the affected gene is to identify the specific genic sequences that flank causative insertions. Insertional mutagensis is typically more productive if multiple copies of the insertional mutagens are present. The Mutator (Mu) transposon of maize has been widely used for forward genetics because of its high copy number and transposition activity (Benito and Walbot 1997). This high copy number can, however, complicate the identification of the specific insertion responsible for a mutant phenotype. Traditionally, identifying a gene sequence that had been tagged by an insertion involved genomic DNA blotting using multiple wild-type and mutant siblings to identify a DNA fragment that contained the insertion and that cosegregated with the mutant phenotype (James et al. 1995). However, both DNA blotting and subsequent postblotting gene isolation steps were laborious, time-consuming, and often unpredictable.
Here, we report two strategies, MuClone and DLA-454, for cloning mutant alleles derived from insertional mutagenesis. Both strategies are based on an adaptation of a novel highly specific and efficient genome-walking method, digestion–ligation–amplification (DLA) that uses a single-stranded oligo as the adaptor instead of the partially double-stranded adaptors used in other methods. MuClone, a cost-efficient strategy, adds unique three-nucleotide tags to the 3′ ends of the common adaptor primer so subsets of high-copy Mu transposons can be separately amplified in a manner analogous to AFLP technology (Yunis et al. 1991). It is then possible to identify which copy of the transposon cosegregates with the mutant allele in the cosegregating population. DLA-454 combines DLA with 454 pyrosequencing to amplify and sequence multiple independent alleles of a gene to be cloned. Analysis of the resulting Mu flanking sequences (MFSs) identifies the target gene. To illustrate the applicability of the MuClone and DLA-454 strategies, each was used to independently clone the glossy4 (gl4) gene. The maize gl4 is a homolog of the Arabidopsis CUT1 gene involved in epicuticular wax accumulation.
MATERIALS AND METHODS
A total of 16 Mu-induced alleles of gl4a (gl4, to distinguish from its paralog, gl4b) were available for analysis. One (gl4a-92-1178-64) was previously isolated and the remaining 15 alleles were generated via direct tagging experiments as illustrated in supporting information, Figure S1. The su1 gene was used as a visual marker to select for gl4a-Mu or -ref alleles. However, because the su1 gene is ∼25 cM from the gl4a locus (our unpublished data), one-quarter of the plants selected after cross 3 (Figure S1) would be expected to not carry a gl4a-Mu allele but to instead carry gl4a-ref. This is consistent with the fact that Mu insertions could be identified only for 13/15 of the gl4a alleles isolated via direct tagging. Five mutant seedlings from each selfed family of gl4a/gl4a mutants and five nonmutant seedlings from each selfed family that did not segregate for gl4a mutants (cross 4, Figure S1) were pooled separately for DNA isolation and used for cosegregation analysis.
The DLA method (Figure 1):
Approximately 1 μg of genomic DNA was digested with 10 units NspI [New England Biolabs (Beverly, MA), no. R0602L] at 37° for 1.5 hr. The single-stranded oligo, NspI-5, was added as the adaptor (5 μm) for ligation with T4 DNA ligase (New England Biolabs, no. M0202L). Ligation was performed at 16° for 3 hr in a 50-μl volume. The digestion–ligation product was then purified using the Qiaquick PCR purification kit [QIAGEN (Valencia, CA), no. 28106]. Unblocked and blocked (next section) ligation products were used as templates for the primary PCR reaction with the target-specific primer (TSP1, e.g., Mu-TIR, see Table S1) and the adaptor primer (NspI-5), using AmpliTaq Gold DNA Polymerase [Applied Biosystems (Foster City, CA), no. N808-0241]. The blocking step is described in the following section. This PCR program consisted of 94° for 10 min; 15 cycles of 94° for 30 sec, 60° for 45 sec, 72° for 2.5 min; and a final extension at 72° for 10 min in a 20-μl volume. The primary PCR product was diluted 10 times and 2 μl of diluted product was used for secondary PCR with nested primers TSP2 (e.g., aMu25, see Table S1) and NspI-P, using AmpliTaq Gold DNA Polymerase. This PCR program consisted of 94° for 10 min; suitable numbers of cycles (35 and 20 cycles were employed for MuClone and DLA-454, respectively) of 94° for 30 sec, 60° for 45 sec, and 72° for 2.5 min; and a final extension at 72° for 10 min.
Oligo sequences are as follows: NspI-5 20 mer, 5′-GAACGTCACAGCATGTCATG-3′; NspI-P 19 mer, 5′-AACGTCACAGCATGTCATG-3′.
ddNTP single-base extension (blocked DLA) and validation:
To prevent the single-stranded ends of ligation products from being filled during primary PCR and thus serving as undesired priming sites in the secondary PCR reaction, single-base extension with a ddNTP was performed prior to primary PCR to block further strand synthesis (Figure 1). Blocked and nonblocked DLA is defined as DLA with and without the ddNTP extension, respectively. In a 50-μl volume, 50–200 ng purified ligated DNAs were used for single-base ddNTP extension with 80 μm ddNTP and 4 units Polymerase I Large Fragment (Klenow) [Promega (Madison, WI), no. 9PIM220]. The blocking step was performed at 30° for 30 min, followed by 75° for 10 min to inactivate the enzyme. Prior to PCR, the Qiaquick PCR purification kit was used to remove extra ddNTP. To validate the application of DLA to amplifying MFSs and compare the specificity of blocked and nonblocked DLA, a pool of DNA from 12 sibling seedlings obtained via the self-pollinating of a single Mu-active plant was amplified via both blocked and nonblocked DLA. The primary and secondary PCR steps employed primer pairs Mu-TIR/NspI-5 and aMu25/NspI-P (Table S1), respectively. The nested PCR products from blocked and nonblocked DLA were TOPO cloned per the protocol of the TOPO TA Cloning kit (Invitrogen, Carlsbad, CA) and plasmids from >90 clones from each set were isolated and sequenced with the M13f primer (5′-GTAAAACGACGGCCAG-3′).
Direct comparison between blocked DLA and splinkerette PCR:
Sau3AI digestion and adaptor ligation procedures of splinkerette PCR were performed according to Uren et al. (2009). The adaptor sequences and PCR primers (Splink 1 and 2) of Uren et al. (2009) were used. NspI digestion and ligation of blocked DLA were performed independently, using the same B73 genomic DNA as used for splinkerette PCR. Both ligated DNAs were subjected to cleanup, using the Qiaquick PCR purification kit prior to PCR amplification. The same sets of target-specific primers in combination with method-appropriate adaptor primers (Splink 1 and 2 for splinkerette PCR and NspI-5 and NspI-P for blocked DLA) and the same PCR program (94° for 10 min; 35 cycles of 94° for 30 sec, 60° for 45 sec, and 72° for 2.5 min; and 72° for 10 min) were used for both methods. The same volumes of the resulting PCR products were loaded during agarose gel electrophoresis.
DLA was adapted to facilitate cosegregation analysis that enables the cloning Mu tagged mutants. Instead of employing NspI-P primer (with CATG 3′ ends) used in DLA as the adaptor primer for the PCR, a set of nested primers with the addition of unique three-nucleotide tags at the 3′ ends was used in separate reactions to selectively amplify sequences flanking distinct subsets of Mu insertions. Because the 3′-end sequence generated by NspI must be (A/G) with a CATG overhang, a total of 32 tagged primers are sufficient to anneal to all NspI sites during DLA (see Table S2). The presence of these tags provides additional specificity and eliminates the need to use blocked DLA in the MuClone method. Resulting PCR reactions were analyzed via agarose gel (2%) electrophoresis. Mutant-specific bands were cut from the gel and purified with the QIAGEN gel extraction kit (QIAGEN no. 28704) and direct sequencing.
Adapting DLA primers for 454 sequencing:
To enable subsequent 454 sequencing the nested Mu-specific and adaptor primers were concatenated with the 454 sequencing primer to generate various composite primers (Table S1). A given bar-coded Mu composite primer contains sequences from 454 primer A (GCCTCCCTCGCGCCATCAG), the barcode, and the Mu TIR nested primer. The resulting composite adaptor primer (BnspI-P) contains 454 primer B (GCCTTGCCAGCCCGCTCAG) and adaptor sequence. The primary PCR product of DLA was diluted 10 times and 2 μl of diluted product was used for secondary PCR, using one of several bar-coded Mu composite primers (AeMu, AfMu, AgMu, and AhMu, Table S1) and BnspI-P primer (5′-GCCTTGCCAGCCCGCTCAGAACGTCACAGCATGTCATG-3′).
Processing DLA-454 reads:
1. Categorizing DLA-454 reads on the basis of their barcodes and sequencing trimming:
Raw 454 reads were categorized by their barcode sequences. SeqClean (http://compbio.dfci.harvard.edu/tgi/software/) was used to trim barcodes, primers, and partial Mu TIR sequences. A two-step trimming strategy was used. First, the Mu primer and the adaptor primer (default overlapping requirement, ≥80% identity with primers) were removed. Trimmed sequences with lengths ≥60 bp were subjected to a second round of trimming to remove Mu TIRs (≥30 bp overlapping, ≥80% identity with 34 bp known TIRs and a set of novel TIRs collected from 454 reads, our unpublished data).
2. Mapping DLA-454 MFS reads to the B73 reference genome:
Trimmed MFSs were aligned to the maize B73 reference genome (http://www.maizesequence.org/, released on 3/20/09), using BLASTN. Only MFSs that had five or fewer best BLAST alignments (lowest E-value hits) of MFS queries with ≥95% identity and ≥90% coverage were retained for further analysis.
3. Removing redundant Mu insertion sites:
To allow for trimming errors, MFSs with the same orientation that had apparent insertion sites within 3 bp of each other were regarded as having been derived from the same Mu insertion site. This is conservative because Mu insertions are known to cluster (Dietrich et al. 2002). Because the MFSs on both sides of a Mu insertion can be amplified and sequenced, two sequences having opposite alignment orientations but with apparent insertion sites exactly 8 bp apart (due to the 9-bp target site duplication) were counted as a single Mu insertion.
RNA isolation and quantitative RT–PCR:
RNA from 8-day-old leaves was isolated using the RNeasy plant mini kit (QIAGEN). RNase-free DNase (QIAGEN) was used during RNA isolation to remove genomic DNA from RNA. The RNA samples were confirmed to be gDNA free when only cDNA corresponding amplicons but no gDNA amplicons were detected after PCR, using actin (GenBank accession no. AY273142) primers that can amplify the gDNA and cDNA templates simultaneously, yielding amplicons of different sizes. qRT–PCR was conducted using an Mx4000 multiplex quantitative PCR system (Stratagene, La Jolla, CA). Quantitative (q)RT–PCR data were initially analyzed by using Mx4000 analysis software. Ct values were calculated using baseline-corrected, ROX-normalized parameters. Two technical replicates were included in each plate, and the average Ct value of these two replicates was used for further data analysis. The housekeeping gene, cytosolic GAPDH (GapC, GenBank accession no. X07156) was used as the internal control. The relative expression (RE) of gl4 genes, including gl4a and its paralog (gl4b), was calculated using the formula RE = 100 × 2Ct,gapC−Ct,gl4. The primers used are as follows: actin-408f 20mer, 5′-CCAGGCTGTTCTTTCGTTGT-3′; actin-520r 20mer, 5′-GCAGTCTCCAGCTCCTGTTC-3′; gap1106f 20mer, 5′-GCTTCTCATGGATGGTTGCT-3′; gap1224r 20mer, 5′-CAGGAAGGGAAGCAAAAGTG-3′; pgl4-17L 18mer, 5′-GACCGGGTGAAGCCCTAC-3′; gl4para17a 18mer, 5′-AACCGGATCAGGCCCTAC-3′; and pgl4-16R 20mer, 5′-TAGGCGAGCTCGTACCAGAG-3′.
DLA, a highly specific method for amplifying unknown flanking sequences:
We developed DLA to overcome technical limitations associated with existing genome-walking technologies. To test the utility of DLA for isolating unknown sequences adjacent to known sequences (i.e., genome walking), it was used as shown in Figure 1 to amplify sequences from the maize a1 (GenBank accession no. X05068) and rf2a genes (GenBank accession no. AF215823). First, genomic DNA was digested with NspI, which generates 3′ 4-bp overhangs. A single-stranded oligo was then used as the adaptor for ligation to the digested fragments. Note that each end of digested genomic DNA can ligate to only a single adaptor. To prevent the single-stranded ends of ligation products from being filled during primary PCR (5′-overhang fill-in), which would later serve as undesired primer annealing sites that are not target specific, single-base extension with a ddNTP can be performed prior to primary PCR to block further strand synthesis. Blocked and nonblocked DLA are defined as DLA with and without this single ddNTP extension step, respectively. Primary PCR was performed using TSP1 (target-specific primer 1, Table S1) and the first adaptor primer. Finally, secondary PCR was performed to amplify the flanking genomic sequences with TSP2 (target-specific primer 2, Table S1) and the second adaptor primer.
Two primers designed for each a1 gene and rf2a gene (Table S1, Figure S2) were used to perform DLA in combination with the adaptor primers, NspI-5 and NspI-P, respectively. Blocked and nonblocked DLAs were performed separately. The nonblocked DLAs yielded more nonspecific product (Figure S2) as shown after electrophoresis. The amplified fragments of expected sizes were excised from the gel and subjected to Sanger sequencing. Although both blocked and nonblocked DNA yielded the expected sequences, blocked DLA was, as expected, more specific. The results of these tests demonstrate that DLA can be applied to amplify unknown flanking sequences (i.e., genome walking).
To test the ability of DLA to amplify MFSs, PCR products from both blocked and nonblocked DLA were cloned and sequenced (materials and methods). The desired amplicons (termed Mu-Nsp amplicons) are expected to be flanked by the Mu-specific primer and the adaptor primer. Undesirable nonspecific amplicons having the adaptor primer at both ends were termed Nsp-Nsp amplicons. The sequencing data showed that 41/93 (44%) of the sequences from nonblocked DLA reactions were found to be the desired Mu-Nsp amplicons, with the remainder being undesired Nsp-Nsp amplicons. This suggests that about half of the nonblocked molecules resulted from the 5′-overhang fill-in (Figure 1). In contrast, all (94/94, 100%) sequences from the blocked DLA were from the desired Mu-Nsp amplicons. Therefore, blocking with ddNTP during DLA dramatically increases the specificity of amplification.
Comparison of DLA to splinkerette PCR:
Most ligation-based approaches for genome walking employ adaptors and most adaptors are partially double-stranded DNAs that have been subjected to specific modifications. In the absence of such modifications, partially double-stranded adaptors can participate in interadaptor ligation, reducing their effective concentration. In addition, unmodified adaptors can generate DNA molecules with adaptors at both ends (adaptor–adaptor molecules) that can serve as templates for nonspecific amplification. DLA was designed to solve these problems by using the single-stranded oligo instead of partially double-stranded DNAs used in most adaptor-mediated PCR methods as the adaptor. This slight modification makes DLA simple and efficient. Direct experimental comparisons were performed between blocked DLA and splinkerette PCR, a widely used method of genome walking that employs a partially double-stranded DNA as an adaptor (Uren et al. 2009). These comparisons were performed on three known genic targets (rf2a, a1, and rad51a). Results obtained using rf2a are shown in Figure S3. Both blocked DLA and splinkerette PCR exhibited high specificity. But the bands obtained via blocked DLA were much stronger, demonstrating that when using these templates and conditions, blocked DLA is more sensitive than splinkerette PCR. Similar results were obtained for a1 and rad51a (data not shown). It is, however, difficult to make general conclusions about sensitivity given that the performance of genome-walking methods is dependent upon choices of substrates, restriction enzymes, adaptor primers, and PCR conditions.
Cloning a Mu-tagged allele of gl4 via MuClone:
DLA was adapted to clone a gene using a Mu-tagged allele. In MuClone a set of nested adaptor primers with unique three-nucleotide tags at their 3′ ends are used in separate reactions (N = 32) of the final PCR step to selectively amplify sequences flanking distinct subsets of Mu insertions as is done in AFLP reactions. Since these selective adaptor primers dramatically decrease the nonspecific amplifications even under the presence of Nsp-Nsp templates, the ddNTP blocking step can be skipped during the MuClone procedure. MuClone reactions are performed on DNA samples isolated from a family that is segregating for a Mu-insertion mutant. PCR bands that cosegregate with the Mu-tagged mutant allele in this family (the candidate gene) are excised, purified, and sequenced.
To test the MuClone protocol we made use of Mu-tagged alleles of gl4. Seedlings homozygous for gl4 mutant alleles express a “glossy” phenotype (Figure 2A; Emerson 1935) associated with alterations in the composition and/or amount of epicuticular waxes, which are derived from very-long-chain fatty acids. Mu-tagged alleles of gl4 were isolated as described (materials and methods). A set of sibling plants, each of which had the genotype gl4a-mu-94B560/gl4a-mu-94B560 or Gl4/Gl4, were self-pollinated. From each of five families derived from the self-pollination of gl4a-mu/gl4a-mu plants, DNA was extracted from pooled tissues of five glossy seedlings, resulting in five pooled DNA samples. The same process was applied to five nonglossy plants from each of five families derived from the self-pollination of Gl4/Gl4 plants. These 10 pooled DNA samples (5 mutants and 5 wild types) were used for cosegregation analysis. MuClone analyses of these 10 samples were performed using a set of primers containing the unique three-nucleotide tags (Table S2). Among the 32 three-nucleotide tags tested, only one (Nsp15ctc) could amplify a specific PCR band from all 5 mutant pools and not from any of the wild-type pools (Figure 2B). This band was purified and sequenced. The sequence of this PCR product aligned to a region of a maize BAC that contains a gene (the gl4 candidate) that has, on the basis of FGENESH prediction and EST alignments, a single exon. A collection of primers was designed on the basis of the sequence of the gl4 candidate gene. These primers, in combination with the Mu-TIR primer, were able to amplify Mu-flanking sequences from 13 of the 15 additional independently isolated Mu insertion alleles of gl4 (Table 1). All of the resulting amplified sequences aligned to the gl4 candidate gene (Table 1 and Figure 2C). The candidate gene was PCR amplified from a stock homozygous for the spontaneous reference allele of gl4 (gl4a-ref). This amplification product was shown to contain a G/C-to-A/T nonsense mutation at base position 1361 (codon 454, Table 1) of the coding region of the candidate gl4 gene (Figure 2C). In addition, four previously identified EMS-induced alleles of gl4 were shown to contain lesions of G/C-to-A/T transition (Table 1) that are typical EMS-induced point mutations (Greene et al. 2003). Among these four mutants, three are nonsynonymous missense point mutations and one generated a premature stop codon (Table 1). In combination, these data establish that the DNA sequence amplified from the gl4a-mu allele via MuClone is a portion of the gl4 gene and thereby establish the utility of MuClone for cloning insertion alleles.
Combining DLA with 454 sequencing technology:
DLA was adapted for sequencing MFSs using 454 technology as illustrated in Figure 3. DNA barcodes (Qiu et al. 2003) were inserted between the 454 primer A and the Mu-specific primer to allow different input samples that were pooled in the same 454 run to be distinguished after sequencing. The resulting library amplified with primer 1 and primer 2 (Figure 3A) was subjected to 454 sequencing, using 454 primer A. Consequently, reads start from the barcode followed by portions of the Mu TIR and the MFS (Figure 3, B and C). The numbers of reads that carry the different barcodes were consistent with the relative amounts of DNA in each sample prior to pooling (data not shown). A total of 99.7% of reads carry the Mu-specific primer and 94% carry TIR sequences, demonstrating that “DLA-454” specifically and efficiently amplifies MFSs.
Isolating Mu-tagged alleles of gl4 via DLA-454:
The 16 independent Mu-tagged gl4 alleles used in the MuClone experiment were subjected to DLA-454 to test the utility of this strategy for isolating multiple alleles (or mutants) derived from insertional mutagenesis screens (Table 2). Two approaches were used to prepare pooled DNAs for 454 sequencing. In the first approach 16 DNA samples each of which carried a single different gl4-Mu allele were separately subjected to the modified DLA protocol illustrated in Figure 3 prior to pooling amplification products across alleles. In the second approach DNA samples from plants carrying the 16 alleles were pooled prior to conducting the modified DLA. Subsequently, DLA products from both approaches were sequenced using 454 technology. After trimming the barcodes, the Mu-specific primer, the amplified portion of the Mu-TIR, and the Nsp-P adaptor primer, remaining MFSs were mapped to the maize genome (materials and methods). Only reads that mapped to fewer than five genomic locations were retained for further analysis. The number of Mu insertion sites in each 5-kb window across the genome (using 2.5-kb steps) was determined (Table 2). Previous genetic mapping data indicated that gl4 is located on chromosome 4 (Schnable et al. 1994). Only a single annotated gene on chromosome 4 that contains multiple independent Mu insertions was identified (Figure 4, Table 2) and this gene is the same one identified via MuClone. Similar results were obtained from both pooling strategies. The 5′ end of each alignment, which indicates the position of the Mu insertion site, is located within the gl4 transcriptional unit for each allele. The numbers of reads obtained from the different Mu insertions vary dramatically possibly due to GC content and the sizes of the allele-specific DLA products (Figure 4). Even so, of the 14 alleles that can be amplified using gene-specific primers, 11 were recovered from pooling strategy 1 and 12 from pooling strategy 2 (Table 2).
gl4 is a homolog of CUT1 that encodes a very-long-chain fatty acid condensing enzyme required for epicuticular wax biosynthesis:
The amino acid sequence of GL4 shares 61% identity with the CUT1 (CER6) protein of Arabidopsis. CUT1/CER6 is a very-long-chain fatty acid (VLCFA) condensing enzyme required for epicuticular wax biosynthesis (Millar et al. 1999; Fiebig et al. 2000). Using BLASTN, the gl4 genomic sequence aligns to two loci in the maize genome (≥90% identity; ≥90% coverage). The copy (gl4a) cloned from the Mu-tagged allele is located on chromosome 4 and is predicted to encode a 485-amino-acid protein. The other copy identified by sequence homology (gl4b) is located on chromosome 5 and is predicted to encode a 480-amino-acid protein. The coding region of gl4b that has full-length cDNA (GenBank accession no. BT063439) support exhibits 96% identity with gl4a. Therefore, the maize genome contains two presumably functional copies of CUT1. In contrast, the Arabidopsis and rice genomes each contain only a single copy of CUT1. Consistent with the known subcellular localization of the VLCFA elongase (Bessoule et al. 1989; Millar et al. 1999; Xu et al. 2002), the Arabidopsis CUT1 and maize GL4A proteins are predicted to be integral membrane proteins containing two N-terminal transmembrane helices (Cserzo et al. 1997) (http://www.sbc.su.se/∼miklos/DAS/). Subcellular localization predictions indicate the GL4A protein is targeted to the plasma membrane or the endoplasmic reticulum (Horton et al. 2007) (http://wolfpsort.org/). These predictions are consistent with the fact that the membrane-bound elongase is localized in the microsomal fraction that includes the plasma membrane or the endoplasmic reticulum (Bessoule et al. 1989; Millar et al. 1999; Xu et al. 2002).
qRT–PCR experiments indicate gl4a transcripts accumulate to only low levels in gl4a-mu-94B545 homozygous seedlings, but to higher levels in gl4a-ref homozygous seedlings and the highest levels in wild type (B73) seedlings (Figure 5). These results are consistent with the fact that the gl4a-mu-94B545 mutant contains exonic Mu insertion, while the gl4a-ref mutant results from a nonsense mutant at the 3′ end of the gl4a gene (Figure 2C). These transcript accumulation levels are also consistent with the phenotypes of these mutants; gl4a-mu-94B545 mutant seedlings exhibit a strong glossy phenotype, whereas gl4a-ref mutant seedlings exhibit a weaker glossy phenotype (data not shown). Interestingly, the gl4a paralog, gl4b is upregulated in gl4a-mu-94B545 mutant seedlings, indicating the expression of gl4b is regulated at some level by the accumulation of gl4a transcripts. However, no significant change in the expression of gl4b was detected in the gl4a-ref mutants. We hypothesize this is due to the fact that gl4a-ref is a leaky mutant.
Comparisons to other methods:
Several ligation-based genome-walking strategies have been developed (Table S3). Some (Table S3, E) rely on self-annealing, which can result in low efficiency in large genomes. In comparison, adaptor ligation-mediated PCR exhibits improved genome-walking efficiency in large genomes. Multiple versions of adaptor ligation-mediated PCR have been developed (e.g., Riley et al. 1990; Jones and Winistorfer 1992; Devon et al. 1995; Hengen 1995; Frey et al. 1998; Kilstrup and Kristiansen 2000; Edwards et al. 2002; O'Malley et al. 2007; Uren et al. 2009). Typically, partially double-stranded DNA is used as the adaptor to ligate with digested genomic DNA. Two ligation-related problems are recognized. First, the double-stranded adapters can ligate to each other (interadaptor ligation), reducing the frequency of desired products (Table S3, C and D). Second, adaptors can ligate to both strands of both ends of genomic DNA fragments (adaptor–adaptor molecules). Such adaptor–adaptor molecules can serve as templates for nonspecific amplification (Table S3, B and D). Several modifications (Riley et al. 1990; Jones and Winistorfer 1992; Devon et al. 1995; Hengen 1995; Frey et al. 1998; O'Malley et al. 2007; Uren et al. 2009) have been developed to overcome these problems, but none solves both problems.
In the DLA ligation system, an oligo is used as the adaptor. The adaptor oligo anneals to the 3′-digested overhang. Nicks between the adaptor oligo and the digested 5′ end can be repaired by T4 DNA ligase. The new overhang, containing a portion of the adaptor sequence, is formed after ligation. Unfortunately, subsequent PCR steps could fill this 5′ overhang to form adaptor priming sites at both ends of a DNA molecule (Figure 1). If so, such a DNA molecule could be amplified by the adaptor primer even in the absence of target-specific primers, resulting in nonspecific amplification. To avoid 5′-overhang fill-in that introduces such adaptor–adaptor DNAs, ddNTPs were used to block the 3′ ends. The ddNTP blocking was shown to improve amplification specificity, which can be a problem with several other adaptor-mediated PCR methods (Table S3, B and D). The single-stranded oligo used in DLA is not expected to participate in interadaptor ligation. This results in higher concentrations of oligos, which is expected to increase ligation efficiency as compared to methods listed in Table S3, C and D. In addition, during DLA amplification, an adaptor primer works only on the newly synthesized strand primed by the target-specific primer, further increasing specificity. Taken together, these features make DLA efficient and highly specific.
DLA is a general method for amplifying DNA flanking known sequences:
It should be possible to apply the DLA method generally for obtaining unknown sequences flanking known target sequences, such as transposons, T-DNA, and transgenes. For unsequenced genomes, this method will be useful for determining sequences upstream and/or downstream of a known target locus (e.g., the isolation of promoter sequences associated with a full-length cDNA). It should be possible to broaden the application of DLA by utilizing different restriction enzymes for genomic digestion. At least three criteria should be considered when doing so: (1) restriction enzymes that produce 3′ overhangs are preferred (this is because the oligo adaptor can be ligated to the resulting restriction fragments without phosphorylation), (2) the average size of the restriction fragments should be suitable for PCR amplification, and (3) no recognition sites for the selected restriction enzyme should exist in the portion of the known sequence to be amplified.
MuClone is a simple and efficient approach for cloning Mu-tagged genes:
Mu transposons are widely used for forward genetics in maize. Active Mu transposon lines contain ∼50–200 copies of Mu and even commercial inbred lines have several dozen Mu-related sequences (Walbot and Warren 1988; Settles et al. 2004). These high copy numbers greatly complicate the visualization of individual Mu transposons following PCR amplification. The DLA method addresses this problem using an approach similar to that used during AFLP to amplify subsets of Mu transposons in separate reactions. Selective amplification of Mu flanking sequences is achieved by appending a series of unique three-nucleotide tags to the adaptor primer during the nested PCR (NspI-P). This achieves sufficient complexity reduction to permit Mu-derived amplification products to be individually visualized via agarose gel electrophoresis. Taking advantage of this feature, DLA can be used to identify sequences that flank a Mu transposon and that cosegregate with mutant phenotypes. The amplification of Nsp-Nsp templates requires that the selective adaptor primer matches both ends of Nsp-Nsp DNAs, thus dramatically decreasing the nonspecific amplifications even under the presence of Nsp-Nsp templates. Therefore, the blocked step need not be performed during the MuClone procedure.
DLA-454, a specific and practicable strategy for high-throughput sequencing of Mu flanking sequences:
The DLA-454 strategy combines DLA with 454 pyrosequencing to analyze MFSs. Approximately 94% of DLA-454 sequencing reads contain the expected Mu-TIR sequences, demonstrating the specificity of DLA. We were among the first to use barcodes to deconvolute pooled cDNA libraries after sequencing ESTs (Qiu et al. 2003). We have now applied this technology to pool samples prior to DLA-454 sequencing. To avoid the potential of confusion among samples due to sequencing errors an edit distance of at least two is required between distinct barcodes. Considering the number of input DNA samples, for this project 6-base barcodes were sufficient, but because 454 reads are long relative to the lengths of barcodes, large numbers of samples could potentially be pooled in a single 454 run.
Because we were able to clone the gl4 gene using both DLA-454 and MuClone, we conclude that DLA-454 is an efficient method for cloning genes for which multiple Mu-tagged alleles are available. A significant advantage of DLA-454 over MuClone is that it is not necessary to develop a segregating population that can save calendar time. On the other hand, if multiple independent alleles are not available for a given gene, DLA-454 could still be used by comparing Mu insertion sites in a pool of mutants vs. a pool of nonmutant siblings from a segregating population. The use of DNA barcodes allows multiple mutants to be analyzed in a single 454 run, reducing the per gene cost of cloning via DLA-454. Given its efficiency, it would be possible to use DLA-454 in combination with barcodes to create a sequence-indexed collection of Mu-insertion alleles of maize for reverse genetics (Fernandes et al. 2004; Settles et al. 2007; Vandenbussche et al. 2008). With minor modifications DLA-454 could also be used to construct a sequence-indexed collection of other maize insertions or insertion alleles in other species.
gl4a is a maize homolog of CUT1:
Epicuticular waxes cover all aerial surfaces of land plants, where they prevent nonstomatal water loss and protect plants from frost-induced injury, UV radiation, and pathogens (Post-Beittenmiller 1996). In addition, they play important roles in pollen–stigma and plant–insect interactions (Post-Beittenmiller 1996). Several maize genes related to the accumulation of epicuticular waxes have been cloned, including gl1 (Hansen et al. 1997; Sturaro et al. 2005), gl2 (Tacke et al. 1995), gl8 (Xu et al. 1997), and gl15 (Moose and Sisco 1996). Similarly, epicuticular wax genes have been cloned from Arabidopsis, including CER1 (Aarts et al. 1995), CER2 (Negruk et al. 1996; Xia et al. 1996), CER3 (Hannoufa et al. 1996), CER4 (Rowland et al. 2006), CER5 (Pighin et al. 2004), CER6/CUT1 (Millar et al. 1999; Fiebig et al. 2000), CER7 (Hooker et al. 2007), CER10 (Zheng et al. 2005), and PAS2 (Bach et al. 2008).
Epicuticular waxes are composed of derivatives of VLCFAs that are synthesized via four enzymatic reactions: condensation, reduction, dehydration, and a second reduction. In maize, the reduction step is encoded by gl8 (Xu et al. 1997) and in Arabidopsis, the dehydration step is encoded by PAS2 (Bach et al. 2008). The CUT1 protein functions in the condensation reaction, which is thought to be rate limiting (Millar and Kunst 1997; Millar et al. 1999). The phenotype of gl4 mutants demonstrates that this gene is required for epicuticular wax accumulation in maize. Our finding that the gl4a gene encodes a CUT1 homolog is consistent with the function of this protein in the biosynthesis of epicuticular waxes in Arabidopsis, where defects in CUT1 result in waxless stems (Fiebig et al. 2000; Hooker et al. 2002). Like several Arabidopsis mutants that interrupt the accumulation of epicuticular waxes, CUT1 mutants confer conditional male sterility (Fiebig et al. 2000; Hooker et al. 2002). This is thought to be a consequence of an interruption in the tapetal cells of the biosynthesis of lipids destined for inclusion in the exine (Ariizumi et al. 2004). In contrast, only two of the maize glossy mutants (not including gl4) affect male fertility (C. R. Dietrich and P. S. Schnable, unpublished data). This could be a consequence of differing roles of epicuticular wax genes in tapetal cells and pollen development between Arabidopsis and maize or due to genetic redundancy in maize. Although the observed expression interactions between gl4a and its paralog, gl4b, are at least consistent with the latter argument, we favor the former hypothesis because no gl4a or gl4b transcripts were observed among 333,174 454-ESTs isolated from tapetal cells of maize anthers (J. Cao and P. S. Schnable, unpublished data).
We thank Philip Stinard for technical assistance with the gl4 tagging experiment; Li Fan for generating families segregating for the gl4 mutants; Jun Cao for sharing unpublished data; Michael Miller, An-Ping Hsia, and Yan Fu for useful discussions; Lisa Coffey, Hailing Jin, and undergraduate students Tint-fen Low and Jia-Ling Pik for technical assistance; and the maize genome sequencing project (DBI-0527192) for sharing the sequences of BACs prior to publication. This research was supported in part by grants from the National Science Foundation (award no. DBI-6344852) to P. S. Schnable and B. J. Nikolau and from the National Research Initiative of the U.S. Department of Agriculture Cooperative State Research, Education and Extension Service, grant no. 2005-35301-15715 to P. S. Schnable.
Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.108936/DC1.
Sequence read archive accession no: SRX007377.
↵1 Present address: Monsanto, 700 Chesterfield Pkwy. W., Chesterfield, MO 63017.
Communicating editor: J. Boeke
- Received August 24, 2009.
- Accepted September 25, 2009.
- Copyright © 2009 by the Genetics Society of America