Abstract
We have undertaken a large-scale genetic screen to identify genes with a seedling-lethal mutant phenotype. From screening ~38,000 insertional mutant lines, we identified >500 seedling-lethal mutants, completed cosegregation analysis of the insertion and the lethal phenotype for >200 mutants, molecularly characterized 54 mutants, and provided a detailed description for 22 of them. Most of the seedling-lethal mutants seem to affect chloroplast function because they display altered pigmentation and affect genes encoding proteins predicted to have chloroplast localization. Although a high level of functional redundancy in Arabidopsis might be expected because 65% of genes are members of gene families, we found that 41% of the essential genes found in this study are members of Arabidopsis gene families. In addition, we isolated several interesting classes of mutants and genes. We found three mutants in the recently discovered nonmevalonate isoprenoid biosynthetic pathway and mutants disrupting genes similar to Tic40 and tatC, which are likely to be involved in chloroplast protein translocation. Finally, we directly compared T-DNA and Ac/Ds transposon mutagenesis methods in Arabidopsis on a genome scale. In each population, we found only about one-third of the insertion mutations cosegregated with a mutant phenotype.
WHAT genes are essential for the viability of a plant? Because of the complexity of the multitude of biological processes required for a plant to grow and develop, a large and diverse set of genes are likely to be involved. A forward genetics approach to this question is a powerful method to identify the relevant genes. This approach involves the isolation of embryo-defective mutants and seedling-lethal mutants, which are likely to comprise the largest classes of visible mutants in Arabidopsis. There is often an overlap in the mutants identified in embryo and seedling screens because embryo-defective mutants that form seeds capable of germination may also be identified as seedling-lethal mutants. Genes with a seedling-lethal phenotype are likely to encode genes specifically required during early seedling development as well as more generally functioning genes whose absence becomes critical during seedling development. Although a saturation ethyl methanesulfonate (EMS) mutagenesis has identified several Arabidopsis genes with a seedling-lethal mutant phenotype (Jurgenset al. 1991; Mayeret al. 1991), subsequent analysis has been limited to a small subset of genes with unusual body patterns. Similarly, >300 embryo-defective Arabidopsis mutants have been isolated (Errampalliet al. 1991; Castleet al. 1993; Meinke 1994) and ~150 were mapped (Franzmannet al. 1995). Molecular cloning of genes in these two classes on a genome-wide scale has not been reported.
Previous studies provide conflicting estimates of how many genes are essential for embryogenesis in Arabidopsis. On the basis of the frequency of multiple alleles in genes with an embryo-defective phenotype, there are estimated to be only 500 genes essential for embryogenesis (Franzmannet al. 1995). By contrast, an estimated 3500–4000 genes are predicted to be essential for embryogenesis on the basis of the frequency of fusca mutants in large-scale seed color and seedling-lethal screens (Jurgenset al. 1991; Miseraet al. 1994). The number of genes essential for the seedling stage of growth has not been reported. About 1100 genes are estimated to be essential for gametophytic function on the basis of an analysis of transmissible deficiencies having average DNA losses of <90 kb (Viziret al. 1994). Estimates from genetic studies of other eukaryotes are relatively similar to each other with respect to the proportion of essential genes in a genome. In Saccharomyces cerevisiae 17% of the genes were shown to be essential for viability in rich medium by creating a collection of mutant lines, each with the precise deletion of one open reading frame (ORF; Winzeleret al. 1999). For Caenorhabditis elegans, estimates range from 15% (Brenner 1974; Herman 1988) to 25% (Stewartet al. 1998). Recent studies with RNA interference suggest that ~10% of C. elegans genes are essential (Fraseret al. 2000). For Drosophila melanogaster, 28% of the genes have been estimated to be essential (Bossyet al. 1984).
In this study, we have initiated a comprehensive functional genomics effort to identify the genes required for viability at the seedling stage of plant development. Parallel efforts to identify genes required at the embryo stage of development are described in McElver et al. (2001). Because these genes encode essential proteins in Arabidopsis, this research may also have applications to the identification of new herbicidal compounds (Ward and Bernasconi 1999). From a screen of Arabidopsis T-DNA and Ds insertion lines, we isolated >500 mutants with a seedling-lethal phenotype as the first step in this process. Many of these mutants are likely to be required for chloroplast function on the basis of phenotype and sequence data. In addition, mutant phenotypes, such as elongated or reduced hypocotyls, may be due to the disruption of essential genes in signal transduction pathways. In light of the large-scale nature of the project, we took alternative high-throughput approaches, enabling us to focus on the identification of DNA sequences for a large number of these genes. From cosegregation analysis of >200 mutants, we were able to directly compare the frequency of “tagged” mutants in T-DNA and Ac/Ds mutageneses. Finally, we present sequence information for an initial set of genes that are essential for seedling growth and development.
MATERIALS AND METHODS
Arabidopsis insertional mutant collections: All Ds lines were generated according to Sundaresan et al. (1995) and generously provided by R. Martienssen (Cold Spring Harbor Laboratories), H. Ma (Pennsylvania State University), and U. Grossniklaus (Zurich University). T-DNA lines were generated as described in McElver et al. (2001). The T-DNA vectors included pPCVICEn4HPT, pSKI015, pCSA104, and pDAP101.
Arabidopsis seedling screening and growth conditions: Between 50 and 75 seeds for each line were placed on MS media [4.3 g/liter Murashige and Skoog salts (Life Technologies, Rockville, MD), 8 g/liter Phytagar (Life Technologies)] containing the fungicides Benomyl (5 mg/liter; Sigma, St. Louis) and Maxim (1 mg/liter; Syngenta, Greensboro, NC). Top agar with fungicides, identical to MS media except with 6 g/liter Phytagar, was added to spread out the seeds on each plate. Plates were placed at 4° for 1–7 days to synchronize germination. Seedlings were germinated and grown at 19°–23° under lights (80–100 μE/sec/m2) with a 16-hr light and 8-hr dark photoperiod. Seven and 14 days after being moved from 4°, the plates were screened with a dissecting microscope for abnormal seedlings. For a line in which it was uncertain whether the mutant phenotype was lethal, three mutant seedlings were transplanted to soil and grown at 18°–23° with 14.5 hr of light per day. For a given line, the mutant phenotype was considered to be lethal only if all three mutant seedlings died. For a small portion of the lines, it was necessary to rescreen a line by replating sterilized seeds on MS + 2% sucrose media. This medium was used because it enabled us to distinguish more easily between seedling-lethal mutants and sick nonmutant seedlings. It is possible that a few of these lines had mutants that would have been inviable on MS media lacking sucrose, but were viable on media containing sucrose. Because such replated lines were screened only on media with sucrose, they would not have been identified as lethal in our screen and the total number of lethal mutants may be slightly underestimated.
Cosegregation analysis: For T-DNA lines, ~75–150 T2 seeds were sterilized and grown on germination medium (GM; Guyeret al. 1998) containing 30 mg/liter hygromycin B for pPCVICEn4HPT lines and 15 mg/liter Basta for pSKI015, pCSA104, and pDAP101 lines. For a small number of lines, only ~40–75 plants were analyzed because there were not enough seeds or there was poor germination. The ratio of resistant to sensitive seedlings (R:S ratio) was used to determine the most likely number of insertion loci. If the R:S ratio was <6.0, the line most likely had a single insertion locus and 32 resistant seedlings were transplanted to soil. This cutoff was derived empirically and was based on chi-square analysis and a strategy to prevent lines with a single insertion locus being assigned to the two insertion loci category; this analysis may have led to a slight overestimate of the number of single insertion lines. In most cases, siliques were screened for a potential embryo-defective phenotype. If all the resistant plants segregated progeny with an embryo phenotype, the line was considered “tagged.” If some, but not all, the resistant plants segregated progeny with an embryo-defective phenotype, the line was considered “not tagged.” If no embryo phenotype was detected, seeds were collected from each resistant plant and plated on MS media with fungicides. If a seedling-lethal phenotype was detected among the progeny of each of the resistant plants, the line was considered tagged. For tagged mutants, the number of resistant plants checked for cosegregation of the selectable marker and the lethal phenotype was usually ~30 and ranged from 24 to 59. If the R:S ratio was >6.0 and <20, then 17 or more resistant seedlings were usually transplanted to allow the identification in the next generation of a “subfamily” that segregated a single insertion and the seedling lethal phenotype. If an appropriate subfamily was identified, cosegregation analysis proceeded as described for lines with a single insertion locus. If an appropriate subfamily was not identified, the line was not analyzed further (Table 3). If the R:S ratio was >20, the line was not usually analyzed because there were likely to be more than two insertion loci. Because of the possibility of having two insertion loci linked to each other, R:S ratio data could not definitively determine the number of insertion loci for every line analyzed. For a small number of lines, cosegregation analysis was done according to McElver et al. (2001).
Experiments were performed similarly for Ds lines, except 50 mg/liter kanamycin monosulfate was used. In addition, 20 Ds lines segregated only resistant progeny (Table 2), which indicated that each line was homozygous for a Ds element and heterozygous for the lethal mutation (Tables 4 and 5). This situation arose because F4 seeds derived from homozygous F3 plants were used for some Ds lines, while T2 seeds derived from hemizygous T1 plants were used for T-DNA lines (see results). Among these 20 lines, it is most likely that each has a single Ds element that did not cosegregate with the lethal phenotype. It remains possible that a very small number of these lines contained two Ds elements and were tagged.
The following calculations provide an estimate of the accuracy of the cosegregation analysis. Although most lines designated as tagged contained an insertion that caused the lethal phenotype, at a low frequency, it is possible that an apparently tagged line had an insertion that was tightly linked to a second mutation with a lethal phenotype but that no recombination was detected between the two mutations. In this case, the line is not truly tagged because the insertion did not disrupt the gene responsible for the lethal phenotype. If no recombinant plants were detected among 30 resistant plants analyzed, the recombination frequency (p) between the insertion and a hypothetical, linked second mutation was ≤0.034 (or 3.4 cM). Thus, a hypothetical, linked mutation would have been within a 6.8-cM interval spanning the insertion. Each line used for cosegregation is estimated to harbor about three mutations on the basis of the cosegregation frequency of 29–33% (Table 3); this estimate implies that on average there are one insertion and two other mutations per line. With ~550 cM in the Arabidopsis genome (Schmidt 1998), the frequency of a hypothetical, linked mutation being in that 6.8-cM region is 2.5% [2 × (6.8/550)]. Therefore, the designation of a line as tagged based on cosegregation of the lethal phenotype and the selectable marker is likely to be correct for 97.5% of the lines when 30 resistant plants were analyzed.
The cosegregation process, which consists of a single self-cross of a heterozygote in most cases, served as an opportunity for other mutations to segregate away from the seedling-lethal mutations in these lines. Because of the large scale of the experiments, these mutants were not subjected to backcrossing. Based on the cosegregation data (Table 3), there are likely to be only about three to five mutations in most lines, which is much less than in standard Arabidopsis EMS seed mutageneses based on the frequency of embryo-lethal mutants observed (Redei and Koncz 1992; McElveret al. 2001). It seems likely that most background mutations would not prevent the detection of a seedling-lethal phenotype because they would be weaker or affect processes later in development.
Molecular biology: Arabidopsis genomic DNA was prepared according to Reiter et al. (1992) or using the Nucleon PhytoPure Plant DNA isolation kit (Amersham International, Buckinghamshire, England) or the Puregene DNA isolation kit (Gentra Systems, Minneapolis) as modified by McElver et al. (2001). Other procedures were carried out according to standard methods (Ausubelet al. 1998).
Plasmid rescue: For each pPCVICEn4HPT and pSKI015 T-DNA line with a tagged seedling-lethal mutation, genomic DNA was isolated from tissue collected from either heterozygotes or a mixture of homozygotes, heterozygotes, and wild-type plants. Following Southern blot analysis to determine appropriate restriction enzymes to use for plasmid rescue, genomic DNA was cut with an appropriate restriction enzyme to rescue the right or left border of the T-DNA. The ligated genomic DNA was transformed into Escherichia coli cells and ampicillin-resistant colonies were isolated. Plasmid clones from these colonies were analyzed by restriction enzyme digestion and sequenced to determine the location of the insertion in the Arabidopsis genome.
Thermal asymmetric interlaced PCR: Thermal asymmetric interlaced (TAIL)-PCR (Liuet al. 1995) was performed as modified by McElver et al. (2001). The arbitrary degenerate primers and T-DNA primers used are described in McElver et al. (2001). The Ds nested primers used were 5a (5′ ACTAGCTCTACCGTTTCCGTTTCCGTTTAC 3′), 5b (5′ TTACCTCGGGTTCGAAATCGATCGGGATAA 3′), 5c (5′ AAATCGGTTATACGATAACGGTCGGTACGGGA 3′), 3a (5′ GGGTCTTGCGGATCTGAATATATGTTTTCATGTGTG 3′), 3b (5′ TACCGAACAAAAATACCGGTTCCCGTCCGATTTCGAC 3′), and 3c (5′ GGATCGTATCGGTTTTCGATTACCGTATTTATCC 3′). The DNA sequence of PCR products was determined as described by McElver et al. (2001).
Confirmation of sequences flanking insertions: Results from plasmid rescue experiments were confirmed either by Southern blot with a probe derived from the flanking genomic DNA or by PCR with one primer in the insertion and the other in the flanking genomic DNA. Results from TAIL-PCR were confirmed by a second PCR reaction with a gene-specific primer and an insertion-specific primer. For four T-DNA lines, results were considered confirmed when the same border sequence was obtained from TAIL-PCR reactions with two or more different arbitrary degenerate primers. Both borders of each insertion were identified and confirmed for all but three of the lines.
Photography and image processing: Plants were photographed with a DEI-750 video camera (Optronics Engineering, Goleta, CA) and images were captured with Scion Image (Scion Corporation, Frederick, MD) software. Images were adjusted for brightness, contrast, and color and assembled for figures with Adobe (San Jose, CA) Photoshop (version 5.5).
RESULTS
Isolation of seedling-lethal mutants: To identify mutants with a seedling-lethal phenotype, we screened both T-DNA and Ds transposon Arabidopsis mutant collections. The generation of the T-DNA collection has been described (McElveret al. 2001). For the T-DNA lines, we screened T2 self progeny from a single T1 parent (McElveret al. 2001), so that we expected to find one-quarter homozygous mutant progeny for a line segregating a recessive mutation. We generally performed our screening by examining the growth of seedlings on MS media without sugar to allow us to identify the greatest number of lethal mutants possible. Addition of sugar to the media might rescue the phenotype of some mutants that show lethality on media without sugar.
From 26,187 independent T-DNA lines, we isolated 407 lines segregating seedling-lethal mutants. Although we used some T-DNA lines generated with an activation tagging vector (pPCVICEn4HPT and pSKI015, GenBank accession no. AF187951; Waldenet al. 1994; Weigelet al. 2000), we did not observe any dominant or semidominant lethal mutants in this screen. Dominant lethal mutations would presumably die in the T1 generation so that they would not be present in the T2 progeny. The frequency of seedling-lethal mutants identified from the activation tagging lines (1.52%; 55/3609) and the frequency from the other T-DNA lines (1.56%; 352/22,578) are essentially identical. This result suggests that most of the mutants identified in the activation tagging lines were the result of loss of gene function rather than altered gene function. In addition, molecular analysis of the tagged seedling-lethal mutants indicated that most of the lines have a T-DNA disrupting an ORF (see below). When an ORF is disrupted by a T-DNA, it is most likely that the protein function is disrupted rather than being misexpressed.
To make a direct comparison between T-DNA lines and Ds transposon lines, we also screened a large collection of Ds lines (Sundaresanet al. 1995). This collection was generated by crossing parental Ac and Ds lines and selecting for F2 progeny with transposition events unlinked to the original Ac and Ds elements. This approach provided the opportunity to identify insertions into most of the genome (Parinovet al. 1999). F3 seeds from each F2 plant were collected individually and a stock of F4 seeds was sometimes created by collecting seeds from kanamycin-resistant F3 progeny. Using the same seedling screening protocol as for the T-DNA lines, we screened F3 or F4 progeny from 12,196 lines and identified 98 lines with a seedling-lethal mutant phenotype. Unlike the T-DNA lines, most of the Ds lines in the collection were not screened for an embryo-defective phenotype.
Phenotypic classes: The seedling-lethal mutants displayed a wide range of phenotypes, which were classified as affecting pigmentation and/or morphology (Figures 1 and 2). Among the T-DNA and Ds insertion mutants, the frequencies of pigmentation (81% vs. 79%), pigmentation and morphology (11% vs. 12%), and morphology (8% vs. 9%) mutants were nearly identical (Table 1). This distribution of mutants seems to differ from that obtained in another large-scale seedling mutant screen that found only 50% of seedling mutants had defects in pigmentation but not morphology (Jurgenset al. 1991). This difference might stem from the other screen having used a different classification scheme or not having been limited to lethal mutants. The frequency of pigmentation subclasses in our study was also comparable between the two mutant populations (Table 1). The albino, yellow, and pale green mutants (Table 1 and Figure 1, A, B, and E–H), which were the majority of the mutants, included a range of pigmentation phenotypes and the assignment of mutants to these subclasses was based on visual inspection. We isolated 12 mutants exhibiting an albino phenotype on media without sucrose and a striking purple-tinted (“fusca”) phenotype superimposed on the albino phenotype on media containing sucrose (Figure 1, C and D). Because only a subset of mutants were grown on both types of media, there are likely to be more mutants in this subclass that have been classified in this study as albino. About 1 week after germination on media containing sucrose, the purple coloration begins to fade and the seedlings gradually appear more like typical albinos. Wild-type Arabidopsis seedlings grown on media with sucrose exhibit a much milder version of this purple phenotype, which is due to anthocyanin accumulation, particularly in the hypocotyl at its junction with the cotyledons and along the edges of the cotyledons (Kubaseket al. 1992). Two mutants had a distinctive phenotype of green cotyledons and small white leaves (Figure 1I). This phenotype has been observed previously in Arabidopsis for mutants in the TZ, TH-1, and PY genes, which are likely to encode thiamin biosynthetic genes (Li and Redei 1969; Koornneef and Hanhart 1981). Both of these mutants seem to be thiamin auxotrophs because they appeared normal when grown on media supplemented with 0.1 mm thiamin (data not shown). Eleven of 13 dark green lethal mutants also displayed morphological defects (for example, Figure 1, K and L), which may imply that the dark green phenotype is indicative of a type of defect different from other pigmentation defects. A dark green leaf phenotype has been reported for dwarf mutants with a deficiency in either brassinosteroid or gibberellic acid hormonal pathways and it has been suggested that this defect may be due to smaller cell size (Clouseet al. 1996; Bennettet al. 1998).
Seedling-lethal pigmentation phenotypes. (A) GT0946, albino, no leaves (day 15). (B) 4036, albino, with leaves (day 14). (C) 4788, albino (day 7). (D) 4788, fusca (day 7). (E) 245, pale green (day 14). (F) GT6839, yellow leaves, albino cotyledons (day 12). (G) GT1209, yellow, small cotyledons, reduced root growth (day 14). (H) GT0992, yellow, small cotyledons, reduced root growth (day 14). (I) 5007, white leaves, green cotyledons (day 14). (J) 2973, fusca cotyledons with purple dots (day 18). (K) 22084, dark green, cotyledons not separated (day 14). (L) 22433, dark green, variable cotyledon number and size, short thick hypocotyl, little root growth (day 13). All seedlings were grown on MS media, except B and E were supplemented with 5% sucrose and D, F, and K were supplemented with 2% sucrose.
Seedling-lethal morphological phenotypes. (A) 3963, (right) small leaves with irregular margins (arrows) and a wild-type sibling (left). (B) 44446, no leaves, three small cotyledons, thick hypocotyl, reduced root growth. (C) 59928, no leaves, short thick hypocotyl. (D) 55582, single concave cotyledon, reduced hypocotyl, very little root growth (not visible). (E) 59438, single cotyledon, two small leaves (arrow), reduced hypocotyl, no root. (F) 59930, single cotyledon. (G) ET4386, two seedlings with variable cotyledon number, disrupted phyllotaxy, little root growth. (H) 58972, stubby cotyledons, variable cotyledon number, very little root growth. (I) 47091, variably shaped cotyledons, possibly four cotyledons, reduced hypocotyl, very little root growth. (J) 59270, two seedlings with three or four cotyledons, short and thick hypocotyl, little root growth. (K) 59095, no leaves, small closed cotyledons, short and thick hypocotyl, very little root growth. (L) ET5262, no leaves, small closed cotyledons, little root growth. (M) 58424, no leaves. (N) 57348, elongated hypocotyl (arrowheads), small leaves, pale green. (O) 46153, two seedlings with an elongated hypocotyl (arrowheads), elongated leaf petioles (arrows). (P) GT5602, small cotyledons, reduced hypocotyl, very little root growth. (Q) 54196, no root (arrowhead), short hypocotyl. (R) 5283, seedlings with very little root growth compared with wild-type (Col-0) seedling, variable cotyledon defect (arrowheads). (S) Col-0, root with small root hairs. (T) 5283, root with decreased overall length and longer root hairs compared to S. All seedlings were grown on MS media, except C, D, M, N, and R were supplemented with 2% sucrose and S and T were grown on GM. Seedlings in R, S, and T were grown on plates placed at a slant so that the roots would grow on the agar surface.
Seedling-lethal phenotypic classes
In addition to the pigmentation-defective lethal mutants, mutants were isolated with a wide array of morphological defects (Figure 2). We observed phenotypes ranging from those that seemed to affect only a single structure to others that seemed to affect all seedling structures (leaves, cotyledons, hypocotyl, and roots). Mutant 3963 appeared normal, except for small leaves with irregular margins (Figure 2A). For lethals with defects in cotyledon number, either a single cotyledon (Figure 2, D–F) or multiple cotyledons (Figure 2, B and G–J), the number of cotyledons often varied among the seedlings of a given mutant line. More than 40 mutants exhibited a short, thick, or reduced hypocotyl (Figure 2, B–D and H). Four mutants displayed an elongated hypocotyl (Figure 2, N and O). Previously isolated elongated hypocotyl mutants, which affected gibberellic acid or light signal transduction, were viable (Jacobsen and Olszewski 1993; Briggs and Huala 1999; Reedet al. 2000), so that these essential genes might define novel components in these pathways. Ethylene and auxin are also known to control hypocotyl elongation (Collettet al. 2000). More than 30 mutants had very little or no root growth (Figure 2, P–R). For mutant 5283, although we saw increased root growth on media with sucrose (Figure 2T) compared to media without sucrose (Figure 2R), there was still considerably less root elongation than in wild type (Figure 2S). Several genetic screens have identified mutants that lack roots or have reduced root systems, but the number of Arabidopsis genes with this mutant phenotype remains unclear (Mayeret al. 1991; Chenget al. 1995; Berlethet al. 1996; Schereset al. 1996).
Number of insertion loci in seedling-lethal mutants
Because many of the seedling-lethal mutants exhibited defects soon after germination, these mutants were also examined for phenotypes during embryogenesis. For the T-DNA lines, 191 of 407 seedling-lethal mutants had a detectable embryo-defective phenotype (data not shown; McElveret al. 2001). For the Ds lines, 35 of 72 examined seedling-lethal mutants had a detectable embryo-defective phenotype (data not shown). Generally, these embryo phenotypes appeared late in embryogenesis, for example, as pale mature or albino embryos.
Genetic analysis of mutants: Because Arabidopsis insertional mutants also contain noninsertional mutations, it was necessary to determine whether an insertion cosegregated with the seedling-lethal phenotype. If the insertion cosegregated with the lethal phenotype, the mutant was considered tagged, but if the insertion did not cosegregate with the lethal phenotype, the line was considered not tagged and a noninsertional mutation was likely to be the cause of the lethal phenotype. From the initial R:S ratio in cosegregation analysis, the number of insertion loci in a line was determined (see materials and methods; Table 2). Based on the segregation of the selectable markers, 54% of the T-DNA lines had a single insertion locus, while 94% of the Ds lines had a single insertion locus. The higher number of insertion loci per line in the T-DNA lines probably explains the higher frequency of seedling-lethal mutants isolated from the T-DNA population compared to the Ds population. From the cosegregation analysis, we identified 32 T-DNA and 32 Ds lines as tagged (Table 3). The frequency of tagged lines in both populations was about one-third.
Cosegregation analysis
We refer to the mutants by line number and have not named the corresponding genes because the large number of mutants isolated in this study made the use of complementation tests or genetic mapping inefficient for this purpose. Instead, we used the molecular position of an insertion in the Arabidopsis genome to identify mutants with disruptions of the same gene.
Molecular analysis of mutants: For each tagged seedling-lethal mutant line identified, we attempted to identify the DNA sequence of the gene or genes disrupted by the insertion. Table 4 shows a summary of this molecular analysis. Initially to isolate Arabidopsis genomic DNA sequences adjacent to T-DNA insertions, we used a plasmid rescue approach (Holsterset al. 1982) and obtained flanking sequences for 12 of 13 lines. Although plasmid rescue was an effective tool to identify genomic sequence flanking insertions, we subsequently switched to TAIL-PCR (Liuet al. 1995) because it allowed a higher throughput. We used TAIL-PCR experiments to identify flanking sequences for 15 of 19 T-DNA lines. We used only TAIL-PCR for Ds insertions, which lack the sequence elements necessary for plasmid rescue, and obtained flanking sequences for 31 of 32 lines. BLASTn (Altschulet al. 1997) was used to identify the insertion position in Arabidopsis genomic sequence entries in GenBank. Additional BLAST searches identified genes with sequence similarity from Arabidopsis and other species. Results from plasmid rescue and TAIL-PCR experiments were confirmed either by Southern blot or by PCR (see materials and methods). For Ds lines ET2614 and GT2929, we were unable to clearly identify the essential gene due to a Ds insertion into the indole acetic acid hydrolase (iaaH) negative selectable marker, which might have been linked to the lethal mutation (Sundaresanet al. 1995; Parinovet al. 1999). For 10 of 64 mutants, no confirmed Arabidopsis genomic sequence flanking an insertion was recovered (Table 4). For 15 of the remaining 54 mutants, it was not possible to easily assign the gene that was responsible for the lethal phenotype because an insertion was between two genes or it was accompanied by a deletion or rearrangement affecting more than one gene (Table 4). Chromosomal rearrangements have been reported previously for T-DNA lines (Castleet al. 1993; Nacryet al. 1996). Additional analysis of 30 of 37 genes disrupted in these seedling-lethal mutants included the identification, either experimentally or from GenBank, of a cDNA clone that contained the entire protein-coding sequence (G. J. Budziszewski and J. Z. Levin, unpublished results).
Molecular analysis of the tagged seedling-lethal mutants revealed that a diverse set of genes is essential for seedling viability. Table 5 shows the locations of insertions within Arabidopsis genomic clones and the identities of genes disrupted in 20 of the 39 mutants for which the gene responsible could be deduced (Table 5). A detailed molecular characterization of the remaining 19 mutants will be presented in a future article. For two other lines, 868 and ET4401, the gene responsible for the seedling-lethal phenotype could not be identified (Table 5). Line 868 appears to contain a rearrangement or deletion that spans a region including the gene disrupted in line 4144, so that the lethal phenotypes of these two lines might be due to the inactivation of the same gene. Among those genes identified in Table 5 are four that were previously shown to have seedling-lethal phenotypes: DET1 (Pepperet al. 1994), CLA1 (Mandelet al. 1996), KEULE (Assaadet al. 2001), and PALE CRESS (PAC; Reiteret al. 1994), which may have a role in chloroplast mRNA maturation (Meureret al. 1998). Disruption of the translational apparatus resulted in pigmentation-defective lethal phenotypes in line 245. Line 245 had a defect in a putative peptide chain release factor, which was predicted by TargetP (Emanuelssonet al. 2000) to be localized in the mitochondria. Disruption of the photosynthetic apparatus also resulted in pigmentation-defective lethal phenotypes in lines 4144, with a defect in a putative chloroplast ATP synthase δ-subunit, and GT1802, with a defect in a putative cytochrome b6-f complex iron-sulfur subunit. Antisense experiments with two similar genes in tobacco resulted in plants with extremely slow growth due to decreases in photosynthesis (Priceet al. 1995). In contrast to these four lines, the identification of a putative RNA splicing gene as defective in line 5283, which exhibited reduced root growth and variable defects in the number and shape of cotyledons and leaves (Figure 1, R and T), did not provide a clear explanation of the defect in this mutant; however, these data can be used as a starting point in future studies of this mutant. Consistent with the mutant phenotype, the mRNA for the gene disrupted in line 5283 is detected by Northern blot analysis in both roots and aboveground seedling tissues (data not shown). In summary, the predicted roles for the identified essential genes (Table 5) indicate that a diverse set of pathways and processes within the plant contain an essential component.
Molecular analysis of tagged lines
Sequence data for lines with a seedling-lethal phenotype
Nonmevalonate isoprenoid pathway mutants: Three of the genes identified in this study disrupt the recently discovered nonmevalonate isoprenoid pathway. In plants, two independent pathways are responsible for the synthesis of isoprenoids: a cytosolic acetate/mevalonate pathway and a plastidic nonmevalonate 1-deoxy-d-xylulose-5-phosphate pathway (reviewed in Lichtenthaler 1999 and Lichtenthaleret al. 2000). Initial studies in Scenedesmus olbiquus suggested that such a pathway existed, but it was unclear whether there was redundancy between the two pathways (Schwenderet al. 1996). Subsequently, the genes involved in this pathway have been identified and characterized in E. coli. We identified albino seedling-lethal mutants disrupting genes encoding the first three enzymes in this pathway. This phenotype was likely due to a block in the formation of carotenoids, phytol side chains of chlorophylls, and plastoquinone-9, which are products of this pathway. The first enzyme, 1-deoxy-d-xylulose-5-phosphate synthase (DXS), converts pyruvate and glyceraldehyde 3-phosphate to 1-deoxy-d-xylulose-5-phosphate (Loiset al. 1998). Line 1055 disrupts the Arabidopsis gene encoding DXS (Table 5), which has been previously identified as CLA1 on the basis of its albino mutant phenotype (Mandelet al. 1996). The second enzyme, 1-deoxy-d-xylulose-5-phosphate reductoisomerase (DXR), converts 1-deoxy-d-xylulose-5-phosphate to 2-C-methyl-d-erythritol 4-phosphate (Takahashiet al. 1998) and the Arabidopsis homolog has been cloned and characterized (Schwenderet al. 1999). Line 4036 (Figure 1B) shows that the Arabidopsis homolog is an essential gene. The third enzyme, 4-diphosphocytidyl-2C-methylerythritol synthase, converts 2-C-methyl-d-erythritol 4-phosphate and CTP to 4-diphosphocytidyl-2C-methyl-d-erythritol (Rohdichet al. 1999) and the Arabidopsis homolog has been cloned and characterized (Rohdichet al. 2000). Line GT0946 (Figure 1A) shows that the Arabidopsis homolog is also an essential gene. Interestingly, the enzymes in this pathway are not found in animals and have been proposed to be novel targets for herbicides and antibacterial drugs that are based on the compound fosmidomycin (Rohmer 1998).
Chloroplast protein translocation: Two of the mutants identified in this study may disrupt the translocation of nuclear-encoded proteins into the chloroplast. Most of these proteins are imported by protein complexes composed of Toc (translocons of the outer envelope of chloroplasts) and Tic (translocons of the inner envelope of chloroplasts) proteins (Schleiff and Soll 2000). After import into the chloroplast, four pathways are proposed to be involved in the translocation of proteins into or across the thylakoid membrane (Robinsonet al. 2001). Membrane proteins are translocated by an SRP-dependent or spontaneous pathway. Lumen proteins are translocated by a Sec-dependent or ΔpH-dependent pathway. Line 2490, which had a pale green phenotype, contained a disruption in a gene similar to the pea Tic40 (Stahlet al. 1999) and Brassica napus Toc36 genes (Koet al. 1995; Table 5). Tic40 has been shown to be an inner chloroplast envelope-localized protein acting in protein translocation and displays some sequence similarity to Hsp70 interacting proteins, but its exact role remains unclear. Although some Arabidopsis chloroplast protein translocation mutants, e.g., ppi2 (Baueret al. 2000) and 2490, display lethal phenotypes, others such as ppi1 (Jarviset al. 1998), ffc, and chaos (Aminet al. 1999) have a pale nonlethal phenotype that might be a result of partial functional redundancy among import pathways. Lines GT6839, GT8096, and ET7536, which had a yellow phenotype (Figure 1F), contained a disruption in the Arabidopsis homolog of the E. coli tatC gene (Table 5; Bogschet al. 1998; Moriet al. 1999). The pea tatC protein has been shown to be required for the thylakoid ΔpH-dependent pathway in vitro (Moriet al. 2001). Maize mutants in the tha4 and hcf106 genes, which are similar to E. coli tatA and tatB, disrupt this pathway and have a seedling-lethal phenotype (Settleset al. 1997; Walkeret al. 1999).
Predicted localization of essential seedling proteins: In light of the large fraction of seedling-lethal mutants with pigmentation defects (Table 1), we attempted to determine whether these mutants had molecular defects in chloroplast function. We used the TargetP program (Emanuelssonet al. 2000) to predict whether the genes identified in this study encode proteins containing chloroplast transit peptides (CTPs) at their N termini. TargetP is reported to be the most accurate method for predicting the presence of CTPs and is estimated to be correct for 85% of plant proteins. In most cases, the coding region for these proteins was derived from full-length cDNA sequences (data not shown). Nine of 13 genes from mutants with only a pigmentation phenotype in Table 5 were predicted to contain a CTP. Analysis of a larger set of 30 essential genes identified from mutants with only a pigmentation phenotype in this study indicated that 21 genes were predicted to contain a CTP (J. Z. Levin, unpublished results). None of the four genes from mutants with morphological phenotypes in Table 5 was predicted to contain a CTP (data not shown). Analysis of three additional genes from mutants with morphological phenotypes in this study predicted these genes also would not contain a CTP (J. Z. Levin, unpublished results). Among all Arabidopsis proteins, ~14% are predicted by TargetP to have CTPs (Arabidopsis Genome Initiative 2000). These results show a significant enrichment of chloroplast proteins in the pigmentation-defective mutant class.
Gene family membership: In light of the high frequency of genes belonging to gene families in Arabidopsis (Arabidopsis Genome Initiative 2000), we analyzed the genes identified in this study as essential for seedling viability for their membership in gene families. In an analysis of the entire genome, 65% of Arabidopsis genes were considered to be members of gene families on the basis of BLAST (Altschulet al. 1997) and FASTA (Pearson and Lipman 1988) analyses (Arabidopsis Genome Initiative 2000). Using the same criteria, we determined that 44% of the 18 essential genes identified in Table 5 were members of gene families. Three of these essential genes are in gene families with two members; 1 is in a gene family with three members; 1 is in a gene family with four members; and 3 are in gene families with more than five members. This distribution is similar to that found for the entire genome (Arabidopsis Genome Initiative 2000). Analysis of a larger set of 37 essential genes identified from mutants in this study indicated that 41% are members of gene families (J. Z. Levin, unpublished results).
DISCUSSION
With the determination of the genome sequence of Arabidopsis complete, the challenge for plant biologists is to understand the function of every gene. It is estimated that there are ~25,500 genes in Arabidopsis (Arabidopsis Genome Initiative 2000), but only a small fraction of them have been characterized. Determination of the function of all Arabidopsis genes is a goal of the Arabidopsis research community before the year 2010 (Choryet al. 2000). One approach to this problem is to generate large insertional mutant collections and to use them to identify mutants in each of the predicted genes (reviewed in Parinov and Sundaresan 2000). With the aim of identifying the genes necessary for seedling viability, we isolated and molecularly characterized Arabidopsis seedling-lethal mutants on a genome-wide scale. Using both T-DNA and Ds insertion mutant populations, we isolated >500 seedling-lethal mutants and molecularly characterized >50 genes disrupted in these mutants. Among these essential genes are some that might be considered “housekeeping genes” such as a peptide release factor in line 245 and an ATP synthase subunit in line 4144 and others that might be classified as “developmental regulatory genes” such as DET1 (Pepperet al. 1994) in line ET5745 (Table 5). A high proportion of seedling-lethal mutants with pigmentation defects are likely to affect nuclear-encoded chloroplast proteins. Seedling development seems to depend primarily on chloroplast function because of the need for energy. This hypothesis is supported by the ability of sucrose to bypass the need for light in dark-grown wild-type Arabidopsis seedlings that were able to flower when grown on vertical petri dishes with sucrose-containing media (Roldanet al. 1999). Although many complex processes involve the chloroplast during seedling development, such as coordination of plastid and nuclear gene expression (Suseket al. 1993), it is unclear whether these functions have an essential component beyond the need for energy.
Beyond the usefulness of the identification of a loss-of-function phenotype for a gene, each mutant can be a starting point for future detailed characterization of specific cellular and developmental processes. This collection of seedling-lethal mutants could be a resource for such experiments. The application of other functional genomic methods, such as mRNA or metabolite profiling, to these seedling-lethal mutants may yield a greater understanding of the roles these genes play in plant growth and development. In particular, mutants disrupting the nonmevalonate isoprenoid pathway (Table 5) could shed light on the regulation of these genes and metabolites within this newly discovered biosynthetic pathway.
Seedling-lethal mutants have been identified previously in other types of genetic screens, including those for pigmentation defects. High chlorophyll fluorescence (hcf) occurs in plants when there is a reduction in photosynthetic activity beyond photosystem II and it can be visually detected as red plants in response to UV irradiation. In a screen for hcf mutants, 23 of the 34 Arabidopsis mutants identified were also seedling lethals (Meureret al. 1996). Although it is unresolved how many genes can mutate to a hcf phenotype, there is clearly an overlap between mutants that could be found in a hcf screen and those identified in this study as seedling lethal. Only 5 of the 130 maize hcf mutants identified were allelic with each other, indicating that this is a large class of genes (Miles 1994). Molecular analysis of a limited number of maize hcf genes has revealed chloroplast proteins acting in protein translocation (Settleset al. 1997), mRNA processing and translation (Fisket al. 1999), and translation (Schulteset al. 2000). Genes identified in our study play roles in these processes as well (Table 5). A screen for chlorophyll-deficient xantha Arabidopsis mutants identified many mutants but focused on seven genes specifically affecting chlorophyll synthesis or integration into the photosynthetic membrane (Rungeet al. 1995).
Multiple large-scale duplication events during the last 200 million years have been proposed to explain the extensive duplication within the Arabidopsis genome (Visionet al. 2000). As a result of these duplications, Arabidopsis is reported to have 65% of its genes within gene families compared to only 29% for S. cerevisiae, 28% for D. melanogaster, and 45% for C. elegans (Arabidopsis Genome Initiative 2000). Members of gene families can exhibit functional redundancy, depending on the extent of divergence of function by changes in their coding sequences and/or expression patterns. It will be valuable to determine the extent of functional redundancy within the entire Arabidopsis genome. An interesting example of partial redundancy within gene families has been reported for the Arabidopsis CAULIFLOWER and APETALA1 genes in which double mutants have a dramatic cauliflower-like floral meristem defect, while cauliflower single mutants have a wild-type phenotype and apetala1 single mutants have a milder floral-defective phenotype (Kempinet al. 1995). For the large R2R3 MYB transcription factor gene family, it appears that there may be considerable functional redundancy as a significant number of genes have no visible single mutant phenotype, although further study may reveal subtle phenotypes (Meissneret al. 1999). Somewhat surprisingly, 41% of the genes found in this study to be essential for seedling viability are members of gene families. For these genes, no other gene family member can replace the function of the mutated gene. While it is possible that all the members of a gene family have the same function and the lethal phenotype is the result of a decrease in the aggregate level of gene function below a threshold, it seems more likely that there has been a divergence in function for these genes. One other possibility is that members of a gene family encode proteins that function nonredundantly in different cellular compartments, e.g., cytoplasm and chloroplast. These results suggest that many of the members of gene families in Arabidopsis may have nonredundant functions.
Although we isolated >500 seedling lethals, our research program is ongoing and additional effort will be required to establish exactly how many genes can mutate to this phenotype. An estimate of how close we are to saturation in this screen can be made in several ways. First, for a phenotypic class with a known number of genes based on previous saturation mutageneses, we can compare the number of mutants isolated in this study. We detected five mutants with a fusca phenotype (Table 1) and 10 genes could have been detected in a seedling screen (Miseraet al. 1994). We detected two seedling-lethal mutants with white leaves and thiamin auxotrophy (Table 1) and 3 genes with this phenotype are known (Koornneef and Hanhart 1981). Both of these results suggest that there are >500 genes with a seedling-lethal phenotype. Second, an estimate can be made relative to the emb lethals isolated from the same T-DNA population (McElveret al. 2001). McElver and co-workers found that 2.5% of T-DNA lines segregate an emb mutant and they estimate that there are 500–750 emb genes (McElveret al. 2001). In comparison, 1.6% of T-DNA lines segregated a seedling-lethal mutant, suggesting that 320–480 genes are in this class. Third, only 1 gene was mutated more than once among the 54 tagged lines characterized molecularly. Thus, the molecular analysis will require additional information on multiple alleles before we can use this criterion to determine the extent of saturation.
As part of our seedling-lethal screen, we performed a large-scale direct comparison of T-DNA and Ds transposon insertional mutagenesis methods in Arabidopsis. The spectrum of phenotypes obtained in each screen appears to be similar (Table 1). Many of the differences between the methods result in T-DNA lines being more complex to analyze than Ds lines. On average, T-DNA lines had more insertion loci per line than Ds lines (Table 2). At a given insertion locus, a T-DNA line often had more than one copy and the insertion frequently was partially rearranged, while the Ds elements showed no evidence of rearrangements (data not shown). T-DNA lines were more likely to affect multiple genes or to have an insertion between two predicted genes. We attempted to identify the gene disrupted by an insertion in 32 T-DNA lines and 32 Ds lines. Among the 15 tagged lines analyzed in which the essential gene could not be identified for these reasons, only 4 were Ds lines (Table 4). We obtained a similar frequency of tagged mutants in both populations (Table 3). This frequency reflects the number of mutations in a line other than insertions carrying the selectable marker. Most of these mutations are likely to be point mutations or partial insertion copies that might be caused by DNA-modifying enzymes involved in T-DNA insertion or transposition. The frequency of 29% for T-DNA mutants is similar to the 36% found by Castle et al. (1993) and the 34% reported by McElver et al. (2001). Finding only 33% of the Ds mutants were tagged was a surprise to us. Previous reports of Ds transposon tagging frequency in Arabidopsis suggested a higher rate might be found. However, these estimates were based on analysis of fewer mutants. Because each Ac/Ds system was slightly different, it is not possible to definitively determine the causes of the reported tagged-mutant frequency. For the two largest studies reported, 15 of 29 (Longet al. 1997) and 4 of 28 (Altmannet al. 1995) mutants were tagged. In three smaller studies, 3 of 5 (Bancroftet al. 1993), 3 of 4 (Longet al. 1993), and 2 of 6 (Bhattet al. 1996) mutants were tagged. What might be the reason for these differences? It is possible that the low frequency obtained by Altmann et al. (1995) was due to the absence of selection against the continued presence of the Ac element, resulting in additional excision and insertion events that lowered the frequency of tagged mutants. Because the lines in this study (Sundaresanet al. 1995) and 26 of the lines in Long et al. (1997) were both generated from Ac lines expressing transposase under the control of the 35S promoter, the strength of the promoter used to express transposase does not seem to account for these differences. It is possible that previous studies do not have sufficiently large sample sizes to statistically distinguish them from the results in this study.
To increase the confidence in the assignments of particular genes as responsible for the seedling-lethal phenotype in a given line, additional experiments will be necessary. These assignments are estimated to be correct in almost every case on the basis of the cosegregation results (see materials and methods). Four of the genes identified here are previously known to be essential (Table 5). Additional efforts could include the isolation of additional alleles, complementation with a wild-type transgene, reversion of Ds mutants, and creation of transgenic plants with antisense or dsRNA constructs (Waterhouseet al. 1998; Chuang and Meyerowitz 2000; Levinet al. 2000). The most efficient strategy for a large-scale effort would probably be the first alternative.
Acknowledgments
We acknowledge Joanna Barton, Paul Burt, Parna Chattaraj, Hodan Guled, Karen Maguylo, and Sarah Williams for technical assistance, and Bob Dietrich, Mark Johnson, David Meinke, and Cathy Frye for critical reading of the manuscript. We thank David Meinke, Mary Ann Cushman, Amy Schetter, and Kelsey Smith (all from Oklahoma State University) for assistance with cosegregation analysis. We are grateful to Joseph Simorowski for sending us additional seeds for the Ds lines on several occasions. We also thank the Syngenta Biotechnology, Inc., sequencing facility, greenhouse facility, and media kitchen for their excellent assistance.
Footnotes
-
Note added in proof: After the submission of the revised version of this article, a report describing the phenotype of Arabidopsis tatC mutants was published (R. Motohashi, N. Nagata, T. Ito, S. Takahashi, T. Hobo et al., 2001, An essential role of a TatC homologue of a Delta pH-dependent protein transporter in thylakoid membrane formation during chloroplast development in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 98: 10499–10504).
-
Communicating editor: C. S. Gasser
- Received April 24, 2001.
- Accepted September 17, 2001.
- Copyright © 2001 by the Genetics Society of America