Abstract
The objectives of this study were to isolate and physically localize expressed resistance (R) genes on wheat chromosomes. Irrespective of the host or pest type, most of the 46 cloned R genes from 12 plant species share a strong sequence similarity, especially for protein domains and motifs. By utilizing this structural similarity to perform modified RNA fingerprinting and data mining, we identified 184 putative expressed R genes of wheat. These include 87 NB/LRR types, 16 receptor-like kinases, and 13 Pto-like kinases. The remaining were seven Hm1 and two Hs1pro-1 homologs, 17 pathogenicity related, and 42 unique NB/kinases. About 76% of the expressed R-gene candidates were rare transcripts, including 42 novel sequences. Physical mapping of 121 candidate R-gene sequences using 339 deletion lines localized 310 loci to 26 chromosomal regions encompassing ∼16% of the wheat genome. Five major R-gene clusters that spanned only ∼3% of the wheat genome but contained ∼47% of the candidate R genes were observed. Comparative mapping localized 91% (82 of 90) of the phenotypically characterized R genes to 18 regions where 118 of the R-gene sequences mapped.
WHEAT crop is attacked by a variety of pests, including insects, pathogens, viruses, and nematodes. On average, these pests cause 20–37% yield loss worldwide, translating to ∼$70 billion/year (http://pseru.ars.usda.gov; Pimentalet al. 1997). Genetic manipulation of resistance (R) genes is an efficient, economical, and well-tested method of controlling wheat pests. About 269 R genes conferring resistance to various pests have been identified in wheat or its wild relatives (McIntoshet al. 2000). Of the 110 that are well studied, 80 R genes show single-gene inheritance (3:1 ratio in F2) and seem to have the gene-for-gene interaction with the pests. Inheritance of the remaining 30 genes is complex and the resistance seems to be manifested in a nonspecific manner. The wheat genome is ∼16 million kb/haploid cell (Arumuganathan and Earle 1991). About 95–99% of the genome is nontranscribing (Sandhu and Gill 2003). Most of the wheat genes are present in clusters spanning physically small regions (gene-rich regions) of varying gene densities and numbers (Gill et al. 1996a,b; Sandhu and Gill 2002a; M. Dilbirligi, M. Erayman, D. Sandhu and K. S. Gill, unpublished results). The gene-rich regions are spaced by variously sized blocks of nested retrotransposons and duplicated genes (SanMiguelet al. 1996; Panstrugaet al. 1998; Wendel 2000; Wickeret al. 2001). Although it is feasible to target the gene-rich regions, nontranscribing repeated DNA interspersing these regions poses a serious hindrance to genomic manipulation of genes (Feuillet and Keller 1999, 2002; Steinet al. 2000; Wickeret al. 2001; SanMiguelet al. 2002; Yanet al. 2003). Furthermore, in addition to the repeated DNA, most wheat genes have multiple orthologs and in some cases paralogs. Nonfunctional or diverged-function gene copies may further complicate genome manipulation. As a result, very few genes with known function have been cloned in wheat. Therefore, approaches that specifically target an expressed part of the genome are particularly desirable for cloning wheat genes.
Thus far, 46 R genes conferring resistance against pathogens, insects, nematodes, and viruses attacking 12 plant species have been cloned (Dilbirligi and Gill 2004). Most of the cloned R genes are structurally conserved and can be grouped into four distinct classes on the basis of the presence of one or more of the nucleotide-binding (NB) sites, receptor-like transmembrane kinase (RLK), cytoplasmic protein kinase (PK), and leucine-rich repeat (LRR) domains (Hammond-Kosack and Jones 1997; Meyerset al. 1999; Mondragon-Palominoet al. 2002). The class I, containing NB and LRR (NB/LRR), is the largest with 31 members. This group of genes can have either a N-terminal coiled coil or a tool-interleukin receptor-like (TIR) domain (Meyerset al. 1999; Mondragon-Palominoet al. 2002). The TIR domain is found mainly in the dicots. Class II R genes encode proteins that carry an extracellular N-terminal LRR region and a C-terminal transmembrane domain. Class III genes carry an extracellular LRR and a cytosolic kinase domain that are connected by a single pass transmembrane domain. Class IV genes contain only the CPK domain. Significant structural and functional differences, however, may exist among R genes within each class. For instance, the LRR domain present in the Hs1pro-1 gene is intercellular compared to Cf2, Cf4, Cf5, Cf9, Xa21, Ve1, and Ve2, where it is extracellular (Dixonet al. 1996; Caiet al. 1997; Kawchuket al. 2001).
The motifs present in the plant R-gene domains are highly conserved, and the sequence and order of the conserved amino acids is often unique for the R genes. The LRR domain of plant R genes contains 9–41 imperfect repeats, each ∼25 amino acids long with a consensus amino acid sequence of xx(L)x(L)xxxx (Cooleyet al. 2000). The PK domain of both Pto and Xa21 genes contains up to 25-amino-acid-long motifs where the first three (DFG) and the last two (PE) residues are highly conserved (Liuet al. 2002). Internal threonine (T) and serine (S) residues are essential for autophosphorylation and thus are conserved (Liuet al. 2002). The kinase-1a (p-loop), kinase-2, and putative kinase-3a motifs of the NB domain of plant R genes have a consensus sequence of GxxGxGK(T/S)T, LxxxDDVW, and Gxxxx TxR, respectively. These sequences are considerably different from that present in other NB-encoding proteins (Hammond-Kosack and Jones 1997; Meyerset al. 1999).
Genomic DNA sequence analyses revealed that there are 166 putative NB/LRR-containing genes in Arabidopsis thaliana and ∼600 in rice (Oryza sativa; Arabidopsis Genome Initiative 2000; Goffet al. 2002; Richlyet al. 2002). The number of functional or expressed R genes, however, is not known. Analyses of NB/LRR-containing genomic sequences from various crop plants suggested that only a small fraction of the R genes may be functional (Chinet al. 2001; Sunet al. 2001; Shenet al. 2002). With the objective to clone resistance genes en masse, primers complementary to the conserved sequences have been used to amplify genomic DNA of various crop plants (Leisteret al. 1996; Yuet al. 1996; Collinset al. 1998). Expressed sequence tagged (EST) transcripts for only 9 of the 173 resistant gene analogs (RGAs) isolated from six major crop species have been observed (M. Dilbirligi and K. S. Gill, unpublished results; http://www.ncbi.nlm.nih.gov/entrez). The proportion of expressed R genes is expected to be even smaller in wheat because of the larger genome size.
It is well documented that recombination is highly uneven on the wheat and other eukaryotic chromosomes. Detailed analysis of wheat homeologous group 1 chromosomes showed that recombination occurs mainly in gene-rich regions encompassing ∼13% of the genome (Sandhu and Gill 2002b). Distribution of recombination is very similar in other wheat chromosomes (M. Dilbirligi, M. Erayman, D. Sandhu and K. S. Gill, unpublished results). Most of the recombination occurs in the distal 50% of the chromosomes. There may be as much as a 20-fold difference in the rate of recombination even among various gene-rich regions (Sandhu and Gill 2002a; M. Dilbirligi, M. Erayman, D. Sandhu and K. S. Gill, unpublished results). In many plants, R genes have been shown to cluster on the chromosomes usually in the highly recombinogenic subtelomeric ends (Botellaet al. 1997; Meyerset al. 1999; Sakamotoet al. 1999; Haltermanet al. 2001; Brueggemanet al. 2002; Ernstet al. 2002; Goffet al. 2002; Richlyet al. 2002; Akhunovet al. 2003). Recombination in regions immediately around the R genes, however, is usually suppressed (Noelet al. 1999; Chinet al. 2001; Sunet al. 2001; Shenet al. 2002). Because of the massive differences in the extent of recombination particularly around the R genes, the linkage-based analysis may be misleading. It is, therefore, imperative to reveal the physical location of the R genes. Furthermore, it should be possible to use this unique structural and functional organization of R-gene-containing regions to verify the R-gene candidacy of any sequence.
The objectives of this study were: (i) to isolate and characterize the expressed fraction of the wheat R genes belonging to all known classes and (ii) to physically map both phenotypically characterized R genes and the cloned R-gene candidates via comparative analysis and deletion mapping to understand their distribution in the wheat genome.
MATERIALS AND METHODS
Plant material: Fifty-four wheat lines possessing one or more R genes were used for the modified RNA fingerprinting method. Nineteen nullisomic-tetrasomic lines (missing a pair of chromosomes, the deficiency of which is compensated for by a pair of homeologous chromosomes) and 13 ditelosomic lines (missing a pair of chromosome arms; Sears 1954) were used for arm location of the gene fragments. High-resolution physical mapping of the R-gene sequences was accomplished using 339 wheat deletion lines (Endo and Gill 1996).
RNA methods: Total RNA was isolated from the pooled leaf tissue from 3- to 4-week-old seedlings of the 54 R-gene-containing wheat lines using the guanidinium thiocyanatecesium chloride method (Sambrooket al. 1989) with a few modifications. The tissue was ground to a fine powder in liquid nitrogen and suspended in guanidinium thiocyanate buffer containing 1% mercaptoethanol and 0.5% sodium lauryl sarcosinate. After removing the cell debris by centrifugation, supernatant was layered on a 5.7 m CsCl/0.01 m EDTA solution in an ultracentrifuge tube and centrifuged in a swinging bucket rotor (SW55Ti) at 40,000 rpm for ∼16 hr. The RNA pellet was washed with 70% ethanol and resuspended in TE containing 0.1% SDS. The poly(A)+ RNA was isolated following the standard protocol (Sambrooket al. 1989).
Modified RNA fingerprinting and cloning: First-strand cDNA synthesis and the subsequent PCR amplification for the RNA fingerprinting reactions were carried out using the Delta RNA fingerprinting kit (CLONTECH, Palo Alto, CA) following the recommended protocols, except that 35S isotope was used instead of 33P. In addition to degenerate primers for the p-loop (GVGKTT) and the hydrophobic (GLPLAL) domains of the cloned R genes, nine T primers were also used. The T primers had a common 19-bp sequence at the 5′ end followed by nine thymidine and two variable bases at the 3′ end in all possible pairwise combinations of A, C, and G (Table 1). The PCR reactions were performed in a total volume of 20 μl containing 5 μm each of dNTPs, 0.4 μl of Advantage cDNA polymerase mix (CLONTECH), 20 μm of each primer, 50 ng of the first-strand cDNA, 2 μl of 10× cDNA PCR reaction buffer, and 0.2 μlof[35S]dATP. The PCR conditions were the following: one cycle of 5 min at 94°, 5 min at 45°, 5 min at 68°; 2 cycles of 2 min at 94°, 5 min at 45°, 5 min at 68°; 25 cycles of 1 min at 94°, 1 min at 60°, and 2 min at 68°, followed by 7 min at 68°. The amplification products were size separated on a 0.4-mm denaturing 5% polyacrylamide/8 m urea gel, following standard sequencing gel protocol (Sambrooket al. 1989). The gel containing 10 μl of amplification product was run at 70 W for 3–4 hr. The gel was then rinsed in TE buffer for ∼5 min, blotted onto a Whatman 3MM paper, and dried using a gel dryer (Bio-Rad, Richmond, CA) at 80° for 2 hr. An X-ray film was placed on the dried gel and exposed for 3 to 4 days. The fragment bands were cut out of the gel and eluted by boiling in 50 μl of TE for 5 min. The eluted DNA was reamplified using the corresponding primer pair, size separated on a 1% agarose gel, and purified using GENE-CLEAN III KIT (Q-BIOgene). The fragments were cloned using the pGEM-T easy cloning system I (Promega, Madison, WI). The DNA sequencing was done commercially (http://geneseek.com). The Sequencer 6.0 program and VecScreen (http://www.ncbi.nlmn.nih.gov/) function were used to trim vector contamination. The trimmed sequences were grouped at 90% nucleotide identity level and the longest sequence from each group was used for further analysis. The TBLASTX (nucleotide query-translated database), BLASTX (nucleotide query-protein database), and BLASTN (nucleotide-nucleotide comparison) version 2.1.2 (http://www.ncbi.nlm.nih.gov) were used to assign putative functions to the sequences (Altschulet al. 1997). Numerical options were used at default values during the BLAST searches.
Oligonucleotides used for RNA fingerprinting
R-gene identification by data mining: For data mining, 22 putative protein sequences corresponding to four major R-gene classes were used to perform TBLASTN (protein query-translated database) searches at the National Center for Biotechnology Information (NCBI) GenBank EST database (nonmouse and nonhuman EST entries; est_others; http://www.ncbi.nlm.nih.gov; Altschulet al. 1997). By comparing domain search, individual and multiple motif search, consensus sequence search, and individual full-length search, we observed that the individual full-length search is the most successful method for mining plant R genes (M. Dilbirligi and K. S. Gill, unpublished results). Therefore, the individual fulllength BLAST search was performed for the putative protein sequences of 22 R genes (Table 2) belonging to all R-gene classes. Numerical options were left at the default values for all BLAST searches.
Sequence analysis: The BestFit [Genetic Computer Group (GCG); Wisconsin Package Version 10.1, Madison, WI] sequence comparison was performed to calculate the homology between the identified sequences and the known R genes (gap opening penalty of 3.0 and gap extension penalty of 1.0). The identified wheat sequences were compared with the known R genes to show their structural resemblance. Overall sequence similarity; presence and order of domains and motifs; sequence, location, and order of the conserved amino acids within the motifs; and size of the regions interspersing various motifs and domains were the criteria for structural resemblance comparisons. The MEME function of GCG and the CLUSTALW function of Vector NTI (default values) were used to identify the motifs. Additional details of the analysis of the identified sequences are given by M. Dilbirligi and K. S. Gill (unpublished results).
Physical mapping of phenotypically characterized wheat R genes: Physical localization of phenotypically characterized wheat R genes was accomplished by comparative mapping. There are 269 R genes, including quantitative trait loci (QTL), known in wheat or its wild relatives, of which 229 have been mapped on chromosomes, 147 have been mapped on chromosome arms, and 90 have been mapped relative to molecular markers (McIntoshet al. 2000; http://wheat.pw.usda.gov/ggpages/maps.shtml). A comprehensive wheat consensus genetic linkage map containing 1380 commonly used DNA markers has been constructed by combining information from 137 published maps (M. Dilbirligi, M. Erayman, D. Sandhu and K. S. Gill, unpublished results). To reveal the physical location of wheat R genes, 90 phenotypically characterized R genes were first placed on the consensus genetic linkage map by using the following criteria: First, the genes and markers present on multiple wheat genetic linkage maps were used as anchor markers for the construction of the consensus genetic maps. Second, the markers and R genes present between two anchor markers on multiple maps were integrated relative to the anchors. Genetic distances among the integrated markers were standardized relative to the flanking anchors. Third, markers and R genes with a consistent location on various maps were incorporated. Finally, the R genes present on only one or two genetic linkage maps were placed on the consensus genetic maps via common markers. If only one linked marker was available, the R gene was mapped in a window around the marker. Genetic distances used for the construction of the consensus genetic linkage map are relative.
Plant R genes used for data mining
The consensus genetic linkage map was then compared with the consensus physical maps to reveal the physical location of the phenotypically characterized R genes. First, flanking markers for each physical region were identified on the consensus genetic linkage map to localize encompassed R genes to the corresponding physical region. The R genes that were present very close to the flanking markers were also placed but were marked with an asterisk (*). Additionally, the physical location of 17 group 5 R genes (marked “†”) was extrapolated on the basis of the location of other R genes present on the arm.
Physical mapping of the R-gene candidates: A two-step mapping strategy was followed to physically map the putative R-gene sequences. First the sequence clones were used as probes for gel-blot DNA hybridization analysis of 19 nullisomic-tetrasomic and 13 ditelosomic lines to reveal chromosome and arm location of the sequences. Either EcoRI or HindIII restriction enzymes were used for the analysis, except for BF201229 where DraI was used. The sequences were then analyzed on deletion lines for the corresponding chromosome. The names of the deletion lines along with the fraction length (FL) of the retained arm are given on the left of the chromosomes (Figure 2). Genomic DNA isolation and gelblot analysis methods were as previously described (Gillet al. 1993). The aneuploid stocks and the EST clones were kindly provided by Bikram S. Gill and Olin Anderson, respectively. Each fragment band was mapped to a chromosomal region flanked by the breakpoints of the smallest deletion lacking the fragment band and the largest deletion possessing it. Multiple fragment bands mapping on a chromosome arm were scored to be nonallelic (shown by a, b, and c at the end of the probe name) if the bands mapped to different chromosomal regions. The physical mapping information from the three homeologues (A, B, and D) was combined to localize the known R genes as well as the R-gene candidates to the smallest possible physical regions. The consensus physical map for each of the seven wheat homeologous groups was constructed as previously described (Gillet al. 1996a).
RESULTS
RNA fingerprinting: Size separation of the RNA fingerprinting PCR product showed 76 bands with the p-loop/GLPL primer combination. The p-loop/T primer combinations generated 80–100 bands each (Figure 1a). The size of the fragment bands ranged between ∼150 and 1300 bp except for the p-loop/GLPL primer combination in which the size of the smallest band was ∼300 bp. Many fragment bands amplified by different primer combinations appeared to be the same because of the size. About 900 fragment bands were observed on the fingerprinting gels. About 220 bands that appeared to be unique on the basis of size and intensity were excised from the gels, reamplified, and cloned (Figure 1b). Two clones corresponding to each band were sequenced. Additional clones were sequenced if two clones from a sample were different. A total of 385 clones were sequenced and analyzed. ContigExpress analysis of the sequences resulted in 121 unique contigs. The longest sequence of each contig was used for further analysis after sequences were confirmed to contain a p-loop (kinase-1a) motif. Individual Gap (GCG) analysis showed that 121 sequences shared 21% (UNL115–UNL175) to 90% (UNL184–UNL185, UNL201–UNL202 and UNL208, and UNL139–UNL160) sequence similarity. Only ∼27% of the sequences were closely related and sequence similarity for the remaining 73% was <50%. Putative protein sequences of the 121 clones were analyzed by BLASTX (Altschulet al. 1997; http://www.ncbi.nlm.nih.gov/BLAST). Only 48 of these sequences showed homology to the known or annotated genes and the remaining 73 were unique. Sequence similarity for 19 of the 48 was >80% (E ≤ 10–40) and for the remaining was between 35 and 80% (E ≤ 10–8). Twenty-six sequences were R-gene candidates because of their homology to NB/LRR, Pto, Xa21, or other types of R genes (Table 3 and Table 4). Twenty-two sequences were homologous (BLASTX E values were ≤10–29 to ≤10–57) to genes controlling cell structural and metabolic activities and thus were not analyzed further.
—Cloning of expressed candidate R-gene fragments using modified RNA fingerprinting. (a) 35S-labeled PCR product size separated on a polyacrylamide-urea gel. The product of each primer combination was loaded in two adjacent lanes. Approximate size of the fragments is shown on the left. (b) The excised bands were reamplified and size separated on a 1% agarose gel. The order of the PCR products on the agarose gel from left to right follows that of the polyacrylamide gel from top to bottom.
Motif analysis of the unique sequences: The MotifSearch (GCG) was performed for the 73 unique sequences to identify putative functional motifs and domains. In addition to the conserved p-loop (kinase-1a), 38 also had the predicted kinase-2 and kinase-3a motifs (Tables 3 and 4). Of these, 17 contained an NB domain similar to that of the NB/LRR type of R genes and the remaining 21 were putative kinases. The consensus among the 17 NB/LRR types of sequences was GVGKTT for p-loop along with (L/V) (L/V) (L/V) (L/I/D) D (D/I/V/L) for kinase-2 and (E/G/F/V) (T/S/G/Q) × (T/Y) (T/S) R for kinase-3a. The 21 putative kinases had an invariant aspartate (D) in kinase-2 and an arginine (R) in kinase-3a. Of the remaining 35 sequences, 21 had either a kinase-2 or a kinase-3a along with a p-loop motif as found in NB or kinase-encoding proteins. Fourteen sequences were not analyzed further as no motif other than a p-loop was observed.
Data mining: The full-length sequence comparisons identified 344 sequences homologous to 22 known R genes with E values ranging from ≤10–1 to ≤10–117. Detailed structural analysis of these sequences was reported by M. Dilbirligi and K. S. Gill (unpublished results). Briefly, comparison of the cloned R genes showed that the sequence, order, and gap length among various motifs and domains was unique to R genes even though the R-gene domains and motifs are also present in other proteins. Therefore, these structural features of the selected 344 sequences were compared with those of the cloned R genes. These analyses selected 176 sequences that were then used as query for BLASTX search in the nonredundant protein database. Only the sequences in which the best hit was a cloned R gene were selected. The selected 163 sequences were analyzed with VecScreen ORF Finder BLASTP programs (http://www.ncbi.nlm.nih.gov). These sequences assembled into 99 contigs representing 67 NB/LRR types, 12 Xa21-like (RLK), 11 Pto-like (PK), 7 Hm1-like, and 2 Hs1pro1-like sequences (Table 3). For physical mapping, the largest clone from each contig was used as a probe.
Physical mapping of R-gene candidates: A total of 166 best R-gene candidates, including 85 from RNA fingerprinting and 81 from data mining experiments, were physically mapped using wheat aneuploid stocks (Tables 3 and 4). These R-gene candidates belonged to all known classes. Forty-five of these sequences were unmappable because of either smear pattern or unresolved fragment bands on nullisomic-tetrasomic lines (Table 4). Physical location of 121 sequences was revealed using 335 deletion lines for all 21 wheat chromosomes (Figure 2). Four (4BL-6, 4BL-9, 5BL-11, and 6BS-9) of the 339 deletion lines were not used for the final analysis because of the discrepant results.
The 121 probes detected 664 fragment bands of which 371 were mapped on chromosomes using either EcoRI or HindIII enzymes (Table 4). These 371 fragment bands represented 310 loci. Of these, 94 mapped on the A-genome, 120 on the B-genome, and 96 on the D-genome chromosomes (Figure 2; Table 4). Fortyseven loci were detected for wheat homeologous group 1, 34 for 2, 45 for 3, 27 for 4, 67 for 5, 42 for 6, and 48 were for group 7 chromosomes. The highest number of loci was detected on homeologous group 5 (22%) and the lowest was detected on homeologous group 4 (9%). The difference was not significant among other homeologous groups. Of the 121 probes, 17 detected loci for only one, 21 for two, and the remaining 83 detected loci for all three homeologous chromosomes (Table 4).
Summary of the type of R-gene sequences used for deletion mapping
Twenty-three probes detected paralogous loci. The probes UNL214, UNL216, BE426789, BF200008, and BE427790 detected loci on three different homeologous groups (Table 4). The number of bands detected by these probes ranged between 7 and 15. Eighteen probes detected two paralogous loci each. The paralogous loci detected by 6 of these probes were on the same chromosome (Table 4). The loci corresponding to these 23 probes were 25 on the A-, 34 on the B-, and 25 on the D-genome chromosomes (Figure 2; Table 4). These 23 probes included 11 NB/LRR, six unique NB/kinases, and three of the PK class of R genes. Some differences were observed among the various types of sequences for the number of bands detected. The average number of bands for the NB/LRR, RLK, and PK classes of R genes was 6.1, 6.6, and 7.3, respectively. The unique NB/kinases, Hm1, and the other types of R genes detected an average of ∼4 bands.
The physical maps of the wheat chromosomes are shown in Figure 2. The 121 probes detected loci on all 21 wheat chromosomes. The highest number (67) of loci was detected on group 5 chromosomes and the lowest (27) on group 4. The number of loci for the remaining homeologous groups was between 35 and 48 (Figure 2). Most of the R-gene candidate loci mapped to the distal regions of the chromosomes. About 75% of the loci (231 of 310) mapped to the distal 20% of the chromosomes. The only exception was Xunl200, which mapped in the proximal regions of group 7 chromosomes (Figure 2). The majority of the NB/LRR and NB/kinase types of R-gene candidates mapped to the distal regions of the chromosomes. The Hm1 and PK-like gene sequences mapped in both distal and proximal regions (Figures 2 and 3). Also, the fragments belonging to the NB/LRR and NB/kinase classes of R genes mapped to all homeologous groups of wheat. The PR and RLK types of fragments mapped to five of the seven homeologous groups. The PR type did not map on groups 2 and 3, and the RLK type was absent on groups 1 and 3. The PK type was absent on homeologous groups 4, 6, and 7.
Various orthologous loci on the homeologues were in the syntenic regions; therefore, it was possible to combine the physical mapping information from the three homeologous chromosomes to generate a consensus physical map (Figure 3). About 43% of the loci mapped to the short arms and 57% to the long arms (Figure 3; Table 5). Most of the candidate R-gene loci localized to 26 chromosomal regions. These regions encompassed ∼16% of the wheat genome. Major differences were observed among various regions for both the size and the number of the candidate R-gene loci. The size of the regions ranged from 1 to 11% of the respective chromosome and the number of loci ranged from 2 to 29. Some of these regions were very small in size but contained a large number of the candidate R-gene loci. Five of these regions (short arm of groups 1, 2, and 3 and long arm of groups 5 and 6) contained ∼47% of the loci but encompassed only ∼3.5% of the genome (Figure 3). These regions of higher loci density were present mainly at the distal ends of the chromosomes.
Physical mapping of phenotypically characterized R genes: Physical mapping results of phenotypically characterized wheat R genes are summarized in Figure 3 and Tables 5 and 6. Of the total 269 wheat R genes, 90 have been mapped relative to molecular markers and 147 have been localized to chromosomes (Table 5; McIntoshet al. 2000). These wheat R genes confer resistance against 18 different wheat pests (see Table 5). The number of genes corresponding to the five major wheat pests Lr, Sr, Yr, Pm, and H are 45, 46, 47, 34, and 24, respectively. The number of resistance genes corresponding to the remaining 13 pests is 27 (McIntoshet al. 2000).
Gel-blot DNA analysis of the putative R genes using aneuploid stocks
An example of the comparison of the consensus physical with the genetic linkage maps is shown in online supplemental Figure 4 available at http://www.genetics.org/supplemental/. The 90 phenotypically characterized R genes were reliably mapped to small chromosomal regions by the comparative mapping approach. Seventeen additional R genes were tentatively localized to physical regions on the basis of the location of candidate R genes (marked ‘†’, Figure 3). These group 5 R genes (mainly H) were localized to a region where several candidate R genes mapped (Figure 3). These genes, however, were not included in any further analysis. Physically mapped wheat R genes included 31 Lr,30 Sr,8 Yr, 16 Pm, and an H resistance gene loci. The remaining 4 wheat R-gene loci were for Crr, Kb, and fusarium pathogens (Figure 3; Table 5). The short arms of the wheat chromosomes contained 50 of the 90 R genes and the long arms contained 40 (Figure 3; Tables 5 and 6). While wheat group 1 and 2 chromosomes contained 41% of the R genes, homeologous group 3 had the least (6%) number. The five major types of wheat R genes were present in all seven groups (Figure 3, Table 5). No specific pattern was observed for the type of wheat R genes.
—Physical maps of R-gene candidates on the wheat chromosomes. Chromosome length, arm ratio, and the C-banding patterns were drawn to scale following the nomenclature for standard karyotypes (Gillet al. 1991). The deletion line breakpoints and FL of the retained arms are marked by arrows on the left of the chromosomes. Location of the candidate R-gene loci is given on the right.
Both the location and the distribution pattern between the phenotypically characterized wheat R genes and the candidate R-gene sequences were very similar. Of the 26 regions, 18 contained ∼78% of the R-gene candidate loci and ∼90% of the phenotypically characterized wheat R genes (Figure 3; Table 5). The proportional number in these regions was also similar. The exception was the long arm of group 2 chromosomes that contained eight of the known wheat R genes but lacked any of the candidate loci (Figure 3; Table 5). The size of these regions totaled <16% of the wheat genome. About 50% of both the phenotypically characterized wheat R genes and the candidate R-gene loci mapped to five distal regions of chromosome groups 1, 2, and 3 short arms and 5 and 6 long arms encompassing <3% of the genome (Figure 3).
DISCUSSION
Resistance mechanism seems to be conserved among plants as most of the R genes share a strong structural similarity irrespective of the host or the pest type. With an objective to utilize this structural similarity to isolate R genes en masse, RGAs were amplified from genomic DNA using conserved motif primers (Kanazinet al. 1996; Leisteret al. 1996; SanMiguelet al. 1996; Yuet al. 1996; Aartset al. 1998; Collinset al. 1998; Seahetal. 1998; Shenet al. 1998). This approach was not very successful probably because of the presence of pseudogenes that have structural similarities to the functional R genes. Furthermore, most of the functional R genes have been shown to be single or few copies that will be out-competed by multiple-copy nonfunctional RGAs during random cloning and sequencing of the amplified genomic fragments. Analyses of ∼800 RGAs from 20 plant species have identified only one functional R gene (DM3; http://www.ncbi.nlm.nih.gov/entrez; Shenet al. 2002). Our comparison for six major plants, each with an average of 200,000 ESTs, revealed that only ∼5% of the RGAs are expressed (data not shown). The remaining are either nonfunctional or rare transcripts.
Physical localization of phenotypically characterized wheat R loci based on mapping results of the putative R-gene sequences
—Consensus physical maps showing distribution of candidate R-gene probes on wheat chromosome groups. Each arm of the consensus chromosome and the location and size of the R-gene-containing regions were drawn to scale on the basis of the average size of the three homeologous chromosomes. Names of the deletion lines flanking the regions are given on the left side of the consensus chromosomes. A few loci, mapping in closer proximity to the flanking deletion lines of the R-gene clusters, were included in the regions and labeled by an asterisk (*). Wheat R genes labeled as “†” on the long arm of group 5 were tentatively located since the majority of the candidate R-gene fragments mapped only to that region.
Pattern of gene expression is very different among the cloned R genes. About 81% of the R genes appear to be rare transcripts as no EST is present for 32% and only one or two for the remaining (data not shown). Expression of 19% of the cloned R genes is very high, making it very difficult to clone the rare transcripts by any random cloning and sequencing approaches. We targeted the expressed R genes by using cDNA instead of the genomic DNA. We made our approach independent of expression by size separating on denaturing polyacrylamide gels. As a result, bands corresponding to abundant as well as rare transcripts were visible with similar intensity. Success of this approach can be realized from the fact that sequencing only 385 clones identified 121 unique sequences. Of these, 92 were rare transcripts, including 73 unique sequences for which no wheat homologs or ESTs were observed in the database. Only a subset of the identified sequences contained the GLPL motif, as observed among the cloned R genes. These results suggest that using both GLPL and the poly(A)+ tail primers was a worthwhile approach.
Comparison of phenotypically characterized wheat R genes and mapped candidate R-gene sequences
Various approaches to compare structural resemblance to the known R genes identified 99 R-gene candidates from the identified EST sequences. Including the 121 identified by RNA fingerprinting, all 220 expressed sequences showed varying levels of structural similarity to the known R genes. Detailed sequence analysis of these R-gene candidates is given by M. Dilbirligi and K. S. Gill (unpublished results). Of these R-gene candidates, 129 belong to the NB/LRR class, 38 to other types of R genes, and 17 are PR genes. The sequence similarity, presence, and location of the domains and motifs and other structural features of the 38 sequences matched with those of Pto, Xa21, Hm1,or Hs1pro-1 R genes (Table 3). The remaining 36 of the 220 sequences had only a p-loop. No other R-gene motif or domain was observed, probably because of the shorter size of the sequences. Although many of these may structurally resemble known R genes, these sequences were not used for physical mapping.
A total of 229 genes conferring resistance to various wheat pests have been characterized and mapped on the chromosomes. Of these, 110 have been mapped relative to DNA markers whereas only arm location is known for 57. We were able to precisely localize 90 of these genes relative to deletion breakpoints (Figure 2; online supplemental Figure 4 at http://www.genetics.org/supplemental/; Table 5). It was not possible to physically map 20 R genes because the markers linked to these were mainly randomly amplified polymorphic DNA or amplified fragment length polymorphism types and thus were not present on the consensus genetic linkage map.
Although the phenotypically characterized R genes were present on all chromosomes, wheat homeologous groups 1 and 2 had the greatest number: 92 of the 229. Wheat groups 3 and 4 contained the lowest number: 44 of 229. Most of the wheat R genes are present in the telomeric or subtelomeric regions. About 75% of the wheat R genes mapped in the distal 20% of the chromosomes. All 90 R genes were clustered in ∼12% of the wheat genome present as 20 small chromosomal regions. Five of these regions contained major R-gene clusters, accounting for ∼50% of the R genes (Figure 3). Clustering of R genes has also been observed in other plant species such as Arabidopsis, soybean, maize, and lettuce (Botellaet al. 1997; Meyerset al. 1999; Chinet al. 2001; Richlyet al. 2002). This clustering of R genes may have resulted from duplication events caused by unusual types of recombination that have been observed around R genes of various plants (Chinet al. 2001; Sunet al. 2001; Shenet al. 2002).
The distribution of the R-gene candidates on wheat chromosomes was very similar to that of the phenotypically characterized R genes. All 121 R-gene candidates mapped in 26 chromosomal regions, including the 20 in which phenotypically characterized wheat R genes mapped. The relative distribution of R-gene candidates among various regions roughly matched that of the known R genes (Figure 3; Table 5). The five major R gene cluster regions that account for ∼50% of the known R genes contained 43 of the R-gene candidates (Figure 3; Table 5). Further characterization is necessary for finding a particular R-gene candidate corresponding to a particular disease.
Acknowledgments
We thank Olin Anderson for providing the EST clones and Bikram Gill for seed of the aneuploid stocks. We also thank Michael G. Hanly for critically reading the manuscript. This work was supported by a joint contribution of the Agricultural Research Center, Washington State University (journal series no. 207-03), and the University of Nebraska (journal series no. 14286). Financial support for the project also came from the Turkish government, the Nebraska Agricultural Experiment Station Hatch Funds, the Department of Agronomy and Horticulture, the University of Nebraska, the Vogel Endowment funds, and the National Science Foundation.
Footnotes
-
Communicating editor: J. A. Birchler
- Received August 18, 2003.
- Accepted September 24, 2003.
- Copyright © 2004 by the Genetics Society of America