- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Graham, M. A.
- Articles by Shoemaker, R. C.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Graham, M. A.
- Articles by Shoemaker, R. C.
Organization, Expression and Evolution of a Disease Resistance Gene Cluster in Soybean
Michelle A. Graham1,2,a, Laura Fredrick Marek1,a, and Randy C. Shoemakera,ba Department of Agronomy, Iowa State University, Ames, Iowa 50010
b USDA-ARS, Corn Insect and Crop Genetics Research Unit, Iowa State University, Ames, Iowa 50010
Corresponding author: Randy C. Shoemaker, Department of Agronomy, Iowa State University, Ames, IA 50011., rcsshoe{at}iastate.edu (E-mail)
Communicating editor: J. A. BIRCHLER
| ABSTRACT |
|---|
PCR amplification was previously used to identify a cluster of resistance gene analogues (RGAs) on soybean linkage group J. Resistance to powdery mildew (Rmd-c), Phytophthora stem and root rot (Rps2), and an ineffective nodulation gene (Rj2) map within this cluster. BAC fingerprinting and RGA-specific primers were used to develop a contig of BAC clones spanning this region in cultivar "Williams 82" [rps2, Rmd (adult onset), rj2]. Two cDNAs with homology to the TIR/NBD/LRR family of R-genes have also been mapped to opposite ends of a BAC in the contig Gm_Isb001_091F11 (BAC 91F11). Sequence analyses of BAC 91F11 identified 16 different resistance-like gene (RLG) sequences with homology to the TIR/NBD/LRR family of disease resistance genes. Four of these RLGs represent two potentially novel classes of disease resistance genes: TIR/NBD domains fused inframe to a putative defense-related protein (NtPRp27-like) and TIR domains fused inframe to soybean calmodulin Ca2+-binding domains. RT-PCR analyses using gene-specific primers allowed us to monitor the expression of individual genes in different tissues and developmental stages. Three genes appeared to be constitutively expressed, while three were differentially expressed. Analyses of the R-genes within this BAC suggest that R-gene evolution in soybean is a complex and dynamic process.
IN the last several years, many different disease resistance genes (R-genes) have been cloned from a variety of plant species. To date, five different structural classes of R-genes have been identified (![]()
![]()
![]()
![]()
![]()
![]()
![]()
Complex clusters of R-genes are common in plant genomes. The Xa21 gene family in rice contains seven homologs within a 230-kb region (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
40 gene clusters of two or more (![]()
Determining how sequence differences between paralogs result in altered specificities has been essential in examining the evolution of R-genes. By examining three haplotypes of the Cf-2/Cf-5 family in tomato, ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Domain-swapping experiments have also been used to identify regions in R-genes required for specificity. ![]()
![]()
![]()
One area of disease resistance research that has remained relatively unexplored is the expression of disease resistance genes. Using the BLASTX algorithm, ![]()
1 in 5000 ESTs correspond to TIR containing R-genes. The complete coding sequence of soybean cDNA clone LM6, a TIR/NBD/LRR R-gene homolog (![]()
![]()
1 in 2000.
Given these low transcript levels, expression of few genes has been examined in detail. ![]()
![]()
![]()
![]()
![]()
In soybean, a cluster of RGAs had been mapped to a region on soybean linkage group J (![]()
![]()
![]()
We report here the complete sequence of soybean BAC 91F11 from cultivar Williams 82. We have identified 16 different R-gene sequences within this BAC, including four genes from two potentially novel classes of disease resistance genes. The first class is composed of two genes with homology to the TIR and NBD domains fused inframe to an NtPRp27-like gene that is a putative defense-related protein believed to be involved in downstream defense responses. The second class is composed of two genes with complete TIR signatures fused inframe to soybean calmodulin Ca2+-binding domains. RT-PCR revealed that three of the genes within this BAC were constitutively expressed, while three appeared to be differentially expressed. Sequence analysis of regions outside of the R-genes has also revealed important information about the origins of these genes and the mechanisms governing their evolution.
| MATERIALS AND METHODS |
|---|
BAC 91F11 sequencing:
BAC 91F11 was identified from the Iowa State University Williams 82 soybean BAC library. BAC 91F11 subclones were generated using three different methodologies from BAC DNA prepared using QIAGEN (Valencia, CA) tip-100s. Subclones were prepared from low-melt agarose-purified BAC insert DNA digested with Sau3AI and ligated into the BamHI site of HK-phosphatase (Epicentre Technologies, Madison, WI) treated pGEM3 Z+ (Promega, Madison, WI). Because many of the initial subclones were chimeric, subclones were also generated from Tsp509I partially digested BAC DNA treated with HK-phosphatase, size-selected on a low-melt agarose gel, and ligated into the EcoRI site of pBSKII+ (Stratagene, La Jolla, CA). While this technique did not produce detectable chimeric subclones, many gaps remained in the BAC sequence after analysis of a number of clones statistically determined to provide full sequence coverage. All but two of these remaining sequence gaps were filled using subclones generated from nebulized, phosphatase-treated and end-polished BAC DNA (Invitrogen, La Jolla, CA) cloned into the EcoRV site of pBSKII+. Subclones >1500 bases were sequenced by the Iowa State University DNA Synthesis and Sequencing Facility. Subclone sequences were assembled using Sequencher software (Gene Codes Corporation, Ann Arbor, MI). Once a core of contig sequences was assembled, only additional informative subclones were completely sequenced. The two final gaps were closed by PCR using primers designed from adjacent subclone sequences. The sequence of BAC 91F11 has been given GenBank accession no.
AF541963. Sequence similarities were determined using the BLASTX algorithm (![]()
![]()
Sequence analysis of BAC 91F11 resistance genes:
The location of introns was predicted on the basis of sequences of cDNAs LM6 and MG13 (![]()
![]()
![]()
Development of resistance gene-specific primers:
As genomic sequences were obtained from BAC 91F11, they were arranged into contigs made up of overlapping sequences. These sequences were examined for disease resistance motifs using the BLASTX algorithm. Sixteen different genes or gene fragments were identified, ranging in nucleotide identity from 71 to 99% (see RESULTS). Alignment of the genes using Lasergene software (DNASTAR) was used to identify regions from which gene-specific primers could be designed for use in RT-PCR. Oligo 6.0 (Molecular Biology Insights, Cascade, CO) was used for designing primers for 12 of the genes (Table 1). Gene-specific primers could not be designed for R-genes 6 and 9 and R-gene fragments A and B due to the small size and high nucleotide identity of the R-gene portions.
|
Controls for testing resistance gene primer specificity:
Each primer pair was tested by PCR against subclones representing all other predicted resistance genes in the BAC. PCR was performed using a PTC 200 DNA Engine thermocycler from MJ Research (Watertown, MA). PCR reactions were 20 µl in volume and contained 1x BRL PCR buffer, 2.0 mM MgCl2, 200 µM each dNTP, 0.2 µM each primer, 1 µl template DNA, and 0.5 units Taq Enzyme (Invitrogen, Carlsbad, CA). PCR cycling conditions were 94° for 2 min, 35 cycles of 94° for 1 min, anneal for 30 sec, 72° for 1 min, followed by 72° for 2 min. Annealing temperatures were altered until a specific primer pair would amplify only the subclone corresponding to the specific gene. To confirm that no other products were amplified, a Southern blot of PCR products derived from each primer pair against all subclones was made and probed with the PCR product of the specific primer pair. If the primers were specific, only the subclone corresponding to the gene-specific primer pair would show a hybridization signal. A second Southern blot was used to demonstrate that the gene-specific PCR product could hybridize to all the subclones. Together, the two Southern blots demonstrated that the gene-specific primers were specific to an individual R-gene within the BAC. To test whether the primers were specific relative to the rest of the R-genes in the genome, the gene-specific primers were tested against the Williams 82 BAC library. The primers were considered specific if they amplified only BACs that overlapped with BAC 91F11. Once the gene-specific primers were selected, they were retested against the subclones representing all of the genes using the reagents for RT-PCR.
mRNA isolation and reverse transcriptase polymerase chain reaction:
Six mRNA samples were isolated from a range of organs and stages of development in the soybean cultivar Williams 82. Plants from which samples were collected at 9 days after planting (DAP) and 14 DAP were grown in a growth chamber. The samples of fully expanded leaves were taken from greenhouse-grown plants 150 DAP. For other leaf samples, all the leaves were harvested from the plants. Flower and pod samples were taken from field-grown plants at 62 and 80 DAP, respectively. For mRNA isolation, samples were taken from at least six plants, combined, and ground in liquid nitrogen in preparation for mRNA isolation. Two independent mRNA isolations were performed on the ground tissue. mRNA was isolated from 2 g of selected tissues using the Micro-FastTrack 2.0 kit (Invitrogen). Approximately 0.1 µg of mRNA was used for first-strand cDNA synthesis using the Advantage RT-for-PCR kit (CLONTECH, Palo Alto, CA) following the manufacturer's recommended conditions. An oligo (dT) 18 primer was used for the first-strand synthesis. cDNAs were amplified using the Advantage cDNA polymerase mix (CLONTECH). Amplification reactions had a final concentration of 1x cDNA PCR reaction buffer, 0.2 mM dNTPs, and 0.5x cDNA polymerase mix in a total volume of 20 µl. Amplification conditions were those determined for the gene-specific primers above. RT-PCR products were run out on a 1% agarose, 1% TAE, ethidium bromide gel.
Controls for the RT-PCR reactions included a "minus" reverse transcriptase RT-PCR reaction to test each mRNA sample for genomic DNA contamination. cDNA synthesis and RT-PCR conditions were as described for tissue samples except no reverse transcriptase enzyme was added to the cDNA synthesis reaction. In addition, an RT-PCR reaction using water as the template for cDNA synthesis was included to check for reagent contamination. Each set of synthesized cDNAs, including the controls, was amplified using primers designed from a soybean tubulin EST taken from the Public Soybean EST Project as a positive control. The sequences of the primers were Tub56 U, 5' CAA TTG GAG CGC ATC AAT G 3' and Tub56 L, 5' ATA CAC TCA TCA GCA TTC TC 3'. All RT-PCR reactions were repeated to verify results.
It was impossible to design gene-specific primers at the same location in each of the genes due to the high nucleotide identity shared between genes. Differences in the primers, amplification lengths, and primer annealing sites can significantly affect first-strand cDNA synthesis and PCR efficiency. Therefore, we made no attempt to quantify expression of the genes. Our assay is a yes/no assay to determine if the gene products could be detected in a particular tissue.
Analysis of R-genes with NtPRp27-like sequences:
To determine if genes homologous to genes 13 and 14 existed elsewhere in the genome, primers were designed to span the resistance gene and NtPRp27-like protein fusion point. Primer NBD13/14 (5' GGC CTT CCA CGG GCT TT 3') was designed from within the GLPLA domain of the NBD. Primer NtPRp27-R13/14 (5' TGC AAT ACC TCC ART TAA TC 3') was designed from within a conserved domain in the NtPRp27-like sequence. The primers were used to screen the Williams 82 BAC library using PCR conditions described previously for the R-gene-specific primers and an annealing temperature of 52°.
| RESULTS |
|---|
Sequence analysis of BAC 91F11:
The subclones from BAC 91F11 were assembled into three large contigs of 12,608, 42,051, and 61,392 nucleotides. Average sequence redundancy was 4.8-fold, excluding subclones for which only end sequence was obtained. BAC-end sequences from other BACs in the Williams 82 linkage group J contig (![]()
|
Using the BLASTX and TBLASTX algorithms (![]()
Analysis of BAC 91F11 retroelements:
The retroelements from BAC 91F11 fall into three groups (Fig 1). Two of the retroelements are Gypsy/Ty3-related retroelements while the other two sequences show similarity to a Ta11-like non-LTR retroelement and an L1-like non-LTR retroelement. Further analyses were made of the two Gypsy/Ty3 retroelements. The element located between R-genes 13 and 14 has long terminal repeats of 390 bases. Three base differences were observed between the LTRs. The 4401-bp open reading frame is disrupted by three stop codons. Insertion of the retroelement resulted in duplication of the target sequence GAAAG. The second Gypsy/Ty3 element, next to R-gene 4, has identical LTRs, 370 bp in length. The open reading frame is 4524 bp in length and appears to be intact. Insertion of this retroelement resulted in a target site duplication of 5 bases, TGGGG. The LTRs of the two retroelements show no significant nucleotide identity with each other. Within the open reading frame the retroelements share
50% nucleotide identity.
Structure of BAC 91F11 resistance genes:
Using the GCG software package, we analyzed the structures of the R-genes located within BAC 91F11 (Fig 1 and Fig 2). All 16 R-genes are oriented in the same direction and have an average nucleotide identity of 86% within the predicted exons. The structures of the genes are shown in Fig 2. Genes 1, 3, 4, 5, 7, 8, 10, 11, and 12 appear to be full-length genes with similarity to the TIR/NBD/LRR family of disease resistance genes. Each of these genes contains 10 LRRs except for gene 12, which is missing the 3 terminal LRRs and 544 bases (relative to the consensus) of the 3' untranslated region. All of the genes, except genes 7 and 11, contain complete open reading frames. Gene 7 contains three frameshifts resulting in stop codons. Gene 11 has a single frameshift resulting in a stop codon. The locations of these frameshifts are shown in Fig 2. The intron positions within the genes are conserved although the sizes of the introns vary.
|
In addition to the full-length genes, we identified seven truncated genes, four of which may belong to two classes of novel plant defense genes. R-gene fragments A and B, which extend 187 and 243 bases past the start site, respectively, encode the amino terminus of the TIR. R-gene 2 encodes a complete TIR domain, a truncated NBD domain, and a single LRR (cDNA MG13, ![]()
81 bases long and overlaps with the third Ca2+-binding domain of ScaM-4. Ca2 is 133 bases long and shows homology with the fourth Ca2+-binding domain of ScaM-4. Using the BLASTN algorithm we were able to identify seven ESTs in GenBank dbEST corresponding to R-genes 6 and 9 (BG154262, BG652535, BI425148, BI787128, AW164239, BI971986, and BI892930). Four of the ESTs revealed that the calmodulin fragments were included in the transcripts of R-genes 6 and 9. The DIOGENES open reading frame prediction program supported this conformation. In both genes, a 111-bp intron separates the TIR from the Ca1 domain. The Ca1 domain is separated from Ca2 by a 945-bp intron in R-gene 6 and a 965-bp intron in R-gene 9 (Fig 2). Along their entire length, R-genes 6 and 9 share 99% nucleotide identity. Together, the TIR, Ca1, and Ca2 domains encode a protein 264 amino acids in length.
Using the BLASTX, BLASTN, and TBLASTX algorithms, we were unable to identify genes from other plant species homologous to R-genes 6 and 9 in the GenBank nonredundant, EST, or GSS databases. Analysis of the ESTs corresponding to genes 6 and 9 revealed that they were expressed in a variety of soybean cultivars (Williams, Williams 82, Bragg, Harosoy, progeny from a recombinant inbred line from Minsoy x Noir, and Corolla, a meristematic mutant) and tissues (germinating shoots, floral meristems, etiolated hypocotyls, hypocotyls infected with P. sojae, and roots following mock infection or flooding treatments). It is also interesting to note that ScaM-4 is one of two calmodulin isoforms induced by fungal elicitors or pathogen infection (![]()
The last two truncated R-genes, genes 13 and 14, may also represent a second novel class of resistance genes. These two genes contain TIR and NBD domains (Fig 2); however, they are missing motif 5 of the NBD-ARC domain (![]()
![]()
Genes 13 and 14 share 91.4% nucleotide identity from the start of the R-gene homology through the end of the NtPRp27-like sequence homology. Detailed analyses of the two genes show that the fusion occurred at the same breakpoint, suggesting that one of the genes is a duplicate of the other. In addition, sequences 1050 bp past the 3' end of NtPRp27-like protein have a nucleotide identity of 91.0%. At the end of this region, gene 13 has a 243-bp repeat of the 5' portion of the NtPRp27-like protein (Fig 1).
We searched the GenBank nonredundant database, dbEST, and dbGSS for other R-gene sequences fused to NtPRp27-like sequences. In addition, the databases were searched to identify all NtPRp27 homologs. These homologs were then screened for any R-gene motifs. In both cases, we were unable to identify R-genes fused to NtPRp27-like genes in any other species. Comparison of the amino acid sequences of NtPRp27 homologs from tobacco, wheat, barley, and Arabidopsis with the homologous region of genes 13 and 14 shows strong amino acid conservation (Fig 3A). Phylogenetic analyses of the NtPRp27-like portions of R-genes 13 and 14 showed greatest similarity to NtPRp27 and two putative genes from Arabidopsis (GenBank accession nos. GB AAD25570.1 and GB AAD25577.1; Fig 3B).
|
To determine if genes homologous to genes 13 and 14 existed elsewhere in the soybean genome, primers spanning the R-gene/defense-related protein fusion were used to screen the Williams 82 BAC library (Fig 3A). The primers identified only those BACs overlapping with BAC 91F11. The primers also detected similar bands in the following soybean lines: BSR 101, PI 1487-654, Wm79, Wm1, L91-8765, Altoona, Clark, A81-356033 (Glycine max), and PI 468916 (G. soja; data not shown).
Evolutionary analysis of BAC 91F11 R-genes:
The GCG program Diverge was used to examine sequence differences among the BAC 91F11 R-genes. The R-genes were divided into structural domains including the TIR, NBD, LRR, and also the IR between the NBD and LRR. By examining the ratio of nonsynonymous substitutions (Ka) to synonymous substitutions (Ks), we examined the types of evolutionary forces acting upon the genes. Within the TIR, NBD, IR, and LRR regions, average Ka/Ks values for all pairwise comparisons were 0.830, 0.548, 0.785, and 1.104, respectively. The use of two-by-two contingency tables identified gene comparisons in which the Ka/Ks ratio was significantly different from neutral selection (Ka/Ks = 1; Table 2). Within the IR and the NBD region many of the pairwise comparisons were significantly less than one, suggesting conservative selective pressure. Pairwise comparisons including R-genes 7 and 11 often resulted in Ka/Ks ratios significantly greater than one. These results, along with the frameshifts and stop codons, support their roles as pseudogenes.
|
Sequence analyses of other R-gene clusters has suggested that the solvent-exposed amino acids of the ß-strand/ß-turn motif of the LRR are involved in determining pathogen specificity. The 10 LRRs of genes 1, 3, 4, 5, 7, 8, 10, and 11 and the 7 LRRs of gene 12 were analyzed by separating the ß-strand/ß-turn motif from the remainder of the LRR. The Ka/Ks average of all pairwise comparisons in the ß-strand/ß-turn was 1.93 while the remainder of the LRR had a Ka/Ks value of 0.724. Within the ß-strand/ß-turn domain, many of the pairwise comparisons had Ka/Ks ratios significantly greater than one, suggesting divergent evolution (Table 2). Amino acid alignment of the LRRs demonstrates the hypervariability of the ß-strand/ß-turn motif specifically in the fifth, sixth, and seventh LRRs (Fig 4).
|
The alignment of the 16 R-genes was used to construct phylogenetic trees of the IR and the TIR, NBD, and LRR regions within R-genes (Fig 5). For genes 6, 9, 13, and 14, only the R-gene portions of the genes were included. Phylogenetic analyses were performed using the heuristic tree search and parsimony options in PAUPsearch and PAUPdisplay. Genes in close proximity on the BAC do not appear to cluster within or between phylogenetic trees. The only exceptions are genes 13 and 14, which are grouped together. Additionally, bootstrap analysis rarely supports branches within phylogenetic trees. This suggests that domain shuffling may have occurred between the genes.
|
Assessing significant phylogenetic relationships among R-genes in this cluster has been difficult due to the high nucleotide homology shared among genes, the number of R-genes present, and the duplication of the genes themselves. To further understand the relationships among genes, we examined the sequences 2000 bp upstream and downstream from the start and stop codons of all 16 R-genes (Fig 6 and Fig 7). This revealed a significant amount of information on the origin of the genes within the cluster. Subsets of restriction sites were used to demonstrate regions of sequence similarity. Immediately evident in our analyses were two large duplications involving R-genes on BAC 91F11 (Fig 6). The first duplication is 5419 bases in length and results in the duplication of R-genes 6 and 9 (Fig 6A). The second duplication involves the regions surrounding R-genes 13 and 14 (Fig 6B). This 5531-base duplication includes 1942 bases upstream of the start site, the R-genes themselves, and 1001 bases downstream of the stop codon.
|
|
At their 5' end, all of the R-genes share 7582 bases of nucleotide similarity immediately preceding the start site. Upstream of this point, however, the genes fall into three distinguishable classes (Fig 7A). Fragment A and R-genes 1, 3, 4, 7, 12, and 13 share no additional DNA similarity with the other R-genes. R-genes 8, 10, 11, and 14 make up a second group. R-genes 5, 6, and 9 are related to this group for the first 1000 bases prior to the start site, but then fall into a separate group. Comparing the groups with their physical position on the BAC reveals no clear pattern.
Analysis of the 2000 bp 3' to the stop codon also revealed three distinct groups (Fig 7B). R-genes 6 and 9 are similar to each other, but share no similarity to the rest of the R-genes. R-genes 13 and 14 are also similar to each other but show no similarity with the rest of the genes. This is not surprising given their unique fused structure. The remaining R-genes fall into one major group that can also be subdivided. Again, these sequence-based groupings do not correspond to physical positions within the BAC.
In many cases, the alignments reveal evidence of recombination. Looking at the 5' upstream sequence, R-gene 7 most resembles R-gene 12. However, looking at the 3' downstream sequence, R-genes 7 and 12 fall within different groups. The 5' sequences of R-genes 10 and 11 are most similar; yet the 3' sequence reveals they belong in different groups as well. Perhaps the most striking example is the organization of R-genes 13 and 14. Gene 14 appears to be composed of the 5' sequences from both main groups and includes R-gene fragment B.
Gene-specific RT-PCR analyses of BAC 91F11 R-genes:
We designed 12 gene-specific primers that differentiated these 12 genes from all R-genes on BAC 91F11 (Table 1). We were unable to design primers that would differentiate the R-gene portions of genes 6 and 9 and R-gene fragments A and B because of their small size and high nucleotide identity. Controls verified that the gene-specific primers differentiated between subclones representing all 16 genes and that the primers were specific to this R-gene cluster on linkage group J.
On the basis of these results, we were able to perform RT-PCR to monitor the expression of 12 of the genes on BAC 91F11. Libraries were made from six different soybean tissues. The tissues chosen represented different organs of the soybean as well as different developmental stages. Negative controls confirmed that the mRNA samples were free of genomic DNA contamination.
RT-PCR results demonstrate that six of the BAC 91F11 R-genes are expressed (Fig 8B and Fig C). Genes 2, 10, and 14 appeared to be constitutively expressed in all of the tissue samples. Gene 2 corresponds to cDNA MG13 (![]()
![]()
![]()
|
The BLASTN algorithm was used to search for sequences with similarity to expressed soybean genes from dbEST (GenBank; March, 2002). We considered only sequences with less than two base differences relative to the BAC sequence. Twenty different ESTs were identified from a variety of plant tissues and environmental conditions (Fig 1). ESTs were identified for R-genes 2, 4, 6, 7, 8, 9, and 12. The corresponding EST libraries included a variety of tissues and developmental stages as well as different pathogen-infected tissues.
| DISCUSSION |
|---|
R-gene structure and evolution:
BAC 91F11 contains 16 different R-gene sequences with homology to the TIR/NBD/LRR family of disease resistance genes. Analysis of the synonymous and nonsynonymous amino acid substitution rates between the genes has revealed several interesting features. First, the Ka/Ks ratios in the IR, TIR, and NBD regions are <1.0. This suggests that these regions are under purifying selection. Second, the solvent-exposed ß-strand/ß-turn domain of the LRR has an average Ka/Ks value of 1.93, while the other regions of the LRR have an average Ka/Ks value of 0.724. These values would suggest that the ß-strand/ß-turn motif of the LRR is undergoing divergent selection.
Previously, we used PCR amplification of R-gene-like sequences in the BAC 91F11 cluster to estimate amino acid substitution rates in this region. The results indicated that the TIRs of R-genes in this cluster were under divergent selection (![]()
As cloned R-gene sequences have been analyzed, intragenic unequal crossing over has been shown to play an important role in changing pathogen specificity. Decreases or increases in LRR number are associated with altered specificity in the L and M loci in flax (![]()
![]()
![]()
![]()
![]()
Analyses of the sequences 2000 bases upstream and downstream of the R-genes suggest that recombination is important in generating R-gene diversity. Upstream of the translational start site, the 16 R-genes share only 80 bases of sequence similarity. Beyond this point, the genes break into two distinct groups. Three main groups could also be identified in the sequences downstream of the stop codon. While distinct groups of genes are present, they are not physically separated on the BAC; instead they appear in a completely random assortment. In addition, there is clear evidence of recombination between distinct groups.
By analyzing the fusion junctions and the adjacent sequences of genes 13 and 14, it is apparent that one of the sequences arose through direct duplication of the other. To determine if the duplication was due to the insertion of the intervening retroelement, we examined the sequences of both genes 13 and 14 and of the retroelement separating them. Genes 13 and 14 share >90% nucleotide identity and contain complete open reading frames. The LTRs of the retroelement share >99.2% nucleotide identity. The high conservation found within the LTRs suggests that the insertion of the retroelement was relatively recent and occurred after the duplication of the two genes. Therefore, the duplicated genes are probably the result of intragenic recombination. In addition, a recombination or deletion event was probably involved in fusing the TIR/NBD domains to the defense-related protein.
Recombination or deletion was also apparently involved in the formation of R-genes 6 and 9. Again, unequal recombination or deletion would be necessary to fuse the R-gene TIR to the calmodulin-like Ca2+-binding domains and then to duplicate the novel gene. Surprisingly, while the genes share >96% nucleotide identity, they are physically separated by >30 kb. This suggests that a mechanism must be present for separating newly duplicated genes. In the Mla locus of barley, R-genes have been shuffled by several rounds of duplication and inversion and by the insertion of nested transposons (![]()
Analysis of BAC 91F11 has allowed us to build a model to describe the evolution of this cluster of genes (Fig 9). Initially, the R-genes on BAC 91F11 were most likely arranged as two separate clusters (Fig 9A). The head to tail orientation of all the R-genes within the BAC suggests that within the clusters, tandem duplications resulted in new R-genes (![]()
|
The degree to which genes are shuffled could be dictated by the insertion and excision of retroelements. Genes 13 and 14, which are 90% identical, are obviously the result of a tandem duplication. However, unlike many of the genes in this cluster, they have not been physically separated. The only thing separating them is a retroelement (Fig 9G). In contrast, genes 6 and 9 are still 99% identical but have been separated by >30,000 bases. Analyses of the LTRs of the retroelements within this BAC suggest that retroelement insertion occurred after the duplication of the genes (Fig 9, EI). Insertion of retroelements near R-genes may prevent the mispairing that would eventually lead to recombination and gene shuffling. In the Mla locus of barley, successive rounds of duplication have been followed by inversions and nested retrotransposon insertions (![]()
![]()
Novel disease resistance signatures:
We have identified two R-genes (13 and 14) with high similarity to the TIR and NBD domains of disease resistance genes. In addition, these genes have a third domain with similarity to the defense-related proteins NtPRp27 (![]()
![]()
![]()
![]()
Unlike NBD/LRR genes, homologs with similarity to NtPRp27 and WCI-5 are rare. ![]()
![]()
Another interesting point to consider is the expression of genes 13 and 14. In the case of gene 14, the defense-related protein is now fused to a gene whose expression is constitutive in all of the tissues examined. NtPRp27 and WCI-5, which have high similarity to the defense-related portion of genes 13 and 14, are the only two other defense-related proteins for which expression data have been examined in detail thus far. For each of these genes pathogen-induced expression occurs 2 or 9 days, respectively, after infection (![]()
![]()
![]()
Given that NtPRp27 is a secreted protein constitutively expressed in roots and induced by pathogen infection, ![]()
In addition to the TIR/NBD/defense-related genes, we identified a second novel class of disease resistance genes: TIR domains fused inframe to two Ca2+-binding domains with homology to the Ca2+-binding domains of soybean calmodulin ScaM-4. Ca2+ domains or EF-Hands have been identified in a large variety of proteins and are generally found as pairs (![]()
![]()
![]()
![]()
![]()
![]()
Analyses of the alternative splice products of the tobacco mosaic virus resistance gene N suggest that the TIR/NBD domains of genes 13 and 14 and the TIR domain of genes 6 and 9 could still act in signal transduction pathways. Alternative splicing of resistance genes has been demonstrated only in the NBD/LRR family of disease resistance genes. In addition to N, the flax rust resistance genes M (![]()
![]()
![]()
![]()
Initially, the presence of truncated R-gene domains within the cluster suggested that these genes were no longer functional. However, the genes with truncated domains show functional similarity to the MyD88 family of genes (![]()
![]()
In Arabidopsis, almost 20% of the identified NBD/LRR genes are missing most, if not all, of the LRR (33 truncated R-genes and 166 total NBD/LRR genes; ![]()
![]()
R-gene expression:
RT-PCR was used to monitor the expression of 12 of the 16 R-genes in roots, shoots, flowers, pods, mature leaves, and young leaves. Of the 12 genes examined, expression of 6 genes was detectable. Genes 2, 10, and 14 were constitutively expressed in all tissues. Genes 8, 11, and 12 were differentially expressed. By analyzing dbEST at GenBank, we were able to identify over 20 ESTs corresponding to BAC 91F11 R-genes. ESTs confirmed the expression of 2 genes (genes 4 and 7) for which we did not detect RT-PCR activity and 2 genes (6 and 9) for which we could not design RT-PCR primers. Many of the ESTs originated from plant tissues harvested under different environmental conditions or after pathogen inoculation. This suggests that at least some of the BAC 91F11 R-genes could be induced by stress. Differences between RT-PCR and the EST results are most likely due to differences in tissues examined.
By monitoring the expression of the R-genes we hoped to identify possible candidates for adult-onset powdery mildew resistance (Rmd). In the greenhouse, the primary leaves and first and second trifoliates are susceptible to powdery mildew infection while subsequent trifoliates are resistant (![]()
![]()
In this article we reported the sequencing of a 118.8-kb core BAC in a soybean linkage group J contig known to span a cluster of disease resistance genes. Sequence analysis has revealed the presence of several different types of genes. These include 16 genes with similarity to disease resistance genes and retroelements, as well as domains from calmodulin genes, leucine zipper proteins, and defense-related proteins. More than half of the R-genes appear to be full length; the remainder encode truncated genes. To date, truncated TIR/NBD genes have been described only in Arabidopsis. In contrast to Arabidopsis, the truncated soybean genes may represent two novel classes of disease resistance genes because the TIR domains are fused inframe to additional protein domains: TIR/NBD domains fused to a putative defense-related protein (NtPRp27) and TIR domains fused to calmodulin EF (Ca2+-binding) domains. Both the NtPRp27-like protein and the calmodulin EF domains are putatively involved in pathogenesis and defense responses in other species. By analyzing the R-gene sequences within this BAC we have begun to make advances in understanding the mechanisms generating novel disease resistance specificities in soybean.
| FOOTNOTES |
|---|
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession no.
AF541963. ![]()
1 These authors contributed equally to this work. ![]()
2 Present address: Department of Plant Biology, University of Minnesota, St. Paul, MN 55108. ![]()
| ACKNOWLEDGMENTS |
|---|
Thanks go to Dan Voytas and David Wright at Iowa State University for their help in the evaluation of the retrotransposon sequences. Thanks go to Roger Wise at Iowa State University for critical evaluation of the manuscript. Names are necessary to report factually on the available data; however, the USDA neither guarantees nor warrants the standard of the product, and the use of the name by the USDA implies no approval of the product to the exclusion of others that may also be suitable. This article is a contribution of the Corn Insect and Crop Genetics Research Unit (USDA-ARS, Midwest Area) and project no. 3236 of the Iowa Agriculture and Home Economics Experiment Station (Ames, IA).
Manuscript received June 12, 2002; Accepted for publication September 24, 2002.
| LITERATURE CITED |
|---|
AARTS, M. G., B. TE LINTEL HEKKERT, E. B. HOLUB, J. L. BEYNON, and W. J. STIEKEMA et al., 1998 Identification of R-gene homologous DNA fragments genetically linked to disease resistance loci in Arabidopsis thaliana. Mol. Plant Microbe Interact. 11:251-258.[Medline]
AKIRA, S., K. TAKEDA, and T. KAISHO, 2001 Toll-like receptors: critical proteins linking innate and acquired immunity. Nat. Immun. 2:675-680.[Medline]
ALTSCHUL, S. F., T. L. MADDEN, A. A. SCHAFFER, J. ZHANG, and Z. ZHANG et al., 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
ANDERSON, P. A., G. J. LAWRENCE, B. C. MORRISH, M. A. AYLIFFE, and E. J. FINNEGAN et al., 1997 Inactivation of the flax rust resistance gene M associated with the loss of a repeated unit within the leucine-rich repeat coding region. Plant Cell 9:641-651.[Abstract]
BOTELLA, M. A., J. E. PARKER, L. N. FROST, P. D. BITTNER-EDDY, and J. L. BEYNON et al., 1998 Three genes of the Arabidopsis RPP1 complex resistance locus recognize distinct Peronospora parasitica avirulence determinants. Plant Cell 10:1847-1860.
BRYAN, G. T., K. S. WU, L. FARRALL, Y. JIA, and H. P. HERSHEY et al., 2000 A single amino acid difference distinguished resistant and susceptible alleles of the rice blast resistance gene Pi-ta.. Plant Cell 12:2033-2045.
CENTURY, K. S., R. A. LAGMAN, M. ADKISSON, J. MORLAN, and R. TOBIAS et al., 1999 Developmental control of Xa21-mediated disease resistance in rice. Plant J. 20:231-236.[Medline]
COOLEY, M., S. PATHIRANA, H. J. WU, P. KACHROO, and D. KLESSIG, 2000 Members of the Arabidopsis HRT/RPP8 family of resistance genes confer resistance to both viral and oomycete pathogens. Plant Cell 12:663-676.

99% nucleotide similarity to BAC 91F11 sequences (March, 2002). GenBank accession numbers of the ESTs and their positions are given under the ruler in italics.






