The determinants of recognition specificity of self-incompatibility in Brassica are SRK in the stigma and SP11/SCR in the pollen, both of which are encoded in the S locus. The nucleotide sequence analyses of many SRK and SP11/SCR alleles have identified several interspecific pairs of S haplotypes having highly similar sequences between B. oleracea and B. rapa. These interspecific pairs of S haplotypes are considered to be derived from common ancestors and to have maintained the same recognition specificity after speciation. In this study, the genome structures of three interspecific pairs of S haplotypes were compared by sequencing SRK, SP11/SCR, and their flanking regions. Regions between SRK and SP11/SCR in B. oleracea were demonstrated to be much longer than those of B. rapa and several retrotransposon-like sequences were identified in the S locus in B. oleracea. Among the seven retrotransposon-like sequences, six sequences were found to belong to the ty3 gypsy group. The gag sequences of the retrotransposon-like sequences were phylogenetically different from each other. In Southern blot analysis using retrotransposon-like sequences as probes, the B. oleracea genome showed more signals than the B. rapa genome did. These findings suggest a role for the S locus and genome evolution in self-incompatible plant species.
THE molecular mechanism of self-incompatibility, which is a unique system of self-recognition in plants, has been intensively studied. Since self-incompatible Brassica species include important vegetables and oil crops and are suitable for genetic analysis, Brassica species are the most frequently used for the study of self-incompatibility. The self-recognition specificity of the pollen and the stigma in Brassica is controlled by a single multiallelic locus called the S locus. The determinants of the self-recognition specificity in Brassica are S receptor kinase (SRK) (Stein et al. 1991) in the stigma and S-locus protein 11/S-locus cysteine-rich protein (SP11/SCR) (Schopfer et al. 1999; Suzuki et al. 1999) in the pollen, both of which are encoded in the S locus. S-locus glycoprotein (SLG), a secreted glycoprotein similar to the extracellular domain (S domain) of SRK, was the first molecule identified as a protein encoded by a gene in the S locus (Nasrallah and Wallace 1967), but the participation of SLG in self-incompatibility is still controversial (Dixit et al. 2000; T. Suzuki et al. 2000; Takasaki et al. 2000; Silva et al. 2001; Sato et al. 2002). These three genes are located in the core region of the S locus, and SLL2 (S-locus-linked gene 2), SP2 (S-locus protein 2), SP6, and SP7 are found in the flanking regions of the S core region (Cui et al. 1999; Suzuki et al. 1999; Kimura et al. 2002; Fukai et al. 2003). The alleles of these genes are transmitted to the progeny as one set, and therefore a set of alleles of these genes is called an S haplotype. Order and orientations of the S-locus genes and distances between these genes are highly variable in a species (Fukai et al. 2003). This structural heteromorphism of the S locus in a species would suppress recombination in this region (Cui et al. 1999).
The number of S haplotypes in Brassica oleracea is considered to be 50 (Ockendon 2000) and estimated to be >100 in Brassica rapa (Nou et al. 1993). S haplotypes have been classified into two groups, class I and class II, on the basis of the nucleotide sequence similarity of SLG alleles and SRK alleles (Nasrallah et al. 1991). The deduced amino acid sequences of SLG and the S domain of SRK have >72% identity among different S haplotypes within the classes and <70% identity between the two classes (Nishio and Kusaba 2000). In pollen, the class II S haplotypes are generally recessive to the class I S haplotypes (Nasrallah et al. 1991). In pollen-dominant/pollen-recessive heterozygous plants, the steady-state level of the pollen-recessive SP11 mRNA is drastically reduced, and therefore the dominant/recessive relationship of pollen recognition specificity is considered to be controlled at the mRNA level (Kusaba et al. 2002; Shiba et al. 2002).
The nucleotide sequence analyses of SRK and SP11 alleles of many S haplotypes in B. oleracea and B. rapa have identified several interspecific pairs of S haplotypes having highly similar SRK and SP11 sequences between these two species (Sato et al. 2002). Using interspecific hybrids between the two species, transgenic plants, and a bioassay with recombinant SP11 proteins, B. oleracea S haplotypes have been found to possess the same recognition specificity as B. rapa S haplotypes in six interspecific pairs (Kimura et al. 2002; Sato et al. 2003). These interspecific pairs of S haplotypes between B. oleracea and B. rapa are considered to be derived from common ancestors and to have maintained the same recognition specificity after speciation. On the other hand, pulsed-field gel electrophoretic analysis has suggested that the S-locus region (between SP4 and SP7) in B. oleracea is longer than that in B. rapa (G. Suzuki et al. 2000). These observations suggest that every B. oleracea S haplotype enlarged or every B. rapa S haplotype reduced its size after speciation. Since the S locus has the large structural polymorphism in a species, it is important to compare S haplotypes in the interspecific pairs to reveal the interspecific differences. Through analysis of the partial sequences of genomic clones, Kimura et al. (2002) have suggested that the S locus of B. oleracea S-7 is longer than that of B. rapa S-46. The basis of this structural difference between these S haplotypes has not been elucidated. Neither is it known whether the difference between these S haplotypes is generally seen in other S haplotypes.
The genome size of barley (5000 Mb) is much larger than that of rice (430 Mb), and the genome size of maize (2500 Mb) is also much larger than that of sorghum (750 Mb). Insertions of transposons and retrotransposons are considered to be responsible for the expansion of the genome size (Laurie and Bennett 1985; Bennett and Leitch 1995; Dubcovsky et al. 2001). Comparison of the orthologous adh region between sorghum and maize has indicated that the adh region of maize is longer than that of sorghum and that 60% of the intergenic regions in maize are occupied by retrotransposons, which are not present in the adh region of sorghum (Tikhonov et al. 1999; Ilic et al. 2003). The differences of the S-locus lengths between B. oleracea and B. rapa may be due to insertion of retrotransposons.
We have determined a complete nucleotide sequence of the S locus of a class II S haplotype, S-60, in B. rapa (Fukai et al. 2003). In this study, we determined the S-locus sequences of two B. rapa S haplotypes and three B. oleracea S haplotypes to elucidate the nature of interspecific differences of S haplotypes in three interspecific pairs, two being class I and one being class II. All S-locus sequences of B. oleracea were found to be longer than those of B. rapa in the interspecific pairs, and more retrotransposons were detected in the B. oleracea S locus. The B. oleracea genome showed more copies of the retrotransposons than B. rapa genome did. Evolution of the S locus and the Brassica genome is discussed here.
MATERIALS AND METHODS
Four S homozygous lines in B. oleracea L. (S-7, S-12, and S-15) and B. rapa L. (S-47) were used as plant materials for isolating the S locus. SRK and SP11 sequences of B. oleracea S-12 are similar to those of B. rapa S-47 (Sato et al. 2002). B. oleracea S-7 and S-15 were similar to B. rapa S-46 and S-60, respectively, whose S-locus sequence has been determined previously (Kimura et al. 2002; Fukai et al. 2003). Doubled haploid (DH) lines developed from cabbage Matsunami and B. rapa ssp. chinensis cv. Osome, an inbred line of Chinese kale, an F1 hybrid cultivar of broccoli Green Comet, Yellow Sarson C634, and an F1 hybrid cultivar of Chinese cabbage CR-Seiga 65 were used for Southern blot analysis with S-locus probes.
Cloning and sequencing of DNA fragments of the S-locus regions:
Genomic DNAs of the S-7, S-12, and S-15 homozygotes in B. oleracea and of the S-47 homozygote in B. rapa were partially digested by Sau3AI and ligated to the arms of a λ-phage, λFIXII (Stratagene, La Jolla, CA). Clones of the S-locus regions were isolated by plaque hybridization using specific probes. SacI, SalI, and XbaI fragments of these genomic clones were subcloned into pBluescript SK(−).
For construction of cosmid libraries, genome DNAs of B. oleracea S-12 and S-15 were partially digested by MboI and ∼40-kb DNA fragments were size fractionated using CHEF-DR II (Bio-Rad, Hercules, CA). Purified DNA was ligated to the arms of a cosmid vector, pWE15 (Stratagene), and packed in vitro using GIGAPACK II Gold (Stratagene). HindIII and EcoRI fragments of these genomic clones were subcloned into pBluescript SK(−).
The nucleotide sequences of the subclones were determined with a CEQ 2000XL DNA analyzer (Beckman Coulter), and the data were analyzed using Sequencher (Gene Codes, Ann Arbor, MI).
DNA sequence analysis was performed using the BLAST search tool (Altschul et al. 1990), GENSCAN (Burge and Karlin 1997), and GeneMark (http;//opal.biology.gatech.edu/genemark/). Deduced amino acid sequences were analyzed by Pfam (http://pfam.wustl.edu/) to determine the motifs of retrotransposons. Sixty-three gag sequences were selected on the basis of the completeness of the gag region sequences from 126 sequences classified as retrotransposon gag protein in Arabidopsis thaliana registered in Pfam. Phylogenetic analysis was performed using CLUSTAL W (http://hypernig.nig.ac.jp/homology/clustalw_help.html).
Southern blot analysis:
Genomic DNA was extracted from young leaf tissue by the CTAB method (Murray and Thompson 1980). The genomic DNA (2 μg) digested with EcoRI was subjected to electrophoresis on 1.0% agarose gel and transferred to a nylon membrane, Nytran (Schleicher & Schuell, Dassel, Germany). The membrane was hybridized with a digoxigenin-labeled probe at 65°. After hybridization, the membrane was washed twice in a solution containing 0.1% SSC and 0.1% SDS at 65° for 20 min.
Comparison of the genome structure of the S-locus region between B. oleracea S-7 and B. rapa S-46:
To compare the genome structure of the S-locus region between B. oleracea S-7 and B. rapa S-46, we determined the nucleotide sequences of the genomic clones of SRK, SP11, and their flanking regions. In B. rapa S-46, Kimura et al. (2002) isolated a P1-derived artificial chromosome clone, G28, which harbors SRK, SP11, and SLG. They determined the nucleotide sequence of an 11.7-kb region covering the S domain of SRK and SP11 in G28. In this study, the 9.3-kb region neighboring the 11.7-kb region was newly sequenced (accession no. AB180897) (Figure 1a). In B. oleracea S-7, three genomic phage clones, each harboring SP11, the kinase domain of SRK, and the S domain of SRK, were isolated. The lengths of the inserts were 16, 15, and 15 kb, respectively. These phage clones were found to overlap each other and the total length of a contig was 32 kb. We determined the nucleotide sequence of the 32-kb region covering SRK and SP11 (accession no. AB180898) (Figure 1a).
The orientations of the SRK and SP11 alleles in the S locus of B. oleracea S-7 and B. rapa S-46 were the same, while the distance between SRK and SP11 in B. oleracea S-7 was 16 kb longer than that in B. rapa S-46. The region between SRK and SP11 in B. oleracea S-7 can be divided into five regions, three of which are highly similar, i.e., >90% homologous, to the regions between SP11 and SRK in B. rapa S-46 (Figure 1a). Two regions, each located between two of these three regions, were named region A and region B. The sequence of region A was analyzed with BlastX, but no sequence homologous to region A was found. Southern blot analysis of the genomic DNA of B. oleracea S-7 and B. rapa S-46 using three sequences (A-1, -2, -3) in region A as probes showed signals only in B. oleracea S-7 (Figure 1b). The sequence of region B was analyzed with BlastX, and many sequences homologous to region B, such as retrotransposon-like sequences in A. thaliana, were found. Open reading frame prediction with GENEMARK and GENSCAN revealed one gene in region B that was similar to retroelement pol polyprotein-like protein (AB028613) and T32E20.9 protein (AC020646) (Table 1). We named this retrotransposon-like sequence BoSTF07a (B. oleracea S locus retrotransposon family). Southern blot analysis of the genomic DNA of B. oleracea S-7 and B. rapa S-46 using a sequence (B-1) in region B as a probe showed many signals in B. oleracea S-7 and B. rapa S-46 (Figure 1b).
Comparison of the genome structure of the S-locus region between B. oleracea S-12 and B. rapa S-47:
For the comparison of the genome structure of the S-locus region between B. oleracea S-12 and B. rapa S-47, the nucleotide sequences of genomic clones of SRK, SP11, SLG, and their flanking regions were determined. In B. rapa S-47, five genomic phage clones, each harboring SP6/SRK, SRK/SP11, SP11/SLG, SLG/SLL2, and SLL2/SP2, were isolated, and the lengths of the inserts were 18, 16, 16, 15, and 18 kb, respectively. These five phage clones overlapped each other and the total length of a contig was 54 kb. We determined the nucleotide sequence of the full length of the contig (accession no. AB180899). In B. oleracea S-12, we isolated two cosmid clones of a contig including SRK, two phage clones of a contig including SP11, and two cosmid clones of a contig including SLG. A contig covering both SRK and SP11 or both SP11 and SLG was not obtained, and therefore the 37-, 17-, and 29-kb regions, including SRK, SP11, and SLG, respectively, were sequenced (accession nos. AB180900, AB180901, AB180902) (Figure 2a).
The sequences of the exons and the introns of the SRK alleles except for the fifth intron were highly similar between B. oleracea S-12 and B. rapa S-47. The lengths of the fifth intron of these two were different, i.e., 115 and 68 bp, respectively, and nucleotide sequence similarity was not high; i.e., it was 75.8% identity. The lengths of the first intron of the SP11 alleles of B. oleracea S-12 and B. rapa S-47 were 2145 and 3452 bp, respectively. The first intron of the SP11 allele of B. rapa S-47 was divided into four regions: SP11inI-1, SP11inI-2, SP11inI-3, and SP11inI-4. SP11inI-1, SP11inI-2, and SP11inI-4 were highly similar to the corresponding regions of the first intron of the SP11 allele of B. oleracea S-12, each nucleotide identity being 93.4, 93.0, and 95.7%, and SP11inI-3 was not present in the first intron of SP11 of B. oleracea S-12 (Figure 2b). A sequence homologous to SP11inI-3 was not found in the DNA Data Bank.
High sequence similarity was also found in the promoter region of SRK and SP11 between BoS-12 and BrS-47. The similarity of the promoter region was observed in a 250-bp sequence (96.4%) of SP11 and in a 349-bp sequence (85.0%) of SRK. There were no similar sequences in the upstream regions. This observation suggests that the promoter regions of SP11 and SRK are no >250 and 349 bp, respectively.
In the intergenic regions, there were 13 regions that show high similarity between B. oleracea S-12 and B. rapa S-47 (Figure 2a). Because these high-similarity regions are arranged in the same order and have the same orientations between B. oleracea S-12 and B. rapa S-47, it is conceivable that the gene order and the orientations of the SRK, SP11, and SLG alleles in the S locus of B. oleracea S-12 are the same as those of B. rapa S-47. The length of the intergenic region between SP6 and SRK in B. rapa S-47 was 5 kb, but SP6 was not found in the 18-kb upstream region of SRK in B. oleracea S-12, suggesting that the SP6-SRK intergenic region of B. oleracea S-12 is >18 kb. The length of the intergenic region between SRK and SP11 in B. rapa S-47 was 11 kb, while the sum of the length of the region downstream of SRK and the length of the region upstream of SP11 in B. oleracea S-12 was 26 kb, suggesting that the length of the intergenic region between SRK and SP11 in B. oleracea S-12 is >26 kb. The length of an intergenic region between SP11 and SLG in B. rapa S-47 was 4 kb, while that in B. oleracea S-12 was inferred to be >17 (4 + 13) kb. These results indicate that the intergenic region between SRK and SP11 and the region between SP11 and SLG in B. oleracea S-12 are longer than those in B. rapa S-47. Even if the gene order and the orientations of SRK, SP11, and SLG in the S locus of B. oleracea S-12 are not the same as those of B. rapa S-47, the regions in B. oleracea S-12 are considered to be longer than those in B. rapa S-47: 19 kb for the region between SRK and SP11 and 17 kb for the region between SP11 and SLG. The intergenic region between SLG and SLL2 in B. oleracea S-12 was also 7 kb longer. However, the region between SLL2 and SP2 in B. oleracea S-12 was 11 kb shorter than that in B. rapa S-47 (Figure 2a).
Analysis by BlastX identified four retrotransposon-like sequences—BoSTF12a, BoSTF12b, BoSTF12c, and BoSTF12d—and one transposon-like sequence in B. oleracea S-12 (Figure 2a). The deduced amino acid sequence of the transposon-like sequence was similar to MuDR-like transposon protein (AC007288). Putative amino acid sequences of BoSTF12a, BoSTF12b, BoSTF12c, and BoSTF12d were similar to retroelement pol polyprotein-like protein (AB028613), F12K21.6 protein (AC023279), T15F17.1 protein (AF262042), and F5M15.26 protein (AC027665), respectively (Table 1).
Comparison of the genome structure of the S-locus region between B. oleracea S-15 and B. rapa S-60:
Fukai et al. (2003) have determined the nucleotide sequence of an 86.4-kb region (AB097116) that covers the SP11, SRK, and SLG genes of B. rapa S-60. It has been reported that there are two SLG genes (BoSLG-15A and BoSLG-15B) in B. oleracea S-15. The 10-kb sequence including BoSLG-15A (Y18262) (Cabrillac et al. 1999) and the 7-kb sequence including BoSLG-15B (Y18261) have been deposited in EMBL. We isolated two cosmid clones of a contig harboring SRK, one phage clone harboring SP11, and three cosmid clones of a contig harboring BoSLG-15A and BoSLG-15B in B. oleracea S-15. The genome structure of the class II S-locus region was compared in B. oleracea S-15 and B. rapa S-60. A contig covering both SP11 and SRK or both SRK and SLGA/B of B. oleracea S-15 was not obtained, and therefore the 39-kb and 17-kb regions, including SRK and SP11, respectively, were sequenced (accession nos. AB180903 and AB180904). A physical map of ∼91 kb harboring BoSLG-15A and BoSLG-15B was constructed (Figure 3a).
The sequence of the exons and the second, the fourth, and the fifth introns of the SRK alleles were highly similar, >93.0%, between B. oleracea S-15 and B. rapa S-60. The lengths of their first, third, and sixth introns were different, i.e., 628–632 bp, 3001–4398 bp, and 86–73 bp, respectively, and nucleotide sequence similarities were low: 78.8, 70.3, and 77.9% identity. Two SP11 sequences (BoSP11-15A and BoSP11-15B) in B. oleracea S-15, which are arranged in the same orientation, were found (Figure 3b). The BoSP11-15A had the same sequence as the cDNA clone, which has been identified as BoSP11-15 (Sato et al. 2003). The sequences of the exon and the intron of BoSP11-15A were highly similar to those of BrSP11-60. Amino acid identity between the second exons of BoSP11-15A and BoSP11-15B was 95.5%. A part of the first exon of BoSP11-15B was found to be deleted. The amino acid identity between B. rapa SLG-60 and B. oleracea SLG-15A was 89.6% and that between B. rapa SLG-60 and B. oleracea SLG-15B was 91.7%. The duplication of SP11 in B. oleracea S-15 has recently been reported (Shiba et al. 2004). It can be inferred that BoSP11-15A and BoSLG15B are the orthologs of BrSP11-60 and BrSLG-60, respectively.
The high sequence similarity of the promoter region between BoS-15 and BrS-60 was disrupted in the region upstream from −245 bp (the translation initiation site is numbered 1) of SP11, and that between BoS-15 and BrS-60 was disrupted in the region upstream from −213 of SRK. This observation suggests that the promoter regions of SP11 and SRK are no >245 and 213 bp, respectively.
There were 17 regions that showed high similarity between B. oleracea S-15 and B. rapa S-60 (Figure 3). Because the high similarity regions are arranged in the same order and orientations between B. oleracea S-15 and B. rapa S-60, it is conceivable that the gene order and the orientations of the SRK, SP11A, and SLGB alleles in the S locus of B. oleracea S-15 are the same as those of B. rapa S-60. The length of the intergenic region between SP11 and SRK in B. rapa S-60 was 7 kb, while the sum of the lengths of the region upstream of SP11A and the region upstream of SRK in B. oleracea S-15 was 19 kb. The length of the intergenic region between SRK and SLG in B. rapa S-60 was 26 kb, while the sum of the lengths of the region downstream of BoSRK-15 and the region upstream of BoSLG-15A was ∼62 (22 + ∼40) kb. These results indicate that the intergenic region between SP11 and SRK and the region between SRK and SLG in B. oleracea S-15 are longer than those in B. rapa S-60. Analysis by BlastX identified two retrotransposon-like sequences, i.e., BoSTF15a and BrSTF60a, in B. oleracea S-15 and B. rapa S-60. Putative amino acid sequences of BoSTF15a and BrSTF60a were similar to those of F5M15.26 protein (AC027655) and retroelement pol polyprotein-like protein (AB028613), respectively (Table 1). It has been shown that there is a Melmoth sequence, which has been found in the S-locus region in B. napus (Cui et al. 1999), in SLA of the BoS-15 S locus (Y18262). A retrotransposon-like sequence, which was not included in STFs because of low similarity to the reported retrotransposon sequences, was also found in a downstream region of SP11A of BoS-15 and was named retro-1.
Analysis of the STFs:
Seven STFs and one MuDR transposon-like sequence were identified by analysis with GENSCAN, GENEMARK, and BlastX. BoSTF12c was classified as a non-long terminal repeat (LTR) retrotransposon and the other six STFs were classified as a Ty3 gypsy of the LTR type (Kumar and Bennetzen 1999). Pfam analysis was performed using deduced amino acid sequences of the seven STFs. BrSTF60a had the regions of gag, rve (integrase), rvt (reverse transcriptase), and rnaseH. BoSTF12a and BoSTF15a had gag, rve, and rvt, while BoSTF12d had gag, rve, and rnaseH. On the other hand, BoSTF07a and BoSTF12b had only gag, and BoSTF12c had only rve (Figure 4). To compare the six STFs (except for BoSTF12c), the gag sequences that were found in the six STFs were used. The gag sequences of two retrotransposons—one being Melmoth, which has been classified as LTR-type Ty1 copia (Cui et al. 1999), and the other being Athila, which has been classified as an LTR-type Ty3 gypsy and found in a centromere region in A. thaliana (Arabidopsis Genome Initiative 2000)—and 63 gag sequences were used. These gag sequences were selected on the basis of their completeness of gag region sequences from 126 gag sequences of retrotransposon-like sequences in A. thaliana registered in Pfam software. Phylogenetic analysis was performed by ClustalW. The sequences were clustered into six distinct clades designated as clusters 1, 2, 3, 4, 5, and 6. BoSTF07a, BoSTF12a, BoSTF15a, and Athila belonged to cluster 5, and BoSTF12b, BoSTF12d, and BrSTF60a belonged to cluster 6 (Figure 5). In the comparison of the gag sequences between STFs within the cluster, the highest amino acid identity was found between BoSTF12a and BoSTF15a (98.9%), and the second highest was found between BoSTF12b and BoSTF12d (88.4%). In other combinations of STFs, the amino acid identities were <57%.
STF-like sequences in the Brassica genome:
The existence of STFs and their homologs in the Brassica genome was investigated by Southern blot analysis using genomic DNAs of a doubled haploid line (S-6 homozygote) from B. oleracea ssp. capitata cv. Matsunami (Matsunami–DH) and a doubled haploid line (S-60 homozygote) from B. rapa ssp. chinensis cv. Osome (Osome–DH). The regions of gag, rve, and rvt of STFs and the MuDR region of the MuDR-like transposon were used as probes. Many signals were detected in the Matsunami–DH and Osome–DH lines by the probes of the gag regions of BoSTF07a and BoSTF12b and of the MuDR region of the MuDR-like transposon. One dense band and three thin bands were detected in Osome–DH by the gag probe and the rvt probe of BoSTF12a, respectively, while Matsunami–DH showed multiple bands by these probes. No signal was detected in Osome–DH by the rve and gag probes of BoSTF12c and BoSTF12d, while Matsunami–DH showed smeary signals by these probes. A few dense bands and many thin bands were detected in Osome–DH and many thin bands were detected in Matsunami–DH by the probes of the gag and rvt regions of BrSTF60 (Figure 6). Taken together, these findings show that Matsunami–DH in B. oleracea has more transposon-like sequences than Osome–DH in B. rapa. The Southern blot analysis of other inbred lines and F1 hybrid cultivars in B. oleracea (an inbred line of Chinese kale and Green Comet) and B. rapa (C634 and CR-Seiga65) also showed more bands of STFs in B. oleracea plants than in B. rapa plants (data not shown). These analyses revealed that B. oleracea has more copies of retrotransposon-like sequences in the entire genome, as well as in the S locus, than B. rapa.
Structural differences of S locus in B. oleracea and B. rapa:
Comparison of the genomic sequences of SRK and SP11 in the three pairs revealed high similarity of the promoter regions and the introns, as well as the exons, of SP11 and SRK among the S haplotypes in the three interspecific pairs. Lack of similarity at the upstream regions of the promoters between BoS-15 and BrS-60 suggested that the promoter regions of SP11 and SRK are no >245 and 213 bp, respectively. Corresponding to this, the 192-bp upstream region of BrSP11-9 has been reported to be sufficient for gene expression in the tapetum and pollen (Shiba et al. 2002), and boxes I–V assigned by Dzelzkalns et al. (1993) as the conserved sequences in the promoter of the genes specifically expressed in the stigma are included in the 213-bp sequence.
Although most of the intron sequences of SP11 and SRK were also conserved among the S haplotypes in the three pairs, some introns showed different length and low sequence similarity. For example, the first intron of BrSP11-47 is 1.3 kb longer than that of BoSP11-12. The first intron of BrSP11-47 can be divided into four regions, three of which have high similarity to the first intron of BoSP11-12. However, the other region, SP11inI-3, is not present in BoSP11-12. Insertions or deletions of short DNA fragments have occurred and such sequence variations have been maintained in introns but not in exons.
Comparison of the genome structures of S haplotypes of the three interspecific pairs revealed that the B. oleracea S locus was larger than the B. rapa S locus in all of them. The distance between SP11 and SRK in BoS-7 was 16 kb longer than that in BrS-46. This 16-kb region includes region A, which has no sequence homology to any sequences deposited in the DNA database, and region B, which has a retrotransposon-like sequence, BoSTF7a. In the pair of BoS-12 and BrS-47, the distances between SP6 and SRK, between SRK and SP11, and between SP11 and SLG were all shown to be longer in BoS-12 than in BrS-47. Four STFs and one MuDR-like transposon were found in the S locus of BoS-12, but not in the S locus of BrS-47. Analysis of class II S haplotypes, i.e., BoS-15 and BrS-60, also revealed that the B. oleracea S locus is longer than the B. rapa S locus. Although the length of the BoS-15 sequence determined so far, including the sequence determined in this study, was shorter (73 kb) than that of the BrS-60 sequence (86.4 kb), one STF sequence, retro-1, and Melmoth were found. These findings suggest that insertions or deletions of retrotransposon-like sequences have resulted in the differences of the lengths of the S locus between B. oleracea and B. rapa.
Evolution of the S locus:
There are two possible explanations of the cause of the differences of the S-locus lengths between B. oleracea and B. rapa. One is the elongation of the S-locus region in B. oleracea due to a high rate of retrotransposon insertion. The other is that the S-locus region in B. rapa has shrunk due to more efficient elimination of retrotransposons.
Time of retrotransposon insertion has been estimated using the nucleotide substitution rate between the sequences of the pair of LTRs of maize retrotransposons (SanMiguel et al. 1998). Since the two LTRs of a single retrotransposon are considered to be identical at the time of retrotransposon insertion into the host genome, divergences of LTRs of STFs and the introns of the SRK and SP11 alleles in the interspecific pairs were estimated as p-distances using MEGA ver.2.1 (Kumar et al. 2001). Among seven STFs identified in this study, BoSTF12c lacked LTRs, and only one side of the LTRs was sequenced in BoSTF12b and BoSTF12d, which were located at the ends of the genomic clones. Therefore, the four STFs were used for the comparison (Table 2). The LTR sequences of BoSTF15a had no nucleotide substitution. Although the LTRs of BoSTF07a had the highest p-distance (0.0192) in the four STFs, this was significantly lower than the lowest value between the three pairs of the intron sequences of SP11 and SRK, i.e., 0.0945. Under the assumption that the mutation rate in LTRs is not different from those in the introns of SP11 and SRK, it is considered that STFs have been inserted into the S locus after the speciation of B. rapa and B. oleracea.
It has been proposed that insertions of LTR retrotransposons in A. thaliana have occurred during the last four million years (Devos et al. 2002). On the other hand, elimination of inserted retrotransposons by illegitimate recombination has also been indicated (Devos et al. 2002). The difference of the size of the S locus between B. oleracea and B. rapa might be attributed to the elimination of retrotransposons. However, the relatively constant sizes of the B. rapa S locus, as indicated by PFGE analyses (G. Suzuki et al. 2000; Kimura et al. 2002), would require the independent, parallel elimination of most retrotransposons from several different S haplotypes, an unlikely occurrence. Therefore, as an explanation of the observed size differences of the S locus in these species, frequent insertion of STFs into the S locus of B. oleracea is more probable than is frequent elimination of inserted retrotransposons from the B. rapa S locus.
The fact that the B. oleracea S locus harbors more retrotransposon-like sequences than the B. rapa S locus may simply reflect a difference in retrotransposon activity between the species. Another possible explanation is a difference between the species in the genomic location of the S locus. The accumulation of larger numbers of retrotransposon-like sequences in the Petunia S locus than in the Rosaceae S locus may reflect the low rates of recombination in centromeric regions (Wang et al. 2004). However, there is no evidence so far to support a centromeric location of the S locus in the Brassica species (Iwano et al. 1998).
The differences in S-locus structure between B. oleracea and B. rapa revealed by this study are similar to the differences observed between barley and rice and between maize and sorghum (Tikhonov et al. 1999; Ilic et al. 2003). However, the genome size of B. oleracea (∼600 Mb) is not very different from that of B. rapa (∼500 Mb) (Arumuganathan and Earle 1991). That the distance between SP11 and SRK in B. oleracea is more than twice that in B. rapa may suggest an elevated rate of retrotransposon insertion into the S locus compared to the rest of the genome.
The A. thaliana genome has a small number of retrotransposon-like sequences, which are distributed predominantly in centromeric regions. Heterochromatic regions are also rich in retrotransposons (CSHL/WUGSC/PEB Arabidopsis Sequencing Consortium 2000), and Athila and Gypsy-like have been reported to have preferentially targeted heterochromatin (Pereira 2004). These regions are known to be recombination-suppressed regions. An SRK allele and its cognate SP11 allele are thought to have been transmitted as a unit for a long time, at least since the speciation of B. rapa and B. oleracea (Sato et al. 2002). This implies that no recombination in the region between SRK and SP11 has occurred or that rare recombinants, if generated, have been selected against, although evidence of recombination within the SLG sequence in the S-locus evolution has been reported (Kusaba et al. 1997; Awadalla and Charlesworth 1999). The Brassica S locus was also found to be rich in retrotransposon-like sequences. These examples suggest that a chromosomal region with low recombination frequency can be an insertion hot spot for some types of retrotransposons. The relationship between low recombination frequency and richness of transposons has been suggested by population genetic study (Charlesworth and Langley 1986). Consistent with this, the S-locus regions of other species also often contain many retrotransposon-like sequences (Cui et al. 1999; Kusaba et al. 2001; Lai et al. 2002; Ushijima et al. 2003; Wheeler et al. 2003; Tomita et al. 2004; Wang et al. 2004). For example, the S-locus region of Arabidopsis lyrata is richer than the region outside of the S locus in retrotransposon-like sequences (Kusaba et al. 2001).
Insertion of various retrotransposons may contribute to the suppression of recombination in the S locus, particularly in B. oleracea, but not all of heterogeneity is due to transposon insertions. If anything, the lack of recombination itself might be a crucial factor for the accumulation of heterogeneity in the Brassica S locus. Lack of recombination enhances the accumulation of heterogeneity between S haplotypes via lack of the exchange of sequence information between S haplotypes, which could reduce sequence heterogeneity. Conversely, the accumulation of heterogeneity could contribute to the suppression of recombination. Furthermore, the retrotransposons may facilitate the evolution of frequent loss or gain of gene duplicates. It is possible that the high sequence diversity in the intergenic regions has been generated in such a manner in the S locus of Brassica species and other self-incompatible species.
Recombination within a complex of genes results in the breakdown of a number of reproductive traits, in addition to self-incompatibility. In heterostylous systems, for example, recombination between the genes (as yet unidentified) that determine recognition specificity of the pollen and anther height and those that determine recognition specificity and length of the pistil results in homostyly with self-compatibility (Dowrick 1956; Matsui et al. 2004). Sex in dioecious systems is also determined by a gene complex or sex chromosomes, with suppressed recombination between unidentified sex-determining genes (Negrutiu et al. 2001; Ma et al. 2004). The complex that determines mating type in Chlamydomonas carries intrachromosomal translocations, inversions, and large indels, which may account for the observed suppression of recombination (Ferris and Goodenough 1994). The Y chromosome of Cannabis sativa has been reported to carry many copies of a long interspersed nuclear element-like retrotransposon (Sakamoto et al. 2000). A genomic region implicated in the expression of apomixis in Pennisetum squamulatum has been reported to be a low-recombination region of ∼50 Mb that contains highly repetitive sequences of a retrotransposon (Akiyama et al. 2004). Among the aspects shared by the Brassica S locus and each of these systems is the control of an important aspect of the reproductive system by a genetic complex that is subject to both recombination suppression and retrotransposon insertion.
In the Southern blot analysis using the STF sequences as probes, some probes detected strong signals in both species and the other probes detected strong signals in only one species. The sequences detected in both species are inferred to be retrotransposons possessed by the ancestral species of B. oleracea and B. rapa before speciation, and those detected in only one of the species are inferred to have been incorporated into or deleted from the genomes after speciation. More signals of retrotransposons, which were detected in both species, were in B. oleracea than in B. rapa. The presence of more retrotransposons in the B. oleracea genome than in the B. rapa genome may suggest that retrotransposons have been more active in B. oleracea than in B. rapa.
It has been reported that transposition of retrotransposons is activated under stress conditions for plants (Grandbastien 1998). It can be speculated that B. oleracea has grown under higher stress conditions than B. rapa. Corresponding to this, the leaves of B. oleracea plants are generally thicker than those of B. rapa plants and are coated with a thick wax layer, indicating that B. oleracea has plant morphology adapted to drought. Many wild plants in B. oleracea grow on arid rocky cliffs in the Mediterranean area.
Hypomethylation caused by the ddm1 mutation has been revealed to activate exogenous and endogenous retrotransposons in Arabidopsis (Hirochika et al. 2000), indicating the influence of epigenetics on retrotransposon activation. Participation of DNA methylation, histone H3 lysine-9 methylation, and RNA interference in retrotransposon silencing has been suggested (Lippman et al. 2004). Our preliminary investigation using Southern blot analysis with methylation-sensitive and methylation-insensitive restriction enzymes showed high methylation of STFs.
Among the STFs, BrSTF60a was not detected in B. oleracea and was found in low copy number in B. rapa. A few retrotransposons with transposing activity have been identified in plants, e.g., Tos17, Tto1, and Tnt1 (Grandbastien et al. 1989; Hirochika 1993; Hirochika et al. 1996). These retrotransposons are generally present in low copy number in plant genomes. BrSTF60a has gag, rve, rvt, and rnaseH, and its structure is well conserved as a retrotransposon. On the basis of mutation rates in LTRs, BrSTF60a and BoSTF15a were considered to have been very recently integrated into the Brassica S locus. If these STFs have transposing activity, the influence of external and internal factors such as environmental stresses and DNA methylation on their transposing activity is of great interest.
We are grateful to Dave Astley, Horticulture Research International, England, for providing the S tester lines of B. oleracea; to Kokichi Hinata of Tohoku University for the B. rapa S tester lines; and to Yasuhisa Kuginuki, the National Institute of Vegetables and Tea Science, Tsu, Japan, for the DH lines of B. oleracea and B. rapa. We also thank Ryo Kimura and Shohei Takuno for technical advice. This work was supported in part by a grant-in-aid for Special Research on Priority Areas B (no. 11238202) from the Ministry of Education, Science, Sports and Culture, Tokyo.
- Received October 7, 2004.
- Accepted April 1, 2006.
- Copyright © 2006 by the Genetics Society of America