Abstract
Comparative genome studies are important contributors to our understanding of genome evolution. Most comparative genome studies in plants have been based on genetic mapping of homologous DNA loci in different genomes. Large-scale comparative physical mapping has been hindered by the lack of efficient and affordable techniques. We report here the adaptation of fluorescence in situ hybridization (FISH) techniques for comparative physical mapping between Arabidopsis thaliana and Brassica rapa. A set of six bacterial artificial chromosomes (BACs) representing a 431-kb contiguous region of chromosome 2 of A. thaliana was mapped on both chromosomes and DNA fibers of B. rapa. This DNA fragment has a single location in the A. thaliana genome, but hybridized to four to six B. rapa chromosomes, indicating multiple duplications in the B. rapa genome. The sizes of the fiber-FISH signals from the same BACs were not longer in B. rapa than those in A. thaliana, suggesting that this genomic region is duplicated but not expanded in the B. rapa genome. The comparative fiber-FISH mapping results support that chromosomal duplications, rather than regional expansion due to accumulation of repetitive sequences in the intergenic regions, played the major role in the evolution of the B. rapa genome.
COMPARATIVE genetic mapping by cross-hybridization of restriction fragment length polymorphism (RFLP) probes in different species has proven to be a valuable tool to examine genome architecture and trans-specific genome relationships. In many plant families, it has been demonstrated that there is considerable collinearity of genetic loci between related species (Bonierbaleet al. 1988; Tanksleyet al. 1988; Hulbertet al. 1990; Ahn and Tanksley 1993; Kowalskiet al. 1994). The genetic location of a gene or a DNA marker in one species can be predicted on the basis of its trans-specific synteny in another species. It should be possible to map a gene in a large-genome species and clone it from a related small-genome model organism on the basis of collinearity, although such attempts have not been successful so far (Gale and Devos 1998). Genomic conservation has also given biologists new tools and perspectives on genome and crop evolution (White and Doebley 1998). Despite all of the advances in comparative genetic mapping relatively little has been done to compare the physical properties of plant genomes.
There are three basic strategies for comparative physical mapping. One is to sequence orthologous regions of DNA from related species and to compare the sequence data (Chen et al. 1997, 1998). Although this is the most definitive analysis, it is expensive and not affordable to most laboratories. Pulsed field gel electrophoresis (PFGE) followed by gel blotting and probe hybridization is another method to physically compare the same genomic regions between different species. This method can be used to find the smallest DNA fragments that hybridize to the same set of probes in related species (Sadowskiet al. 1996). A weakness of this method is that its resolution can be limited significantly by the unknown physical distances between the probes and the proximity of recognition sites for rare-cutting restriction enzymes. The third method is fluorescence in situ hybridization (FISH) mapping in related species using the same DNA probes. This method has proven to be a very powerful technique to study chromosomal evolution in mammalian species. DNA derived from a single chromosome from one mammalian species can be labeled to paint chromosomes from another species. The rearrangement and modification of a particular chromosome among related mammalian species can be determined efficiently (Wienberget al. 1990; Scherthanet al. 1994). In plants, large genomic DNA clones from one species were mapped by FISH on chromosomes from related species (Fuchset al. 1996; Zwicket al. 1998). This comparative FISH mapping strategy, however, generates mapping data of limited resolution, because DNA probes separated by several megabases (Mb) may not be resolved on condensed metaphase chromosomes.
Two of the model plant species, Arabidopsis thaliana and rice (Oryza sativa), are ideal candidates for comparative physical mapping because both genomes will be completely sequenced. For A. thaliana, there is already a plethora of information about its genome in the form of sequence data and sequencing-ready bacterial artificial chromosome (BAC) contigs (Linet al. 1999; Marraet al. 1999; Mayeret al. 1999; Mozoet al. 1999). Brassica is the genus most closely related to A. thaliana that has economic value. Three diploid Brassica species (Brassica nigra, 2n = 16; B. oleracea, 2n = 18; and B. rapa, 2n = 20) and three polyploids, which are derived from the three diploids, are important vegetable crops. All of the genetic mapping studies indicate that the three diploid Brassica species are actually ancient polyploids, with nearly one-half of the RFLP loci being present in more than one copy (Quiroset al. 1994). Because of the genomic complexity, polyploidy, and both partial and complete duplications, the Brassica species are difficult targets for genome studies. Most of the previous comparative studies between Arabidopsis and the Brassica genomes were genetic comparisons of segregating RFLP markers (Kowalskiet al. 1994; Lagercrantzet al. 1996; Osbornet al. 1997; Lagercrantz 1998), although limited comparative physical mapping results using PFGE analysis are also available (Sadowskiet al. 1996; Sadowski and Quiros 1998).
We report here the physical comparison of a 431-kb BAC contig from A. thaliana chromosome 2 on both chromosomes and DNA fibers of B. rapa. We demonstrate that the comparative fiber-FISH mapping strategy is an efficient technique for large-scale comparative physical mapping studies. We also show that duplications played a major role in the expansion of the B. rapa genome relative to the Arabidopsis genome.
MATERIALS AND METHODS
We used a 431-kb Arabidopsis BAC contig consisting of six BAC clones (T07M07, T04M15, T02P04, T07D17, T20B05, and T03K09) for comparative physical mapping. This contig is derived from the 79-cM region of chromosome 2 and is located on the long arm ∼2 Mb from the telomere and 14 Mb from the centromere (Wanget al. 1997, http://www.tigr.org/tdb/at/atgenome/chr.II.status.html). All of these BACs, except T04M15, have been sequenced (Linet al. 1999). BAC T03G21 was sequenced instead of BAC T04M15 as part of this contig. Insert sizes and sequence information on these BAC clones are summarized in Table 1.
Ecotype “Columbia” of A. thaliana and line R500 (TO980) of B. rapa (provided by Dr. T. C. Osborn, University of Wisconsin-Madison) were used for chromosome and DNA fiber preparations. Seeds were germinated on moist filter paper and allowed to grow at room temperature until ∼2 cm in length. Root tips were excised and fixed in 3:1 ethanol to glacial acetic acid for a few days before being stained in acetocarmine and squashed. Preparations with good chromosome morphology were immediately placed in a –80° freezer until used for FISH.
Genomic DNA fibers were prepared following Fransz et al. (1996). BAC DNA was isolated from 200-ml cultures grown overnight using an alkaline lysis method (Sambrooket al. 1989). The DNA was labeled with either digoxigenin-11-UTP (Boehringer Mannheim, Indianapolis) or biotin-16-UTP (Boehringer Mannheim) using standard nick translation protocols. Preparation of probe for FISH followed Jiang et al. (1996) with only minor modifications. We applied 30% formamide, rather than 50%, in the hybridization mixture when probing Arabidopsis BACs on B. rapa DNA fibers. Each FISH experiment was internally replicated on two different slides and each experiment was replicated twice to ensure the reliability of the data. The experiment was replicated again if further data collection was necessary.
Images were captured digitally using a Photometrics (Tucson, AZ) SenSys charge coupled device (CCD) camera attached to an Olympus (Lake Success, NY) BX60 epifluorescence microscope using a 60× PlanApo objective. The CCD camera was controlled using IPLab spectrum v3.1 software on a Macintosh computer. Gray scale images were captured for each color channel and then merged. Measurements were made on the digital images within IPLab spectrum software and final image adjustments were done with Adobe Photoshop v5.1. Numerical data were analyzed in Microsoft Excel (98) using the data analysis package.
RESULTS
The six BAC clones were combined and labeled as a single probe to hybridize to the metaphase chromosomes of A. thaliana and B. rapa. FISH signals were detected only on the distal region on the long arm of A. thaliana chromosome 2 (Figure 1A, a), confirming a single location of this DNA fragment in the A. thaliana genome. This probe, however, generated FISH signals on four to six B. rapa chromosomes in different metaphase cells (Figure 1A, b). Most of the signal foci were distally located on metaphase chromosomes, which is consistent with the chromosomal locations of these clones in Arabidopsis. The FISH analysis revealed multiple copies of this DNA fragment in the B. rapa genome, although the exact number of duplications cannot be determined because of the lack of consistency of the number of unambiguous FISH signals in different cells. Relatively faint and small FISH signals were often observed on many other chromosomes. Such signals may represent partial duplications or be derived from repetitive DNA elements.
We previously demonstrated that the insert sizes of BAC clones can be determined by converting the length (micrometers) of fiber-FISH signals into kilobases (Jackson et al. 1998, 1999) (kilobases (Jackson et al. 1998, 1999). The six BACs were measured on A. thaliana DNA fibers and the standard deviations of the measurements were collected (Table 1). When the six BACs were labeled alternately with red or green colors and hybridized to the A. thaliana DNA fibers, the sizes and overlapping of the red/green alternate signals closely matched the sequencing data (Figure 1B).
To determine the corresponding sizes of the Arabidopsis BACs in the B. rapa genome, we hybridized the BAC clones on DNA fibers prepared from B. rapa. The following strategy was used to avoid collecting signals derived from broken fibers: three adjacent BACs labeled red/green/red or green/red/green were hybridized to B. rapa fibers. Only the signals derived from the middle BAC, which was flanked on both sides with differently colored signals, were collected and analyzed. We did not collect data from the flanking BACs as it was impossible to determine if they were fragmented. Data were collected for the four internal BACs (T04M15, T02P04, T07D17, and T20B05) from this contig using this method.
—Comparative FISH mapping of a 431-kb A. thaliana BAC contig in B. rapa. (A) FISH analysis of the entire contig on the metaphase chromosomes of (a) A. thaliana (ecotype “Columbia”) and (b) B. rapa (line TO980). FISH signals were detected on a single pair of A. thaliana chromosomes. Major FISH signals were observed from four to six B. rapa chromosomes. Bars, 5 μm. (B) Fiber-FISH of the six-BAC contig in A. thaliana and a diagram illustrating the sequencing and BAC fingerprinting data of the same BAC clones. The six BACs were detected with alternating green-red colors in fiber-FISH. Yellow fluorescence spots resulted from the overlapping of green and red signals. The yellow bars in the diagram represent the overlapping regions of adjacent BACs. (C) Fiber-FISH mapping of four BAC clones in B. rapa. (a) BAC T04M15 (red) flanked by T07M07 and T02P04; (b) BAC T02P04 (green) flanked by T04M15 and T07D17; (c) BAC T07D17 (red) flanked by T02P04 and T20B05; (d) BAC T20B05 (green) flanked by T07D17 and T03K09. Bars, 20 μm.
The measurements of fiber-FISH signals from B. rapa showed that only BAC T02P04 was longer in size (104.2 ± 36.2 kb) than the actual BAC insert size (85 kb) (Table 1 and Figure 1C, b). The average standard deviation (SD) of a 100-kb BAC on A. thaliana DNA fibers is ∼±10 (Table 1). The SD of this BAC in B. rapa was substantially larger than that in A. thaliana. Thus, the collected signals may represent different loci in the B. rapa genome having different physical sizes. The averages of the measurements from the other three BACs (T04M15, T07D17, and T20B05) were shorter than the actual sizes of the BAC inserts (Table 1). The measurements from T04M15 also had a large SD (91.8 ± 39.3 kb). The SD of T20B05 was similar to that in A. thaliana, indicating that the duplicated loci corresponding to this BAC may have relatively similar physical sizes in B. rapa. The t-tests of the means of the sizes of fiber-FISH data collected from B. rapa and A. thaliana showed that all of the BAC sizes in B. rapa were significantly shorter than they were in A. thaliana, except for T02P04, which was significantly longer (Table 1).
The overlapping of fiber-FISH signals between adjacent BACs on DNA fibers of B. rapa was in accordance with sequencing data from A. thaliana. BACs T07D17 and T20B05 have little overlap with adjacent BACs based on sequencing data and fiber-FISH of these two BACs in B. rapa showed similar results (Figure 1C, c and C, d). BAC T04M15 has 20–30-kb overlap with both T07M07 and T02P04 based on fiber-FISH in A. thaliana (Figure 1B) and BAC fingerprinting data (M. L. Wang and H. M. Goodman, unpublished data). In B. rapa, however, T04M15 overlapped completely with the flanking BACs (Figure 1C, a), indicating that deletions may occur in this region after the divergence between B. rapa and A. thaliana. The amount of overlap of T02P04 with its flanking BACs was inconsistent in B. rapa, ranging from little in some signals to complete overlap in others (Figure 1C, b). This result suggests that the duplicated loci corresponding to T02P04 are heterogeneous in the B. rapa genome, probably also resulting in the large SD of the fiber-FISH measurements.
Sequencing data of the 431-kb Arabidopsis BAC contig and fiber-FISH data from A. thaliana and B. rapa
DISCUSSION
Most of the previous comparative mapping studies between A. thaliana and the Brassica species have been based on genetic analysis of polymorphic DNA markers. The genetic mapping comparisons between Arabidopsis and various Brassica species have shown collinearity in defined regions with duplications of these regions in several diploid Brassica species (B. nigra, Lagercrantzet al. 1996; Lagercrantz 1998; Sadowski and Quiros 1998; B. oleracea, Kowalskiet al. 1994; B. rapa, Osbornet al. 1997) and the amphidiploids derived from these diploid species (B. napus, Osbornet al. 1997; Scheffleret al. 1997; Cavellet al. 1998). The first comparison, with B. oleracea, showed 11 regions of conserved organization (Kowalskiet al. 1994). These regions, ranging from 3.7 to 49.6 cM in A. thaliana, span 25% of the A. thaliana genome and 30% of the B. oleracea genome. This study also suggested triplication of at least part of the B. oleracea genome. All other similar comparative genetic mapping revealed both extensive collinearity between A. thaliana and Brassica species and duplication/triplication of collinear regions within the diploid Brassica species.
There are two major hypotheses regarding the evolution of the Brassica genomes based on previously reported comparative genetic/physical mapping results. The first hypothesis is that the diploid Brassica species were derived from hexaploids (“triplication hypothesis”; Lagercrantz and Lydiate 1996; Lagercrantz 1998). Since the sizes of the diploid Brassica genomes are approximately three times the size of the 145-Mb A. thaliana genome (Arumuganathan and Earle 1991), this hypothesis suggests that the gene spacing between Arabidopsis and the Brassicas should be relatively similar. The triplication hypothesis is supported by several finemapping experiments. Genetic mapping using a highly polymorphic B. nigra population revealed that the B. nigra genome contains eight distinct sets of chromosomal segments, each present in three copies, covering virtually the entire genome (Lagercrantz and Lydiate 1996). Comparative mapping of a 1.5-Mb contig from A. thaliana chromosome 5 detected three segments in the B. nigra genome: two intact chromosomal segments, each equivalent to the entire Arabidopsis contig, and the other disrupted by a single inversion (Lagercrantzet al. 1996). Cavell et al. (1998) demonstrated that the B. napus genome contains six duplicated regions, each equivalent to a 30-cM (7.5 Mb) region of A. thaliana chromosome 4, and each ∼20–30 cM in B. napus. In a similar study, Scheffler et al. (1997) reported that a 30-cM segment of A. thaliana chromosome 3 showed collinearity with six regions of the B. napus genome.
The second hypothesis is that the diploid Brassica species were derived from amphidiploids with part of the genome duplicated again after the amphidiploidization (“amphidiploidy hypothesis”; Sadowski and Quiros 1998). The partial duplication after amphidiploidization resulted in the regional “triplication” revealed by the comparative Arabidopsis/Brassica RFLP mapping studies. Since the sizes of the diploid Brassica genomes are larger than what can be explained by the amphidiploidy hypothesis, one would assume that at least part of the genome size increase is due to the accumulation of intergenic DNA, similar to what has been observed in the sh2-a1 region of grass species (Chen et al. 1997, 1998). This hypothesis is supported by the fact that only duplication, not triplication, was detected for certain loci in the Brassica genomes, although this could also be explained by the failure of detection of triplication because of the lack of polymorphism of the DNA probes analyzed. Physical mapping of a 15-kb region on chromosome 3 of A. thaliana in diploid Brassica species by PFGE suggested that at least some of the corresponding chromosomal segments in the Brassica genomes seem to be substantially larger than 15 kb (Sadowskiet al. 1996). A similar PFGE study of a 30-kb region of A. thaliana chromosome 4 also suggested a significant expansion of the corresponding region in the B. nigra genome (Sadowski and Quiros 1998).
We used an Arabidopsis BAC contig covering 431 kb of DNA for comparative FISH mapping. Based on gene prediction software, there are 8 genes of known function and identity and 112 predicted or hypothesized genes in this contig (TIGR, http://www.tigr.org/tdb/at/atgenome/chr.II.status.html). Thus this region is rich in genes with an average density of 1 every 3.6 kb of DNA. The sequencing data suggested that this region is unique and not duplicated in the Arabidopsis genome (Linet al. 1999). FISH analysis confirmed that the six BAC clones hybridized to a single location in A. thaliana. However, we observed FISH signals of the same BAC clones on four to six B. rapa chromosomes, indicating that this DNA fragment has multiple duplications in the B. rapa genome. One or some of the multiplied loci may have undergone significant rearrangements after the original duplication, resulting in the inconsistent number of signal foci in different cells. Fiber-FISH analysis showed that three out of the four A. thaliana BAC clones detected shorter DNA fragments in B. rapa as compared to A. thaliana. The fiber-FISH measurements suggest that this particular chromosomal region is not significantly expanded in B. rapa as compared to A. thaliana. The FISH signals on B. rapa DNA fibers resembled the typical “beads-on-a-string” fiber-FISH pattern, similar to those generated on A. thaliana DNA fibers. If a large amount of intergenic DNA sequences had accumulated in this region of the B. rapa genome, we would expect large gaps within the fiber-FISH signals, which would also result in significant expansions of the signals as compared to those of A. thaliana.
Duplications and accumulation of nontranscribed intergenic repetitive DNA sequences are two important mechanisms for the expansion of higher eukaryotic genomes. In maize it is well documented that accumulation of retroelements played a significant role in genome expansion (SanMiguelet al. 1996; Chen et al. 1997, 1998; SanMiguel and Bennetzen 1998). The sh2 and a1 genes are separated by ∼20 kb in rice and sorghum. However, the physical distance between these two genes in maize is ∼140 kb, largely due to accumulation of retroelements in this region. In the present study, we demonstrated that an A. thaliana BAC contig, including four BAC clones and ∼300 kb DNA, showed similar sizes on DNA fibers from A. thaliana and B. rapa. The possibility of different condensation of the DNA fibers from the two species can be eliminated because DNA fibers prepared from nuclei of various eukaryotic species, including humans, have a similar extension degree that ranged from 2.8 to 3.3 kb per micrometer (reviewed by Jacksonet al. 1998). The comparative fiber-FISH results show that this region is not expanded in the B. rapa genome as compared to the A. thaliana genome. Substantial expansion of the diploid Brassica genomes as compared to the A. thaliana genome was suggested on the basis of PFGE analyses (Sadowskiet al. 1996; Sadowski and Quiros 1998). The different conclusions based on the PFGE data and the fiber-FISH results may be a reflection of the evolution of different parts of the Brassica genomes. Additional comparative physical mapping using clones from different parts of the A. thaliana genome will solve this puzzle. It would be particularly interesting to know if certain regions of the Brassica genomes, such as the centromeric regions or other heterochromatic regions, have been significantly expanded in size due to accumulation of repetitive DNA sequences.
Comparative physical mapping in phylogenetically related species is an important approach for genome analysis. We have demonstrated that fiber-FISH analysis of the same sets of large genomic DNA clones provides a fast and affordable method to compare the physical sizes of specific genomic regions in related species. Measurements of FISH signals from B. rapa DNA fibers using four contiguous A. thaliana BAC clones, covering 300 kb DNA, showed that the corresponding genomic regions in B. rapa are not significantly expanded as compared to the same region in A. thaliana. The data reported here support the hypothesis that the increase in genome size in B. rapa relative to A. thaliana is a result of chromosomal duplications and not an accumulation of intergenic DNA sequences.
Acknowledgments
This research is supported by funds 135-0528 and 135-0505 from the Graduate School of the University of Wisconsin-Madison to J.J.
Footnotes
-
Communicating editor: J. A. Birchler
- Received February 7, 2000.
- Accepted June 13, 2000.
- Copyright © 2000 by the Genetics Society of America