A core genetic map of the legume Medicago truncatula has been established by analyzing the segregation of 288 sequence-characterized genetic markers in an F2 population composed of 93 individuals. These molecular markers correspond to 141 ESTs, 80 BAC end sequence tags, and 67 resistance gene analogs, covering 513 cM. In the case of EST-based markers we used an intron-targeted marker strategy with primers designed to anneal in conserved exon regions and to amplify across intron regions. Polymorphisms were significantly more frequent in intron vs. exon regions, thus providing an efficient mechanism to map transcribed genes. Genetic and cytogenetic analysis produced eight well-resolved linkage groups, which have been previously correlated with eight chromosomes by means of FISH with mapped BAC clones. We anticipated that mapping of conserved coding regions would have utility for comparative mapping among legumes; thus 60 of the EST-based primer pairs were designed to amplify orthologous sequences across a range of legume species. As an initial test of this strategy, we used primers designed against M. truncatula exon sequences to rapidly map genes in M. sativa. The resulting comparative map, which includes 68 bridging markers, indicates that the two Medicago genomes are highly similar and establishes the basis for a Medicago composite map.
THE genus Medicago contains in excess of 54 characterized species (Lesins and Lesins 1979; Small and Jomphe 1988), with the majority of species being either diploid annuals or tetraploid perennials. The most important economic species of Medicago is the tetraploid perennial Medicago sativa, or alfalfa, although several annual Medicago species are of regional agricultural importance either as forage crops or for intercropping as a means to enhance soil nitrogen. M. truncatula is native to the Mediterranean basin, where the existence of numerous native populations has provided an important resource for population biology and surveys of natural phenotypic variation (Bonnin et al. 1996a,b). In addition to its native distribution, M. truncatula has been cultivated for close to 1 century in Australia, where it was developed on a limited scale as a winter forage and for use in ley rotation with wheat (Davidson and Davidson 1993). Also known by the common name “barrel medic,” M. truncatula is well suited as a crop in areas of nonacidic soils and low winter rainfall.
As a consequence of its native distribution in the Mediterranean basin and agronomic use particularly in Australia, M. truncatula has great potential for the study of both basic and applied aspects of plant biology. The natural attributes of M. truncatula that make it desirable as an experimental system include its annual habit, diploid and self-fertile nature, abundant natural variation, relatively small 500-Mbp genome, and close phylogenetic relationship to the majority of crop legume species (Barkeret al. 1990; Cook 1999). Moreover, over the past decade several research groups have developed the tools and infrastructure for basic research, including efficient transformation systems (Trieu and Harrison 1996; Trinhet al. 1998; Kamatéet al. 2000), collections of induced variation (Penmetsa and Cook 2000), well-characterized cytogenetics (Cerbahet al. 1999; Kulikovaet al. 2001), and a collaborative research network (http://www.medicago.org). Research efforts on M. truncatula encompass a broad range of issues in plant biology, ranging from studies of population biology (Bonnin et al. 1996a,b) and resistance gene evolution (Cannonet al. 2002; Zhuet al. 2002) to the molecular basis of symbiotic interactions (e.g., Penmetsa and Cook 1997; Catoira et al. 2000, 2001; Harrisonet al. 2002; Ben Amoret al. 2003; Limpenset al. 2003; Liuet al. 2003; Mathesiuset al. 2003) and micronutrient homeostasis (Nakata and McConn 2000; McConn and Nakata 2002; Elliset al. 2003). Of importance to these hypothesis-driven investigations is the parallel development of tools for genome analysis, including a large collection of expressed sequence tags (ESTs) and an ongoing physical map and whole-genome sequencing effort, as well as corresponding activities on metabolic profiling and proteomics.
A key resource for both classical genetic and genomics efforts in M. truncatula is a genetic map composed of well-characterized molecular markers. Thoquet et al. (2002) have produced a genetic map based primarily on the analysis of anonymous sequence polymorphisms [i.e., amplified fragment length polymorphism (AFLP) and randomly amplified polymorphic DNA (RAPD) markers]. However, the increasing sequence information for M. truncatula provides an opportunity to map sequence-characterized loci. We have used such a strategy in previous studies to describe the organization and distribution of resistance gene analog sequences (Zhuet al. 2002) and as the basis for examining genome conservation between M. truncatula and Arabidopsis thaliana (Zhuet al. 2003).
The goal of the current study was to develop codominant genetic markers for the transcribed region of the M. truncatula genome. Ancillary goals included providing a community resource for genetic mapping in M. truncatula and developing a set of conserved genetic elements for comparative map analysis within the Fabaceae. We have emphasized the development of sequence-based genetic markers, as these are anticipated to have wider application among populations within a species and between related species. Toward this end, we used the extensive collection of ESTs for M. truncatula (e.g., Fedorovaet al. 2002) to develop genetic markers for genes that exhibit high sequence conservation with other legumes or with Arabidopsis. In parallel to the EST approach, we used DNA hybridization and sequence information to identify and genetically map bacterial artificial chromosome (BAC) clones containing genes of special interest [e.g., M. truncatula resistance gene analogs, genes expressed during symbiosis, or homologs of mapped soybean restriction fragment length polymorphism (RFLP) clones]. The majority of the genetic markers (BAC and EST) are anchored to BAC clone contigs, providing an important opportunity to use fluorescence in situ hybridization (FISH) to resolve ambiguities in the genetic map, as well as to increase the integration of genetic, cytogenetic, and physical map data. The resulting genetic map defines eight linkage groups (LGs), corresponding to the eight cytogenetically defined M. truncatula chromosomes (Kulikovaetal. 2001). To test the utility of these genetic markers for cross-species comparison, we analyzed 68 sequence-based markers in a diploid M. sativa (alfalfa) population. The results demonstrate that the two species are essentially colinear, with the exception of the notoriously variable 5S rDNA loci and two ESTs that appear to have been the focus of lineage-specific expansion or contraction.
MATERIALS AND METHODS
Identification of expressed sequence tags for genetic marker development: M. truncatula EST sequences were obtained from the National Center for Biotechnology Information (NCBI) dbEST and used to query the NCBI databases using blastx, blastn, or tblastx. M. truncatula ESTs with high similarity to genes discovered in other organisms (principally Arabidopsis and/or other legumes) were selected for further analysis. Analyses were conducted against public domain sequences available at NCBI in February 2000. In the initial attempt, we screened ∼2700 M. truncatula ESTs using blast and selected 274 ESTs as marker candidates. Oligonucleotide primers were designed from predicted exon sequences using the Lasergene PrimerSelect software package (DNAStar, Madison, WI) with the following general guidelines. In cases in which introns could be predicted by aligning an M. truncatula EST with a corresponding genomic sequence of Arabidopsis, primer pairs were designed to anneal in exon sequences and to amplify across intron regions. In cases in which an M. truncatula EST possessed similarity to sequences identified in other legumes (on the basis of blastn), sequence alignments were used to design oligonucleotide primers that would amplify DNA fragments from each of the corresponding legume genomes. The soybean database contributed most of the legume sequences for sequence comparison due to the relative abundance of soybean ESTs, and thus a majority of the EST primer pairs amplify sequences from the soybean genome (H.-K. Choi and D. Cook, unpublished results).
Identification of BAC clones for genetic marker development: RFLP probes previously mapped in crop legumes were used to identify homologous M. truncatula BAC clones on the basis of DNA hybridization. Soybean RFLP clones with high homology to genes in the NCBI database (May 1999) based on blastx were selected as probes for Southern blot analysis. High-density filters containing five times the coverage of the M. truncatula genome were obtained from the Clemson University Genome Center and hybridized with [32P]dCTP-labeled probes essentially as described by Nam et al. (1999). Putative positive clones were retrieved from the BAC library, purified, and used for DNA isolation by means of the QIAGEN (La Jolla, CA) plasmid kit according to the manufacturer's instructions. Purified BAC DNA was digested with HindIII, resolved in a 0.6% agarose gel, and used for a second round of Southern blot analysis. Hybridization patterns were used to confirm the original hybridization result and to distinguish paralogous loci on the basis of the size of the hybridizing band and the correspondence between BAC fingerprints. The resulting BAC clones were end sequenced using oligonucleotide primers that are complementary to the BAC clone polylinker: SQ-BAC-L (5′-AACGCCAGGGTTTTCCCAGTCACGACG-3′) and SQ-BAC-R (5′-ACACAGGAAACAGCTATGACCATGATTACG-3′). Twenty-microliter sequencing reactions contained 500 ng of BAC DNA, 8 μl of ABI BigDye (Perkin-Elmer, Norwalk, CT), and 5 pmol of primer. Sequencing reactions were performed with a 2-min initial denaturation step at 97°, followed by 40 cycles at 97° for 6 sec and 60° for 5 min. On the basis of BAC end sequence information, oligonucleotide primer pairs were designed to PCR amplify the corresponding genomic DNA fragment from M. truncatula mapping parents, genotypes A17 and A20.
Identification of polymorphic sequences and marker development: Parental genomic DNAs (Mt A17 and Mt A20) were amplified by the polymerase chain reaction using oligonucleotide primers designed from ESTs or BAC end sequences, as described above. Ten-microliter PCR reactions contained the following reagents: 20 ng of genomic DNA template, 1× PCR reaction buffer, 2.5 mm MgCl2, 0.25 mm of each dNTP, 5 pmol of each primer, and 0.5 unit of HotStarTaq DNA polymerase (QIAGEN). PCR thermocycling reactions were performed with a 15-min initial denaturation/activation step, followed by 35 cycles at 94° for 20 sec, 55° for 20 sec, and 72° for 2 min, with a final extension step of 5 min at 72°. PCR products were assessed by gel electrophoresis in 1% agarose, visualized by means of ethidium bromide staining. PCR reactions producing single bands were selected for sequencing using an ABI377 or ABI3730XL automated sequencer and the ABI PRISM BigDye terminator sequencing ready reaction kit (Perkin-Elmer). Sequencing reactions of 10-μl volume contained 10–50 ng of PCR amplicon, 4 μl of ABI BigDye reagent, and 5 pmol of primer. Sequencing thermocycling was performed with a 1-min initial denaturation step at 96°, followed by 35 cycles at 96° for 10 sec, 55° for 5 sec, and 60° for 4 min. DNA sequence alignments, produced with the Sequencher 3.1.1 program (Gene Codes, Ann Arbor, MI), were used to survey the parental alleles for polymorphic sites. Length and codominant polymorphisms could be assayed directly by means of agarose gel electrophoresis. Single-nucleotide polymorphisms (SNPs) were converted to cleaved amplified polymorphic sequences (CAPS) by identifying SNPs that confer differential restriction enzyme sites between the two parental alleles (Konieczny and Ausubel 1993; Hauseret al. 1998; Michaels and Amasino 1998). In cases in which a suitable restriction enzyme site was not identified, oligonucleotide primers with a single nucleotide mismatch were designed adjacent to the polymorphic position, such that a restriction site was created in the PCR product of one parent, but not the other (socalled derived CAPS markers, or dCAPS; e.g., Neffet al. 1998).
Genotyping and data analysis: Plant genomic DNA was isolated using the DNeasy plant mini kit (QIAGEN) according to protocols provided by the manufacturer. Two parental lines of M. truncatula, Jemalong A17 (the primary experimental genotype used in most investigations to date) and A20, were chosen previously (Penmetsa and Cook 2000) to facilitate genetic mapping and subsequent map-based cloning of genes defined by their mutant phenotype. The basic mapping population consisted of 93 F2 progeny derived from a cross of A17 and A20. In regions of specific interests, or where additional recombinants were desired to establish marker order, up to 120 individuals were genotyped.
For purposes of marker genotype analysis, the F2 DNAs were analyzed in parallel with three control DNAs (A17 maternal homozygous line, A20 paternal homozygous line, and heterozygous DNA) in a structured 96-well microtiter plate format. Briefly, following PCR ∼50–100 ng of product (1–2 μl) was transferred to a new 96-well plate containing 1–5 units of a predetermined restriction enzyme (Table 1) in a total volume of 8 μl. Digestion was carried out at the manufacturer-specified temperature for 2–4 hr. Cleaved DNA fragments were analyzed by agarose gel electrophoresis and genotypes were recorded as follows: homozygous maternal (A17) as “A,” homozygous paternal (A20) as “B,” heterozygous as “H,” not A as “C,” not B as “D,” and missing data as “—.”
For M. sativa, genetic marker candidates were first scored for polymorphisms in the parental plants (Mscw2 and Msq93) and their F1 progeny (F1/1). Markers that displayed easily scored polymorphisms (e.g., length variation, dominant inheritance, or heteroduplex formation) were genotyped directly by means of agarose gel electrophoresis. In cases in which alleles could not be scored directly on agarose gels, the amplification products were sequenced to identify polymorphisms and to develop CAPS markers (as described above for M. truncatula). In cases in which CAPS markers could not be developed, alleles were scored in F2 populations by direct sequencing of the PCR products. In such cases, a limited number of F2 individuals were selected to provide fine discrimination within the desired genetic interval, aided by a colorcoded genotype map of the diploid alfalfa population (Kisset al. 1998). In a typical mapping experiment, 138 M. sativa F2 individuals were analyzed. The F2 mapping population was derived from a single F1 plant (F1/1), based on a cross between the diploid yellow-flowered M. sativa ssp. quasifalcata and the diploid purple-flowered M. sativa ssp. coerulea (described by Kisset al. 1993).
Genetic distances were calculated by the “classical” maximum-likelihood method using MAPMAKER/EXP 3.0 (Landeret al. 1987; Lincolnet al. 1992). Linkage was determined by the “Group” command set at LOD 3.5 and a distance of 40 cM based on the Kosambi mapping function. The order of the markers was determined by the “Order” command (LOD 3.0, θ= 0.40). Raw genotype data were checked using the color mapping method as described by Kiss et al. (1998). Color mapping provides a convenient means to visually inspect and curate genotypes for each individual of the population, thereby identifying potential genotyping errors and rare recombination events, and to propose linkage or nonlinkage.
Identification of BAC clones for FISH analysis: In cases in which BAC clones were not previously identified by means of DNA hybridization, we used the polymerase chain reaction to identify candidate BAC clones. BAC DNA pools were constructed either from the 5× coverage BAC library, as described by Nam et al. (1999), or from a more recently developed 20× coverage BAC library of M. truncatula (D. Kim and D. R. Cook, personal communication). Candidate BAC clones were purified and cultured overnight on Luria broth agar medium supplemented with 30 μg/ml of chloramphenicol. The identity of BAC clones was confirmed by PCR, with amplified products assessed for size and intensity by means of gel electrophoresis in 1% agarose.
FISH with BAC clones on prometaphase and pachytene chromosomes: Anthers of M. truncatula A17 flower buds were used for producing mitotic prometa-phase (tapetum) and meiotic pachytene chromosome spreads. A detailed description of the chromosome preparation procedure and FISH is provided by Kulikova et al. (2001). BAC DNA used as probes was isolated according to the alkaline lysis method and labeled with either biotin-16-dUTP or digoxigenin-11-dUTP using a nick-translation mix (Roche). In some cases, BACs were labeled with a mixture of both dUTPs (in ratio 1:1) to produce yellow FISH signals after detection. Two to five probes were used simultaneously in each hybridization, including BACs that were mapped previously (Kulikovaet al. 2001) and served as landmarks for individual chromosomes.
Biotin-labeled probes were detected with avidin-Texas red and amplified with biotin-conjugated goat-antiavidin and avidin-Texas red (Vector Laboratories, Burlingame, CA). Digoxigenin-labeled probes were detected with sheep-antidigoxigenin fluorescein-5-isothiocyanate (FITC; Roche) and amplified with rabbit-anti-sheep FITC (Jackson ImmunoResearch Laboratories, West Grove, PA). Chromosomes were counter-stained with 4′,6-diamidino-2-phenylindole (DAPI) in Vecta-shield antifade solution (Vector Laboratories) of 5 μg/ml. Some chromosome preparations were reused for FISH with a new set of probes according to the method of Heslop-Harrison et al. (1992). Images were captured for each fluorescent dye separately with a cooled CCD camera system (Photometrics, Tucson, AZ) on a Zeiss Axioplan 2 fluorescence microscope, pseudocolored, and merged by means of a CytoVision workstation (Applied Imaging). To separate individual chromosomes, each chromosome was digitally excised and copied into a new image using Adobe Photoshop 6.0 (Adobe).
Development of genetic markers: With the goal of constructing a core genetic map of M. truncatula enriched with gene-based genetic markers, we focused on three distinct classes of sequences: (1) ESTs with high homology to genes known in Arabidopsis and/or other legume species, (2) M. truncatula BAC clones with high homology to mapped soybean RFLP probes, and (3) genes of predicted function. Table 1 provides a complete list of all marker information used in this study. ESTs with similarity to Arabidopsis and legumes: To identify M. truncatula ESTs with high similarity to genes in other legumes or Arabidopsis, we used BLAST (Altschulet al. 1990) to search the NCBI nonredundant (nr) and EST (dbEST) databases for related sequences, using the following minimum criteria: tblastx against nr, <e-50; blastn against dbEST to identify ESTs from other legume species (principally the soybean EST data set), <e-45; and blastn against the Arabidopsis genome sequence, <e-30. Where possible, sequences were chosen to represent apparently low- or single-copy-number genes, using the Arabidopsis genome as a reference for gene copy number. In total, 141 EST-based genetic markers were developed on the basis of this approach.
BAC clones with homology to mapped RFLP probes mapped from soybean, alfalfa, or pea: In addition to providing a genetic context for the analysis of M. truncatula genes, we desired to produce a framework for comparison of the genetic maps of related crop legume species. To test the feasibility of this strategy for soybean (Gylcine max), a set of 256 publicly available soybean RFLP clones was purchased from BioGenetic Services and each clone was sequenced from both ends. The resulting soybean sequence information was deposited at NCBI as accession nos. AQ841751–AQ842207 and AQ842113–AQ84-2119. A total of 121 of the soybean RFLP clones, ∼47% of the sequenced clones, contained a putative open reading frame based on BLASTX and TBLASTX searches of the NCBI database (as of May 1999). These putative protein-coding clones were used to screen a five-times version of the M. truncatula BAC library (Namet al. 1999) on the basis of DNA hybridization. DNA was isolated from the candidate M. truncatula BAC clones, digested with restriction enzymes, and analyzed following agarose gel electrophoresis by Southern hybridization against the corresponding soybean RFLP clones. This analysis allowed us to verify the original hybridization result and to identify putatively paralogous loci on the basis of a hybridization fingerprint. In total, 79 of the 121 soybean RFLP probes analyzed in this manner hybridized strongly to the M. truncatula BAC library. Seventy-three percent of these 79 soybean RFLP probes identified only one BAC contig, which we interpret as a single locus in M. truncatula. On the basis of similar reasoning, the remaining soybean RFLP clones identified either two (13%) or three loci (14%). These results are likely to represent an underestimate of gene copy number in M. truncatula, as not all BAC clones identified by a given probe were subjected to Southern blot analysis. The corresponding BAC clones were end sequenced and the information was used to develop 60 genetic markers. On a more limited scale, RFLP clones previously mapped in alfalfa or pea were also used to screen the M. truncatula BAC library for homologous loci and to develop genetic markers on the basis of a similar strategy. In cases in which RFLP clones from other species were used to identify M. truncatula BAC clones and derived genetic markers, their species affiliation is listed in Table 1.
Markers developed from sequences of predicted function: As a counterpart to selecting genes on the basis of BLAST analysis or DNA hybridization, genetic markers were also developed from sequences selected on the basis of their presumed function. The largest class of this marker type represents the nucleotide binding site-leucine-rich repeat superfamily of resistance gene analogs (see Zhuet al. 2002 for a comprehensive analysis). An additional 12 genes were selected for mapping on the basis of their possible role in plant-microbe interactions, including symbiotic nitrogen fixation (e.g., leghemoglobin, ENOD40, ENOD16, and rip1) and pathogenic associations (e.g., homologs of plant chitinase proteins).
Identification of polymorphisms and genotyping: Polymorphic loci were identified following PCR amplification and sequencing of alleles from M. truncatula genotypes A17 and A20, which served as parents of the mapping population used in this study (as selected by Penmetsa and Cook 2000). Seventeen length and 14 dominant polymorphisms were characterized and could be mapped by virtue of their inherent fragment size differences, or presence/absence criteria, between parental alleles. The remaining 257 polymorphisms were single-nucleotide differences between parental alleles. For the majority of SNPs, alleles were converted to CAPS markers. SNPs that could not be converted to CAPS markers were scored by direct sequencing of PCR products amplified from DNA of the segregating progeny.
In 60 EST markers, PCR primers were designed to anneal in conserved exon regions and to amplify across the more highly diverged intron regions. The closest Arabidopsis homolog was used to infer intron position and thereby aid primer design. This “intron-targeted” marker strategy assumes that polymorphisms will be more frequent in intron vs. exon regions. To test this assumption for the M. truncatula genotypes under analysis, we compiled intron and exon sequences for 47 of the intron-targeted markers. Pairwise alignments between the marker genomic sequences and the M. truncatula EST data at NCBI allowed us to distinguish exon from intron sequences and to calculate the relative divergence of each (Table 2). On the basis of this limited survey, the average intron size in M. truncatula was 161 bp, with a range of 78–747 bp, and the GT-AG rule for intron junctions was strictly conserved. As expected, polymorphisms were more frequent in intron sequences (on average, 1 SNP every 142 bp) than in the adjacent coding regions (on average, 1 SNP every 509 bp), with 80% of exon SNPs predicted to represent synonymous changes. In the case of 40 marker genes, we analyzed the correspondence between 64 empirically determined M. truncatula introns and the number and position of introns in the Arabidopsis homologs. We identified only a single discrepancy, namely a first intron in marker gene ASN2, present in Medicago but absent from the Arabidopsis homolog (At3g47340). The same first intron was present in six additional legume species (i.e., M. sativa, Pisum sativum, Phaseolus vulgaris, Vigna radiata, Lotus japonicus, and G. max) from which the ASN2 PCR product was sequenced (data not shown), indicating that the intron is ancestral to this group of Papilionoid legumes.
Genetic map construction: The genetic map shown in Figure 1 was derived from the analysis of 274 codominant and 14 dominant PCR-based genetic markers. In total, 93 F2 individuals from a cross between M. truncatula ecotypes A17 and A20 were genotyped. A skeleton version of this map was used previously to develop an integrated cytogenetic and genetic map of M. truncatula genotype A17 (Kulikovaet al. 2001), and thus the eight genetic linkage groups correspond to the individual chromosomes, with chromosome numbering derived from the corresponding linkage groups in M. sativa (Kisset al. 1993; Kaloet al. 2000), as determined below. By convention, the cytogenetically determined short chromosome arms define the top of each linkage group. The 288 genetic markers span 513 cM with an average distance between markers of 1.8 cM (Table 3). Although the estimated correlation between the physical and genetic distance is 970 kbp/cM, in practice this value varies according to the specific regions under analysis, with previous analyses of five distinct euchromatic chromosomal regions yielding values ranging from 200 to 1100 kb/cM (Anéet al. 2002; Gualtieriet al. 2002; Schnabelet al. 2003).
A total of 177 codominant markers with complete genotype information were designated as “framework” markers (Figure 1). The majority of framework markers segregated as expected for codominant (1:2:1) alleles; however, 32% (56/177) of the markers exhibited distorted segregation, with the expected frequency of heterozygous individuals but overrepresentation of one homozygous state and underrepresentation of the other. In all cases, distorted marker segregation identified regions of multiple markers with abnormal ratios of alleles. In addition to linkage groups 4 and 8, which are discussed in greater detail below, three markers (i.e., ppPF, NCAS, and TUP) on the short arm of chromosome 1 exhibited an excess of A17 homozygotes; 11 contiguous markers on the long arm of chromosome 3 (i.e., GSb through DK273L) exhibited an excess of A20 homozygotes; and two markers on the long arm of chromosome 7 (i.e., VBP1 and ENOL) exhibited an excess of A17 homozygotes.
In the initial analysis, six well-defined linkage groups could be identified. These linkage groups were characterized by normal Mendelian segregation of marker loci (with the exception of the regions noted above), as shown by example for linkage group 2 (Figure 2a). The integrity of each of these six linkage groups (i.e., linkage groups 1, 2, 3, 5, 6, and 7) was confirmed previously (Kulikovaet al. 2001) by FISH studies in which multiple BAC clone probes from each linage group could be assigned to a single pachytene chromosome.
In contrast to the situation for the six linkage groups mentioned above, the 55 additional marker loci resolved unexpectedly into four linkage blocks. A majority of these loci exhibited distorted segregation ratios, with an excess of A20 homozygotes and an underrepresentation of A17 homozygotes, as shown in Figure 2, b and c. Two lines of genetic evidence suggest that these 55 genetic markers belong to two linkage groups. First, we mapped 26 of these loci on the genetic linkage map of the closely related M. sativa, where they resolved into two well-defined linkage groups (Ms LG4 and Ms LG8, respectively), as described below. Second, selected marker loci from within the distorted regions were genotyped in the M. truncatula segregating population derived from genotypes A17 and DZA315 used by Thoquet et al. (2002) for construction of an AFLP- and RAPD-rich genetic map. In each case, the markers mapped to M. truncatula linkage groups that had been previously determined to correspond to the counterparts of M. sativa linkage groups 4 and 8 (G. Kiss, personal communication).
To test the assumption that these markers correspond to loci on chromosomes 4 and 8, respectively, of M. truncatula genotype A17, we used FISH to determine the physical location of 16 of these markers in pachytene chromosome spreads (Figure 3, a–e). As a prelude to this analysis, each genetic marker was converted to a corresponding BAC clone contig by hybridizing PCR fragments to high-density filters of the M. truncatula BAC library (Namet al. 1999) or by PCR analysis of a BAC library DNA multiplex. A total of 16 BAC clones were used as probes for FISH analysis, as highlighted in Figure 3, b and c. Initially, we observed that BAC clones 34J06 and 43B05 gave signals on different chromosomes. One of these chromosomes, containing 34J06, could be identified as chromosome 4 according to our knowledge of centromere position and location of a diagnostic repeat, MtR1, in the pericentromeric heterochromatin of the short arm (data not shown). A first series of hybridizations was performed with five BACs, four of which, namely 10F20, 1P05, 5K15, and 41H08, were mapped in a previous study (Kulikovaetal. 2001). In a second series of hybridizations, a new set of probes, including 15B23, 06B09, 66M02, 34J06, 47M03, 70L14, and 11C13, was used. All of these BAC clones mapped to chromosome 4. The individual hybridization patterns are shown in Figure 3, a–c, while a composite diagram integrating genetic and cytogenetic data for linkage group 4 is shown in Figure 3f. On the basis of a similar set of analyses, five BAC clones, 43B05, 22O13, 69K21, 50M17, and 10M16, were positioned on chromosome 8. The individual hybridizations are shown in Figure 3, c–e, with a composite summary of the genetic and cytogenetic data for linkage group 8 shown in Figure 3g.
Comparative linkage analysis between M. truncatula and M. sativa: Constructing a comparative map between M. truncatula and M. sativa was facilitated by the high level of nucleotide conservation between these two species, which allowed the direct application of genetic markers in either direction. Of 81 markers analyzed, 68 were successfully mapped. For the remaining 13 markers, 4 primer pairs failed to amplify M. sativa DNA, 2 markers lacked polymorphism, and 7 markers generated uninterpretable sequence (probably mixtures of multiple loci). As shown in Figure 4, the marker alignment between the two Medicago maps reveals an extremely high level of synteny between M. truncatula and M. sativa, including the distorted regions of M. truncatula linkage groups 4 and 8, described above.
Despite the overall high level of similarity, several differences were noted. One apparent difference was the position of a 5S rDNA locus. In M. truncatula, a 5S rDNA locus mapped to LG5, while in M. sativa a 5S rDNA locus was mapped to LG4. However, cytogenetic analysis indicates the presence of three 5S rDNA loci in M. truncatula genotype A17 on LG2, LG5, and LG6 (Kulikovaet al. 2001), while the number of 45S rDNA loci has been observed to vary between genotypes of M. truncatula (T. Bisseling and O. Kulikova, personal communication). The position and number of 5S rDNA loci has also been observed to vary between ecotypes of A. thaliana (Franszet al. 1998), so it should not be surprising to find such a difference between species of the same genus.
We noted two additional differences that are likely to be more substantive than those of the rDNA loci, described above. The PCT primers listed in Table 1 identified a single locus on M. truncatula linkage group 4. However, Southern blot analysis of M. truncatula genomic DNA using the PCT PCR fragment as probe identified four putative paralogous sequences that hybridized to the PCT marker. One of these loci was polymorphic and mapped to linkage group 2 (Figure 4), while the other three fragments were not polymorphic for the enzymes used. In M. sativa, only one hybridizing locus was evident, corresponding to a polymorphic, single locus at the syntenic position on linkage group 2 (Figure 4). In a second case, the NUM1 gene was mapped to LG4 in M. truncatula by means of NUM1-specific primer pairs. Using the same primers in M. sativa, an ∼2-kbp nonpolymorphic fragment was amplified. The gel-purified fragment was used as a probe to map NUM loci in both M. truncatula and M. sativa by means of RFLP. The hybridization pattern of M. truncatula identified two loci, Mt-NUM1 on LG4 and Mt-NUM2 on LG8. The location of the Mt-NUM1 locus on LG4 corresponded to the locus mapped by CAPS. By contrast, the hybridization pattern of the NUM1 probe in M. sativa was complex, generating >30 bands. The deduced genotypes generated at least five polymorphic loci, of which one (Ms-NUM1) mapped to LG4 and the other (Ms-NUM2) mapped to LG8. The middle repetitive-like hybridization patterns of PCT in M. truncatula and of NUM1 in alfalfa suggest that PCT and NUM sequences may have evolved differently in these two closely related plant species.
In this study, we positioned 288 sequence-based markers on the genetic map of M. truncatula, covering 513 cM. Each linkage group contained an average of 36 markers, with a range of 27–47 (Table 3). Thoquet et al. (2002) recently published a genetic map of M. truncatula that spans 1125 cM and is composed of 289, predominantly RAPD and AFLP, genetic markers. The difference in total genetic distance covered by the two mapping efforts may derive from inherent differences in mapping parents [A17 and DZA315 in the case of Thoquet et al. (2002) and A17 and A20 in the present study] and also from the marker types used. Thus, in contrast to mapping expressed genes as codominant markers, which was the focus of the current study, the AFLP and RAPD strategy used by Thoquet et al. (2002) maps anonymous loci that typically exhibit dominant inheritance. Although it is likely that the two strategies surveyed different regions of the genome, both efforts produced eight well-resolved linkage groups that could be readily aligned with the eight genetically defined linkage groups of diploid M. sativa (Kaloet al. 2000; Thoquetet al. 2002). Efforts to link the two genetic maps of M. truncatula based on simple sequence repeat markers derived from ESTs and sequenced BAC clones are currently underway.
Because the genetic markers used in this study are primarily expressed sequences or BAC clones that contain predicted genes, their position in the genome can be considered to provide a rough definition of the “gene space” of M. truncatula. On the basis of cytogenetic analysis (Kulikovaet al. 2001), the structure of M. truncatula chromosomes is apparently relatively simple, with condensed heterochromatic DNA in centromeric and pericentromeric islands, flanked by mostly euchromatic arms. In the process of constructing the cytogenetic map for this species, >60 of the EST-containing BAC clones genetically mapped in this study also have been mapped to euchromatic regions of the M. truncatula genome by means of FISH (Kulikovaet al. 2001; O. Kulikova and T. Bisseling, personal communication). These results indicate a high level of correspondence between euchromatin and transcribed genes, reminiscent of the relationship observed in A. thaliana where >96% of the transcribed genes are contained within euchromatic regions of the genome (Arabidopsis Genome Initiative 2000). Consistent with this hypothesis, the average predicted gene density for the 92 genetically mapped and sequenced, EST-containing BAC clones is ∼1 gene/6 kbp (B. A. Roe and D. Kim, personal communication). Thus, the correspondence of sequenced BAC clones with genetically mapped loci expands the total number of ESTs and predicted genes on the genetic map to ∼1800. The accession numbers for these sequenced BAC clones are given in Table 1.
In addition to the mapping of ESTs or BAC clones selected strictly on the basis of homology criteria, the genetic positions of five phenotypic markers associated with nodulation, dmi1, dmi2, dmi3, sun, and skl, are shown in Figure 1. Map positions were determined by virtue of the fact that the genetic markers developed in this study were used to map the respective loci in F2 populations of mutant A17 × wild-type A20. With the exception of the skl locus (Penmetsa and Cook 1997), located on the long arm of chromosome 7, the map locations of the other loci have been previously reported (Anéet al. 2002; Endreet al. 2002; Schnabelet al. 2003) and the information is included here for purposes of integration. Interestingly, recent evidence from physical map data and complete sequencing of a 500-kb BAC contig indicates that dmi1 is immediately adjacent to the telomere (Anéet al. 2004), and thus this locus defines a genetic and physical terminus of this linkage group. In addition to genes implicated in nodulation based on phenotypic criteria, we also mapped several genes whose expression patterns are correlated with nodule development or function. Several of these genes, including ENOD40 (Yanget al. 1993; Crespiet al. 1994), the Rhizobium-induced peroxidase (rip1; Cooket al. 1995), and the leghemoglobin gene LB1 (Gallusciet al. 1991), map to LG5, which also contains dmi2 (Endreet al. 2002) and the syntenic counterpart of the Sym2 region of P. sativum (Gualtieriet al. 2002; Limpenset al. 2003). Despite the apparent abundance of nodulation-associated genes on linkage group 5, several nodulation genes (nodule expressed transcripts and phenotypically mutant loci) are distributed elsewhere in the genome, including a cluster of ENOD8-like genes on linkage group 1 (Dicksteinet al. 2002), ENOD16 on linkage group 8, a cluster of apyrase genes on linkage group 7 (Cohnet al. 2001), and sunn, skl, dmi1, and dmi3 on linkage groups 4, 7, 2, and 8, respectively.
In addition to genes implicated in symbiosis, >100 resistance gene analogs have been previously mapped to 67 separate loci (Zhuet al. 2002). The majority of markers for toll/interleukin receptor (TIR) and non-TIR NBS-LRR resistance gene analogs are clustered, with major clusters identified on the short arm of linkage group 3 and throughout linkage group 6. Interestingly, in the absence of the resistance gene analog markers, linkage group 6 contains only 10 genetic markers and fails to coalesce as a distinct linkage group. Thus, linkage group 6 is threefold underrepresented in the number of non-RGA genes compared to the seven other linkage groups, while containing 33% of all mapped RGA loci. Chromosome 6 is also unusual in the respect that it is the shortest and most heterochromatic of all M. truncatula chromosomes (Kulikovaet al. 2001).
The genetic map of M. truncatula was difficult to interpret for linkage groups 4 and 8. In each of these cases significant deviation from Mendelian segregation was observed, with A20 homozygotes significantly overrepresented in the populations (Figure 2, b and c). On the basis of a combination of comparative genetic mapping in M. sativa and analysis of an alternate mapping population of M. truncatula (Thoquetet al. 2002), we were able to resolve the genetic relationships between these two linkage groups. FISH analysis with genetically mapped BAC clones was used to verify the predicted marker order and linkage group assignments, while color mapping was used to determine that the recombination map was consistent with the interpretations from these analyses. The value of 32% distorted marker segregation observed in this study is similar to the 25% distorted segregation reported by Thoquet et al. (2002). Moreover, in both studies a cluster of markers with distorted segregation was observed on chromosome 3, including the common marker locus “GSb,” suggesting a possible contribution from the common parental background of genotype A17. By contrast, the distorted marker segregation for linkage groups 4 and 8 observed in this study was not evident in the Thoquet et al. (2002) analysis, suggesting a possible incompatibility between A17 and A20 alleles in these genome regions. In the case of M. sativa, which is an outcrossing species, segregation distortion is typified by an overabundance of the heterozygous genotype (Kaloet al. 2000). This contrasts with the overabundance of paternal (homozygous A20 or A17) genotypes, described in this study.
The ultimate goal of constructing this genetic map was to describe structural/genetic features of the genome of M. truncatula. We anticipate that an EST-based genetic map will also have utility for the many map-based cloning projects currently underway in M. truncatula. Finally, a sequence-based genetic map of M. truncatula should have utility for comparison of genome structure between legume species and thus for the characterization of traits with potential application to agriculture in legumes. We have documented a high degree of conservation in gene content and order between the genomes of diploid M. sativa (alfalfa) and M. truncatula, suggesting that the current genetic map and ongoing genome sequencing of M. truncatula will have significant utility for defining genome organization in cultivated alfalfa (Brouwer and Osborn 1999). Moreover, we anticipate that many of the gene-based genetic markers developed in this study will have applications for comparative mapping to other related legume species.
This research was funded by National Science Foundation Plant Genome Award DBI-0196179 to D.R.C. and D.K. and by grants to G.B.K. from the European Union (QLG2-CT-2000-30676) and the Hungarian National Research and Development Program (OM 4/023/2001, T038211, and OMFD-00229/2002).
Communicating editor: A. H. Paterson
- Received August 18, 2003.
- Accepted December 12, 2003.
- Copyright © 2004 by the Genetics Society of America