| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 172, 2541-2555, April 2006, Copyright © 2006
doi:10.1534/genetics.105.054791
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






,**


* Department of Plant Pathology, University of California, Davis, California 95616,
Laboratoire des Interactions Plantes-Microorganismes, INRA-CNRS, 31326 Castanet-Tolosan Cedex, France,
Department of Plant Pathology, University of Minnesota, St. Paul, Minnesota 55108,
Biological Research Center, Institute of Genetics, H-6701 Szeged, Hungary, ** Institute of Genetics, Agricultural Biotechnology Center, 2100 Godollo, Hungary and 
Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019
1 Corresponding author: Department of Plant Pathology, University of California, 1 Shields Ave., Davis, CA 95616.
E-mail: drcook{at}ucdavis.edu
| ABSTRACT |
|---|
|
|
|---|
20 nt) microsatellite every 12 kbp, while the frequency of individual motifs varied according to the genome fraction under analysis. A total of 1,236 microsatellites were analyzed for polymorphism between parents of our reference intraspecific mapping population, revealing that motifs (AT)n, (AG)n, (AC)n, and (AAT)n exhibit the highest allelic diversity. A total of 378 genetic markers could be integrated with sequenced BAC clones, anchoring 274 physical contigs that represent 174 Mbp of the genome and composing an estimated 70% of the euchromatic gene space.
The utility of M. truncatula as a genetic system (e.g., PENMETSA and COOK 2000), combined with its relatively small (466 Mb; BENNETT and LEITCH 1995) and efficiently organized genome (KULIKOVA et al. 2001, 2004), have motivated an international effort to develop and apply the tools of genomics in M. truncatula to key questions in legume biology. One aspect of this effort has been the development of enabling methodologies, such as efficient transformation methods (TRINH et al. 1998; KAMATÉ et al. 2000; ZHOU et al. 2004), high-throughput systems for forward and reverse genetics, including insertional mutagenesis (D'ERFURTH et al. 2003), RNAi (LIMPENS et al. 2003, 2004), and TILLING (VANDENBOSCH and STACEY 2003), and an effective network among research groups (http://www.medicago.org). In parallel to these activities, national and international programs are collaborating to characterize the genome of M. truncatula at the transcript (FEDOROVA et al. 2002; JOURNET et al. 2002; LAMBLIN et al. 2003), protein (GALLARDO et al. 2003; WATSON et al. 2003; IMIN et al. 2004), and whole genome sequence levels (YOUNG et al. 2005).
Cytogenetic and genetic data predict that the genome of M. truncatula is organized into separate gene-rich euchromatic arms and gene-poor heterochromatic pericentromeric regions (KULIKOVA et al. 2001, 2004; CHOI et al. 2004a). These results underlie a strategy for sequencing the M. truncatula genome wherein the euchromatic chromosome arms are first delimited within a physical map and then subjected to a BAC-by-BAC sequencing approach. As of March 2004, 44,292 BACs (
11x coverage) had been fingerprinted by HindIII digestion and agarose gel electrophoresis. An initial stringent build of the map yielded 1370 contigs with an average length of 340 kbp, covering an estimated 466 Mbp or 93% of the genome. In parallel to the development of a physical map, >800 EST-containing BAC clones were sequenced to provide seed points from which to continue the whole genome sequencing effort. Sites of potential sequence polymorphism within the initial BAC sequence data are being used to facilitate merger of the genetic and physical maps, while the resulting chromosome assignments are being used to guide the distribution of BACs to sequencing centers.
A major focus of the genetic mapping effort is short tandem repeats, also known as simple sequence repeats (SSRs) or microsatellites. These repetitive sequences consist of direct tandem repeats of short (110 bp) nucleotide motifs. Unequal recombination between SSRs and slip-mispairing during DNA replication (SIA et al. 1997) result in polymorphism rates that tend to be much greater than those observed for nonrepetitive DNA sequences. The high rate of mutation combined with low selection coefficients on variant alleles result in extreme allelic diversity at microsatellite loci (ROSS et al. 2003).
Identification of SSRs in DNA sequence databases can be automated by use of public software programs, such as SSRIT (TEMNYKH et al. 2001). Moreover, because SSR alleles are typically codominant and their polymorphisms can be scored either in a simple agarose gel format or in high-throughput capillary arrays, they are frequently the molecular marker of choice for construction of genetic maps. Estimates suggest that 15% of plant ESTs contain SSRs longer than 18 nucleotides (KANTETY et al. 2002). Thus, development of ESTSSR markers has become commonplace in a wide variety of plant species (CORDEIRO et al. 2001; KANTETY et al. 2002; SHAROPOVA et al. 2002; DECROOCQ et al. 2003; THIEL et al. 2003), including Medicago spp. (JULIER et al. 2003; EUJAYL et al. 2004; GUTIERREZ et al. 2005; SLEDGE et al. 2005). SSRs are even more abundant in the noncoding regions of genomic sequences, providing a rich source of genetic markers to map sequenced genome regions (CARDLE et al. 2000). In rice, for example, genomic-SSR markers identified from BAC sequences provided immediate links between genetic, physical, and sequence-based maps (TEMNYKH et al. 2001).
In this article we report the characteristics of perfect microsatellites within the genome of M. truncatula. Genetic markers developed from SSRs in BAC sequences were incorporated into the M. truncatula genetic map, simultaneously anchoring a predicted majority of the euchromatic portion of the physical map to chromosomal loci. In total, we analyzed 77 Mbp of genomic sequence (16.5% of the genome) obtained from gene-rich BAC clones, 27 Mbp of nonredundant transcript sequence, 20 Mbp of low pass random whole genome shotgun data, and 49 Mbp of BAC-end sequences for the presence of perfect SSRs. The resulting data set allowed comparison of SSR frequency, length, motif structure, and distribution between genic and nongenic fractions of the genome. We also compared the distribution of SSRs in the M. truncatula genome to that of other legumes (soybean and L. japonicus) and model plants (Arabidopsis and rice).
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
400 bases to either side) were extracted for primer design. The Primer3 software was configured to design five sets of oligonucleotide primers flanking each SSR with a target amplicon size range of 100300 bp. Primer specifications were melting temperature (Tm)
5763° (target 60°) with
Tm <1° for each primer pair and a primer length of
1827 nucleotides (target 20 nucleotides). Three oligonucleotide sets were generally tested to discover polymorphisms for each BAC clone. PCR was performed in a total volume of 10 µl [10 ng of genomic template DNA, 1x PCR buffer, 2.5 mM MgCl2, 0.25 mM of each dNTPs, 5 µM of each primer, and 0.5 unit of Taq DNA polymerase (Invitrogen)] with a temperature profile of 3 min at 95°, 35 cycles of 2030 sec at 9495°, 2030 sec at 55°, 1 min at 72°, and a final 5 min extension step at 72°. PCR products were resolved on a 24% agarose gel and bands were visualized by staining with ethidium bromide. Primers that produced easily scored polymorphisms (length variation and dominant inheritance) were selected as genetic markers for mapping. In some cases, BAC clones were mapped on the basis of simple length polymorphisms, single strand conformational polymorphisms (SSCP), or differential restriction sites (i.e., cleavable amplified polymorphic sequences or CAPS) identified between the two parental alleles. SSCP analysis was performed according to VINCENT et al. (2000), with silver staining of polyacrylamide gels according to BASSAM and CAETANO-ANNOLES (1993).
Mapping of SSRsintegration of sequenced BAC clones into the genetic map:
To facilitate genotyping and map integration, a subset of 69 individuals from an earlier mapping population (CHOI et al. 2004a) was used. The genetic map reported by CHOI et al. (2004a,b) included 288 sequence-characterized genetic markers on the same base-mapping population. Using this strategy we integrated 320 new SSR markers and 29 non-SSR markers into the existing genetic map. Plant genomic DNA was extracted using the DNeasy Plant 96 Kit (QIAGEN) according to the manufacturer's directions. For purposes of marker genotype analysis, the F2 DNAs were analyzed in parallel with three control DNAs (A17 maternal homozygous line, A20 paternal homozygous line, and F1 heterozygote DNA). The PCR products were resolved as described above and genotypes were recorded as follows: homozygous maternal (A17) "A", homozygous paternal (A20) "B", heterozygous "H", not A "C", not B "D", and missing data "-". Genotypes for all markers were integrated into a color-coded genotype matrix using Excel (KISS et al., 1998). Markers were assigned to chromosomes using the "Make Linkage Groups" command of Map Manager QTX (MANLY et al. 2001). Genetic distances were calculated on the basis of the Kosambi function. Markers with an LOD > 3.0 were integrated into a framework map, while those with LOD < 3.0 or ambiguous genotypes were tentatively assigned to intervals by visual inspection of the color-coded genotype matrix. In addition to mapping BAC clones by means of SSRs, we also used BLASTN to compare the sequences of previously mapped genetic markers (CHOI et al. 2004a,b) with sequenced BAC clones of M. truncatula. In cases where BLASTN results revealed perfect matches, genetic markers and BAC clones were assumed to represent the same locus.
| RESULTS |
|---|
|
|
|---|
20 bp; class II, 12 to
19 bp). SSRs with lengths of 20 nucleotides and greater tend to be highly mutable (TEMNYKH et al. 2001), while SSRs with lengths between 12 and 19 nucleotides tend to be moderately mutable (PUPKO and GRAUR 1999).
Frequency of perfect microsatellites in genomic DNA sequence:
The frequency of perfect microsatellites in Medicago genomic DNA is shown in Table 2, along with similar calculations for soybean, L. japonicus, Arabidopsis, and rice. Despite differences in the nature and quantity of genomic sequence analyzed, the major trends were similar across species. Thus, class II SSRs (1219 nt) were the most abundant microsatellites and occurred at similar frequencies in all five species, with an average density of one SSR every 0.60.7 Mbp. In Medicago, hexa- and heptanucleotide repeats accounted for 65% of these short genomic microsatellites, with di- and pentanucleotide repeats being the most infrequent. These same patterns characterize the other four genomes. The major evident differences between the monocot (rice) and dicot (Medicago, Lotus, soybean, and Arabidopsis) species were a twofold increase in the frequency of trinucleotide repeats and an underrepresentation in the frequency of mononucleotide repeats in rice compared with dicots.
|
Frequency of perfect microsatellites in transcript sequence:
For analysis of transcript data, we compared SSR frequencies in two data sets: bulk nonclustered ESTs and the NCBI unigene set. As shown in Table 2, despite the redundant and asymmetric nature of bulk EST data, the relative and absolute frequencies of microsatellites showed good correspondence between the bulk EST and NCBI unigene data sets. Moreover, as in the case of genomic DNA, trends were similar between species.
Class II SSRs were significantly more abundant (i.e., one SSR every 0.61.0 Mbp) in transcript data compared to their class I counterparts (i.e., one SSR every 1339 Mbp), similar to the situation observed in genomic DNA. Thus, 5491% of bulk EST sequences contained class II SSRs, depending on the species under analysis, while only 13% of ESTs contained class I SSRs. The most abundant class II SSRs were tri-, hexa- and heptanucleotide motifs, consistent with observations made in a wide range of species (ELLEGREN 2004), while class I SSRs were most frequently repeats of di- and trinucleotide motifs. On the basis of analysis of the NCBI unigene set, the frequency of class I and class II SSRs is similar in the transcript data of all four dicot species, and substantially less frequent than that observed in rice.
Class I SSRsfrequency of individual motifs:
To compare the frequency of specific long-repeat motifs within and between genomes, we examined each of the 16 possible mononucleotide, dinucleotide, and trinucleotide motifs of class I SSRs in each of the five species (Table 3). In all species, the abundance of dinucleotide repeats in genomic DNA (Table 2) could be attributed to an overrepresentation of AT motifs; soybean in particular exhibits a two- to threefold increase in AT-motif frequency relative to the other four species analyzed. By contrast, the high frequency of dinucleotide repeats in EST sequences could be attributed to an abundance of AG repeats (Table 3). The frequency of AG-balanced repeats in bulk EST data was especially high in legumes, with values two- to threefold higher than their frequency in rice and Arabidopsis.
|
In the case of trinucleotide repeats, the dicot species contained higher frequencies of AT-rich repeats in both genomic DNA and EST sequence relative to rice. Soybean in particular possessed an
10-fold increase in the genomic AAT trinucleotide motif relative to Medicago and Lotus and a 20- to 40-fold increase relative to rice and Arabidopsis. The opposite was true for GC-rich trinucleotide repeats, which were the predominant trinucleotide motif in rice (KANTETY et al. 2002) and either rare or absent from the dicot genomes. Perfect repeats with motifs longer than trinucleotides (i.e., tetranucleotide to octanucleotide repeats) were predominantly AT-rich motifs in all of genomes analyzed (data not shown).
Distribution of class I microsatellites in the genome of M. truncatula:
To characterize the spatial distribution of class I repeats with respect to genic and nongenic features of the M. truncatula genome, we examined the distribution of perfect microsatellites >20 nt in (1) 51 completely sequenced and annotated gene-rich BAC clones (6.3 Mbp), (2) a random low-pass whole genome shotgun data set (20 Mbp), and (3) a random BAC-end sequence data set (49 Mbp; Table 4). The complete BAC clone sequences used for analysis were part of a larger data set of 778 sequenced BAC clones. These 778 BAC clones were selected to represent euchromatic (presumably gene-rich) regions of the genome on the basis of a combination of genetic and cytogenetic mapping (CHOI et al. 2004a; KULIKOVA et al. 2001) or on the basis of homology to transcript sequences. We first determined that the frequency of SSRs in the 51 annotated BACs (Table 4, row 4) was not significantly different from that of the larger data set of 778 sequenced BAC clones (Table 2, class I SSRs, row 1) (Pearson
2 = 1.23, d.f. = 7, at
= 0.05).
|
60% of the genome can be attributed to repeat-rich and gene-poor heterochromatin located within pericentromeric regions of the genome (KULIKOVA et al. 2004). As described above, the completely sequenced BAC clones were intentionally enriched for gene-rich euchromatic DNA, while the whole genome-shotgun and BAC-end sequence data sets were derived from randomly selected clones that are presumably more representative of the genome as a whole. Comparison of these three genomic data sets revealed that, with the exception of mononucleotide repeats, SSR frequency was 2.3- to 1.4-fold higher in gene-rich BAC clones (63.2 SSR/Mbp) compared to that of random whole genome shotgun sequences (27.3 SSR/Mbp) or random BAC-end sequences (44.2 SSR/Mbp). The finding that SSRs have intermediate frequency in the BAC-end sequence data suggests that the BAC library used for end sequencing might be enriched for gene-rich regions of the genome. This conclusion is supported by the observation that the major classes of centromere-like tandem repeats (i.e., MtR1, MtR2, and MtR3), which together compose 7% of the genome (KULIKOVA et al. 2004), are underrepresented in BAC-end sequence data (data not shown). As a further test of this conclusion, we analyzed SSR frequency in the portion of the shotgun sequence data set with homology to the tandemly arrayed centromere-like repeats, MtR1, MtR2, and MtR3. SSR frequency in this repetitive genome fraction was 7.0 SSR/Mbp, or ninefold less frequent than values obtained with completely sequenced BAC clones. The association of class I SSRs with gene-rich fractions of the genome was also evident in the comparison of BAC-end sequences having homology to ESTs vs. those without homology to ESTs. In particular, BAC-end sequences with BLASTN similarity to ESTs of M. truncatula had
10% higher average SSR frequencies (46.0 SSR/Mbp) than that of BAC-end sequences without BLASTN similarity (42.4 SSR/Mbp). These data are in agreement with the previous report of MORGANTE et al. (2002), in which SSRs were observed to be preferentially associated with the nonrepetitive fractions of plant genomes.
To correlate SSRs with specific genic and nongenic fractions, we annotated the 51 completely sequenced BAC clones by means of the dicot version of FGENESH and assigned five categories of sequence, namely, (1) nontranscribed, (2) 5'-untranslated exon (5'-UTR), (3) coding exon, (4) intron, and (5) 3'-untranslated exon (3'-UTR). The 51 BAC clones contained an average of 20.3 predicted genes per clone, with 1 gene per 6.0 kbp. As shown in Table 4, class I SSRs were slightly more frequent in predicted nontranscribed compared to predicted transcribed regions of gene-rich BAC clones, due primarily to a higher frequency of mononucleotide and dinucleotide repeats. However, SSR frequency varied considerably between the different predicted transcribed fractions (
2 = 57.35, d.f. = 21, P < 0.001). Most SSRs in transcribed regions were detected in 5'- and 3'-untranslated fractions and within introns, with the highest SSR frequency in 5'-UTRs, which were characterized by elevated levels of di-, penta-, hexa-, and heptanucleotide motifs. Predicted exons were substantially underrepresented in all SSR motif lengths, with the exception of trinucleotide and hexanucleotide repeats. Figure 1 presents the distribution of the eight most abundant SSR motifs relative to the five genome fractions. Consistent with the results shown in Table 3, AT-rich di- and trinucleotide motifs were more abundant in nontranscribed than in transcribed regions. This bias was also evident within transcribed regions, where AT-rich repeats were relatively abundant in transcribed nontranslated regions and essentially absent in exon sequences.
|
|
1/500 bp for exon sequences and
1/140 bp for intron sequences (CHOI et al. 2004a). Class I SSRs (559 or 81.7%) were significantly more polymorphic than class II SSRs (58 or 49.6%). The highest rates of polymorphism were observed for (AT)n, (AG)n, (AC)n, and (AAT)n motifs, the most abundant motifs in the M. truncatula genome. Polymorphism rates increased with the number of repeat units: 5-fold, <60%;
5- to 10-fold, 66%;
11- to 15-fold, 71%;
16- to 20-fold, 77%;
20-fold 82%.
Anchoring of sequenced BACs to the genetic map:
For purposes of integrating the BAC-based physical map of M. truncatula with the genetic and cytogenetic maps, the genotypes of 317 of 617 polymorphic SSRs were scored in a reference mapping population. The remaining 300 SSRs were considered redundant, as they were derived from BACs that were already mapped by means of other SSRs. A total of 71% of the mapped SSR polymorphisms were derived from dinucleotide repeats, and 29 additional markers were developed on the basis of CAPS, SSCP, or length polymorphisms associated with BAC clone sequences. As shown in Figure 2, these 346 new genetic markers were integrated into an existing genetic map of M. truncatula (CHOI et al. 2004a,b), bringing the total number of markers mapped in this population to 634, including 378 genetically mapped BAC clones. In total, these BAC-based markers integrate 274 BAC contigs from the M. truncatula physical map (Table 6). A detailed list of marker attributes and clone GenBank accession numbers is given in supplemental Table S1 (http://www.genetics.org/supplemental/).
|
|
150 Mbp of nonredundant genome sequences obtained as of August 2005,
130 Mbp of sequenced genome, representing an estimated 21,000 predicted genes, has been associated to chromosomal loci by means of genetic mapping of physical contigs. The extent of the physical map (including not-yet-sequenced BAC clones) associated to genetic loci is
242 Mbp, or 48% of the total genome and an estimated 88% of the predicted gene space. | DISCUSSION |
|---|
|
|
|---|
The frequency of class II SSRs in genomic DNA was similar across all plant genomes analyzed in this study (0.60.7 SSR/kbp, Table 2). By contrast, the frequency of class I SSRs was both lower and more variable across genomes. In particular, class I SSRs were 1.5- to 2.5-fold more frequent for soybean compared to the other genomes analyzed. This increase is correlated with the larger size of the soybean genome and also with the fact that the public genome sequence data for soybean is enriched in RFLP-derived genomic clones relative to the other species we analyzed. Although it is uncertain whether either of these factors is causal to the increased frequency of class I SSRs in soybean data, it is noteworthy that ROSS et al. (2003) have recently described the rapid divergence of microsatellite abundance among closely related species. WIERDL et al. (1997; and more recently KRUGLYAK et al. 1998; KATTI et al. 2001) proposed that the lower frequency of long vs. short SSRs may result from selection against mutagenic sites in the genome. It is possible that the polyploid nature of the soybean genome might reduce selection against long microsatellites due to the redundancy of homeologous regions, but if so then the relaxed selection must be specific to noncoding regions, as the frequency of class I SSRs within coding regions (i.e., the NCBI unigene set) was similar between soybean and the other dicot genomes (Table 2).
Analysis of individual class I SSR motifs revealed additional taxon-specific patterns, especially with respect to the types and distribution of dinucleotide and trinucleotide repeats. Thus, (AAT)n and (AG)n were overrepresented in the genomic and EST fractions, respectively, of legume species, but relatively underrepresented in Arabidopsis and rice (Table 3). By contrast, (GGC)n repeats were predominant in the rice genome but not in the dicot genomes. In general the rice genome exhibited a higher rate of class I SSRs (threefold) and class II SSRs (1.5-fold) within the unigene data set, indicating that rice is likely to be a relatively rich source of transcript-associated polymorphisms. Taxon-specific accumulation of repeats in eukaryotic genomes has been reported for several species (TÓTH et al. 2000; KATTI et al. 2001). The current results, and in particular the similarity between the related legume genomes, suggest that taxon-specific motifs originated after divergence of legumes from Arabidopsis and rice. Strand-slippage theories alone are insufficient to explain the differential abundance of specific motif types in different genomes. A positive selection pressure, such as a preference of codon usage in exons or a regulatory effect of specific repeats in noncoding regions, may underlie the taxa-specific accumulation of certain repeat motifs.
In contrast to the classical definition of SSRs as motifs of 16 bp in length (Tautz, 1989), the current analysis also considered motifs with lengths of 7 and 8 bp. The frequencies and distribution of hepta- and octanucleotide repeats were consistent with those observed for motifs of 16 bp, including correspondence across taxa (Table 2), a significantly higher frequency in class II compared to class I SSRs (Table 2), and a low frequency in exon regions (Table 4, except tri- and hexanucleotide repeats). Interestingly, motifs of 7 bp were the second most abundant class II SSR motif length, and they were as abundant as tetra-, penta-, and hexanucleotide motifs in class I SSRs.
The current analysis of SSR distribution in M. truncatula agrees with previous reports for dicot genomes in which the majority of SSRs were found to reside in the nontranscribed fraction of gene-rich regions or within the untranslated portions of transcripts (i.e., UTRs and introns). The rare Medicago SSRs in exons were typically AT-rich trinucleotide repeats (Figure 1). This contrasts to rice, in which GC-rich trinucleotide repeats were observed preferentially in exons (CHO et al. 2000).
The primary objective of this study was to integrate the physical and genetic maps of M. truncatula using microsatellites identified within sequenced BAC clones. By means of semiautomated SSR identification and primer design, 346 BAC clones have been incorporated into the existing genetic map, anchoring 174 Mbp of the physical map to genetic loci. During map integration, eight conflicts were identified between SSR-mapped BAC clones and previously inferred markerBAC relationships (CHOI et al. 2004a), as indicated in Figure 2. The possible origins of such conflicts include highly conserved duplicated genome segments, recently evolved gene paralogs, clone chimerism, and experimental error. Four of the conflicting relationships correspond to resistance gene clusters. Plant resistance genes are members of large gene families, often composed of recently derived paralogs, suggesting that these conflicts may arise from the misassignment of closely related genome regions that have distinct locations in the genetic map. The additional four conflicting BAC clone assignments may also derive from the misassignment of closely-related paralogous genes, as in each case the similarity between sequenced BAC clones and sequenced genetic markers was more consistent with paralogy than identity (8998% identity). Such conflicts will resolve with additional genetic mapping and the progress of the whole genome sequencing effort in M. truncatula.
We note that more detailed analyses of the M. truncatula genome, as well as the genomes of G. max (soybean) and L. japonicus, will become possible as their genomes are better characterized. For example, here we have used FGENESH to predict transcribed vs. nontranscribed regions of the genomes. Recently, the International Medicago Genome Annotation Group has established standards for automated gene prediction, which is likely to increase the accuracy of gene calls relative to the FGENESH tool we have used here. Similarly, even more robust annotations will ultimately derive from experimental approaches, such as those based on the sequencing of full-length cDNA clones for a majority of transcripts. The current work has contributed to an increased characterization of microsatellites in legumes and their comparison to that of other model plant species. Moreover, these data increase the genetic and genomic resources available in M. truncatula by adding a new category of BAC-associated genetic markers and by facilitating integration of genetic and physical maps. Of practical importance, the positioning of physical map contigs to specific locations on linkage groups, and to cytogenetically defined chromosomes (e.g., KULIKOVA et al., 2001, 2004; CHOI et al., 2004a), greatly aids the current genome-sequencing effort in which BACs are distributed according to chromosome assignments (YOUNG et al., 2005). These microsatellite markers also provide tools to validate contig structure and orientation as a prelude to selection of BAC clones for sequencing. Although the ultimate goal of genome sequencing in M. truncatula is to produce pseudo-chromosome arms that cover the entire euchromatic space of M. truncatula (outlined in YOUNG et al., 2005), a more immediate deliverable will be an assembly of ordered and oriented sequenced BAC contigs. Genetic mapping of sequenced BAC clones, largely based on the SSR strategy described here, is crucial to achieving these goals.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| LITERATURE CITED |
|---|
|
|
|---|
BASSAM, B. J., and G. CAETANO-ANOLLES, 1993 Silver staining of DNA in polyacryamide gels. Appl. Biochem. Biotechnol. 42: 181188.
BENNETT, M. D., and I. J. LEITCH, 1995 Nuclear DNA amounts in angiosperms. Ann. Bot. 76: 113176.
CARDLE, L., L. RAMSAY, D. MILBOURNE, M. MACAULAY, D. MARSHALL et al., 2000 Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics 156: 847854.
CHO, Y. G., T. ISHII, S. TEMNYKH, X. CHEN, L. LIPOVICH et al., 2000 Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice (Oryza sativa L.). Theor. Appl. Genet. 100: 713722.[CrossRef]
CHOI, H., D. KIM, T. UHM, E. LIMPENS, H. LIM et al., 2004a A sequence-based genetic map of Medicago truncatula and comparison of marker co-linearity with Medicago sativa. Genetics 166: 14631502.
CHOI, H.-K., J.-H. MUN, D.-J. KIM, H. ZHU, J.-M. BAEK et al., 2004b Estimating genome conservation between crop and model legume species. Proc. Natl. Acad. Sci. USA 101: 1528915294.
CORDEIRO, G. M., R. CASU, C. L. MCINTYRE, J. M. MANNERS and R. J. HENRY, 2001 Microsatellite markers from sugarcane (Saccharum spp.) ESTs cross transferable to erianthus and sorghum. Plant Sci. 160: 11151123.[Medline]
DECROOCQ, V., M. G. FAVE, L. HAGEN, L. BORDENAVE and S. DECROOCQ, 2003 Development and transferability of apricot and grape EST microsatellite markers across taxa. Theor. Appl. Genet. 106: 912922.[Medline]
D'ERFURTH, I., V. COSSON, A. ESCHSTRUTH, H. LUCAS, A. KONDOROSI, et al., 2003 Efficient transposition of the Tnt1 tobacco retrotransposon in the model legume Medicago truncatula. Plant J. 34: 95106.[CrossRef][Medline]
ELLEGREN, H., 2004 Microsatellites: simple sequences with complex evolution. Nat. Rev. Genet. 5: 435445.[CrossRef][Medline]
EUJAYL, I, M. K. SLEDGE, L. WANG, G. D. MAY, K. CHEKHOVSKIY et al., 2004 Medicago truncatula EST-SSRs reveal cross-species genetic markers for Medicago spp. Theor. Appl. Genet. 108: 414422.[CrossRef][Medline]
FEDOROVA, M., J. VAN DE MORTEL, P. A. MATSUMOTO, J. CHO, C. D. TOWN et al., 2002 Genome-wide identification of nodule-specific transcripts in the model legume Medicago truncatula. Plant Physiol. 130: 519537.
GALLARDO, K., C. LE SIGNOR, J. VANDEKERCKHOVE, R. D. THOMPSON and J. BURSTIN, 2003 Proteomics of Medicago truncatula seed development establishes the time frame of diverse metabolic processes related to reserve accumulation. Plant Physiol. 133: 664682.
GUTIERREZ, M. V., M. C. VAZ PATTO, T. HUGUET, J. I. CUBERO, M. T. MORENO et al., 2005 Cross-species amplification of Medicago truncatula microsatellites across three major pulse crops. Theor. Appl. Genet. 110: 12101217.[CrossRef][Medline]
IMIN, N., F. DE JONG, U. MATHESIUS, G. VAN NOORDEN, N. A. SAEED et al., 2004 Proteome reference maps of Medicago truncatula embryogenic cell cultures generated from single protoplasts. Proteomics 4: 18831896.[CrossRef][Medline]
JOURNET, E. P., D. VAN TUINE, J. GOUZY, H. CRESPEAU, V. CARREAU et al., 2002 Exploring root symbiotic programs in the model legume Medicago truncatula. Nucleic Acids Res. 30: 55795592.
JULIER, B., S. FLAJOULOT, P. BARRE, G. CARDINET, S. SANTONI et al., 2003 Construction of two genetic linkage maps in cultivated tetraploid alfalfa (Medicago sativa) using microsatellite and AFLP markers. BMC Plant Biol. 3: 9.[CrossRef][Medline]
KAMATÉ, K., I. D. RODRIGUEZ-LLORENTE, M. SCHOLTE, P. DURAND, P. RATET et al., 2000 Transformation of floral organs with GFP in Medicago truncatula. Plant Cell Rep. 19: 647653.[CrossRef]
KANTETY, R. V., M. LA ROTA, D. E. MATTHEWS and M. E. SORRELLS, 2002 Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol. Biol. 48: 501510.[CrossRef][Medline]
KATTI, M. V., P. K. RANJEKAR and V. S. GUPTA, 2001 Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol. Biol. Evol. 18: 11611167.
KISS, G. B., A. KERESZT, P. KISS and G. ENDRE, 1998 Colormapping: a non-mathematical procedure for genetic mapping. Acta Biol. Hung. 49: 125142.
KRUGLYAK, S., R. T. DURRETT, M. D. SCHUG and C. F. AQUADRO, 1998 Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations. Proc. Natl. Acad. Sci. USA 95: 1077410778.
KULIKOVA, O., G. GUALTIERI, R. GEURTS, D. KIM, D. R. COOK et al., 2001 Integration of the FISH pachytene and genetic maps of Medicago truncatula. Plant J. 27: 4958.[CrossRef][Medline]
KULIKOVA, O., R. GEURTS, M. LAMINE, D. KIM, D. R. COOK et al., 2004 Satellite repeats in the functional centromere and pericentromeric heterochromatin of Medicago truncatula. Chromosoma 113: 276283.[CrossRef][Medline]
LAMBLIN, A. F., J. A. CROW, J. E. JOHNSON, K. A. SILVERSTEIN, T. M. KUNAU et al., 2003 MtDB: a database for personalized data mining of the model legume Medicago truncatula transcriptome. Nucleic Acids Res. 31: 196201.
LIMPENS, E., C. FRANKEN, P. SMIT, J. WILLEMSE, T. BISSELING et al., 2003 LysM domain receptor kinases regulating rhizobial Nod factor-induced infection. Science 302: 630633.
LIMPENS, E., R. JAVIER, C. FRANKEN, V. RAZ, B. COMPAAN et al., 2004 RNA interference in Agrobacterium rhizogenes-transformed roots of Arabidopsis and Medicago truncatula. J. Exp. Bot. 55: 983992.
MANLY, K. H., R. H. CUDMORE and J. M. MEER, 2001 Map Manager QTX, cross-platform software for genetic mapping. Mamm. Genome 12: 930932.[CrossRef][Medline]
MCCOUCH, S. R., L. TEYTELMAN, Y. XU, D. B. LOBOS, K. CLARE et al., 2002 Development and mapping 2,240 new SSR markers for rice (Oryza sativa L.). DNA Res. 9: 199207.[Abstract]
MORGANTE, M., M. HANAFEY and W. POWELL, 2002 Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat. Genet. 30: 194200.[CrossRef][Medline]
PENMETSA, R. V., and D. R. COOK, 2000 Production and characterization of diverse development mutants in Medicago truncatula. Plant Physiol. 123: 13871398.
PUPKO, T., and D. GRAUR, 1999 Evolution of microsatellites in the yeast Saccharomyces cerevisiae: Role of length and number of repeated units. J. Mol. Evol. 48: 313316.[CrossRef][Medline]
ROSS, C. L., K. A. DYER, T. EREZ, S. J. MILLER, J. JAENIKE et al., 2003 Rapid divergence of microsatellite abundance among species of Drosophila. Mol. Biol. Evol. 20: 11431157.
ROZEN, S., and H. SKALETSKY, 2000 Primer3 on the WWW for general users and for biologist programmers, pp. 365386 in Bioinformatics Methods and Protocols: Methods in Molecular Biology, edited by S. KRAWETZ and S. MISENER. Humana Press, Totowa, NJ.
SHAROPOVA, N., M. D. MCMULLEN, L. SCHULTZ, S. SCHROEDER, H. SANCHEZ-VILLEDA et al., 2002 Development and mapping of SSR markers for maize. Plant Mol. Biol. 48: 463481.[CrossRef][Medline]
SIA, E. A., S. JINKS-ROBERTSON and T. D. PETES, 1997 Genetic control of microsatellite stability. Mutat. Res. 383: 6170.[Medline]
SLEDGE, M. K., I. M. RAY and G. JIANG, 2005 An expressed sequence tag SSR map of tetraploid alfalfa (Medicago sativa L.). Theor. Appl. Genet. Aug 2: 113.
TAUTZ, D., 1989 Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res. 17: 64636471.
TEMNYKH, S., G. DECLERCK, A. LUKASHOVA, L. LIPOVICH, S. CARTINHOUR et al., 2001 Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 11: 14411452.
THIEL, T., W. MICHALEK, R. K. VARSHNEY and A. GRANER, 2003 Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 106: 411422.[Medline]
TÓTH, G., Z. GÁSPÁRI and J. JURKA, 2000 Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 10: 967981.
TRINH, T. H., P. RATET, E. KONDOROSI, P. DURAND, K. KAMATÉ et al., 1998 Rapid and efficient transformation of diploid Medicago truncatula and Medicago sativa ssp. falcata lines improved in somatic embryogenesis. Plant Cell Rep. 17: 345355.[CrossRef]
TROTTA, E., N. E. GROSSO, M. ERBA and M. PACI, 2000 The ATT strand of AAT·ATT trinucleotide repeats adopts stable hairpin structures induced by minor groove binding lignads. Biochemistry 39: 67996808.[Medline]
VANDENBOSCH, K. A., and G. STACEY, 2003 Summaries of legume genomics projects from around the globe. Community resources for crops and models. Plant Physiol. 131: 840865.
VINCENT, J. L., M. R. KNOX, T. H. N. ELLIS, P. KALÓ, G. B. KISS et al., 2000 Nodule-expressed Cyp15a cysteine protease genes map to syntenic genome regions in Pisum and Medicago spp. Mol. Plant Microbe Interact. 13: 715723.[Medline]
WATSON, B. S., V. S. ASIRVATHAM, L. WANG and L. W. SUMNER, 2003 Mapping the proteome of barrel medic (Medicago truncatula). Plant Physiol. 131: 11041123.
WIERDL, M., M. DOMINSKA and T.D. PETES, 1997 Microsatellite instability in yeast: dependence on the length of the microsatellite. Genetics 146: 769779.[Abstract]
YOUNG, N. D., S. B. CANNON, S. SATO, D. KIM, D. R. COOK et al., 2005 Sequencing the genespaces of Medicago truncatula and Lotus japonicus. Plant Physiol. 137: 11741181.
ZHOU, Z., M. B. CHANDRASEKHARAN and T. C. HALL, 2004 High rooting frequency and functional analysis of GUS and GFP expression in transgenic Medicago truncatula A17. New Phytol. 162: 813822.[CrossRef]
Communicating editor: H. A. PATERSON
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |