Because of the huge size of the common wheat (Triticum aestivum L., 2n = 6x = 42, AABBDD) genome of 17,300 Mb, sequencing and mapping of the expressed portion is a logical first step for gene discovery. Here we report mapping of 7104 expressed sequence tag (EST) unigenes by Southern hybridization into a chromosome bin map using a set of wheat aneuploids and deletion stocks. Each EST detected a mean of 4.8 restriction fragments and 2.8 loci. More loci were mapped in the B genome (5774) than in the A (5173) or D (5146) genomes. The EST density was significantly higher for the D genome than for the A or B. In general, EST density increased relative to the physical distance from the centromere. The majority of EST-dense regions are in the distal parts of chromosomes. Most of the agronomically important genes are located in EST-dense regions. The chromosome bin map of ESTs is a unique resource for SNP analysis, comparative mapping, structural and functional analysis, and polyploid evolution, as well as providing a framework for constructing a sequence-ready, BAC-contig map of the wheat genome.
THE three major cereal crops, common or bread wheat (Triticum aestivum L., 2n = 6x = 42), rice (Oryza sativa L.), and maize (Zea mays L.), not only nourish humankind but also are plant genetic models with varying degrees of genome complexity. Rice has a small genome (450 Mb; Bennett and Leitch 1995) that has been sequenced (Goff et al. 2002; Yu et al. 2002; http://rgp.dna.affre.go.jp). Maize is a paleopolyploid model with a genome of 2500 Mb (Bennett and Leitch 1995) and sequencing of gene-rich regions is under way (Peterson et al. 2002; Yuan et al. 2002; Lunde et al. 2003; http://www.maizegenome.org). Diploid and tetraploid wheats along with common wheat form a polyploid series with genome sizes of ∼5000 (diploid), 13,000 (tetraploid), and 17,300 (hexaploid) Mb (Bennett and Leitch 1995). Wheat has emerged as a classic polyploid model. Polyploidy is a widespread evolutionary strategy in angiosperms, and research on wheat has greatly contributed to the understanding of this important phenomenon.
Because of triplication of genetic material, wheat can tolerate the loss of whole chromosomes, arms, and segments (Sears 1954, 1966; Sears and Sears 1978; Endo and Gill 1996). These cytogenetic stocks have been exploited for large-scale mapping of molecular markers into chromosomal “bins,” regions delineated by neighboring deletion breakpoints (Werner et al. 1992; Gill et al. 1993, 1996a,b; Kota et al. 1993; Hohmann et al. 1994; Delaney et al. 1995a,b; Mickelson-Young et al. 1995; Faris et al. 2000; Weng et al. 2000; Qi and Gill 2001). Concurrently, RFLP-based genetic maps were developed for the 2x, 4x, and 6x wheats and a diploid ancestor (Chao et al. 1989; K. S. Gill et al. 1991; Devos and Gale 1993; Nelson et al. 1995a,b,c; Dubcovsky et al. 1996; Jia et al. 1996; Marino et al. 1996; Cadalen et al. 1997; Blanco et al. 1998; Boyko et al. 1999; Nachit et al. 2001). In spite of these genetic resources, map-based cloning of genes has been arduous in wheat because of polyploidy and a large genome mostly consisting of repeated DNA sequences; as a result only a few successes have been reported (Faris et al. 2003; Huang et al. 2003; Yan et al. 2003). At the same time, little DNA sequence information is available for wheat to take advantage of gene discoveries in model organisms.
It is in this context that expressed sequence tag (EST) analysis has opened exciting prospects for gene discovery in all organisms, irrespective of their genome size (Adams et al. 1991; Hillier et al. 1996; Covitz et al. 1998; Ewing et al. 1999; Fernandes et al. 2002; Shoemaker et al. 2002; Van der Hoeven et al. 2002). Large-scale mapping of EST unigenes can provide valuable insights into the organization of genomes and chromosomes. Tomato (Lycopersicon esculentum Mill.) chromosomes have proximal heterochromatic and distal euchromatic regions, and preliminary data have indicated that most ESTs map to the distal euchromatic regions (Van der Hoeven et al. 2002). Using EST data, gene density can be calculated for individual chromosomes and chromosome regions by dividing the number of observed loci by the number of expected loci (on the basis of size). In humans, gene density varied from 0.56 for chromosome X to 1.74 for chromosome 19 (Deloukas et al. 1998; http://www.ncbi.nlm.nih.gov/genome/guide/human). In cereals, inter- and/or intrachromosomal variation in gene density has been documented in wheat (Gill et al. 1996a,b), rice (Wu et al. 2002), and maize (Davis et al. 1999). Obviously, EST distribution in relation to chromosome landmarks (short and long arms, euchromatin, heterochromatin, centromeres, and telomeres) and recombination is important in comparative analysis of chromosome structure and evolution, gene isolation, and targeted genome sequencing for a large genome species such as wheat. Such analysis of EST loci distribution for individual wheat chromosomes is presented in the accompanying articles in this issue (Conley et al. 2004; Hossain et al. 2004; Linkiewicz et al. 2004; Miftahudin et al. 2004; Munkvold et al. 2004; Peng et al. 2004; Randhawa et al. 2004). In this article, we report the genome-level comparative mapping analysis of 16,099 EST loci among the 21 chromosomes and variation in gene density among individual chromosomes, chromosomal regions, homoeologous groups, and A, B, and D genomes of hexaploid wheat. The implications of these results for polyploidy-driven evolution and especially for targeted sequencing of gene-rich regions of a polyploid genome are discussed.
MATERIALS AND METHODS
For the chromosome bin mapping of EST-specific fragments, cytogenetic stocks in the hexaploid cultivar Chinese Spring (T. aestivum) were used. The selected set includes 21 nullisomic-tetrasomic (NT) lines for mapping ESTs to individual chromosomes (Sears 1954, 1966), 24 ditelosomic (DT) lines for mapping ESTs to arms and centromeres (Sears and Sears 1978), and 101 deletion (del) lines with 119 chromosome-segment deletions for subarm mapping (Endo and Gill 1996). The fraction length (FL) value of each deletion identifies the position of the breakpoint from the centromere relative to the length of the complete arm. These stocks provide a complete coverage of the wheat genome, subdividing it into 159 chromosome bins. All the genetic stocks selected for EST mapping were cytologically and/or molecularly verified by C-banding and Southern hybridization with >500 EST clones (Qi et al. 2003). Seeds of the aneuploid and deletion stocks from the Wheat Genetics Resource Center (WGRC) were made available to 10 collaborating mapping labs (http://wheat.pw.usda.gov/NSF).
EST singleton clones and Southern hybridization:
Under the auspices of the NSF wheat EST genomics project, 117,466 cDNA clones were sequenced from 43 libraries representing a wide range of wheat tissue and organs, developmental stages, and environmental conditions. The software program phrap was used to assemble the EST contigs with the parameters penalty, −5; minmatch, 50; minscore, 100 (Lazo et al. 2004). The ESTs from each contig and unassembled ESTs were considered as singleton clones and formed the wheat unigene set. A total of 7104 EST singleton clones (unigenes) were selected for chromosome bin mapping in the present study. The amplified PCR products of these clones were prepared at the U.S. Department of Agriculture (USDA)/Agricultural Research Service (ARS) Western Regional Research Center (Albany, CA) and delivered to 10 mapping labs (http://wheat.pw.usda.gov/NSF) for Southern hybridization.
Protocols for Southern hybridization, scoring, and data analysis were standardized as much as possible during a workshop of students and/or postdoctorals from participating labs at the WGRC, Kansas State University, Manhattan, Kansas, on February 11–16, 2001. The detailed protocols are available on line at http://wheat.pw.usda.gov/NSF/project/mapping_data.html and were as described in Qi et al. (2003). Five Southern blots (each with 30 lanes for a total of 150) with EcoRI-digested DNA of the cytogenetic stocks were hybridized with a single EST clone in one hybridization reaction. λDNA digested with HindIII and BstEII was used as a size marker.
EST mapping, scoring, and data verification:
An EST locus was assigned to a chromosome, an arm, or a bin according to whether its restriction fragment was present or absent in a given set of DNA lanes of a Southern blot (Figure 1). Some stocks have more than one deletion chromosome. In these cases, NT and DT lines were extremely helpful to assign a locus to a correct location. In other cases where a fragment was present in all deletion lines but absent in NT and DT lines, the fragment was assigned to a centromeric bin of the specific arm. Southern images were scored by at least two workers at one location, and all mapping data and images of autoradiographs can be accessed at the project website at http://wheat.pw.usda.gov/cgi-bin/westsql/map_locus.cgi. The coordinators of the seven homoeologous groups for the project (group 1, N. L. V. Lapitan; group 2, J. A. Anderson; group 3, M. E. Sorrells; group 4, J. P. Gustafson; group 5, J. Dubcovsky; group 6, K. S. Gill; and group 7, S. F. Kianian) further verified each set of data from all mapping labs. Only mapping data with identical scores by both the mapping labs and the coordinators were considered confirmed loci and were used in this analysis.
Chromosome size data on mitotic metaphase chromosomes of Chinese Spring wheat were taken from B. S. Gill et al. (1991). The distribution of EST loci was assumed to be uniform along the physical length of a chromosome. The mean number of loci per micrometer was calculated by dividing the total EST loci mapped by the total physical length (235.4 μm) of the wheat chromosome complement (B. S. Gill et al. 1991). The numbers of expected EST loci per chromosome, chromosome arm, and chromosome bin were calculated by multiplying the mean number by their physical length. The ratio of observed vs. expected EST loci was used to estimate EST (or gene) density. The chi-square goodness-of fit test was used to detect significant differences between observed and expected numbers of loci. The total number of mapped EST loci for a whole genome was directly taken from the project website and does not always correspond to the totals of the number of mapped ESTs reported for the each of the individual homoeologous groups (Conley et al. 2004; Hossain et al. 2004; Linkiewicz et al. 2004; Miftahudin et al. 2004; Munkvold et al. 2004; Peng et al. 2004; Randhawa et al. 2004). This is because analysis by genome differs from analysis by single bins within individual chromosomes in the various homoeologous groups with respect to several things. For example, at the genome level, if two ESTs have duplicate mapping patterns, one of them should be eliminated from the analysis; at the homoeologous-group level, both of them would be eliminated. Analysis at the genome level can include ESTs that map only to a chromosome, a chromosome centromere, or a chromosome arm. When the bins within an individual chromosome within a homeologous group were analyzed, only those EST loci that mapped to individual bins were included. Each analysis yields a different count of total ESTs mapped.
In hexaploid wheat, mapping of EST loci to each chromosome in a homoeologous group was based on interchromosomal polymorphism. An EST locus was assigned to a chromosome bin according to presence or absence of a restriction fragment in a series of deletion lines. An example of localization of EST loci to individual bins of three chromosomes in a homoeologous group is shown in Figure 1. A total of 7104 EST unigenes were hybridized with the selected set of wheat aneuploid and deletion stocks. Of these, 741 EST probes did not provide useful information due to either high copy number or a lack of signal. For another 601 EST probes, Southern fragments could not be assigned to a chromosome or a chromosome bin for various reasons including comigrating fragments, missing DNA lanes, or poor blots. The remaining 5762 informative EST probes detected 27,658 restriction fragments. Of these, 17,761 fragments were assigned to either a chromosome or a chromosome bin by mapping laboratories. Each mapped fragment was considered a locus. The whole genome map contained 16,099 mapped EST loci that were verified by the coordinators of seven homoeologous groups and were analyzed for gene redundancy, gene density, and other structural features.
Assuming that each restriction fragment represented a single gene, the null hypothesis was that each EST unigene detected three restriction fragments, one in each genome of hexaploid wheat. However, this was not the case. The total number of restriction fragments in the three wheat genomes hybridizing with an EST unigene ranged from 1 to 24. Each EST detected a mean of 4.8 restriction fragments and 2.8 loci. About one-half of the EST probes (46%) detected 3 or 4 fragments. Twelve percent detected only 1 or 2 restriction fragments and 42% detected 5 or more restriction fragments. Some EST unigenes failed to display syntenic relationships and were chromosome or genome specific (Table 1). Eight of 14 EST unigenes that were found to be chromosome or genome specific involved B-genome chromosomes. As an example, view BE442924, a B-genome-specific EST unigene, at the web site (http://wheat.pw.usda.gov/cgi-bin/westsql/map_locus.cgi).
Other ESTs detected intra- and interchromosomal duplications within a chromosome or among chromosomes within a genome (Table 2). Of the 5762 EST unigenes, 1086 (19%) hybridized to restriction fragments on chromosomes in different homoeologous groups. Within this group of 1086 EST unigenes, a majority (75%) detected loci in two homoeologous groups, 17% detected loci in three groups, and <10% detected loci in more than three groups. Some ESTs detected complicated patterns of duplications. For example, BE443755 hybridized to 17 fragments that could be assigned to 1A, 5A, all B-genome chromosomes, and 7D. Of those, nine loci were mapped on opposite arms within homoeologous groups on chromosomes 1AL-1BS and 5AS-5BL and on both arms of chromosomes 4B and 7B (Table 2; http://wheat.pw.usda.gov/cgi-bin/westsql/map_locus.cgi).
Genome and chromosome distribution of EST loci:
Of the 16,099 EST loci, 14,051 were mapped to chromosome bins and 2048 to only a chromosome or a chromosome arm. Eleven percent more loci were detected in the B genome (5774) than in the D (5179) and A (5146) genomes (Figure 2). The A and D genomes had almost the same number of loci. Analysis of the ratio of observed to expected loci in each genome (based on the physical length of the chromosomes in each genome) indicated a significantly higher gene density for the D genome and a significantly lower gene density for the A genome (P < 0.001). Among seven homoeologous groups, the numbers of EST loci observed were significantly higher than expected only for homoeologous group 2 and significantly lower than expected only for homoeologous group 6 (P < 0.005).
The 21 wheat chromosomes differ in absolute size and arm ratio (B. S. Gill et al. 1991). The expectation was that larger chromosomes would have a greater number of EST loci than the smaller chromosomes. Similarly, the long arms would have greater numbers of EST loci than the short arms within a chromosome. Both expectations were realized with some exceptions. Among the 21 chromosomes, 3B and 2B rank first and second in size and they also ranked first and second in number of EST loci, with 972 and 948 EST loci, respectively. Chromosome 1D is the smallest in size, yet chromosomes 6D (584 EST loci, ranked eighteenth on the basis of size) and 4B (612 EST loci, ranked eleventh on the basis of size) had the fewest number of EST loci. As a rule, the long arms had greater numbers of EST loci than the shorter arms (data not shown) and, among the long arms, 5BL is the longest and had the highest number of mapped EST loci (636).
Because individual chromosomes within a homoeologous group were assumed to have similar gene content, we hypothesized that the number of EST loci should not differ significantly among individual chromosomes within a homoeologous group, irrespective of their size. However, this expectation was not realized because within six of the seven homoeologous groups, B-genome chromosomes had the highest number of mapped EST loci, followed by A- or D-genome chromosomes. The only exception was in group 4 where 4B had the lowest number of loci (4A is an exceptional case that is discussed later). In comparing A- and D-genome chromosomes, D-genome chromosomes had a greater number of EST loci except groups 5 and 6 where chromosomes 5A and 6A had a greater number of loci than 5D and 6D, respectively.
Gene density was analyzed among genomes, chromosomes, arms, and subarm regions. Assuming a uniform distribution of genes in the wheat genome, we expected 68 EST loci per micrometer for wheat chromosomes in this study. Gene density values (indicated by a ratio of numbers of EST loci observed/expected for a specified length) can be above or <1. There were differences in gene density among A-, B-, and D-genome chromosomes. The gene density was less than one for A-genome chromosomes (except 4A and 6A) and more than one for D-genome chromosomes (except 5D and 6D). In the B-genome chromosomes, 1B, 2B, 3B, and 5B had average gene densities. Three chromosomes, 1A, 4B, and 6D, had significantly lower gene density (P < 0.001). Chromosomes 1D, 2D, 3D, and 7D had significantly higher gene densities (P < 0.001). The gene density in chromosome 2D was 1.6-fold higher than that in chromosome 4B with the lowest gene density.
In general, the short arms had lower gene density than the long arms. Among all chromosome arms, eight short arms, 2AS, 3AS, 4BS, 5AS, 5BS, 5DS, 6BS, and 6DS; and the long arm of 1AL; had significantly lower gene densities (P < 0.001). The significantly higher gene densities were observed on long arms 1DL, 2BL, 2DL, 3DL, 5AL, 5BL, and 7DL (P < 0.001, Figure 3). Chromosome arms 2DL and 1DL had the highest gene density. The 1DL arm accounts for the high gene density for chromosome 1D. Among the short arms 4BS had the lowest gene density.
Distribution of loci among chromosome bins:
Gene density along the centromere-telomere axis of individual chromosome arms was revealed by analyzing the chromosome bin map (Figure 4). In general, gene density in each chromosome increased with increasing relative physical distance from the centromere. The Pearson correlation coefficient was r = 0.57 (P < 0.0001). Most gene-poor regions were located around the centromeres. The proximal region of the 1AS arm had the lowest gene density (0.05) and the telomeric bin of 6AL had the highest gene density (4.70, P < 0.0001), a 94-fold difference. Among 158 chromosome bins delineated by deletion breakpoints, 48 bins had significantly lower than expected numbers of EST loci observed and 50 had significantly greater than expected numbers of EST loci observed (P < 0.001). The overall breakdown of bins with varying gene densities is given in Table 3. Thirty-three chromosome bins (21%) had a very low gene density with a range of 0.05–0.49. Of these, 22 bins (67%) were located around the centromeric regions. None of these bins were located in the distal 20% of the arm. Another 27 of 41 chromosome bins (66%) with ratios between 0.50 and 0.99 were located in the proximal 60% of the arm length. All but one (2BL2-0.36-0.50) of the 23 chromosome bins with a twofold or greater gene density were located in the distal 40% of the arm; 16 of 23 bins (70%) were found in the distal 20% of the arm. The extremely high EST densities observed in chromosome bins 5BL9-0.75-0.79 (6.11), 5BL14-0.75-0.76 (11.0), and 6DS6-0.99-1.00 (17.00) may be due to errors in fraction length calculations of deletions delineating those bins.
A total of 67 previously mapped agronomic genes were assigned to chromosome bins on the basis of bin allocation of linked markers (Figure 4; Gupta et al. 1999; Peng et al. 2000; Liu and Anderson 2003; McIntosh et al. 2003). Forty-one of the 67 genes were located in the distal 20% of individual chromosome arms. Most genes in distal bins relate to pest resistance. Only 2 of the 67 genes (3%) were mapped in the proximal 40% of the arm. The distribution of agronomic genes was consistent with the distribution pattern of EST loci; most colocalized in the EST-rich, distal chromosomal regions.
Chromosomal structural changes:
In hexaploid wheat, chromosome and arm homoeologies within sets of triplicated chromosomes are highly conserved with the exception of a translocation involving chromosomes 4A, 5A, and 7B, which is fixed in polyploid wheat (Naranjo et al. 1987; Liu et al. 1992; Devos et al. 1995; Mickelson-Young et al. 1995; Nelson et al. 1995a). In addition, a pericentric inversion occurred in chromosome 4A with arm homoeologies as 4AS = 4BL = 4DL and 4AL = 4BS = 4DS (Mickelson-Young et al. 1995). A pericentric inversion has been documented in 4B (Endo and Gill 1984; B. S. Gill et al. 1991; Friebe and Gill 1994; Mickelson-Young et al. 1995). With large-scale EST mapping, we have uncovered further structural changes in 4A and confirmed the 4B inversion and additional chromosomal structural changes in most of the B-genome chromosomes. These are described in detail in individual chromosome group articles (Conley et al. 2004; Linkiewicz et al. 2004; Miftahudin et al. 2004) and they account for some of the anomalies in gene distribution (see discussion).
ESTs have become an essential resource for gene discovery. Fifteen plant species had >50,000 ESTs each in public databases by January 30, 2004 (http://www.ncbi.nlm.nih.gov/dbEST), including wheat with the most entries (549,926). The second essential resource is the chromosome maps of ESTs where EST data can be related to genes controlling phenotypic traits (Davis et al. 1999; Wu et al. 2002). Apart from Arabidopsis and rice, where a nearly complete genome sequence is available, common wheat now is one of the most densely mapped genomes among plants and highest among all polyploid organisms. Previously, 5537 RFLP loci, several hundred SSR loci, and 2049 protein loci and genes controlling phenotypic traits were mapped in wheat (McIntosh et al. 2003). The current chromosome bin map by the NSF EST project contains 16,099 EST loci. This vast EST resource and the chromosome bin maps (Conley et al. 2004; Hossain et al. 2004; Linkiewicz et al. 2004; Miftahudin et al. 2004; Munkvold et al. 2004; Peng et al. 2004; Randhawa et al. 2004) will hasten the pace of gene discovery in wheat. These data are also useful for analyzing general trends and peculiarities of gene distribution among chromosomes and genomes as influenced by features of karyotype, polyploidy, and evolutionary forces unique to polyploid organisms.
Several features of the wheat EST map distinguished it from other plant EST maps. First, Southern hybridizations were used, thus making it possible to locate diverged duplicated genes sharing 80% or more homology. This would not have been possible using a PCR-based approach. Second, aneuploid and deletion stocks were used for allocating restriction fragments to individual chromosomes, arms, and bins. The genome coverage was 100% because a set of nullisomic-tetrasomic and several telocentric stocks were included in the analysis. For many EST clones, all restriction fragments were mapped to specific chromosome bins. The choice of EcoRI as the restriction enzyme did not appear to have any effect on the number or distribution pattern of EST loci among chromosomes and genomes as judged from the analysis of previous deletion-mapping results using other restriction enzymes (Delaney et al. 1995a,b; Mickelson-Young et al. 1995; Gill et al. 1996b; Faris et al. 2000; Qi and Gill 2001; Sandhu et al. 2001). One disadvantage of the chromosome bin map is that loci within chromosome bins cannot be ordered. However, this problem can potentially be overcome by in silico ordering of sequenced EST loci using the rice genome sequence as reported by Sorrells et al. (2003) and as is reported in detail in the individual chromosome publications (Conley et al. 2004; Hossain et al. 2004; Linkiewicz et al. 2004; Miftahudin et al. 2004; Munkvold et al. 2004; Peng et al. 2004; Randhawa et al. 2004).
Why do wheat genomes have a variable number of EST loci?
The A, B, and D genomes of the diploid progenitors of common wheat diverged from a common ancestor 1–5 million years ago (MYA; Huang et al. 2002). Because of the recent evolutionary differentiation of the A, B, and D genomes from a common ancestor, one might expect a similar gene content for these genomes. However, we have found this is not the case as the B genome had a significantly greater number of mapped EST loci than did the A and D genomes, which had similar numbers of loci (P < 0.001). In addition, the greater number of EST loci was not related to genome size as the A genome is larger than the B genome and the D genome is smaller than the A genome. Although a number of mechanisms may lead to this outcome, the most plausible explanation relates to the evolutionary history of each of the hexaploid wheat genomes. The two key parameters may be the breeding system (Akhunov et al. 2003) and age of the diploid donors. Both A and D genomes trace their origins to self-pollinating diploids T. urartu Thum. ex Gand. and Aegilops tauschii Coss., respectively. On the other hand, the B genome, having a greater number of EST loci, was derived from a species closely related to Ae. speltoides Tausch, a cross-pollinating diploid species. As a cross-pollinating species, Ae. speltoides is more diverse for RFLP variation than the self-pollinating Ae. tauschii (Dvořák et al. 1998). In situations of high polymorphism in a cross-pollinating species, it is usually difficult to distinguish between heterozygosity and a duplication event. However, on the basis of DNA sequence analyses, duplicate copies of genes Acc1, Pgk1, and X2 were postulated in Ae. speltoides as compared to single copies in the self-pollinating A- and D-genome diploids (Huang et al. 2002; Li and Gill 2002). Therefore, part of the high polymorphism of the B genome may be due to gene duplications preexisting in the diploid donor.
The age of the lineage of the diploid donors at the time of the polyploidization events may be another contributing factor. Based on a molecular clock (Huang et al. 2002), the Ae. speltoides lineage is older (ca. 4 MYA) than Ae. tauschii (ca. 2.5 MYA), and the T. urartu lineage is the youngest (ca. 1 MYA). Older lineages would be expected to have accumulated more polymorphisms including gene duplications. A faster rate of B-genome evolution may also be indicated by the fact that most of the observed chromosomal inversions and translocations involved B genome chromosomes. Akhunov et al. (2003) analyzed a subset of the EST data and documented twice as many unique loci in the B genome as compared to the A and D genomes.
Why do wheat chromosomes within homoeologous sets vary in gene density?
One factor leading to variation in gene density or number of loci is the evolutionary history of the genomes to which a specific chromosome belongs as discussed above. A second factor is individual chromosome size. The third contributing factor is structural aberrations that include unequal exchanges of genetic material within and among chromosomes. For short arms, the lowest gene density in the 4BS arm (Figure 3) is caused by an asymmetric pericentric inversion as documented by studies on chromosome morphology and C-banding patterns (Endo and Gill 1984) and confirmed by molecular mapping studies (Mickelson-Young et al. 1995). In the case of arms 4AL and 7BS, a reciprocal translocation event occurred, where unequal-size fragments were exchanged. Deletion mapping (Mickelson-Young et al. 1995) and FISH show that the 7BS segment on 4AL constitutes 25% of its arm length whereas the 5AL-specific segment on 7BS is not visible at the resolution power of a light microscope (Mukai et al. 1993; Jiang and Gill 1994).
There are additional significant differences in overall gene density in the absence of obvious structural changes. Chromosome 4B has the lowest gene density among the 21 chromosomes of wheat, which is all the more puzzling considering that B-genome chromosomes have ∼11% more loci overall than the A or D genome chromosomes do. One explanation is that 4B is a highly heterochromatic chromosome with proximal blocks of heterochromatin (B. S. Gill et al. 1991). Future work may shed light on the apparent slow pace of evolution of 4B, a member of the fast-evolving B genome.
Is polyploidy a trigger for fast genome evolution?
It is now generally accepted that the genomes of most eukaryotic organisms have experienced repeated cycles of polyploidization and diploidization (Adams et al. 2003). Polyploidy has at least two evolutionary consequences. It provides a mechanism for fixing hybrid vigor, so there may be selection pressure for maintaining active/diverse alleles at homoeologous loci. On the other hand, gene redundancy triggers diploidization where duplicated genes may be eliminated, silenced, mutated to acquire new functions, or epigenetically regulated for tissue-specific expression (Adams et al. 2003). The data presented here on the mapping of gene motifs of a random set of ESTs provided an excellent opportunity for analyzing the role of gene elimination or amplification during polyploidy-driven evolution.
Gene duplications, in addition to those duplicated by polyploidy, were observed for 19% of the loci that were mapped on nonhomoeologous sets of chromosomes. Another 12% of EST clones detected only one or two fragments, indicating gene elimination. The actual deletion frequency may be lower due to comigrating fragments as Akhunov et al. (2003) estimated it to be 6%. We do not know if these duplication/deletion events occurred before or following polyploidization. Akhunov et al. (2003) reported that of the duplications events analyzed, 83% occurred at the diploid level and 17% at the polyploid level. Of deletion events, 60% occurred at the diploid level and 40% at the polyploid level. Previously, Dubcovsky et al. (1996) showed that ∼31% of the RFLP loci in the diploid T. monococcum L. map were present more than once in the genome. A similar level of duplications (30%) was observed in the barley (Hordeum vulgare L.) RFLP maps, suggesting that a large proportion of the duplications originated before the divergence of the Triticeae species.
Several patterns of gene duplication were observed. Many ESTs detecting more than seven restriction fragments showed both intra- and interchromosomal duplications. In several cases of intrachromosomal duplications, the duplicated fragments were detected in both arms. These events may represent peculiarities of chromosome behavior during which these duplications occurred. For example, opposite arms may be juxtaposed during prophase of meiosis, providing opportunities for gene-conversion type events. The other possibility is that these kinds of ESTs may be as yet unrecognized transposon-like elements that can move within and among different chromosomes.
Prospects for wheat gene discovery and genome analysis:
This public resource offers information on sequence, map position, putative function, and often copy number of a given EST. Many EST-SSRs can be identified and mapped using this information. Similarly, single-copy EST loci will be good candidates for developing SNPs. Many agronomically useful genes have been localized to EST-rich regions in the distal ends of chromosomes and are targets for gene discovery (Faris and Gill 2002; Faris et al. 2003; Huang et al. 2003; Yan et al. 2003). Once an EST tightly linked to a target gene has been identified, it can be the start of an in silico walk using the rice genome sequence contigs. The candidate clones gleaned from comparative wheat-rice maps are in turn mapped in the target gene population and the procedure is repeated until a cosegregating clone has been identified. Each mapped EST also has a unique size fragment that will be useful in allocating BAC contigs to individual wheat chromosomes. The chromosome bin map of EST loci, chromosome-specific BAC libraries (Šafár et al. 2004), and high throughput techniques for BAC fingerprinting (Luo et al. 2003) can now be used to make a sequence-ready global BAC-contig physical map of the wheat genome. The sequencing of BAC-contig maps spanning the gene-rich regions will be especially useful for accessing those genes that may be unique to wheat. This includes the Ph1 gene controlling chromosome pairing in polyploids (Riley and Chapman 1958), genes affecting unique quality attributes such as bread-making properties (Anderson et al. 2002), durable resistance genes that underpin the green revolution such as Lr34 (Singh and Huerta-Espino 2003), and many genes controlling domestication-driven traits (Faris et al. 2003) that have made the wheat plant the most important food source for humanity.
We thank W. John Raupp for editorial assistance, along with Duane L. Wilson for the distribution of the genetic stocks and greenhouse help. This research was supported by National Science Foundation cooperative agreement no. DBI-9975989. This article is contribution no. 04-032-J from the Kansas Agricultural Experiment Station, Kansas State University, Manhattan, KS 66506-5502.
↵ 1 Present address: USDA-ARS Biosciences Research Laboratory, Fargo, ND 58105-5674.
↵ 2 Present address: Plant Breeding and Acclimatization Institute, Radzikow 05-870 Blonie, Poland.
↵ 3 Present address: Department of Agronomy, Iowa State University, Ames, IA 50014-8122.
↵ 4 Present address: Department of Plant Sciences, North Dakota State University, Fargo, ND 58105-5051.
↵ 5 Present address: Department of Crop Science, North Carolina State University, Raleigh, NC 27695.
↵ 6 Present address: Eugentech, 52 Oun-Dong, Yusong, Taeson, 305-333, Republic of Korea.
Communicating editor: J. P. Gustafson
- Received January 5, 2004.
- Accepted June 1, 2004.
- Genetics Society of America