- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Hamblin, M. T.
- Articles by Kresovich, S.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Hamblin, M. T.
- Articles by Kresovich, S.
Comparative Population Genetics of the Panicoid Grasses: Sequence Polymorphism, Linkage Disequilibrium and Selection in a Diverse Sample of Sorghum bicolor
Martha T. Hamblina, Sharon E. Mitchella, Gemma M. White1,a, Javier Gallego2,a, Rakesh Kukatlaa, Rod A. Wing3,b, Andrew H. Patersonc, and Stephen Kresovichaa Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853,
b Department of Agronomy, Clemson University, Clemson, South Carolina 29634-0359
c Plant Genome Mapping Laboratory, University of Georgia, Athens, Georgia 30602
Corresponding author: Stephen Kresovich, 157 Biotechnology Bldg., Cornell University, Ithaca, NY 14853., sk20{at}cornell.edu (E-mail)
Communicating editor: M. AGUADÉ
| ABSTRACT |
|---|
Levels of genetic variation and linkage disequilibrium (LD) are critical factors in association mapping methods as well as in identification of loci that have been targets of selection. Maize, an outcrosser, has a high level of sequence variation and a limited extent of LD. Sorghum, a closely related but largely self-pollinating panicoid grass, is expected to have higher levels of LD. As a first step in estimation of population genetic parameters in sorghum, we surveyed 27 diverse S. bicolor accessions for sequence variation at a total of 29,186 bp in 95 short regions derived from genetically mapped RFLPs located throughout the genome. Consistent with its higher level of inbreeding, the extent of LD is at least severalfold greater in sorghum than in maize. Total sequence variation in sorghum is about fourfold lower than that in maize, while synonymous variation is fivefold lower, suggesting a smaller effective population size in sorghum. Because we surveyed a species-wide sample, the mating system, which primarily affects population-level diversity, may not be primarily responsible for this difference. Comparisons of polymorphism and divergence suggest that both directional and diversifying selection have played important roles in shaping variation in the sorghum genome.
IDENTIFICATION of the genetic variation underlying traits important in domestication and improvement of crops is an area of great interest to both evolutionary and applied biologists. Classical genetic approaches to this problem, such as quantitative trait loci (QTL) mapping, test for an association between a trait and a gene in experimental populations in which the numbers of segregating alleles and meioses are both small. In recent years, methods have been developed that test for such an association in population samples (i.e., groups of unrelated individuals) in which the numbers of alleles and meioses are much larger. Together, these methods provide a strategy for moving from low- to high-resolution mapping of traits, with the ultimate identification of quantitative trait nucleotides (QTNs; ![]()
Characterization of basic population genetic parameters is an essential prerequisite to any approach that analyzes variation in population samples: the power and resolution of haplotype mapping and association studies depend critically on levels of genetic variation, linkage disequilibrium (LD), and population structure. Thus, knowledge of population genetic parameters is a prerequisite to moving beyond mapping in experimental populations. Population genetic analysis can also provide a complementary approach to mapping studies by the identification of loci that have been targets of selection during the process of domestication or crop improvement. These methods can be applied to candidate genes identified through mapping or "reverse" genetics (![]()
![]()
Mating system is an important variable in population genetics: it influences effective population size and effective rate of recombination, which in turn influence levels of genetic variation and linkage disequilibrium (![]()
![]()
![]()
Maize (Zea mays L. ssp. mays) and sorghum (Sorghum bicolor [L.] Moench) are closely related species that differ dramatically in mating system. Together with pearl millet (Pennisetum glaucum), they show considerable synteny in their genomes (![]()
![]()
![]()
![]()
As a first step in exploring the merits of sorghum for LD mapping and population genetic analyses, we have assessed sequence variation and LD in 95 short regions (123444 bp) located throughout the genome, including coding and noncoding sequences. These regions, which correspond to mapped restriction fragment length polymorphism (RFLP) loci, were sequenced in a panel of 27 S. bicolor accessions representing elite inbred lines, the five races of S. bicolor ssp. bicolor (caudatum, durra, bicolor, guinea, and kafir), and three races of S. bicolor ssp. verticilliflorum (arundinaceum, aethiopicum, and verticilliflorum). Members of this panel display a wide range of geographic and phenotypic diversity. In addition, one accession of S. propinquum was sequenced at all loci to serve as an outgroup. Divergence data from S. propinquum allow inferences about differences in neutral mutation rates across the genome, and the relationship between polymorphism and divergence allows inferences about the possible role of selection in the evolution of particular loci. Identification of targets of selection may prove valuable in the search for candidate genes underlying important phenotypes.
| MATERIALS AND METHODS |
|---|
Plant material:
Accessions and their attributes are listed in Table 1. The three subspecies verticilliflorum accessions are wild sorghum; all other S. bicolor are cultivated. Five of the S. bicolor bicolor accessions were exotic lines that had been converted to day-length insensitivity and short stature by crossing to United States inbred line BTx406 followed by repeated backcrossing to the exotic parent. Leaves from one individual from each accession were harvested for extraction of DNA according to the method of ![]()
|
RFLP probe sequences and primer development:
Sequence information was available for clones of PstI-digested BTx623 genomic DNA ("pSB" clones) that had been developed as RFLP probes (![]()
100 loci. In anticipation of some failures, 129 mapped RFLP loci were chosen to cover as much of the genome as possible. PCR primers were developed for these loci and tested on a panel of DNAs from four accessions: BTx3197, BTx623, RTx430, and S. propinquum. Loci that did not amplify from all four accessions were dropped from the set. Of the 102 successful loci, 96 were chosen for amplification in the larger set of 28 accessions. One locus was found to be duplicated and was discarded.
Sequencing and analysis:
PCR products were prepared for sequence analysis by treatment with exonuclease I and shrimp alkaline phosphatase. Cycle sequencing with ABI (Columbia, MD) Big Dye, followed by analysis on an ABI 3700, was performed in the Bioresource Center at Cornell University and at Clemson University. PCR primers were used as sequencing primers. Most PCR products were sequenced with both forward and reverse primers, but in the event that one reaction failed, a single-pass sequence was used.
Chromatograms were assembled into contigs for each locus using both Seqscape (ABI) and Sequencher (Gene Codes, Ann Arbor, MI) software. Our method relied on initial semiautomated identification of variation by Seqscape software (Applied Biosystems) followed by visual inspection and confirmation using Sequencher. Every single-nucleotide polymorphism (SNP) was confirmed by inspection of the chromatograms by at least two different experienced individuals. For purposes of estimating levels of polymorphism on the basis of nucleotide substitution, we removed blocks of three or more contiguous SNPs that were completely associated with each other, since these are likely to arise through insertion/deletion events rather than through nucleotide substitution.
Although sorghum is a predominantly self-pollinating species and therefore usually homozygous at most loci, some heterozygous individuals were observed at eight loci. In these cases, the heterozygous individual was considered to have two chromosomes at that region only. With the exception of LD analysis (see below), the phase of SNPs was unimportant in our analyses. DnaSP version 3 (![]()
Each locus was tested for departure from neutrality by the method of ![]()
Linkage disequilibrium:
The program dipdat (kindly provided by R. R. Hudson) was used to estimate D' and r2, measures of linkage disequilibrium, as functions of distance. This program uses the maximum-likelihood method of ![]()
![]()
Fisher's exact tests of the interlocus comparisons were implemented in DnaSP. Individuals that were heterozygous at more than one site within a linkage group were eliminated from this analysis, as phase in those cases could not be inferred.
Assignment of coding regions:
Most of the loci sequenced were anonymous genomic regions. To classify as many sites as possible by functional category, we performed database searches (blastn and blastx) to identify those regions for which there was good evidence of a transcribed open reading frame. The sequence of the surveyed region was submitted to a blastx search against the nonredundant protein database of GenBank using default parameters. Criteria were as follows:
- If the region showed a 98100% sequence match to a S. bicolor expressed sequence tag (EST) from the CGGC database or a >95% sequence match to a Z. mays EST from the Institute for Genomic Research database or GenBank, a score of >50 in a blastx query of the protein database was sufficient to consider it a coding region. Scores only slightly >50 usually represented short stretches of high similarity.
- In the absence of a strong match in either the sorghum or the maize EST databases, it is still possible that a region encodes a rare transcript. In such cases, a region with a blastx score of >80 was required for the region to be considered coding. In most of these cases, the region also had a strong match with genomic or EST sequence from rice. An exception to this requirement was locus 640, at which polymorphisms were observed more frequently at synonymous sites than if they were occurring at random. In this case, the pattern of polymorphisms provided convincing evidence that the region codes for protein, even though the blastx score was only 75 and there was no good EST match in either maize or sorghum.
| RESULTS |
|---|
Our goal was to characterize levels and patterns of sequence variation across the sorghum genome in a diverse panel of germplasm (Table 1) and to identify regions that appear to depart from average patterns. The final data set represents loci that could be amplified and successfully sequenced in our panel of 27 S. bicolor and one S. propinquum (see MATERIALS AND METHODS). Not all individuals were successfully amplified or sequenced for all loci, so the sample size varies from locus to locus, averaging 24.7 chromosomes/locus (range is 1430). The sample size is greater than the number of accessions in a few cases because of the presence of some heterozygous individuals (see MATERIALS AND METHODS). At most loci (87), all individuals were homozygous at all sites. At 8 loci, a few individuals were heterozygous at one or more sites. Accessions BTx406, BTx3197, 152702, 267380, SC0033, and SC0155 were heterozygous at two loci, and accessions 195684, 56174, and LWA4 were heterozygous at 3 loci.
Total sequence diversity in S. bicolor:
Standard summary statistics of sequence variation for each locus are presented in Table 2, arranged by linkage group; LG designations follow ![]()
![]()
|
Only base-substitution polymorphisms are included in the statistics reported in Table 2. Although 46 loci had at least one indel, only 26 loci had indel variation polymorphic in S. bicolor. Most length variation was found between S. bicolor and S. propinquum where it sometimes appeared to be complex and difficult to align. A total of 238 SNPs were observed in 29,186 bases surveyed, yielding an average of one SNP every 123 nucleotides in this sample. This is about one-fourth the frequency observed in a comparable sample in maize (![]()
![]()
![]()
![]()
![]()
![]()
If the three wild sorghum accessions are removed from the sample, the number of bases surveyed increases to 29,306 while the number of segregating sites decreases to 198. Nucleotide diversity is reduced only slightly, to 0.21%, because the SNPs unique to the wild accessions are usually singletons. Removal of the wild accessions increases the average D to 0.299, indicating that alleles in cultivated S. bicolor tend to be skewed toward intermediate frequency.
Evidence for directional and diversifying selection:
Estimates of sequence diversity (
) at individual loci ranged from 0 to 1.5%. Variation in levels of diversity is expected as a consequence of evolutionary variance, sampling variance due to the small number of nucleotides surveyed per locus, and differences in neutral mutation rate among loci. The neutral mutation rate can be estimated by the amount of divergence between species, in this case S. propinquum, which varies from 0 to 9.8% and averages
1.2% (Table 2). Polymorphism and divergence are expected to increase and decrease together across the genome when a changing neutral mutation rate underlies both phenomena, while a dramatic change in the relationship between polymorphism and divergence suggests the local effects of selection. We plotted
and divergence as a function of genetic map position across each linkage group (see Fig 1). These plots illustrate how dramatically the relationship between polymorphism and divergence can change, even at fairly closely linked loci.
|
To test whether differences in mutation rate alone could account for the observed differences in polymorphism, we employed the method of ![]()
2 statistic for the data set was 145.11, which has a P-value of 0.00061, and none of the 10,000 simulations had a
2 statistic that high, indicating that selection has altered patterns of polymorphism and divergence in these data. On the other hand, none of the individual cell values had a P-value <0.10, so there was not strong evidence that any particular locus had been under selection. In Table 3, we show the 10 loci that had the greatest deviation from expected values (indicated by asterisks in Fig 1). Of these 10 loci, 4 show a deficiency and 6 show an excess of polymorphism relative to divergence, suggesting that both directional and diversifying selection have played a role in sorghum evolution. When the three wild accessions are removed from the analysis, the results change very little (data not shown). We have no information about regional rates of recombination in sorghum, so the contribution of background selection (![]()
|
Short-range and long-range linkage disequilibrium:
Sorghum is a predominantly self-pollinating species (estimates of outcrossing range from 2 to 35% depending on panicle type; ![]()
![]()
![]()
10% (![]()
0.5 by 400 bp. For this same set of comparisons, only 29 of 329 |D'| values were <1.0. Since none of the comparisons involve SNPs >400 bp apart, we are unable to estimate the decay of LD over longer intragenic distances. However, even in this limited data set, there is a clear contrast with maize, for which ![]()
![]()
![]()
|
|
Variation in protein-coding regions:
All loci were analyzed to determine whether there was good evidence that the sequence encodes protein and, if so, to establish the reading frame for codon-based analyses (see MATERIALS AND METHODS). Of the 29,186 nucleotides surveyed, 11,025 (38%) from 52 loci were classified as coding sequence. Since the remaining sequence could not be assumed to be noncoding, no analysis was done of noncoding sequence as a functional class. Average nucleotide diversity (
) at synonymous sites and nonsynonymous sites is 0.39 and 0.09%, respectively (estimates for each locus are provided at http://www.genetics.org/supplemental). We also estimated the average levels of
W (![]()
W at synonymous sites is 0.34%, compared to 1.73% in maize, while the average level at nonsynonymous sites is 0.09%, compared to 0.39% in maize. The ratio of synonymous to nonsynonymous variation, 3.8, is between that of maize (4.43) and humans (2.65), both of which are smaller than that of Drosophila (8.67; ![]()
Both positive and negative selection can alter the ratio of nonsynonymous to synonymous changes. When most variation is neutral, the ratio of synonymous to nonsynonymous mutations is the same within and between species. A departure from this expectation can be detected with a 2 x 2 test of independence (![]()
![]()
![]()
![]()
|
One locus, 640, is a clear outlier in this study. This locus putatively encodes a homolog of Mla1, a mildew-resistance gene characterized in barley. Disease resistance genes are known to have very rapid rates of evolution and to accumulate amino acid differences at a much higher rate than the average (![]()
![]()
Relationships among races:
S. bicolor, having originated in eastern Africa, has been classified into five racial groups on the basis of morphology, and previous studies based on allozyme, RFLP, and simple sequence repeat variation have concluded that both geography and racial structure contribute to the genetic relationships among accessions (![]()
![]()
![]()
![]()
![]()
|
| DISCUSSION |
|---|
The panicoid grass crops provide an opportunity for efficient identification of genetic variation underlying common phenotypes of agronomic interest. Correspondence of QTL locations (![]()
![]()
![]()
Sequence diversity:
This study shows that sorghum has about one-fourth the total variation of maize, from which sorghum is thought to have diverged
16.5 million years ago (![]()
The lower level of variation in sorghum may be due to a number of factors. First, there was a bias away from sequences with higher mutation rates, since 27 loci (21% of the 129 loci tested) that could not be amplified in S. propinquum were dropped from the study. Another possibility is that genome-wide mutation rates in sorghum are lower than those in maize. Considering replication errors alone, the fairly recent common ancestry of maize and sorghum makes this hypothesis implausible. However, the presence of duplicated genes in maize may allow for relaxed constraint and divergent evolution in paralogues, which may increase the neutral mutation rate (![]()
![]()
![]()
Since variation is a function of both neutral mutation rate and effective population size, it is likely that sorghum has an effective population size (Ne) considerably smaller than that of maize. To what extent this simply reflects differences in census population size is difficult to say. However, because there is seldom a very good correspondence between census size and effective size, other factors must be considered. A domestication "bottleneck" may have been more severe in sorghum than in maize, which has retained
70% of the variation present in ancestral teosinte (![]()
![]()
![]()
![]()
![]()
The effects of self-pollination on population genetics:
Another factor affecting the difference in sequence variation between sorghum and maize may be their respective mating systems, specifically, that maize is primarily an outcrosser and sorghum is primarily self-pollinated. There is considerable theoretical work on the effects of self-pollination on population genetics. In a completely self-pollinating species, effective population size, and hence polymorphism, is reduced by half (![]()
![]()
![]()
![]()
Several empirical studies have compared patterns of sequence variation in selfing species to those in closely related outcrossing species. In the genus Lycopersicon, ![]()
![]()
![]()
![]()
![]()
The analyses of ![]()
While one might expect comparisons to other self-pollinating species to provide some insight on the effect of mating system on levels of sequence variation, the comparisons to wild barley and Arabidopsis, which show severalfold higher levels of species-wide variation, are likely to be confounded by other factors. There are deeply diverged lineages at many loci in both these species, as well as strong geographic structure in barley, suggesting that the population histories of these species are quite different from that of cultivated sorghum. The comparison to maize is more easily interpreted, in that the two species are closely related and both have been domesticated and dispersed by humans within the last 10,000 years.
Linkage disequilibrium:
The extent to which linked sites will have a correlated evolutionary history is a function of both effective population size and recombination rate, both of which are affected by mating system (![]()
![]()
![]()
Excess amino acid polymorphism:
Effective population size not only determines levels of neutral variation, but also affects patterns of nearly neutral variation, although this process is still not well understood (![]()
![]()
![]()
![]()
for synonymous sites is >
W, there is essentially no difference between
and
W for nonsynonymous sites.
An alternative explanation is that, in this diverse group of accessions, human selection and/or local adaptation have favored different protein alleles in different environments (see below). Africa, where sorghum diversification occurred, has a particularly wide range of habitats ranging from humid tropics to desert (![]()
The effects of selection on sequence variation:
Candidate genes for association studies are typically identified through integration of QTL mapping, molecular genetics, and bioinformatics approaches. Population genetic analyses can complement this strategy by identifying regions that have been subject to selection (![]()
Selection by humans to improve the agronomic properties of crops is expected to produce characteristic signatures of selection at loci underlying those traits (see, e.g., ![]()
In contrast to targets of directional selection, loci that have responded to selection from local conditions may show an elevated level of diversity in a species-wide sample such as ours, although they might show reduced variation within a local population. Six of the most unusual loci in our HKA tests (Table 3) departed in the direction of excess polymorphism. Of these six, loci 1056, 1218, and 1249 have five, seven, and one nonsynonymous polymorphism(s), respectively, while coding sites were not identified in the other three loci. Interestingly, theoretical work (![]()
Our power to detect strong evidence of selection at particular loci in this study is impaired because detection of selection was not the major motivation of the study and the amount of data at any one locus is quite small. None of the departures that we identify in Table 3 is significant; they simply identify candidate regions for further investigation. Conversely, there are regions not highlighted in Table 3 for which independent evidence suggests that they may be associated with phenotypes under selection. On LG D, for example,
at loci 747 (57 cM) and 161 (59 cM) is eightfold less than average, while divergence is more than three times the average (see arrow in Fig 1). These loci are within the likelihood intervals for QTL affecting tillering, regrowth (![]()
Conclusions:
On the basis of a survey of almost 30,000 sites throughout the genome of S. bicolor, we find a frequency of SNPs about one-fourth of that observed in a comparable sample of maize accessions. There is no evidence of a skew to rare alleles; thus many of these SNPs are found in the frequency range useful for LD mapping and association studies. While the high level of intralocus LD in sorghum may prevent phenotypic differences from being attributed to individual sequence variants, interlocus LD does not appear to be so high as to reduce the utility of genome scans. Comparisons of polymorphism and divergence suggest that both directional and diversifying selection have played important roles in the evolutionary history of sorghum and that identification of the targets of that selection may provide important insights into the genetic basis of agronomically important phenotypes in the grasses and grains.
| FOOTNOTES |
|---|
Sequence data from this article have been deposited in the GenBank Popset library under accession nos.
AY234336,
AY234337,
AY234338,
AY234339,
AY234340,
AY234341,
AY234342,
AY234343,
AY234344,
AY234345,
AY234346,
AY234347,
AY234348,
AY234349,
AY234350,
AY234351,
AY234352,
AY234353,
AY234354,
AY234355,
AY234356,
AY234357,
AY234358,
AY234359,
AY234360,
AY234361,
AY234362,
AY502964,
AY504423,
AY514060,
AY514061,
AY514062,
AY514063,
AY514064,
AY514065,
AY514066,
AY514067,
AY514068,
AY514069,
AY514070,
AY514071,
AY514072,
AY514073,
AY514074,
AY514075,
AY514076,
AY514077,
AY514078,
AY514079,
AY514080,
AY514081,
AY514082,
AY514083,
AY514084,
AY514085,
AY514086,
AY514087,
AY514088,
AY514089,
AY514090,
AY514091,
AY514092,
AY514093,
AY514094,
AY514095,
AY514096,
AY514097,
AY514098,
AY514099,
AY514100,
AY514101,
AY514102,
AY514103,
AY514104,
AY514105,
AY514106,
AY514107,
AY514108,
AY514109,
AY514110,
AY514111,
AY514112,
AY514113,
AY514114,
AY514115,
AY514116,
AY514117,
AY514118,
AY514119, and
AY517934,
AY518080 and in the GSS library (S. propinquum data) under nos. CG993079CG993165 and CL147585CL147591. ![]()
1 Present address: PIE Department, IACR-Rothamsted, Harpenden, Hertfordshire AL5 2JQ, United Kingdom. ![]()
2 Present address: Facultad de Ciencias Experimentales y de la Salud, Sección de Biología Celular y Genética, Universidad San Pablo-CEU, Urb. Montepríncipe, 28668 Madrid, Spain. ![]()
3 Present address: Department of Plant Sciences, University of Arizona, Tucson, AZ 85721-0036. ![]()
| ACKNOWLEDGMENTS |
|---|
M. Tuinstra, W. Rooney, and G. Peterson provided seed; C. T. Hash provided information about accessions; Maria José Aranzana provided technical assistance; J. Hey provided a program to perform the multilocus HKA test; E. Buckler, P. Morrell, M. Aguadé, and two anonymous reviewers provided comments on the manuscript. Support for this project came from grants DBI-9872649 and 01-15903 from the National Science Foundation to A.H.P. and S.K.
Manuscript received September 25, 2003; Accepted for publication January 28, 2004.
| LITERATURE CITED |
|---|
AGUADÉ, M., 2001 Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana. Mol. Biol. Evol. 18:1-9.
ALDRICH, P. R., J. DOEBLEY, K. F. SCHERTZ, and A. STEC, 1992 Patterns of allozyme variation in cultivated and wild Sorghum bicolor. Theor. Appl. Genet. 85:451-460.
BAUDRY, E., C. KERDELHUE, H. INNAN, and W. STEPHAN, 2001 Species and recombination effects on DNA variability in the tomato genus. Genetics 158:1725-1735.
BERGELSON, J., M. KREITMAN, E. A. STAHL, and D. TIAN, 2001 Evolutionary dynamics of plant R-genes. Science 292:2281-2285.
BISHOP, J. G., A. M. DEAN, and T. MITCHELL-OLDS, 2000 Rapid evolution in plant chitinases: molecular targets of selection in plant-pathogen coevolution. Proc. Natl. Acad. Sci. USA 97:5322-5327.
BOWERS, J. E., C. ABBEY, S. ANDERSON, C. CHANG, and X. DRAYE et al., 2003 A high-density genetic recombination map of sequence-tagged sites for Sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses. Genetics 165:367-386.
BUSTAMANTE, C. D., R. NIELSEN, S. A. SAWYER, K. M. OLSEN, and M. D. PURUGGANAN et al., 2002 The cost of inbreeding in Arabidopsis. Nature 416:531-534.[CrossRef][Medline]
CHARLESWORTH, B., 1998 Measures of divergence between populations and the effect of forces that reduce variability. Mol. Biol. Evol. 15:538-543.[Abstract]
CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993 The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303.[Abstract]
CHITTENDEN, L. M., K. F. SCHERTZ, Y. R. LIN, R. A. WING, and A. H. PATERSON, 1994 A detailed RFLP map of Sorghum bicolor x S. propinquum, suitable for high-density mapping, suggests ancestral duplication of Sorghum chromosomes or chromosomal segments. Theor. Appl. Genet. 87:925-933.
CLEGG, M. T., M. P. CUMMINGS, and M. L. DURBIN, 1997 The evolution of plant nuclear genes. Proc. Natl. Acad. Sci. USA 94:7791-7798.
CUI, Y. X., G. W. XU, C. W. MAGILL, K. F. SCHERTZ, and G. E. HART, 1995 RFLP-based assay of Sorghum bicolor (L.) Moench genetic diversity. Theor. Appl. Genet. 90:787-796.
DEU, M., D. L. D. GONZALEZ, J. C. GLASZMANN, I. DEGREMONT, and J. CHANTEREAU et al., 1994 RFLP diversity in cultivated sorghum in relation to racial differentiation. Theor. Appl. Genet. 88:838-844.[CrossRef]
DJE, Y., M. HEUERTZ, C. LEFEBVRE, and X. VEKEMANS, 2000 Assessment of genetic diversity within and among germplasm accessions in cultivated sorghum using microsatellite markers. Theor. Appl. Genet. 100:918-925.[CrossRef]
DOYLE, J. J. and J. L. DOYLE, 1987 A rapid DNA isolation procedure for small amounts of leaf tissue. Phytochem. Bull. 19:11-15.
EYRE-WALKER, A., R. L. GAUT, H. HILTON, D. L. FELDMAN, and B. S. GAUT, 1998 Investigation of the bottleneck leading to the domestication of maize. Proc. Natl. Acad. Sci. USA 95:4441-4446.
FAY, J. C., G. J. WYCKOFF, and C.-I WU, 2001 Positive and negative selection on the human genome. Genetics 158:1227-1234.
GALE, M. D. and K. M. DEVOS, 1998 Plant comparative genetics after 10 years. Science 282:656-659.
GAUT, B. S. and J. F. DOEBLEY, 1997 DNA sequence evidence for the segmental allotetraploid origin of maize. Proc. Natl. Acad. Sci. USA 94:6809-6814.
HILL, W. G., 1974 Estimation of linkage disequilibrium in randomly mating populations. Heredity 33:229-239.[Medline]
HUDSON, R. R., M. KREITMAN, and M. AGUADÉ, 1987 A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159.
HUDSON, R. R., M. SLATKIN, and W. P. MADDISON, 1992 Estimation of levels of gene flow from DNA sequence data. Genetics 132:583-589.[Abstract]
KAHLER, A. L., C. O. GARDNER, and R. W. ALLARD, 1984 Nonrandom mating in experimental populations of maize Zea-Mays. Crop Sci. 24:350-354.
KIMBER, C., 2000 Origins of domesticated sorghum and its early diffusion to India and China, pp. 398 in Sorghum, edited by C. W. SMITH and R. A. FREDERIKSEN. John Wiley & Sons, New York.
KONDRASHOV, F. A., I. B. ROGOZIN, Y. I. WOLF and E. V. KOONIN, 2002 Selection in the evolution of gene duplications. Genome Biol. 3: RESEARCH0008.
LIU, F., D. CHARLESWORTH, and M. KREITMAN, 1999 The effect of mating system differences on nucleotide diversity at the phosphoglucose isomerase locus in the plant genus Leavenworthia. Genetics 151:343-357.
LONG, A. D. and C. H. LANGLEY, 1999 The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 9:720-731.
MCDONALD, J. H. and M. KREITMAN, 1991 Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652-654.[CrossRef][Medline]
MORRELL, P. L., K. E. LUNDY, and M. T. CLEGG, 2003 Distinct geographic patterns of genetic diversity are maintained in wild barley (Hordeum vulgare ssp. spontaneum) despite migration. Proc. Natl. Acad. Sci. USA 100:10812-10817.
NEI, M., 1987 Molecular Evolutionary Genetics. Columbia University Press, New York.
NORDBORG, M., 2000 Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial self-fertilization. Genetics 154:923-929.
NORDBORG, M. and P. DONNELLY, 1997 The coalescent process with selfing. Genetics 146:1185-1195.[Abstract]
NORDBORG, M., B. B. CHARLESWORTH, and D. CHARLESWORTH, 1996 Increased levels of polymorphism surrounding selectively maintained sites in highly selfing species. Proc. R. Soc. Lond. Ser. B Biol. Sci. 263:1033-1039.
NORDBORG, M., J. O. BOREVITZ, J. BERGELSON, C. C. BERRY, and J. CHORY et al., 2002 The extent of linkage disequilibrium in Arabidopsis thaliana. Nat. Genet. 30:190-193.[CrossRef][Medline]
OHTA, T., 1993 Pattern of nucleotide substitutions in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication. Genetics 134:1271-1276.[Abstract]
OHTA, T., 2002 Near-neutrality in evolution of genes and gene regulation. Proc. Natl. Acad. Sci. USA 99:16134-16137.
PATERSON, A. H., Y. R. LIN, Z. LI, K. F. SCHERTZ, and J. F. DOEBLEY et al., 1995 Convergent domestication of cereal crops by independent mutations at corresponding genetic loci. Science 269:1714-1718.
POLLAK, E., 1987 On the theory of partially inbreeding finite populations. I. Partial selfing. Genetics 117:353-360.
RAFALSKI, A., 2002 Applications of single nucleotide polymorphisms in crop genetics. Curr. Opin. Plant Biol. 5:94-100.[CrossRef][Medline]
REMINGTON, D. L., J. M. THORNSBERRY, Y. MATSUOKA, L. M. WILSON, and S. R. WHITT et al., 2001 Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc. Natl. Acad. Sci. USA 98:11479-11484.
ROONEY, W. L., and C. W. SMITH, 2000 Techniques for developing new cultivars, pp. 329347 in Sorghum, edited by C. W. SMITH and R. A. FREDERIKSEN. John Wiley & Sons, New York.
ROZAS, J. and R. ROZAS, 1999 DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.
SAVOLAINEN, O., C. H. LANGLEY, B. P. LAZZARO, and H. FREVILLE, 2000 Contrasting patterns of nucleotide polymorphism at the alcohol dehydrogenase locus in the outcrossing Arabidopsis lyrata and the selfing Arabidopsis thaliana. Mol. Biol. Evol. 17:645-655.
SCHLOSS, S. J., S. E. MITCHELL, G. M. WHITE, R. KUKATLA, and J. E. BOWERS et al., 2002 Characterization of RFLP probe sequences for gene discovery and SSR development in Sorghum bicolor (L.) Moench. Theor. Appl. Genet. 105:912-920.[CrossRef][Medline]
SHEPARD, K. A. and M. D. PURUGGANAN, 2003 Molecular population genetics of the Arabidopsis CLAVATA2 region: the genomic scale of variation and selection in a selfing species. Genetics 163:1083-1095.
SUNYAEV, S. R., W. C. LATHE, III, V. E. RAMENSKY, and P. BORK, 2000 SNP frequencies in human genes: an excess of rare alleles and differing modes of selection. Trends Genet. 16:335-337.[CrossRef][Medline]
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.
TENAILLON, M. I., M. C. SAWKINS, A. D. LONG, R. L. GAUT, and J. F. DOEBLEY et al., 2001 Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc. Natl. Acad. Sci. USA 98:9161-9166.
VIGOUROUX, Y., M. MCMULLEN, C. T. HITTINGER, K. HOUCHINS, and L. SCHULZ et al., 2002 Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication. Proc. Natl. Acad. Sci. USA 99:9650-9655.
WAKELEY, J., 1996 The variance of pairwise nucleotide differences in two populations with migration. Theor. Popul. Biol. 49:39-57.[CrossRef][Medline]
WAKELEY, J. and N. ALIACAR, 2001 Gene genealogies in a metapopulation. Genetics 159:893-905.
WANG, R. L., A. STEC, J. HEY, L. LUKENS, and J. DOEBLEY, 1999 The limits of selection during maize domestication. Nature 398:236-239.[CrossRef][Medline]
WATTERSON, G. A., 1975 On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7:256-276.[CrossRef][Medline]
WHITE, S. E. and J. F. DOEBLEY, 1999 The molecular evolution of terminal ear1, a regulatory gene in the genus Zea. Genetics 153:1455-1462.
WHITLOCK, M. C. and N. H. BARTON, 1997 The effective size of a subdivided population. Genetics 146:427-441.[Abstract]
WRIGHT, S. I., B. LAUGA, and D. CHARLESWORTH, 2003 Subdivision and haplotype structure in natural populations of Arabidopsis lyrata. Mol. Ecol. 12:1247-1263.[CrossRef][Medline]

