Genetics, Vol. 162, 2007-2015, December 2002, Copyright © 2002

The Influence of Linkage and Inbreeding on Patterns of Nucleotide Sequence Diversity at Duplicate Alcohol Dehydrogenase Loci in Wild Barley (Hordeum vulgare ssp. spontaneum)

Jing-Zhong Lin1,a, Peter L. Morrella, and Michael T. Clegga
a Department of Botany and Plant Sciences, University of California, Riverside, California 92521

Corresponding author: Michael T. Clegg, University of California, Riverside, CA 92521., michael.clegg{at}ucr.edu (E-mail)

Communicating editor: A. H. D. BROWN


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Patterns of nucleotide sequence diversity are analyzed for three duplicate alcohol dehydrogenase loci (adh1–adh3) within a species-wide sample of 25 accessions of wild barley (Hordeum vulgare ssp. spontaneum). The adh1 and adh2 loci are tightly linked (recombination fraction <0.01) while the adh3 locus is inherited independently. Wild barley is predominantly self-fertilizing (~98%), and as a consequence, effective recombination is restricted by the extreme reduction in heterozygosity. Large reductions in effective recombination, in turn, widen the conditions for linkage to influence nucleotide sequence diversity through the action of selective sweeps or background selection. These considerations would appear to predict (1) homogeneity in patterns of nucleotide sequence diversity, especially between closely linked loci, and (2) extensive linkage disequilibrium relative to random-mating species. In contrast to these expectations, the wild barley data reveal heterogeneity in patterns of nucleotide sequence diversity and levels of linkage disequilibrium that are indistinguishable from those observed at adh1 in maize, an outbreeding grass species.


SELF-FERTILIZATION is a common form of reproduction among grass species. This mode of reproduction has profound genetic consequences, owing to its effect on genotypic frequency distributions. MENDEL 1865 Down first noted that recurrent self-fertilization is expected to reduce the frequency of heterozygotes by one-half per generation. A reduction in the frequency of heterozygotes reduces the effective rate of recombination that in turn leads to wider conditions for the existence of transient or stable linkage disequilibrium (LD). To quantify this effect, WEIR and COCKERHAM 1973 Down presented an expression for the rate of decay of LD for linked pairs of genes in mixed-mating systems where a fraction s of zygotes arise from self-fertilizations and t (=1 - s) arise from random outcrossing. According to calculations based on this expression, self-fertilization retards the decay of LD for unlinked genes by a factor of ~1/t, where t is the outcrossing rate. Thus, for example, a species with 98% self-fertilization would exhibit a 50-fold reduction in the rate of decay of LD between unlinked genes. On the basis of this fact and extensive empirical research, ALLARD 1975 Down argued that the interaction between self-fertilization and selection should favor the development of coadapted gene complexes. Furthermore, conditions for the maintenance of genetic diversity by natural selection are much more restrictive in self-fertilizing species (e.g., STEBBINS 1950 Down; HAYMAN 1953 Down; KIMURA and OHTA 1973 Down). Even selectively neutral diversity is expected to be restricted owing to background selection and the associated reduction in effective population size in inbreeding species (CHARLESWORTH et al. 1993 Down; NORDBORG 2000 Down).

Consistent with these theoretical expectations, isozyme surveys of predominantly self-fertilizing plant populations have tended to show (1) reduced levels of polymorphism compared to outcrossing species (HAMRICK and GODT 1990 Down) and (2) higher levels of linkage disequilibrium among both linked and unlinked loci (HASTINGS 1990 Down). It is now possible to examine patterns of LD in species-wide samples of gene sequences where physical linkage is measured in terms of a few hundred or a few thousand nucleotides. To date, most of these studies have concentrated on a few random mating species (Drosophila, humans, and Zea mays). The most extensive data derive from studies of human genetic diversity (REICH et al. 2001 Down; STEPHENS et al. 2001 Down). Determining the physical distance encompassed by a LD domain in humans is an area of active research and recent results are leading to a revision of earlier suggestions that LD is restricted to at most a few thousand base pairs. For example, REICH et al. 2001 Down report that the half-length for decay of LD is ~60 kb and in many cases exceeds 80 kb in major human populations. Numerous sampling issues are associated with this and similar studies (WEISS and CLARK 2002 Down). Moreover, there is substantial heterogeneity in estimated LD domains in human populations owing to population structure and historical demographic events (ARDLIE et al. 2002 Down).

Nevertheless, the recent measurement of LD domains is important because it establishes that substantial LD domains can exist in random-mating species. In contrast, a recent species-wide survey of 21 loci on chromosome 1 of maize estimated that LD decays to the neighborhood of zero for sites separated by 500 bp (TENAILLON et al. 2001 Down). This suggests that LD may be extremely limited in species-wide samples from random-mating plant species.

Recent studies of the predominantly self-fertilizing plant Arabidopsis thaliana suggest that LD may extend over physical distances >150 kb (NORDBORG et al. 2002 Down). Estimates of intragenic recombination from sequence data also provide an important indirect measure of the relative importance of recombination in self-fertilizing species. These estimates provide occasional evidence for intragenic recombination in species-wide samples from A. thaliana (e.g., KAWABE et al. 1997 Down; KAWABE and MIYASHITA 1999 Down; KUITTINEN and AGUADE 2000 Down) and wild barley (Hordeum vulgare ssp. spontaneum; LIN et al. 2001 Down).

In this article, we use a coalescent framework to analyze diversity patterns for three duplicate alcohol dehydrogenase loci (adh1, adh2, and adh3) in the genome of wild barley. Alcohol dehydrogenase (adh; EC 1.1.1.1) is an important enzyme in anaerobic metabolism and is usually encoded by a small multigene family in higher plants. All grass genomes so far investigated possess adh1 and adh2 loci. Phylogenetic analyses indicate that these two loci duplicated shortly before the origin of the grass family, ~70 million years ago (GAUT et al. 1999 Down). A third locus, adh3, duplicated from adh2 ~16 million years ago within the lineage that includes wild barley (TRICK et al. 1988 Down). The more ancient adh1 and adh2 genes are closely linked on chromosome 4 of barley. No recombinants have been observed between adh1 and adh2 in experimental crosses leading to an estimate of r = 0.0 with a 95% confidence level of 0.0 < r < 0.01 (BROWN 1980 Down; HART et al. 1980 Down). The more recently derived adh3 locus is freely recombining with adh1 and adh2 (HARBERD and EDWARDS 1983 Down; TRICK et al. 1988 Down; BROWN et al. 1989 Down).

Wild barley is a diploid (n = 7) grass that is distributed throughout the Near East and Far East (ranging from Afghanistan and Turkmenistan through Iran and Turkey into Iraq, Israel, Syria, and Jordan; VON BOTHMER et al. 1995 Down). Wild barley is the progenitor of domesticated barley and as such has been of long-standing interest to students of crop evolution. Isozyme studies of wild barley conducted in the 1970s indicate moderate levels of genetic diversity (e.g., BROWN et al. 1978A Down). Marker loci have been used to estimate outcrossing rates from a number of local populations; these average ~1.6% outcrossing or >98% self-fertilization (BROWN et al. 1978B Down).

In this article we analyze patterns of DNA sequence diversity and LD between the three functionally similar adh loci on the basis of a species-wide sample of 25 accessions from the natural range of wild barley. The data show heterogeneous patterns of diversity between loci. Patterns of LD for adh1 and adh2 within loci are essentially identical to those calculated from maize, an outcrossing species. Some significant LD is detected between the tightly linked adh1 and adh2 loci. LD is larger in magnitude, but still modest within loci in samples that span the geographic range of this predominantly self-fertilizing species. The modest levels of LD suggest that the randomizing forces of mutation and recombination dominate inbreeding in the determination of LD patterns at the timescales represented by this sample.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Plant materials:
Twenty-five wild barley accessions were used in this study. These accessions, which were also studied for the ahd1 and adh3 genes (CUMMINGS and CLEGG 1998 Down; LIN et al. 2001 Down), represent a species-wide sample covering the natural geographic range of wild barley with no population subsampling. Detailed sources and culture conditions for the plant samples and genomic DNA isolation procedure were described in LIN et al. 2001 Down.

Gene nomenclature:
Some confusion exists regarding the identification of adh2 and adh3. TRICK et al. 1988 Down originally cloned all three adh genes from a genomic library of barley and established on the basis of sequence similarity that one of these genes is homologous to maize adh1, thus confirming this gene as barley adh1. The adh3 duplication arose subsequent to the separation of the maize and barley lineages so a sequence similarity argument could not be used to discriminate between adh2 and adh3. On the basis of a comparison of null isozyme mutations to the adh2 and adh3 sequences, LIN et al. 2001 Down suggested that the identity of adh2 and adh3 had been reversed by TRICK et al. 1988 Down. In this article we adhere to the nomenclature established in LIN et al. 2001 Down.

PCR and sequencing:
PCR primers were designed on the basis of the published barley adh3 sequence of TRICK et al. 1988 Down, which we now conclude to actually correspond to the adh2 isozyme locus. The gene was amplified as two segments, 1 and 2, which overlap in exon 4. The total amplified region begins in exon 1 and extends into exon 9 (Fig 1). Templates for DNA sequencing were generated using a two-step nested primer amplification procedure. Segment 1 was first amplified with primers f1b (5'-CAAAATCTACGTAGCAACGAAC-3') and r1a (5'-GAGAGACCACAGCTGAGGACA-3'). The resulting PCR products were then used as templates to reamplify this region using two nested primers, f1d (5'-ATCCGTCTTTCTCTTCCAGC-3') and r1b (5'-GGGGTTGATCTTTGCGAG-3'). Segment 2 was initially amplified with primers f2a (5'-ACCGGCGAGTGCAAGGAC-3') and r2b (5'-CATGTACATCTCGACCACTTC-3') and then reamplified with primers f2b (5'-ATCGTGGCGTGATGATC-3') and r2a (5'-GAAGAAGGTGCCTTTCAGG-3'). PCR amplification was performed as described in LIN et al. 2001 Down. The PCR products were cleaned using a PCR purification kit (QIAGEN, Chatsworth, CA). Nucleotide sequences were determined on both strands by direct sequencing of PCR products on an ABI377 automated sequencer (Applied Biosystems, Foster City, CA) by an outside provider (University of Maine).



View larger version (6K):
In this window
In a new window
Download PPT slide
 
Figure 1. Gene structure of adh2 in barley. The nine exons are numbered. Segments 1 and 2 were amplified with the primers indicated by arrows.

Sequence analysis:
Owing to the high level of self-fertilization in wild barley all accessions studied were homozygous for the adh2 locus. As a consequence the sampling of accessions is tantamount to sampling haplotypes directly. The sequences of both strands of each accession were aligned to establish a consensus sequence and an alignment for the 25 accessions was built using SEQUENCHER (Gene Codes, Ann Arbor, MI). Estimates of linkage disequilibria and their associated levels of significance were obtained using the program DnaSP (ROZAS and ROZAS 1999 Down), and polymorphism analyses were carried out by using the program SITES (HEY and WAKELEY 1997 Down). Tests of neutrality and determination of their associated significance levels used the programs of FU 1997 Down. Haplotype trees were constructed using statistical parsimony (TEMPLETON et al. 1992 Down) as implemented by TCS version 1.13 (CLEMENT et al. 2000 Down). Sequence insertions and deletions were treated as a fifth character state. Sequences of adh1 for the same 25 accessions (CUMMINGS and CLEGG 1998 Down) were retrieved from GenBank and reanalyzed using the methods applied to adh2 and adh3.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Nucleotide sequence polymorphism:
The sequenced adh2 region was 1980 bp, including 946 bp of coding and 1034 bp of noncoding sequence. Nineteen haplotypes are observed in the adh2 sample (Fig 2). Of the 46 polymorphic sites in the sample, 13 are synonymous, 11 are replacements, 12 are in introns, and 10 represent insertion/deletion events (Table 1). There are substantially larger numbers of singleton polymorphisms in all regions of adh2 than expected under a neutral model (Table 1).



View larger version (41K):
In this window
In a new window
Download PPT slide
 
Figure 2. Polymorphic site distribution in the adh2 regions among 25 wild barley accessions. Sites in exons are highlighted. Dots denote consensus sequences. Polymorphism types are as follows: S, synonymous; R, replacement; N, noncoding; —, deletion.


 
View this table:
In this window
In a new window

 
Table 1. Number of polymorphic sites observed and expected under a neutral model (in parentheses) at various frequencies in regions of adh2 in wild barley

Fig 2 shows the distribution and types of polymorphic sites in adh2 for the 25 accessions. There are several major haplotypes, allowing for one or two nucleotide deviations. Haplotype groups I, III, IV, and V account for more than one-half the accessions. Haplotype II and group IV appear to be recombinants. Genealogical analysis showed no evidence for geographic structuring in the adh2 sample, where the common haplotypes are found across the entire distribution of wild barley (Fig 3).



View larger version (28K):
In this window
In a new window
Download PPT slide
 
Figure 3. Statistical parsimony gene genealogy of adh1 (A) and adh2 (B) for 25 wild barley accessions as estimated by TCS version 1.13. Accession numbers are followed by the country of origin for each sample. Major adh2 haplotype groups are denoted by Roman numerals. All branches represent a single mutational step. The haplotypes with the highest outgroup probabilities are displayed as rectangles. Small, unlabeled ovals represent haplotypes inferred from mutational changes, but not sampled in this data set.

A 1-bp insertion at position 1240 for accession 24 resulted in early termination in the amino acid translation. We did not observe a null allele at the isozyme level. But with limited isozyme data at hand (not shown), the effect of this insertion is yet to be determined. There is a 365-bp insertion in intron 3 for accession 39 (not shown). The origin of this DNA fragment is unclear. It has ~42% similarity with a rice genomic DNA clone, but no nucleotide sequence similarity with any of the alcohol dehydrogenase genes in grasses reported thus far. Thus it could be an interlocus recombinant or a transposable element lacking terminal repeat sequences.

Estimates of nucleotide diversity:
Table 2 gives WATTERSON's (1975) {theta} estimate and TAJIMA's (1989) {pi} estimate of nucleotide diversity and test statistics for deviation from a neutral genealogy for all three genes partitioned by intron, synonymous, and replacement sites. The diversity statistics are clearly heterogeneous with a 10-fold range from adh1 to adh3. Adh2 is intermediate with about twice the diversity observed at adh1. Moreover, the three loci show different test statistic patterns. Adh1 has significantly negative values for replacement sites and largely nonsignificant negative values for other sites. Adh2 has negative test statistics and a few are marginally significant. Adh3 has positive or nonsignificant negative values for most tests except for Tajima's T for introns. The positive values at adh3 are a result of the regional differentiation between two major sequence types observed at this locus (see DISCUSSION).


 
View this table:
In this window
In a new window

 
Table 2. Estimates of nucleotide diversity and test statistics for selection at adh1, adh2, and adh3 for 25 accessions of wild barley

Estimates of linkage disequilibrium and recombination:
Because adh1 and adh2 are tightly linked it is of interest to examine the data for intra- and interlocus LD. A standard {chi}2 test (with 1 d.f.) for significant pairwise association can be calculated as NR2 = ND2/[pa(1 - pa)qb(1 - qb)], where N is the sample size, D is the standard measure of linkage disequilibrium for a pair of diallelic loci, and pa and qb are the marginal site frequencies. The test statistic was calculated for all sites where the minority type was represented two or more times in the sample (that is, singletons were excluded from the calculations of D). A Bonferroni correction for multiple tests was applied to judge significance of the resulting test statistics. Six and 13 polymorphic sites in adh1 and adh2, respectively, satisfy this criterion, yielding 15 and 78 within-locus tests and 78 between-locus tests. Of these, 3 out of 15 tests or 20% were significant beyond the adjusted 5% level within adh1 and 26 out of 78 tests (33%) were significant within adh2. Only 4 between-locus tests were significant (5%). These analyses indicate moderate but far from complete association within loci and relatively little association between the two closely linked loci. Consistent with the statistical analysis of LD, at least two recombination events within the adh2 region are detected using HUDSON and KAPLAN's (1985) method, one between nucleotide positions 173 and 713 and another between positions 838 and 1710. The population recombination parameters were estimated to be {gamma} = 0.000463/bp (HEY and WAKELEY 1997 Down) and 4Nc = 0.014799/bp (HUDSON 1987 Down). As noted above, inspection of the haplotype data also reveals evidence for several recombinant genotypes (see Fig 2). For example, accession 2 appears to be a recombinant between haplotype group I and accession 9 of haplotype group III. Haplotype group IV also appears to be a recombinant between haplotype group III and other haplotypes not detected in this sample. However, it is difficult to determine the boundary of linkage blocks and their associated level of significance due to the large number of haplotypes involved.

The power to detect LD within and between loci is limited owing to the very low levels of polymorphism at adh1 and the weak statistical power of these tests (BROWN 1975 Down). Thus the detection of any statistically significant LD between loci following Bonferroni correction establishes that significant levels of LD persist within loci and to a much lesser extent between the adh1 and adh2 loci. On the other hand, LD between adh2 and adh3 (tests not shown) and between adh1 and adh3 were not significant (LIN et al. 2001 Down), lending further support to our previous finding that the identity of adh2 and adh3 had been reversed by TRICK et al. 1988 Down.

Fig 4 plots LD as a function of distance (base pairs) for the three barley genes (measured as the squared correlation in allelic state, R2, defined above) for maize adh1. This plot contrasts patterns of R2 within the barley genes to that observed for an outcrossing grass species. The patterns of LD for barley adh1 and adh2 are essentially indistinguishable from maize adh1. Barley adh3 shows very high R2 across the nearly 2000-bp range of the plot. When the adh3 data are partitioned into the distinct clusters (see LIN et al. 2001 Down), the patterns of LD become indistinguishable from the other loci (data not shown), establishing that the high LD is accounted for by the extreme population subdivision observed for adh3.



View larger version (22K):
In this window
In a new window
Download PPT slide
 
Figure 4. The relationship between linkage disequilibrium (R2) and nucleotide distance at adh1, adh2, and adh3 in wild barley and adh1 in maize. The R2 values are an average of all data points within the distance intervals in 100-bp increments. Intervals with no polymorphic sites are ignored.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Patterns of nucleotide sequence diversity are heterogeneous among loci:
To contrast the adh2 data to those of adh1 and adh3, we begin by reviewing several salient features of the adh1 data set. CUMMINGS and CLEGG 1998 Down determined adh1 sequence diversity for a 1362-bp region of the gene in a sample of 45 accessions of wild barley from throughout the range of the species. The total data included 786 bp of exon sequence and 576 bp of intron sequence. A partition of the polymorphic nucleotide sites into synonymous and replacement sites revealed seven replacement polymorphisms and four synonymous polymorphisms in exons ({theta} = 0.0032/bp for the entire 45-accession sample). All seven replacement polymorphisms were unique in the sample (each occurred in only a single accession). TAJIMA's (1989) test and analogous tests by FU and LI 1993 Down and FU 1997 Down indicated a significant excess of low-frequency amino acid polymorphisms in the sample. Silent polymorphisms in introns and synonymous polymorphisms in exons were not significant as judged by these tests. The most plausible interpretation of these data is that adh1 has been subject to a history of strong purifying selection. Low-frequency replacement polymorphisms are likely to be the result of a selection/mutation balance in small local populations where the selection intensity S < 1/Ne (and where Ne is the local population effective size).

The pattern at adh2 is similar to adh1 in demonstrating an overall excess of low-frequency polymorphisms as judged by the test statistics (Table 2). However, the pattern at adh2 also differs in several important respects. First, the level of nucleotide sequence diversity is 1.6-fold higher at adh2 ({theta} = 0.0048 ± 0.0008) than at adh1 ({theta} = 0.0029 ± 0.0008, for the subset of 25 accessions sampled for adh2 and adh3); second, the excess of low-frequency polymorphisms appears to be associated not only with replacement sites but also with intron and synonymous sites; and third, several replacement polymorphisms occur more than once in the sample (e.g., sites 173, 215, 1404, and 1710). Of particular note is the highly polymorphic site 173 associated with haplotype groups I and V. This isoleucine-to-valine substitution is not associated with any known isozyme variant.

The pattern of replacement polymorphism at adh2 stands in marked contrast to that observed at adh1 in that it is not consistent with the simple mutation/selection balance hypothesis invoked to explain the adh1 pattern. It may be that selective constraints at adh2 are relaxed relative to adh1 and these sites are more nearly neutral. This explanation is consistent with the observation that adh2 shows accelerated rates of protein evolution over the 60- to 70-million-year history of the grass family (GAUT et al. 1999 Down). Assuming weaker selective constraints, it is noteworthy that strong purifying selection at adh1 (background selection) has not reduced {theta} (the effective population size) at the nearby adh2 locus to a level comparable to that observed at adh1. Put differently, despite a high level of self-fertilization and tight linkage between these two loci combined with evidence for strong purifying selection at adh1, the genealogical histories of the two loci are not strongly correlated. An alternative explanation is that the site 173 polymorphism is maintained by selection, but no compelling evidence supports this hypothesis.

LIN et al. 2001 Down analyzed the identical sample of 25 accessions for adh3 and these data reveal a very different pattern of polymorphism from that observed at adh1 or adh2. For adh3, the sample is composed of two dimorphic sequence types in roughly equal frequency that differed from one another at the ~2% level. The estimate of {theta} is more than fivefold lower at adh1 than that observed for adh3 and the test statistics are strongly positive. The estimate of the most recent common ancestor for the adh3 genealogy is ~3 million years. Several recombinants between the common sequence types were observed in the adh3 sample and the recombination events were estimated to have occurred at 770,000 and 320,000 years ago. Most importantly, a strong geographic correlation exists with one sequence type occurring almost exclusively in the Far East and the other exclusively in the Near East. The recombinant types are found in samples from the Zagros mountain area, at the point of contact between these two regions (LIN et al. 2001 Down). BADR et al. 2000 Down reported a similar geographic pattern for a Bkn-3 locus in wild barley and limited sequence samples from at least one other locus in wild barley also suggest the existence of highly diverged sequence types (P. MORRELL and M. CLEGG, unpublished data). In contrast, no geographic correlation is evident in either the adh1 or the adh2 data. The common haplotypes observed at these loci appear to be geographically widespread (see Fig 3). Thus the genealogical history for adh3 is completely distinct from that observed at the two other unlinked adh loci.

Patterns of linkage disequilibrium within and among adh loci:
We may conclude that each of the three duplicate adh loci is subject to the influence of different evolutionary forces, despite the effect of linkage and inbreeding. The detection of intralocus recombination within adh2 and adh3 is noteworthy in view of the >50-fold reduction in effective recombination expected in this species. This finding makes it clear that the randomizing effect of recombination operates even when inbreeding is the predominant mode of reproduction. Still LD within loci is moderate; very high LD is observed within the adh3 sequence sample owing to the two divergent sequence types that characterize the sample (LIN et al. 2001 Down). Moreover, moderate LD is detected within adh1 and adh2, corresponding to major haplotypes. However, the levels of LD are essentially indistinguishable from those observed at the homologous adh1 locus in maize (Fig 4).

What are the possible explanations for the similarity between the maize and barley data? One potential explanation is that wild barley transitioned to self-fertilization from an outcrossing mating system relatively recently in its evolutionary history. There is no compelling evidence one way or the other with which to evaluate this hypothesis, although the closely related species H. bulbosum is self-incompatible. A second possible explanation posits biases in the measures of association that obscure real differences. In two commonly used measures of association, R2 and D', defined as D/Dmax, Dmax is the maximum value of D for a given set of gene frequencies. D' has two serious drawbacks; first, it is extremely sensitive to variation in allele frequency, especially when one allele is in low frequency (as is commonly the case in this data set). This means that the variance is very large in small samples. Second, the sampling distribution of D' is unknown [see discussions in WEISS and CLARK 2002 Down and NORDBORG and TAVARE 2002 Down for a further critique of D' measures]. The other common measure, R2, is also a function of nucleotide frequencies and is therefore sensitive to frequency variation. However, R2 does not tend to inflate the strength of association for rare polymorphic sites as extremely as D'. How might this affect the barley-maize comparison? The estimated values of {theta}/bp for adh1 in maize are nearly 5-fold and 10-fold greater than those for barley adh2 and adh1, reflecting both more polymorphic sites and higher frequencies at the average polymorphic site in maize (GAUT and CLEGG 1993 Down; LIN et al. 2001 Down). To the extent that polymorphic site frequencies are intermediate in maize, we would expect R2 to be smaller for the extreme case of complete association. Such a bias would tend to inflate the magnitude of the barley LD relative to maize. Thus variation in nucleotide frequencies between the maize and barley samples does not appear to be an explanation for the virtually equivalent magnitudes of within-locus LD in maize and barley.

A third possible explanation concerns the evolutionary timescale spanned by these data. The estimated time to the most recent common ancestor for adh2 is 460,000 years based on an estimate of the nucleotide substitution rate of 3.5 x 10-9 sites/year (LIN et al. 2001 Down). This time interval is clearly sufficient for interlocus recombinants to appear in the sample and, consequently, for a substantial uncoupling of adh2 evolution relative to adh1. This timescale is also sufficient for intralocus recombination to have had a marked randomizing effect based on the observation of recombination between adh2 haplotypes in the sample. In considering timescale it is important to recall that a species-wide sample was analyzed in this study. The temporal history of a species-wide sample is likely to be much deeper than that for a local population. As a consequence, the expected number of recombination events may be substantial, even over distances of a few hundred nucleotides. (Of course, a past history of higher outcrossing would also enhance the number of recombination events within genes.) In contrast, local populations are ephemeral and typically have much smaller effective population sizes so the expected number of recombination events between closely linked genes drawn from within a local population sample should be greatly reduced. Moreover, because selection is generally dependent on local environmental conditions, it is expected to reduce haplotype diversity leading to increases in LD that may persist for considerable periods of time in self-fertilizing species. Accordingly, we expect hitchhiking effects within genomes of predominantly self-fertilizing species to be substantial at the local population level and to perturb large chromosomal regions. The global averaging associated with species-wide samples largely obscures these local effects.

A recent study of A. thaliana, which is believed to have a rate of self-fertilization of ~99%, reported LD domains that appear to extend beyond 150 kb (NORDBORG et al. 2002 Down). This study was based on a global sample of 20 accessions and was designed to investigate larger chromosomal regions. Several sampling strategies were used, but most pertinent to this discussion was an effort to study a 250-kb region by sequencing 13 short segments scattered throughout the region. There is considerable scatter in the reported data for values of R2 at very short distances with many data points appearing to fall at or below R2 = 0.2 at the shortest distances. Indeed, the distribution of R2 appears to be almost uniform over the range from 0 to 100 kb and to have an average <0.2 over this range (see NORDBORG et al. 2002 Down, Figure 1). Setting aside the question of multiple tests, NR2 > 3.84 for a critical region of 0.05, which implies R2 < 0.192. Thus it appears that one-half or fewer of the tests would be significant at the shortest distances in the A. thaliana data set.

The question emerges whether there is a discrepancy between the results reported here and those of NORDBORG et al. 2002 Down. Both A. thaliana and wild barley have similar nucleotide site diversity levels, so this is unlikely to be a major factor in any differences in levels of LD between these two species. A. thaliana is believed to have experienced recent population expansions so demographic influences might account for some differences, although little is known about the demographic history of wild barley. What is more striking in both data sets is their agreement in reporting modest LD at very short distances. We conclude that recombination is a powerful influence even in species where the rate of recombination is substantially reduced through the mating system. In contrast to species-wide patterns, strongly correlated patterns of LD may be expected within local populations where time intervals are short and opportunities for recombination are severely restricted.


*  FOOTNOTES

Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos. AY184931, AY184932, AY184933, AY184934, AY184935, AY184936, AY184937, AY184938, AY184939, AY184940, AY184941, AY184942, AY184943, AY184944, AY184945, AY184946, AY184947, AY184948, AY184949, AY184950, AY184951, AY184952, AY184953, AY184954, AY184955. Back
1 Present address: Shentai Genomics, Taitai Industrial Bldg., Third Floor, Shenzhen Hightech Park, Shenzhen 518057, People's Republic of China. E-mail: jzlin88{at}yahoo.com Back


*  ACKNOWLEDGMENTS

We thank Dr. Mary Durbin for technical assistance. We also thank the Alfred P. Sloan Foundation and National Science Foundation grant DEB-0129247 for partial support of this work.

Manuscript received June 25, 2002; Accepted for publication September 30, 2002.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

ALLARD, R. W., 1975  The mating system and microevolution. Genetics 79:115-126.

ARDLIE, K. G., L. KRUGLYAK, and M. SEIELSTAD, 2002  Patterns of linkage disequilibrium in the human genome. Nat. Rev. Genet. 3:299-309.[Medline]

BADR, A., K. MÜLLER, R. SCHÄFER-PREGL, H. EL RABEY, and S. EFFGEN et al., 2000  On the origin and domestication history of barley (Hordeum vulgare). Mol. Biol. Evol. 17:499-510.[Abstract/Free Full Text]

BROWN, A. H. D., 1975  Sample sizes required to detect linkage disequilibrium between two or three loci. Theor. Popul. Biol. 8:184-201.[Medline]

BROWN, A. H. D., 1980  Genetic basis of alcohol dehydrogenase polymorphism in Hordeum spontaneum. J. Hered. 70:127-128.

BROWN, A. H. D., E. NEVO, D. ZOHARY, and O. DAGAN, 1978a  Genetic variation in natural populations of wild barley (Hordeum spontaneum). Genetics 49:97-108.

BROWN, A. H. D., D. ZOHARY, and E. NEVO, 1978b  Outcrossing rates and heterozygosity in natural populations of Hordeum spontaneum Koch in Israel. Heredity 41:49-62.

BROWN, A. H. D., G. J. LAWRENCE, M. JENKIN, J. DOUGLASS, and E. GREGORY, 1989  Linkage drag in backcross breeding in barley. J. Hered. 80:234-239.[Abstract/Free Full Text]

CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993  The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303.[Abstract]

CLEMENT, M., D. POSADA, and K. A. CRANDALL, 2000  TCS: a computer program to estimate gene genealogies. Mol. Ecol. 9:1657-1659.[Medline]

CUMMINGS, M. P. and M. T. CLEGG, 1998  Nucleotide sequence diversity at the alcohol dehydrogenase 1 locus in wild barley (Hordeum vulgare ssp. spontaneum): an evaluation of the background selection hypothesis. Proc. Natl. Acad. Sci. USA 95:5637-5642.[Abstract/Free Full Text]

FU, Y. X., 1997  Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915-925.[Abstract]

FU, Y. X. and W. H. LI, 1993  Statistical tests of neutrality of mutations. Genetics 133:693-709.[Abstract]

GAUT, B. S. and M. T. CLEGG, 1993  Molecular evolution of the Adh1 locus in the genus Zea.. Proc. Natl. Acad. Sci. USA 90:5095-5099.[Abstract/Free Full Text]

GAUT, B. S., A. S. PEEK, B. R. MORTON, and M. T. CLEGG, 1999  Patterns of genetic diversification within the Adh gene family in the grasses (Poaceae). Mol. Biol. Evol. 16:1086-1097.[Abstract]

HAMRICK, J. L., and M. J. W. GODT, 1990 Allozyme diversity in plant species, pp. 43–63 in Plant Population Genetics, Breeding, and Genetic Resources, edited by A. H. D. BROWN, M. T. CLEGG, A. T. KAHLER and B. S. WEIR. Sinauer, Sunderland, MA.

HARBERD, N. P. and K. J. R. EDWARDS, 1983  Further studies on the alcohol dehydrogenases in barley—evidence for a 3rd alcohol-dehydrogenase locus and data on the effect of an alcohol-dehydrogenase-1 null mutation in homozygous and in heterozygous condition. Genet. Res. 41:109-116.

HART, G. E., A. K. M. R. ISLAM, and K. W. SHEPHERD, 1980  Use of isozymes as chromosome markers in the isolation and characterization of wheat-barley chromosome addition lines. Genet. Res. 36:311-325.

HASTINGS, A., 1990 The interaction between selection and linkage in plant populations, pp. 163–180 in Plant Population Genetics, Breeding, and Genetic Resources, edited by A. H. D. BROWN, M. T. CLEGG, A. T. KAHLER and B. S. WEIR. Sinauer, Sunderland, MA.

HAYMAN, B. I., 1953  Mixed selfing and random mating when homozygotes are at a disadvantage. Heredity 7:185-192.

HEY, J. and J. WAKELEY, 1997  A coalescent estimator of the population recombination rate. Genetics 145:833-846.[Abstract]

HUDSON, R. R., 1987  Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245-250.[Medline]

HUDSON, R. R. and N. L. KAPLAN, 1985  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147-164.[Abstract/Free Full Text]

KAWABE, A. and N. T. MIYASHITA, 1999  DNA variation in the basic chitinase locus (ChiB) region of the wild plant Arabidopsis thaliana. Genetics 153:1445-1453.[Abstract/Free Full Text]

KAWABE, A., H. INNAN, R. TERAUCHI, and N. T. MIYASHITA, 1997  Nucleotide polymorphism in the acidic chitinase locus (ChiA) region of the wild plant Arabidopsis thaliana. Mol. Biol. Evol. 14:1303-1315.[Abstract]

KIMURA, M., and T. OHTA, 1973 Theoretical Aspects of Population Genetics. Princeton University Press, Princeton, NJ.

KUITTINEN, H. and M. AGUADÉ, 2000  Nucleotide variation at the CHALCONE ISOMERASE locus in Arabidopsis thaliana.. Genetics 155:863-872.[Abstract/Free Full Text]

LIN, J.-Z., A. H. D. BROWN, and M. T. CLEGG, 2001  Heterogeneous geographic patterns of nucleotide sequence diversity between two alcohol dehydrogenase genes in wild barley (Hordeum vulgare subspecies spontaneum). Proc. Natl. Acad. Sci. USA 98:531-536.[Abstract/Free Full Text]

MENDEL, G., 1865 Experiments in Plant Hybridisation, edited by J. H. BENNETT. Oliver & Boyd, London.

NORDBORG, M., 2000  Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial self-fertilization. Genetics 154:923-929.[Abstract/Free Full Text]

NORDBORG, M. and S. TAVARE, 2002  Linkage disequilibrium: what history has to tell us. Trends Genet. 18:83-90.[Medline]

NORDBORG, M., J. O. BOREVITZ, J. BERGLESON, C. C. BERRY, and J. CHORY et al., 2002  The extent of linkage disequilibrium in Arabidopsis thaliana.. Nat. Genet. 30:190-193.[Medline]

REICH, D. E., M. CARGILL, S. BOLK, J. IRELAND, and P. C. SABETI et al., 2001  Linkage disequilibrium in the human genome. Nature 411:199-204.[Medline]

ROZAS, J. and R. ROZAS, 1999  DnaSP v. 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.[Abstract/Free Full Text]

STEBBINS, G. L., 1950 Variation and Evolution in Plants. Columbia University Press, New York.

STEPHENS, J. C., J. A. SCHNEIDER, D. A. TANGUAY, J. CHOI, and T. ACHARYA et al., 2001  Haplotype variation and linkage disequilibrium in 313 human genes. Science 293:489-493.[Abstract/Free Full Text]

TAJIMA, F., 1989  Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.[Abstract/Free Full Text]

TEMPLETON, A. R., K. A. CRANDALL, and C. F. SING, 1992  A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132:619-633.[Abstract]

TENAILLON, M. I., M. C. SAWKINS, A. D. LONG, R. L. GAUT, and J. F. DOEBLEY et al., 2001  Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc. Natl. Acad. Sci. USA 98:9161-9166.[Abstract/Free Full Text]

TRICK, M., E. S. DENNIS, K. J. R. EDWARDS, and W. J. PEACOCK, 1988  Molecular analysis of the alcohol-dehydrogenase gene family of barley. Plant Mol. Biol. 11:147-160.

VON BOTHMER, R., N. JACOBSEN, C. BADEN, R. B. JORGENSEN and I. LINDE-LAURSEN, 1995 An Ecogeographical Study of the Genus Hordeum. International Plant Genetic Resources Institute, Rome.

WATTERSON, G. A., 1975  Number of segregating sites in genetic models without recombination. Theor. Popul. Biol. 7:256-276.[Medline]

WEIR, B. S. and C. C. COCKERHAM, 1973  Mixed self and random mating at two loci. Genet. Res. 21:247-262.[Medline]

WEISS, K. M. and A. G. CLARK, 2002  Linkage disequilibrium and the mapping of complex human traits. Trends Genet. 18:19-24.[Medline]




This article has been cited by other articles:


Home page
J HeredHome page
H. Chen, P. L. Morrell, M. de la Cruz, and M. T. Clegg
Nucleotide Diversity and Linkage Disequilibrium in Wild Avocado (Persea americana Mill.)
J. Hered., March 14, 2008; (2008) esn016v1.
[Abstract] [Full Text] [PDF]


Home page
ANN BOT (LOND)Home page
R. Papa, E. Bellucci, M. Rossi, S. Leonardi, D. Rau, P. Gepts, L. Nanni, and G. Attene
Tagging the Signatures of Domestication in Common Bean (Phaseolus vulgaris) by Means of Pooled DNA Samples
Ann. Bot., October 1, 2007; 100(5): 1039 - 1051.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
P. L. Morrell and M. T. Clegg
Genetic evidence for a second domestication of barley (Hordeum vulgare) east of the Fertile Crescent
PNAS, February 27, 2007; 104(9): 3289 - 3294.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Stracke, T. Presterl, N. Stein, D. Perovic, F. Ordon, and A. Graner
Effects of Introgression and Recombination on Haplotype Structure and Linkage Disequilibrium Surrounding a Locus Encoding Bymovirus Resistance in Barley
Genetics, February 1, 2007; 175(2): 805 - 817.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. L. Morrell, D. M. Toleno, K. E. Lundy, and M. T. Clegg
Estimating the Contribution of Mutation, Recombination and Gene Conversion in the Generation of Haplotypic Diversity
Genetics, July 1, 2006; 173(3): 1705 - 1723.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
K. S. Caldwell, J. Russell, P. Langridge, and W. Powell
Extreme Population-Dependent Linkage Disequilibrium Detected in an Inbreeding Plant Species, Hordeum vulgare
Genetics, January 1, 2006; 172(1): 557 - 567.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. T. Hamblin, M. G. Salas Fernandez, A. M. Casa, S. E. Mitchell, A. H. Paterson, and S. Kresovich
Equilibrium Processes Cannot Explain High Levels of Short- and Medium-Range Linkage Disequilibrium in the Domesticated Grass Sorghum bicolor
Genetics, November 1, 2005; 171(3): 1247 - 1256.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. I. Wright and B. S. Gaut
Molecular Population Genetics and the Search for Adaptive Evolution in Plants
Mol. Biol. Evol., March 1, 2005; 22(3): 506 - 519.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
P. L. Morrell, D. M. Toleno, K. E. Lundy, and M. T. Clegg
Low levels of linkage disequilibrium in wild barley (Hordeum vulgare ssp. spontaneum) despite high rates of self-fertilization
PNAS, February 15, 2005; 102(7): 2442 - 2447.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. K. Ingvarsson
Nucleotide Polymorphism and Linkage Disequilibrium Within and Among Natural Populations of European Aspen (Populus tremula L., Salicaceae)
Genetics, February 1, 2005; 169(2): 945 - 953.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
A. N. Massa, C. F. Morris, and B. S. Gill
Sequence Diversity of Puroindoline-a, Puroindoline-b, and the Grain Softness Protein Genes in Aegilops tauschii Coss
Crop Sci., September 1, 2004; 44(5): 1808 - 1816.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
P. L. Morrell, K. E. Lundy, and M. T. Clegg
Distinct geographic patterns of genetic diversity are maintained in wild barley (Hordeum vulgare ssp. spontaneum) despite migration
PNAS, September 16, 2003; 100(19): 10812 - 10817.
[Abstract] [Full Text] [PDF]