Genetics, Vol. 152, 1711-1722, August 1999, Copyright © 1999

A Scan for Linkage Disequilibrium Across the Human Genome

Gavin A. Huttleya, Michael W. Smithb, Mary Carringtonb, and Stephen J. O'Briena
a Laboratory of Genomic Diversity, National Cancer Institute, Frederick, Maryland 21702
b Intramural Research Support Program, SAIC Frederick, Frederick, Maryland 21702

Corresponding author: Gavin A. Huttley, Human Genetics Group, John Curtin School of Medical Research, The Australian National University, Canberra ACT, 0200, Australia., gavin.huttley{at}anu.edu.au (E-mail)

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Linkage disequilibrium (LD), the tendency for alleles of linked loci to co-occur nonrandomly on chromosomal haplotypes, is an increasingly useful phenomenon for (1) revealing historic perturbation of populations including founder effects, admixture, or incomplete selective sweeps; (2) estimating elapsed time since such events based on time-dependent decay of LD; and (3) disease and phenotype mapping, particularly for traits not amenable to traditional pedigree analysis. Because few descriptions of LD for most regions of the human genome exist, we searched the human genome for the amount and extent of LD among 5048 autosomal short tandem repeat polymorphism (STRP) loci ascertained as specific haplotypes in the European CEPH mapping families. Evidence is presented indicating that ~4% of STRP loci separated by <4.0 cM are in LD. The fraction of locus pairs within these intervals that display small Fisher's exact test (FET) probabilities is directly proportional to the inverse of recombination distance between them (1/cM). The distribution of LD is nonuniform on a chromosomal scale and in a marker density-independent fashion, with chromosomes 2, 15, and 18 being significantly different from the genome average. Furthermore, a stepwise (locus-by-locus) 5-cM sliding-window analysis across 22 autosomes revealed nine genomic regions (2.2–6.4 cM), where the frequency of small FET probabilities among loci was greater than or equal to that presented by the HLA on chromosome 6, a region known to have extensive LD. Although the spatial heterogeneity of LD we detect in Europeans is consistent with the operation of natural selection, absence of a formal test for such genomic scale data prevents eliminating neutral processes as the evolutionary origin of the LD.


LINKAGE disequilibrium (LD) occurs in populations as a consequence of mutation, random genetic drift, selection of single or linked alleles, and population admixture (see HARTL and CLARK 1990 Down). Although traditional interest in LD was in recapitulation of historic demographic and selective events, more recently the signals of LD association have been employed in identifying hereditary disease genes in populations as an adjunct to traditional pedigree mapping analysis (HASTBACKA et al. 1992 Down; BRISCOE et al. 1994 Down; STEPHENS et al. 1994 Down; EWENS and SPIELMAN 1995 Down; JORDE 1995 Down; KAPLAN et al. 1995 Down).

Mapping association studies explicitly depend upon the persistence of LD, which decays at a rate proportional to the recombination fraction between the two loci in LD and the number of generations, G, since the establishment of LD (EWENS 1979 Down; HARTL and CLARK 1990 Down). The dependence of decay in LD on the recombination fraction and G have also been exploited to estimate the time elapsed since the initial event that established LD in the ancestral population (KAPLAN et al. 1994 Down; TISHKOFF et al. 1996 Down; STEPHENS et al. 1998 Down).

Different evolutionary origins of LD may produce different genomic patterns among selectively neutral loci. For instance, genetic drift will cause regions of LD randomly distributed across the entire genome. The number of genes in LD within a region, and thus the physical extent of LD, will depend on effective population size and the local recombination rate. Genetic drift may contribute to admixture LD, which arises when genetically differentiated populations interbreed. Admixed LD will exist between those loci that genetically distinguish, by virtue of allele frequency differences, the ancestral populations. Where the genetic differentiation arose from the operation of genetic drift in each ancestral population, the resulting LD also occurs randomly across the genome and potentially over substantial physical distances for a small number of generations (CHAKRABORTY and WEISS 1988 Down; BRISCOE et al. 1994 Down; STEPHENS et al. 1994 Down).

In contrast to the unbiased distribution of LD from drift, the operation of mutation or natural selection may affect the genomic pattern of LD in a nonuniform way. While genomic regions with high mutation rates at neutral loci are expected to exhibit less LD (SLATKIN 1994 Down), natural selection can produce very localized concentrations of LD. For example, if a rare genetic variant at a locus becomes the subject of directional positive selection, then alleles at linked neutral markers will also increase in frequency, resulting in LD among the hitchhiking loci. Such directional positive selection is frequently referred to as a selective sweep. Completion of a selective sweep is fixation of both the favored variant and its flanking genetic background, thus eliminating LD in the region. Epistatic selection among linked genes may also lead to linkage disequilibrium between flanking neutral loci, depending on the age of the interaction (LEWONTIN and KOJIMA 1960 Down; LEWONTIN 1964 Down; WIEHE and SLATKIN 1998 Down). In regions of epistatic interactions, LD among neutral markers may persist if episodic fluctuations in selection are common. Thus, differential genomic patterns of LD among neutral loci are expected under various evolutionary scenarios.

For human population analysis, studies of LD have been limited by a paucity of available human markers and knowledge of their genotypic phase. Recent efforts to assess the background pattern of LD in humans have employed a small number of markers localized to specific genomic regions (PETERSON et al. 1995 Down; LAAN and PAABO 1997 Down). Here we analyze the extent of LD that occurs among 5048 short tandem repeat polymorphism (STRP) loci distributed over all autosomes resolved by the GÉNÉTHON gene mapping project using the European Utah and Amish Centre d'Etude du Polymorphisme Humain (CEPH) families (WEISSENBACH et al. 1992 Down; GYAPAY et al. 1994 Down; DIB et al. 1996 Down). The study had two objectives: (1) assess the relationship between LD and recombination fraction (centimorgans) in the human genome; and (2) inspect the entire human genome to identify and characterize in strength and centimorgan length, regions of remarkable LD. The first analysis involved Fisher's exact tests (FETs) for independence of all locus pairs separated by <=30 cM; the second involved a statistical procedure for quantifying clustered LD that corrects for marker density (see MATERIALS AND METHODS). The results identify considerable LD, a striking inverse proportionality between LD and recombination distance (centimorgans), and 10 chromosomal regions that display substantially elevated LD in the human genome.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Data and haplotype determination:
Genotype data for 5048 STRP loci resolved by the GÉNÉTHON gene mapping project using the European Utah and Amish CEPH families 1331, 1332, 1347, 1362, 1413, 1416, and 884 (WEISSENBACH et al. 1992 Down; GYAPAY et al. 1994 Down; DIB et al. 1996 Down) were obtained from http://www.genethon.fr/genethon_en.html/. Each family consists of three generations: four grandparents, two parents, and from 9 to 15 grandchildren. Although genotype data are available for an additional Venezuelan family, these data were excluded due to evidence of population admixture (BEGOVICH et al. 1992 Down; MOONSAMY et al. 1997 Down). It was suggested that the Amish are differentiated from the Utah families (MCLELLAN et al. 1984 Down). However, an exact test for population substructure (RAYMOND and ROUSSET 1995 Down), performed using 50 STR loci from chromosomes 1 and 2 separated by >=5 cm, failed to reject the null hypothesis that these families belong to the same population (P = 0.76).

Pedigree information was used to determine phase of the grandparental chromosomes for all markers in the families, producing a total of 54 independent chromosomal haplotypes. Grandparental origin of alleles in parents was readily discernible from grandparent(s)-parent combinations in ~94% of cases. For the small proportion of loci (<0.25%) that could not be resolved anywhere in a pedigree, the allele was classified as missing data. This occurred when all parent-offspring combinations were identical heterozygotes. In the remaining unresolved cases (<6%), although the grandparent(s)-parent combinations were identical heterozygotes, some parent-grandchild combinations were resolvable. To determine the grandparental phase for a single such unresolved locus within a family, one informative closely linked locus on each side of the locus of interest was identified. We define informative in this context as a locus that is segregating at least two alleles in the pedigree and for which phase was unambiguous. The subset of haplotype combinations in the pedigree that contained the grandparental alleles at the informative loci was determined. In <0.25% of cases, more than one allele was present at the phase unresolved locus in the haplotype subset. In this rare instance, the allele present on >60% of haplotypes was selected; otherwise a missing data allele was assigned (<0.13% of cases).

Previous work has shown that the major histocompatibility complex (MHC) exhibits evidence for extensive LD among STR loci and can thus serve as a reference for the rest of the genome (CARRINGTON et al. 1998). Because the GÉNÉTHON map contains only three markers in this region (D6S291, D6S273, and D6S265), we genotyped the same CEPH families for six non-GÉNÉTHON markers (MogCA, MIB, DQCAR, G51152, TAP1CA, and RING3CA) located in the MHC (Figure 1; FOISSAC et al. 1997 Down; MARTIN et al. 1998 Down). GÉNÉTHON sex-averaged genetic map distances were utilized as our marker order reference for most analyses (DIB et al. 1996 Down). The genetic map for chromosome 6 was modified for marker order around the MHC using DNA sequence data and YAC contig and radiation hybrid (RH) data from the Whitehead Institute (release 11, www-genome.wi.mit.edu/).



View larger version (14K):
In this window
In a new window
Download PPT slide
 
Figure 1. Map of the HLA complex. STRP loci are shown above, and coding genes below, the map with GÉNÉTHON markers in boxes.

Statistics:
Statistical significance determined by FETs, rather than association statistics, was used to measure LD. Historically, measuring LD has been performed for biallelic loci using the coefficient of LD D, or derivatives such as D' or r2 (LEWONTIN 1964 Down; HILL and ROBERTSON 1968 Down), with large values of the statistics interpreted as representing significant LD. Multiallelic formulations of these statistics are also available (see, for example, HEDRICK 1987 Down; KLITZ et al. 1995 Down). Yet, as discussed by SLATKIN 1994 Down, small values of D may also be associated with significant LD. Furthermore, interpreting measures of association is often problematic (PRESS et al. 1992 Down, p. 631), and under some circumstances the distributions of the statistics can differ substantially from that assumed (HUDSON 1985 Down; HEDRICK 1987 Down). In contrast, the principal limitation of probabilities from significance tests for measuring LD is their sensitivity to the marginals (row and column sums) of the pairwise tables, and hence sample size (BENNETT and HSU 1960 Down). Two issues arise from a consideration of the influence of sample size. First, while a low probability may be taken as evidence of LD, a high probability cannot be interpreted as evidence of linkage equilibrium. Second, the sensitivity to sample size suggests that loci with large numbers of alleles, and thus high heterozygosity, will have a reduced power to detect LD because of the increased likelihood of unique alleles at such loci. Contradicting this are the simulation results of SLATKIN 1994 Down, who showed that, for sample sizes of 100 chromosomes, loci with a large number (eight) of alleles had substantially greater power to detect significant LD than biallelic loci. Interestingly, the relationship between power to detect LD and number of alleles approaches a plateau after triallelic loci (SLATKIN 1994 Down). Therefore, because all GÉNÉTHON autosomal STR loci have >=3 alleles, the power to detect significant LD between STR loci should be approximately equivalent.

To assess the distribution of pairwise LD as distinct from multilocus LD, we perform FETs for independence between linked alleles of locus pairs <=30 cM apart. FETs were implemented by a Monte Carlo procedure, where the hypergeometric probability of the observed table was determined and then compared to hypergeometric probabilities calculated from 17,000 randomly shuffled tables that had the same marginals (MEHTA and PATEL 1983 Down; GUO and THOMPSON 1992 Down). The number of times the shuffled table had a hypergeometric probability less than or equal to that of the observed table is taken as the probability that alleles at the loci are independent. The resulting probabilities from these tests for LD are referred to as LDp (linkage disequilibrium probability). The pseudorandom number generator ran1 (PRESS et al. 1992 Down) was used for all permutation-based procedures. Map distance, expressed in centimorgans (cM), was determined from GÉNÉTHON sex-averaged recombination distances (DIB et al. 1996 Down).

There are currently no methods available to describe the spatial pattern of LD in an entire genome. Variable marker density and the proportional relationship between the likelihood of LD and interlocus distance present significant challenges to providing an accurate description of the genomic distribution of LD. Such a description must avoid identifying regions with abundant tightly linked markers as exhibiting remarkable concentrations of LD.

In an effort to provide a detailed description of LD within the human genome, a model was developed that corrects for marker density and uses measurements from the data to correct for the relationship between LD and recombination distance. For this model, locus pairs are defined as being "in LD" according to whether their LDp <= a cutoff c, where c is analogous to a multiple test correction. Although this process will misclassify some locus pairs, it simplifies the spatial analysis, and the resulting list of locus pairs provides hypotheses for subsequent empirical evaluation.

Within each 5-cM genomic region a frequency histogram of all pairwise comparisons is produced, based on interlocus distances and with a 0.5-cM bin size (Figure 2A). Within each bin the frequency of locus pairs with a LDp <= c is determined. The probability of a locus pair having an LDp <= c for a particular bin was taken as the genome-wide frequency of such pairs, e.g., a 0.1 genome frequency of LDp <= c for locus pairs within 0.5 cM is taken as the probability. The probability was estimated of observing the same, or more, locus pairs with an LDp <= c for each 5-cM region of the genome, conditioned on each region's distribution of pairwise distances. Specifically, for each distance bin i within a window w, the binomial probability pwi of the observed or more locus pairs with LDp <= c is calculated as

(1)

where ki is the observed number of LDp <= c locus pairs within distance bin i, ni is the total number of locus pairs for bin i within the window, and pi is the probability of a pair having an LDp <= c for bin i. A novel window statistic {omega} is computed as

(2)

where N (which equals 10 for a 5-cM window) is the number of bins. Small values of {omega} will correspond to high densities of LD. To establish the probability of the observed {omega} from window w, it was compared to a distribution of {omega} calculated from 104 random windows (referred to as {omega}r values) with an identical distribution of interlocus distances to window w. Random windows were produced by generating the ki with a pseudorandom number generator (PRESS et al. 1992 Down) using pi and the ni from window w over all ni > 0. Values of {omega}r were determined from these windows by applying Equation 1 and Equation 2. Observed {omega} values smaller than all {omega}r values were reassessed by increasing the number of random windows to 106. The frequency that {omega}r <= {omega} is taken as the probability that the window has the same, or less, abundance of LD than the genome average. To facilitate a graphical inspection of the results, the probabilities from each window in the genomic scan are transformed into {chi}2 statistics with 1 d.f. using the standard {chi}2 density function and an iterative procedure (PRESS et al. 1992 Down). We subsequently refer to this test as the LD cluster test.



View larger version (66K):
In this window
In a new window
Download PPT slide
 
Figure 2. Testing for clustered LD. (a) Results of spatial distribution assessment of LD in a 5.0-cM window, defined by the STRP locus MogCA, spanning HLA. The probability of the observed or more pairs with an LDp <= 0.01 for the window is calculated using Equation 1 and Equation 2. Midpoint of 0.5-cM intervals is indicated on x-axis. (b) A total of 500 independent windows with all loci having the same heterozygosity and allele frequency distribution. (c) Same as for b, except allele frequency distributions and heterozygosity were randomly drawn from chromosome 1 STR loci.

Because the calculation of {omega} incorporates nonindependent observations, we verified that probabilities from the LD cluster statistic ({omega}) distribution were approximately uniformly distributed using randomly permuted data. The cluster test was applied to LDp values calculated from 54 randomly shuffled haplotypes under two different scenarios. First, a single locus was selected with no missing data that had, roughly, the median heterozygosity (0.72) and number of alleles (7). The allele frequency distribution at this locus was used for 104 loci, evenly spaced 0.25 cM apart, to produce 500 independent 5-cM windows with 20 loci per window. After an initial shuffling of the haplotypes, LDp values were calculated for all pairwise locus comparisons within each window. {omega} values and their probabilities were estimated for each window using a value of c = 0.05. Using the nonparametric Kolmogorov-Smirnov (henceforth KS) test, as implemented in the SAS procedure NPAR1WAY, the distribution is not significantly different from a uniform distribution of the same size (P = 0.96; see Figure 2B for a frequency histogram of the probabilities). The second scenario differs from the first only in that the heterozygosity and allele frequency distribution per locus were allowed to vary. Allele frequency distributions were randomly selected from chromosome 1 loci to create a total of 500 independent windows as before. The distribution arising from this second analysis also did not differ significantly from uniform (P = 0.55; see Figure 2C for a frequency histogram of the probabilities). Thus, the extent of correlation between comparisons involving the same locus is not significant, and the LD cluster test probability values approximate a uniform distribution.

Population genetics theory predicts that closely linked markers will on average exhibit higher LD (and thus lower LDp values) than loosely linked markers. We test for a relationship between distance (centimorgans) and LDp using the Mantel test for matrix correspondence (MANTEL 1967 Down; SOKAL and ROHLF 1995 Down) on pairs separated by <=10 cM. The test compares the two paired matrices of numbers (pairwise recombination distance and pairwise LDp) to assess whether, in this case, small LDp's tend to be associated with small centimorgan distances by multiplying corresponding matrix elements and summing these products across all matrix positions. The observed statistic is then compared to those obtained by randomly shuffling the distance matrix where new positions in the matrix are randomly assigned. The frequency that the shuffled statistic was less than or equal to the observed statistic in 20,000 shufflings is taken as the probability that centimorgan distance and LDp values are independent. To avoid the bias of unresolved map order (0-cM distances), such marker pairs were assigned a distance of 0.1 cM.

Alternative genetic map construction:
Concordance between physical and genetic maps has been used to construct highly accurate genomic maps (e.g., see BROMAN et al. 1998 Down). To assess the effect of mapping errors in the recombination linkage map on our results, physical RH map data (STEWART et al. 1997 Down) were utilized to obtain alternative estimates of genetic map location. Using version 2 of the Stanford G3 panel (STEWART et al. 1997 Down), relative RH map locations for GÉNÉTHON markers within contiguous regions (a group of >=6 unambiguously linked markers) were determined. RH map locations were plotted against each marker's corresponding GÉNÉTHON map locations. A best-fit line was determined with a parsimonious choice of at most three parameters (X, X2, and log X) chosen by eye, and then used to predict the alternative, regression-based, genetic map locations for each marker. Linear regression was performed using the GLM procedure of SAS. The sample regressions presented in Figure 3 illustrate the variable relationship between recombination rates and physical distance and show a high degree of concordance in map order. Markers whose map positions were outside the 95% confidence interval of the best-fit line were not considered further. The alternative estimates, obtained for 1438 of 5048 loci, take into account recombination estimates over larger regions and permit estimation of centimorgan distance between markers unresolved on the recombination linkage map.



View larger version (25K):
In this window
In a new window
Download PPT slide
 
Figure 3. An alternative map using concordance between physical and genetic maps. Presented are four representative sample regressions from portions of chromosomes 3 (a), 3 (b), 11 (c), and 18 (d).


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Evidence for linkage disequilibrium in Europeans:
The results of FETs for locus pair independence that were used as an index of LD for 228,955 locus pairs are presented in Figure 4 as a function of recombination distance (centimorgans). In Figure 4, a and b, we present a frequency histogram of pairs with a test outcome of LDp <= 0.05. In accordance with population genetics theory, the percentage of pairs with LDp (P value for departure from allele independence) values in this range is highest in the shortest intervals, 0–0.5, 0.5–1.0, and 1.0–1.5 cM. Moreover, in the interval 0–0.5 cM, the majority of these LDp <= 0.05 pairs exhibit LDp values <= 0.01 (Figure 4B).



View larger version (35K):
In this window
In a new window
Download PPT slide
 
Figure 4. Relationship between LD and recombination distance. (a) Histogram showing frequency (percentage) of locus pairs with small LDp values. The midpoint of each 0.5 cM is bin listed on the x-axis. (b) Same as for a, but restricted to loci within 5 cM of each other. (c) Plot, equation, and statistics for the percentage of locus pairs with LDp <= 0.01 vs. 1/cM, where cM values are the interval midpoints from a. Data are from the recombination linkage map (R.M.) and the alternative map (A.M.). (d) Centimorgan distance between STRP markers and average LDp values between markers within each centimorgan interval. Mantel P, probability for the relationship between LDp and distance from the Mantel test for pairs within 10 cM.

The pattern of LDp vs. centimorgan distance between STRP loci within short (<=3.5-cM) intervals prescribes a linear relationship between the percentage of pairs with small LDp values and the inverse of centimorgan distance (1/cM) between test loci (Figure 4C). For 0.5-cM intervals from 0 to 3.5 cM, the relationship is highly significant (r2 > 0.99; P < 10-6 for LDp <= 0.01; Figure 4C), suggesting a strong proportionality of centimorgans and LDp for loci 3.5 cM apart. This result is not dependent upon the influential point at 4 cM, because its removal has little impact on the regression relationship (r2 > 0.97; P < 10-3). To assess the potential confounding effect of map errors in the recombination linkage map on this relationship, we analyzed alternative values predicted from concordance between GÉNÉTHON and Stanford RH maps as described in MATERIALS AND METHODS and illustrated in Figure 3. The alternative estimates are based on the physically mapped markers and provide order and non-0-cM estimates between markers unresolved on the GÉNÉTHON map. The reanalysis (Figure 4C) affirms the proportionality of inverse centimorgans to the likelihood of LD.

A plot of mean LDp for locus pairs separated by discrete (1-cM) recombination distances is presented (Figure 4D). A relationship between the centimorgan distance separating a locus pair and their corresponding LDp value was tested for using the Mantel test (MANTEL 1967 Down; SOKAL and ROHLF 1995 Down). Performing this analysis on loci separated by <=10 cM (82,846 locus pairs) revealed a significant correlation between LDp value and centimorgan distance (P < 0.001; MANTEL 1967 Down; SOKAL and ROHLF 1995 Down). To assess the limit of this correlation and thus the limit of LD in Europeans, the GÉNÉTHON dataset was titrated to successively exclude locus pairs in the 0–1.0, 0–2.0, 0–3.0, etc., ranges until the Mantel test yielded probabilities that were >0.05. This analysis provides the conservative estimate that, for samples with N = 54, the upper threshold for correlation between centimorgans and LDp is 4.0 cM.

Because low P values for multiple statistical tests can result from chance alone (as opposed to LD), a multiple test correction is necessary. However, due to the large number of FETs performed, a Bonferroni-based estimate of c (the LDp cutoff) is much less than the resolution of the Monte Carlo FET. Accordingly, an alternative method was employed to obtain an estimate of c below which the majority of locus pairs are likely to be in LD. Specifically, the linear relationship of LDp vs. 1/cM (Figure 4C) was used to identify the range of LDp values for which the majority of pairs were in authentic LD. By titrating probabilities in 0.01 P-value intervals, it was determined that while the percentage of locus pairs with LDp <= 0.01 are highly correlated with 1/cM, those locus pairs in higher P-value intervals (e.g., 0.01 < LDp <= 0.02) are not (results not shown) suggesting that only locus pairs with an LDp <= 0.01 are predominantly in authentic LD. Interestingly, the LDp <= 0.01 relationship would predict that the empirical limit of LD approximates 5.5 cM from the GÉNÉTHON-based data set. This number, obtained by solving the regression equation (Figure 4C) for the distance at which the proportion of pairs in LD is equal to the expectation of 1%, is remarkably consistent with the distance in Figure 4D, where the mean LDp vs. centimorgan curve asymptotes at ~6.5 cM with the background expectation of 0.5. These results offer strong statistical support for implicating LD for the majority of locus pairs separated by 4 cM whose LDp <= 0.01. Thus, for subsequent analyses, the cutoff c = 0.01 was used. Out of 36,382 locus pairs within 4 cM of each other, 1452 (4%) have an LDp <= 0.01. A list of these locus pairs and their LDp values is available at either jcsmr.anu.edu.au/~glenys/humgen/data.htm or rex.nci.nih.gov/RESEARCH/basis/lgd/front_page.htm/.

Consistent with the suggested dependence of probabilities from exact tests on heterozygosity (SLATKIN 1994 Down), a significant, but very small, correlation between LDp and heterozygosity was detected. If heterozygosity was influential in the power of loci to detect significant LD, then for loci within 0.5 cM of each other, a difference in heterozygosity should be apparent between those loci with an LDp <= 0.01 and the remaining loci. Using a KS test, we reject the null hypothesis that the two groups of loci have the same heterozygosity distributions (P < 10-3). The median heterozygosity value for loci with an LDp <= 0.01 (~0.73; 1381 loci) is greater than the median value for the remaining loci (~0.72; 2986 loci). To assess the proportion of variation in LDp values accounted for by heterozygosity, a multiple regression was performed on ~500 independent locus pairs within 0.5 cM of each other, but separated from all other locus pairs by at least 5 cM. Taking heterozygosity at loci A and B as independent variables and LDp as the dependent variable, the analysis indicated that heterozygosity at the two loci accounts for <0.04% of the variance in LDp.

Linkage disequilibrium is heterogeneously distributed throughout the genome in Europeans:
Different evolutionary forces may produce different spatial patterns of LD in the genome. The null hypothesis of spatial homogeneity of LD was initially tested by comparing the LDp distribution of individual chromosomes to the rest of the genome. For example, the LDp distribution (from locus pairs within 4 cM of each other) of chromosome 1 was compared to the LDp distribution from the rest of the genome (produced by pooling the LDp values from chromosomes 2–22). Because differences in LDp values could arise from differences in marker density, the datasets representing a chromosome and the genome were matched for the distribution of interlocus distances. To illustrate this, if 5% of all comparisons within 4 cM on chromosome 1 were between loci separated by 0.3 cM, while for the genome set this value was 8%, locus pairs were randomly sampled without replacement from the genome set to achieve a proportion of 5%. Performing the nonparametric KS test on the 22 comparisons indicates seven chromosomes (2, 5, 6, 12, 13, 15, and 18) had probabilities <=0.05, with chromosomes 2, 5, and 18 having more LD than the genome average and the other chromosomes having less LD. Of these, chromosomes 2 (P = 0.0007), 15 (P = 0.0001), and 18 (P = 0.0013) are significant after correcting for multiple tests using the Bonferroni procedure (SOKAL and ROHLF 1995 Down). We therefore reject the null hypothesis that LDp values, and thus LD, are uniformly distributed in the genome. A comparison of the chromosome heterozygosity distributions for the loci in each of the above sets, again using the KS test, failed to detect any chromosomes significantly different from the genome average. These results suggest that the nonuniform distribution of LD at the chromosomal scale does not result from variation in locus power (as represented by heterozygosity) to detect LD.

Given apparent heterogeneity of LD in the genome, and prior to analyzing the entire genome, we evaluated the effectiveness of the LD cluster test on the human leukocyte antigen (HLA). The HLA region on chromosome 6 includes several loci previously reported to display LD as a consequence of selective pressure on epistatic loci in the region (BODMER 1986 Down; KLEIN 1986 Down; HEDRICK 1994 Down; TROWSDALE 1995 Down; FOISSAC et al. 1997 Down) and associated LD among linked STR loci (CARRINGTON et al. 1998). HLA loci are highly polymorphic (like STRP loci) and play an important role in foreign antigen processing and presentation to T-cell receptor molecules on immune lymphocytes. Because of insufficient GÉNÉTHON marker density, the same CEPH families used in this study were genotyped with an additional six STRP markers. The subsequent analysis of this region proved significant even after adjusting for multiple tests from seven windows (P = 0.002 < adjusted significance P = 0.007). We interpret this positive result to affirm the approach in detecting clusters of LD throughout the genome.

The results from the cluster detection analysis using the GÉNÉTHON dataset are shown in Figure 5. To appraise the contribution of variation in marker density to these results, 500 independent (nonoverlapping) windows were analyzed. The results indicate that marker density accounts for <2% of the variance in window probabilities. Two approaches were employed to judge the significance of the results in Figure 5. First, a standard Bonferroni multiple test correction was determined using the total number of windows over the entire genome (N = 4575). Using this correction, region 7 on chromosome 16 is significant (P = 3 x 10-6 < corrected 5% significance level P = 1.1 x 10-5). However, this approach is overly conservative in part due to the Bonferroni correction itself (RICE 1989 Down; ROTHMAN 1990 Down) and partly because of correlations between overlapping windows. Second, a less stringent (but still conservative) approach was employed by using the HLA region as a benchmark. This region has previously been shown to exhibit extensive LD and it may therefore serve as a lower bound for identifying other such significant regions (BODMER 1986 Down; KLEIN 1986 Down; HEDRICK 1994 Down; TROWSDALE 1995 Down; FOISSAC et al. 1997 Down; CARRINGTON et al. 1998). This approach yields nine additional regions distributed across seven chromosomes (Table 1) and the centimorgan length of regions between STRPs in LD ranges from 2.2 to 6.2 cM (excluding HLA). Genes identified in these regions, which may be influenced by or have a role in developing the initial LD, are listed in Table 1.



View larger version (65K):
In this window
In a new window
Download PPT slide
 
Figure 5. Distribution of linkage disequilibrium (LD) throughout the human genome in a 5-cM sliding window.The x-axis is the chromosomal location of the anchor locus (the first "upstream" locus in the window) based on the recombination linkage map; the y-axis shows the {chi}2 with 1 d.f. for the probability of the observed or greater LD within each 5.0-cM window. Tick marks above plots indicate window positions. Information on the numbered regions is summarized in Table 1. The horizontal dotted line at the height of the HLA cluster is used as a reference to identify remarkable regions.


 
View this table:
In this window
In a new window

 
Table 1. Summary of significant linkage disequilibrium regions

Clustering of LD may also stem from the accumulation of loci with high power to detect LD, genotyping errors, or mapping errors. Although clusters may arise from rare concentrations of loci with high power to detect LD, no difference was detected between levels of heterozygosity at loci with an LDp <= 0.01 within the clusters and the general distribution of heterozygosity. Such a relationship was tested for in two ways. First, a KS test was used to compare the distribution of heterozygosity from all loci in the clusters with an LDp <= 0.01 to the distribution of heterozygosity for all other loci (P = 0.17). Second, a two-tailed sign test was conducted to evaluate whether the frequency of loci with either heterozygosities less than, or greater than, the median value from all loci (~0.72) was unusually high over all clusters or for each cluster separately. None of the sign tests were statistically significant. Only region 8 exhibited a small probability (P = 0.03) for detecting five loci with heterozygosities greater than the median. These results suggest that the clusters do not stem from rare accumulations of informative markers.

Mapping errors could contribute to the detection of clusters either by genotyping errors, underestimating recombination fractions resulting from the number of meioses sampled, or incorrect ordering of loci. Genotyping errors can exert a significant impact on interlocus distance estimates, causing an overestimation of total map length (BRZUSTOWICZ et al. 1993 Down). This tends to decrease the significance of a region in the LD cluster test. High concordance between the genetic and physical maps (>99.5%; HUDSON et al. 1995 Down) suggests that the genotyping error rates in the GÉNÉTHON dataset are probably very small. However, if genotyping errors increase with increasing STR fragment size, then concentrations of loci with large mean fragment size might lead to clustered LD. A comparison of mean fragment sizes of loci in LD within the clusters to the rest of the genome failed to support this hypothesis (P = 0.27 from the KS test).

To assess the effect of underestimating recombination rates, distance estimates were divided by a factor of 2 or 4 with the same division of window size. Using these modified data, a LD cluster test was performed, using the genome-wide averages from the unaltered analysis, for each window within the regions presented in Table 1. In an attempt to test whether the regions would still exhibit a density of LD comparable to HLA from our original analysis, all windows within the regions were compared to the P value 0.002. Region 6 had a single window fulfilling these criteria in the fourfold compressed analysis (P = 0.002) and a comparable result in the twofold analysis (P = 0.007). Region 9 had probabilities <0.05 in both the two- and fourfold analyses, while region 5 had P < 0.05 for the twofold analysis only. The HLA and other regions (1, 2, 3, 4, 7, 8, and 10) were not significant in both analyses (P > 0.5). While these results show that the LD cluster test is sensitive to mapping errors, one region (6) is somewhat robust in the face of such errors.

The alternative map was also used to evaluate the contribution of individual errors in recombination distance to the results. The relationship between the physical and genetic maps was used to identify markers discordant between the two maps. These markers were eliminated, and the physical map location of the remaining markers was used to reestimate genetic map locations. Consequently, the alternative map incorporates regional estimates of recombination fraction. Physical map estimates of intermarker distances and marker order may have higher error rates than the genetic map (DELOUKAS et al. 1998 Down), potentially jumbling the correct order. Insufficient alternative map data prevent us from assessing all but regions 2 and 5. The results provide significant support for regions 2 and 5 after adjusting for multiple comparisons (P = 0.0032 for both regions), even though the data for these regions were still incomplete. As such, the nine regions that were detected with the limited haplotype sample (N = 54) indicate LD signals of potential import in the ancestry of this European population.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The exploration of the distributional properties of LD in Europeans was conducted at three levels: level 1, a genome-wide average description of the relationship between locus pairs in LD and the recombination distance separating them (Figure 4, a–d); level 2, a chromosome scale analysis to determine whether LD is uniformly distributed across the genome; and level 3, a detailed regional analysis for locus clusters that depart from the genomic background LD (as in level 1; Figure 5). The level 1 analysis indicated considerable sporadic LD among loci linked by <=4.0 cM, the proportion of which was inversely related to centimorgan distance (Figure 4C). The level 2 analysis showed that LD is heterogeneous in its genomic distribution in a marker density-independent fashion. The level 3 5.0-cM sliding-window analysis revealed nine genomic regions with clustered LD greater than or equal to that observed for HLA, which include the known genes listed in Table 1.

We detected a striking proportionality between LD and inverse recombination fraction (Figure 4). This relationship indicates that while LD occurs between loci within 5 cM of each other, the majority of these pairs cluster within the shortest distance interval. However, the linearity between the proportion of locus pairs with small LDp values and 1/cM has limitations. Over extremely short distances the relative contribution of mutation to the decay of LD is larger, reducing the role for recombination and thus impacting on the 1/cM result, while at extended distances the regression relationship will predict negative estimates for the proportion of loci in LD, which is biologically implausible.

There are two broad potential evolutionary origins for the observed LD: genetic drift or natural selection (OHTA and KIMURA 1969 Down; HARTL and CLARK 1990 Down). Because European populations have had relatively large effective population sizes (Ne >= 10,000), are known to have expanded rapidly in recent centuries during agricultural development, and have not experienced appreciable recent founder effects (TAKAHATA et al. 1992 Down; TAKAHATA 1993 Down; AYALA 1995 Down; AYALA and ESCALANTE 1996 Down; VON HAESELER et al. 1996 Down), mutation and genetic drift appear unlikely explanations (SLATKIN 1994 Down). Moreover, a plausible consequence of drift-derived LD is a similar LDp distribution on all chromosomes. Yet, both our chromosome scale and sliding-window analyses indicate that the spatial pattern of LD in the genome is significantly nonuniform.

Our ascertainment, in the level 3 analysis, of 10 genomic regions exhibiting remarkable concentrations of loci in LD is plausibly an underestimate because the two multiple test corrections used are conservative. The most stringent of these, the Bonferroni correction, identifies only region 6 as being significant. However, we reject the strict Bonferroni correction as a general guideline for interpreting the results of this analysis because it tends to produce type II error, particularly when multiple tests are not independent of each other, as is the case in this analysis (RICE 1989 Down; ROTHMAN 1990 Down). Furthermore, it may be argued that our alternative strategy of using the HLA region to define a lower benchmark is also overly restrictive because this region is known for its high level of LD (BODMER 1986 Down; KLEIN 1986 Down; HEDRICK 1994 Down; TROWSDALE 1995 Down; FOISSAC et al. 1997 Down).

Although several classes of errors might have contributed to the spatial pattern of LD, our analyses did not support their involvement. The impact of errors in regional distance estimates and locus order for the chromosome scale analysis should be small. In contrast, the impact of regional underestimation of recombination on the sliding-window analyses is potentially severe. Despite this, region 6 on chromosome 16 still exhibited low probabilities. Furthermore, the effect of ordering errors was assessed using the alternative map for two regions (2 and 5). That both these regions were significant supports the authenticity of the remaining regions as representing clustered LD in the European genome.

A further potential confounding factor is variation in the power of loci to detect LD. Although heterozygosity was significantly higher for loci with an LDp <=0.01 relative to the remaining loci, it accounts for <0.04% of the variance in LDp values. Further, we did not detect differences in heterozygosity concordant with the LDp distributions of chromosomes or between loci in the regions defined in Table 1 and the remainder of the genome. The potentially confounding influence of variable informativeness may be substantially reduced in these data by the consistently high heterozygosity prevalent among the GÉNÉTHON STR loci.

The nonuniform pattern of LD in the genome is consistent with the operation of natural selection. However, selective explanations for linkage disequilibrium have been proposed previously only for the HLA region (BODMER 1986 Down; KLEIN 1986 Down; HEDRICK 1994 Down; TROWSDALE 1995 Down). Thus, the absence of an explicit test to discriminate between neutral and selective origins of LD at a genomic scale prevents a formal conclusion regarding the evolutionary origin of the detected LD.

The LD genome screen described here offers a new perspective on the organization of endemic genetic variation in the human genome. Although the haplotype sample size is limited (N = 54), and thus a sizable portion of LD is potentially undetected in Europeans, the analysis of 5048 loci was nonetheless informative in describing a background level for LD in Europeans as well as identifying specific genomic locales where LD appears elevated. The background LD might also be useful in LD association studies that are increasingly being applied to locate genes contributing to heritable disease and phenotypes.


*  ACKNOWLEDGMENTS

We thank Cecile Fizames from CEPH for assistance in obtaining the genotype data, several anonymous reviewers, Andy Clark, Simon Easteal, George Nelson, Clay Stephens, and Sue Wilson for helpful comments on this article. We thank the Frederick Biomedical Supercomputing Center for their assistance. The content of this article does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organization imply endorsement by the U.S. Government. This project was funded in part with federal funds from the National Cancer Institute, National Institutes of Health, under contract no. NO1-CO-56000.

Manuscript received September 5, 1998; Accepted for publication April 27, 1999.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

AYALA, F. J., 1995  The myth of Eve: molecular biology and human origins. Science 270:1930-1936[Free Full Text].

AYALA, F. J. and A. A. ESCALANTE, 1996  The evolution of human populations: a molecular perspective. Mol. Phylogenet. Evol. 5:188-201[Medline].

BEGOVICH, A. B., G. R. MCCLURE, V. C. SURAJ, R. C. HELMUTH, and N. FILDES et al., 1992  Polymorphism, recombination, and linkage disequilibrium within the HLA class II region. J. Immunol. 148:249-258[Abstract].

BENNETT, B. M. and P. HSU, 1960  On the power function of the exact test for the 2 x 2 contingency table. Biometrika 47:393-398[Free Full Text].

BODMER, W. F., 1986  Human genetics: the molecular challenge. Cold Spring Harbor Symp. Quant. Biol. 51:1-13.

BRISCOE, D., J. C. STEPHENS, and S. J. O'BRIEN, 1994  Linkage disequilibrium in admixed populations: applications in gene mapping. J. Hered. 85:59-63[Abstract/Free Full Text].

BROMAN, K. W., J. C. MURRAY, V. C. SHEFFIELD, R. L. WHITE, and J. L. WEBER, 1998  Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am. J. Hum. Genet. 63:861-869[Medline].

BRZUSTOWICZ, L. M., C. MERETTE, X. XIE, L. TOWNSEND, and T. C. GILLIAM et al., 1993  Molecular and statistical approaches to the detection and correction of errors in genotype databases. Am. J. Hum. Genet. 53:1137-1145[Medline].

CARRINGTON, M., D. MARTI, J. WADE, W. KLITZ, L. BARCELLOS et al., 1999 Microsatellite markers in complex disease: mapping disease-associated regions within the human Major Histocompatibility Complex, pp. in Microsatellites: Evolution and Applications, edited by D. B. GOLDSTEIN and C. SCHLÖTTERER. Oxford University Press, Oxford.

CHAKRABORTY, R. and K. M. WEISS, 1988  Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. Proc. Natl. Acad. Sci. USA 85:9119-9123[Abstract/Free Full Text].

DELOUKAS, P., G. D. SCHULER, G. GYAPAY, E. M. BEASLEY, and C. SODERLUND et al., 1998  A physical map of 30,000 human genes. Science 282:744-746[Abstract/Free Full Text].

DIB, C., S. FAURE, C. FIZAMES, D. SAMSON, and N. DROUOT et al., 1996  A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380:152-154[Medline].

EWENS, W. J., 1979 Mathematical Population Genetics. Springer-Verlag, New York.

EWENS, W. J. and R. S. SPIELMAN, 1995  The transmission/disequilibrium test: history, subdivision, and admixture. Am. J. Hum. Genet. 57:455-464[Medline].

FOISSAC, A., B. CROUAU-ROY, S. FAURÉ, M. THOMSEN, and A. CAMBON-THOMSEN, 1997  Microsatellites in the HLA region: an overview. Tissue Antigens 49:197-214[Medline].

GUO, S. W. and E. A. THOMPSON, 1992  Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48:361-372[Medline].

GYAPAY, G., J. MORISSETTE, A. VIGNAL, C. DIB, and C. FIZAMES et al., 1994  The 1993–94 Généthon human genetic linkage map. Nat. Genet. 7:246-339[Medline].

HARTL, D. L., and A. G. CLARK, 1990 Principles of Population Genetics. Sinauer Associates, Sunderland, MA.

HASTBACKA, J., A. DE LA CHAPELLE, I. KAITILA, P. SISTONEN, and A. WEAVER et al., 1992  Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland. Nat. Genet. 2:204-211[Medline].

HEDRICK, P. W., 1987  Gametic disequilibrium measures: proceed with caution. Genetics 117:331-341[Abstract/Free Full Text].

HEDRICK, P., 1994  Evolutionary genetics of the major histocompatibility complex. Am. Nat. 143:945-964.

HILL, W. G. and A. ROBERTSON, 1968  Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38:226-231.

HUDSON, R. R., 1985  The sampling distribution of linkage disequilibrium under an infinite allele model without selection. Genetics 109:611-631[Abstract/Free Full Text].

HUDSON, T. J., L. D. STEIN, S. S. GERETY, J. MA, and A. B. CASTLE et al., 1995  An STS-based map of the human genome. Science 270:1945-1954[Abstract].

JORDE, L. B., 1995  Linkage disequilibrium as a gene-mapping tool. Am. J. Hum. Genet. 56:11-14[Medline].

KAPLAN, N. L., P. O. LEWIS, and B. S. WEIR, 1994  Age of the delta F508 cystic fibrosis mutation. Nat. Genet. 8:216-218[Medline].

KAPLAN, N. L., W. G. HILL, and B. S. WEIR, 1995  Likelihood methods for locating disease genes in nonequilibrium populations. Am. J. Hum. Genet. 56:18-32[Medline].

KLEIN, J. (Editor), 1986 Natural History of the Major Histocompatibility Complex. John Wiley and Sons, New York.

KLITZ, W., J. C. STEPHENS, M. GROTE, and M. CARRINGTON, 1995  Discordant patterns of linkage disequilibrium of the peptide-transporter loci within the HLA class II region. Am. J. Hum. Genet. 57:1436-1444[Medline].

LAAN, M. and S. PAABO, 1997  Demographic history and linkage disequilibrium in human populations. Nat. Genet. 17:435-438[Medline].

LEWONTIN, R. C., 1964  The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49:49-67[Free Full Text].

LEWONTIN, R. C. and K. KOJIMA, 1960  The evolutionary dynamics of complex polymorphisms. Evolution 14:458-472.

MANTEL, N., 1967  The detection of disease clustering and a generalized regression approach. Cancer Res. 27:209-220[Abstract/Free Full Text].

MARTIN, M. P., A. HARDING, R. CHADWICK, M. KRONICK, and M. CULLEN et al., 1998  Characterization of 12 microsatellite loci of the human MHC in a panel of reference cell lines. Immunogenetics 47:503[Medline].

MCLELLAN, T., L. B. JORDE, and M. H. SKOLNICK, 1984  Genetic distances between the Utah Mormons and related populations. Am. J. Hum. Genet. 36:836-857[Medline].

MEHTA, C. R. and N. R. PATEL, 1983  A network algorithm for performing Fisher's exact test in r x c contingency tables. J. Am. Stat. Assoc. 78:427-434.

MOONSAMY, P. V., W. KLITZ, M. G. TILANUS, and A. B. BEGOVICH, 1997  Genetic variability and linkage disequilibrium within the DP region in the CEPH families. Hum. Immunol. 58:112-121[Medline].

OHTA, T. and M. KIMURA, 1969  Linkage disequilibrium at steady state determined by random genetic drift and recurrent mutation. Genetics 63:229-238[Free Full Text].

PETERSON, A. C., A. DI RIENZO, A. E. LEHESJOKI, A. DE LA CHAPELLE, and M. SLATKIN et al., 1995  The distribution of linkage disequilibrium over anonymous genome regions. Hum. Mol. Genet. 4:887-894[Abstract/Free Full Text].

PRESS, W. H., S. A. TEUKOVSKY, W. T. VETTERING and B. P. FLANNERY, 1992 Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, United Kingdom.

RAYMOND, M. and F. ROUSSET, 1995  An exact test for population differentiation. Evolution 49:1280-1283.

RICE, W. R., 1989  Analyzing tables of statistical tests. Evolution 43:223-225.

ROTHMAN, K. J., 1990  No adjustments are needed for multiple comparisons. Epidemiology 1:43-46[Medline].

SLATKIN, M., 1994  Linkage disequilibrium in growing and stable populations. Genetics 137:331-336[Abstract].

SOKAL, R. R., and F. J. ROHLF, 1995 Biometry. W. H. Freeman and Company, New York.

STEPHENS, J. C., D. BRISCOE, and S. J. O'BRIEN, 1994  Mapping by admixture linkage disequilibrium in human populations: limits and guidelines. Am. J. Hum. Genet. 55:809-824[Medline].

STEPHENS, J. C., D. E. REICH, D. B. GOLDSTEIN, H. D. SHIN, and M. W. SMITH et al., 1998  Dating the origin of the CCR5-{Delta}32 AIDS resistance allele by the coalescence of haplotypes. Am. J. Hum. Genet. 62:1507-1515[Medline].

STEWART, E. A., K. B. MCKUSICK, A. AGGARWAL, E. BAJOREK, and S. BRADY et al., 1997  An STS-based radiation hybrid map of the human genome. Genome Res. 7:422-433[Abstract/Free Full Text].

TAKAHATA, N., 1993  Allelic genealogy and human evolution. Mol. Biol. Evol. 10:2-22[Abstract].

TAKAHATA, N., Y. SATTA, and J. KLEIN, 1992  Polymorphism and balancing selection at the major histocompatibility complex loci. Genetics 130:925-938[Abstract].

TISHKOFF, S. A., E. DIETZSCH, W. SPEED, A. J. PAKSTIS, and J. R. KIDD et al., 1996  Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271:1380-1387[Abstract].

TROWSDALE, J., 1995  "Both man and bird and beast": comparative organization of MHC genes. Immunogenetics 41:1-17[Medline].

VON HAESELER, A., A. SAJANTILA, and S. PAABO, 1996  The genetical archaeology of the human genome. Nat. Genet. 14:135-140[Medline].

WEISSENBACH, J., G. GYAPAY, C. DIB, A. VIGNAL, and J. MORISETTE et al., 1992  A second-generation linkage map of the human genome. Nature 359:794-801[Medline].

WIEHE, T. and M. SLATKIN, 1998  Epistatic selection in a multi-locus Levene model and implications for linkage disequilibrium. Theor. Popul. Biol. 53:75-84[Medline].




This article has been cited by other articles:


Home page
GeneticsHome page
K. A. Mather, A. L. Caicedo, N. R. Polato, K. M. Olsen, S. McCouch, and M. D. Purugganan
The Extent of Linkage Disequilibrium in Rice (Oryza sativa L.)
Genetics, December 1, 2007; 177(4): 2223 - 2232.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. Xu, W. Huang, H. Wang, Y. He, Y. Wang, Y. Wang, J. Qian, M. Xiong, and L. Jin
Dissecting Linkage Disequilibrium in African-American Genomes: Roles of Markers and Individuals
Mol. Biol. Evol., September 1, 2007; 24(9): 2049 - 2058.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
S. Chao, W. Zhang, J. Dubcovsky, and M. Sorrells
Evaluation of Genetic Diversity and Genome-wide Linkage Disequilibrium among U.S. Wheat (Triticum aestivum L.) Germplasm Representing Different Market Classes
Crop Sci., May 31, 2007; 47(3): 1018 - 1030.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. Meyer, R. M. Single, S. J. Mack, H. A. Erlich, and G. Thomson
Signatures of Demographic History and Natural Selection in the Human Major Histocompatibility Complex Loci
Genetics, August 1, 2006; 173(4): 2121 - 2142.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. M. Teshima, G. Coop, and M. Przeworski
How reliable are empirical genomic scans for selective sweeps?
Genome Res., June 1, 2006; 16(6): 702 - 712.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. A. Saunders, M. Slatkin, C. Garner, M. F. Hammer, and M. W. Nachman
The Extent of Linkage Disequilibrium Caused by Selection on G6PD in Humans
Genetics, November 1, 2005; 171(3): 1219 - 1229.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. T. Hamblin, M. G. Salas Fernandez, A. M. Casa, S. E. Mitchell, A. H. Paterson, and S. Kresovich
Equilibrium Processes Cannot Explain High Levels of Short- and Medium-Range Linkage Disequilibrium in the Domesticated Grass Sorghum bicolor
Genetics, November 1, 2005; 171(3): 1247 - 1256.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
D. A. Hinds, L. L. Stuve, G. B. Nilsen, E. Halperin, E. Eskin, D. G. Ballinger, K. A. Frazer, and D. R. Cox
Whole-Genome Patterns of Common DNA Variation in Three Human Populations
Science, February 18, 2005; 307(5712): 1072 - 1079.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. Takano-Shimizu, A. Kawabe, N. Inomata, N. Nanba, R. Kondo, Y. Inoue, and M. Itoh
Interlocus nonrandom association of polymorphisms in Drosophila chemoreceptor genes
PNAS, September 28, 2004; 101(39): 14156 - 14161.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. F. Storz, B. A. Payseur, and M. W. Nachman
Genome Scans of DNA Variability in Humans Reveal Evidence for Selective Sweeps Outside of Africa
Mol. Biol. Evol., September 1, 2004; 21(9): 1800 - 1811.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. W. Nachman, S. L. D'Agostino, C. R. Tillquist, Z. Mobasher, and M. F. Hammer
Nucleotide Variation at Msn and Alas2, Two Genes Flanking the Centromere of the X Chromosome in Humans
Genetics, May 1, 2004; 167(1): 423 - 437.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
X. Ke, S. Hunt, W. Tapper, R. Lawrence, G. Stavrides, J. Ghori, P. Whittaker, A. Collins, A. P. Morris, D. Bentley, et al.
The impact of SNP density on fine-scale patterns of linkage disequilibrium
Hum. Mol. Genet., March 15, 2004; 13(6): 577 - 588.
[Abstract] [Full Text] [PDF]


Home page
J. Med. Genet.Home page
D Jeganathan, R Chodhari, M Meeks, O Faeroe, D Smyth, K Nielsen, I Amirav, A S Luder, H Bisgaard, R M Gardiner, et al.
Loci for primary ciliary dyskinesia map to chromosome 16p12.1-12.2 and 15q13.1-15.1 in Faroe Islands and Israeli Druze genetic isolates
J. Med. Genet., March 1, 2004; 41(3): 233 - 240.
[Full Text] [PDF]


Home page
GeneticsHome page
J. Nsengimana, P. Baret, C. S. Haley, and P. M. Visscher
Linkage Disequilibrium in the Domesticated Pig
Genetics, March 1, 2004; 166(3): 1395 - 1404.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
A. Tenesa, A.F. Wright, S.A. Knott, A.D. Carothers, C. Hayward, A. Angius, I. Persico, G. Maestrale, N.D. Hastie, M. Pirastu, et al.
Extent of linkage disequilibrium in a Sardinian sub-isolate: sampling and methodological considerations
Hum. Mol. Genet., January 1, 2004; 13(1): 25 - 33.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
R. L. Vallejo, Y. L. Li, G. W. Rogers, and M. S. Ashwell
Genetic Diversity and Background Linkage Disequilibrium in the North American Holstein Cattle Population
J Dairy Sci, December 1, 2003; 86(12): 4137 - 4147.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. O. Kauer, D. Dieringer, and C. Schlotterer
A Microsatellite Variability Screen for Positive Selection Associated With the "Out of Africa" Habitat Expansion of Drosophila melanogaster
Genetics, November 1, 2003; 165(3): 1137 - 1148.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
L. Kauppi, A. Sajantila, and A. J. Jeffreys
Recombination hotspots rather than population history dominate linkage disequilibrium in the MHC class II region
Hum. Mol. Genet., January 1, 2003; 12(1): 33 - 40.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
B. A. Payseur, A. D. Cutter, and M. W. Nachman
Searching for Evidence of Positive Selection in the Human Genome Using Patterns of Microsatellite Variability
Mol. Biol. Evol., July 1, 2002; 19(7): 1143 - 1153.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
T. G. Schulze, Y.-S. Chen, N. Akula, K. Hennessy, J. A. Badner, M. G. McInnis, J. R. DePaulo, J. Schumacher, S. Cichon, P. Propping, et al.
Can long-range microsatellite data be used to predict short-range linkage disequilibrium?
Hum. Mol. Genet., June 1, 2002; 11(12): 1363 - 1372.
[Abstract] [Full Text] [PDF]


Home page
NeurologyHome page
R. H. Wallace, I. E. Scheffer, G. Parasivam, S. Barnett, G. B. Wallace, G. R. Sutherland, S. F. Berkovic, and J. C. Mulley
Generalized epilepsy with febrile seizures plus: Mutation of the sodium channel subunit SCN1B
Neurology, May 14, 2002; 58(9): 1426 - 1429.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. Sabatti and N. Risch
Homozygosity and Linkage Disequilibrium
Genetics, April 1, 2002; 160(4): 1707 - 1719.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
L. Tiret, O. Poirier, V. Nicaud, S. Barbaux, S.-M. Herrmann, C. Perret, S. Raoux, C. Francomme, G. Lebard, D. Tregouet, et al.
Heterogeneity of linkage disequilibrium in human genes has implications for association studies of common diseases
Hum. Mol. Genet., February 1, 2002; 11(4): 419 - 429.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
L.B. Jorde, W.S. Watkins, and M.J. Bamshad
Population genomics: a bridge from evolutionary history to genetic medicine
Hum. Mol. Genet., October 1, 2001; 10(20): 2199 - 2207.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. L. Remington, J. M. Thornsberry, Y. Matsuoka, L. M. Wilson, S. R. Whitt, J. Doebley, S. Kresovich, M. M. Goodman, and E. S. Buckler IV
Structure of linkage disequilibrium and phenotypic associations in the maize genome
PNAS, September 13, 2001; (2001) 201394398.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. I. Tenaillon, M. C. Sawkins, A. D. Long, R. L. Gaut, J. F. Doebley, and B. S. Gaut
Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.)
PNAS, July 19, 2001; (2001) 151244298.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. L. Mohlke, E. M. Lange, T. T. Valle, S. Ghosh, V. L. Magnuson, K. Silander, R. M. Watanabe, P. S. Chines, R. N. Bergman, J. Tuomilehto, et al.
Linkage Disequilibrium Between Microsatellite Markers Extends Beyond 1 cM on Chromosome 20 in Finns
Genome Res., July 1, 2001; 11(7): 1221 - 1226.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. Zapata, S. Rodriguez, G. Visedo, and F. Sacristan
Spectrum of Nonrandom Associations Between Microsatellite Loci on Human Chromosome 11p15
Genetics, July 1, 2001; 158(3): 1235 - 1251.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. Lai
Application of SNP Technologies in Medicine: Lessons Learned and Future Challenges
Genome Res., June 1, 2001; 11(6): 927 - 929.
[Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
N. E. Morton, W. Zhang, P. Taillon-Miller, S. Ennis, P.-Y. Kwok, and A. Collins
The optimal measure of allelic association
PNAS, April 12, 2001; (2001) 91062198.
[Abstract] [Full Text]


Home page
Hum Mol GenetHome page
S. K. Service, R. A. Ophoff, and N. B. Freimer
The genome-wide distribution of background linkage disequilibrium in a population isolate
Hum. Mol. Genet., March 1, 2001; 10(5): 545 - 551.
[Abstract] [Full Text] [PDF]


Home page
Stat Methods Med ResHome page
L. C Lazzeroni
A chronology of fine-scale gene mapping by linkage disequilibrium
Statistical Methods in Medical Research, February 1, 2001; 10(1): 57 - 76.
[Abstract] [PDF]


Home page
Hum Mol GenetHome page
P. Zavattari, E. Deidda, M. Whalen, R. Lampis, A. Mulargia, M. Loddo, I. Eaves, G. Mastio, J. A. Todd, and F. Cucca
Major factors influencing linkage disequilibrium by analysis of different chromosome regions in distinct populations: demography, chromosome recombination frequency and selection
Hum. Mol. Genet., December 1, 2000; 9(20): 2947 - 2957.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
P. Zavattari, R. Lampis, A. Mulargia, M. Loddo, E. Angius, J. A. Todd, and F. Cucca
Confirmation of the DRB1-DQB1 loci as the major component of IDDM1 in the isolated founder population of Sardinia
Hum. Mol. Genet., December 1, 2000; 9(20): 2967 - 2972.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
G. A. Huttley and S. R. Wilson
Testing for Concordant Equilibria Between Population Samples
Genetics, December 1, 2000; 156(4): 2127 - 2135.
[Abstract] [Full Text]


Home page
Genome ResHome page
L.B. Jorde
Linkage Disequilibrium and the Search for Complex Disease Genes
Genome Res., October 1, 2000; 10(10): 1435 - 1444.
[Full Text]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. H. Kohn, H.-J. Pelz, and R. K. Wayne
Natural selection mapping of the warfarin-resistance gene
PNAS, July 5, 2000; 97(14): 7911 - 7915.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Collins, C. Lonjou, and N. E. Morton
Genetic epidemiology of single-nucleotide polymorphisms
PNAS, December 21, 1999; 96(26): 15173 - 15177.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
N. E. Morton, W. Zhang, P. Taillon-Miller, S. Ennis, P.-Y. Kwok, and A. Collins
The optimal measure of allelic association
PNAS, April 24, 2001; 98(9): 5217 - 5221.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. I. Tenaillon, M. C. Sawkins, A. D. Long, R. L. Gaut, J. F. Doebley, and B. S. Gaut
Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.)
PNAS, July 31, 2001; 98(16): 9161 - 9166.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. L. Remington, J. M. Thornsberry, Y. Matsuoka, L. M. Wilson, S. R. Whitt, J. Doebley, S. Kresovich, M. M. Goodman, and E. S. Buckler IV
Structure of linkage disequilibrium and phenotypic associations in the maize genome
PNAS, September 25, 2001; 98(20): 11479 - 11484.
[Abstract] [Full Text] [PDF]