Genetics, Vol. 150, 1295-1307, November 1998, Copyright © 1998

Constrained Disequilibrium Values and Hitchhiking in a Three-Locus System

Mark N. Grotea, William Klitzb, and Glenys Thomsonc
a Section of Evolution and Ecology, University of California, Davis, California 95616,
b School of Public Health, University of California, Berkeley, California 94720
c Department of Integrative Biology, University of California, Berkeley, California 94720

Corresponding author: Mark N. Grote, Section of Evolution and Ecology, University of California, Davis, CA 95616., mngrote{at}ucdavis.edu (E-mail).

Communicating editor: G. B. GOLDING


*  ABSTRACT
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Positive selection on a new mutant allele can increase the frequencies of closely linked alleles (through hitchhiking), as well as create linkage disequilibrium between them. Because this disequilibrium is induced by the selected allele, one may be able to identify loci under selection by measuring the influence of a candidate locus on pairwise disequilibrium values at nearby loci. The constrained disequilibrium values (CDV) method approaches this problem by examining differences in pairwise disequilibrium values, which have been normalized for two- and three-locus systems, respectively. We have investigated in detail the reliability of inferences based on CDV, using simulation and analytical methods. Our main results are (i) in some circumstances, CDV may not distinguish well between a selected locus and a neighboring neutral locus, but (ii) CDV seldom indicates "selection" in neutral haplotypes with moderate to large 4Nc. We conclude that, although the CDV method does not appear to precisely locate selected alleles, it can be used to screen for regions in which hitchhiking is a plausible hypothesis. We present a microsatellite data set from human chromosome 6, in which constrained disequilibrium values suggest the action of selection in a region containing the human leukocyte antigen (HLA)-A and myelin oligodendrocyte glycoprotein (MOG) loci. The connection between hitchhiking and disequilibrium has received relatively little attention, so our investigation presents opportunities to address more general issues.


IN the genetic hitchhiking model, positive selection on a new mutant allele increases the frequencies of other alleles physically linked to the mutant, skewing the frequency distributions at the linked loci. Theoretical and empirical studies of hitchhiking generally focus on the reduction in variation at linked neutral loci that can result if the recombination rate is low and the selected mutant is quickly fixed in the population (MAYNARD-SMITH and HAIGH 1974 Down; OHTA and KIMURA 1975 Down; AGUADE et al. 1989 Down; KAPLAN et al. 1989 Down; STEPHAN and LANGLEY 1989 Down; and many more recent studies). Relatively fewer studies have focused on linkage disequilibrium (gametic phase disequilibrium) in haplotypes subject to hitchhiking (THOMSON 1977 Down; ROBINSON et al. 1991A Down; BEGUN and AQUADRO 1994 Down, BEGUN and AQUADRO 1995 Down). Our concerns in the present study are the nature of linkage disequilibrium created by hitchhiking, and the extent to which certain patterns of disequilibrium can be used to make inferences about hitchhiking. In a wider sense, our aim is to investigate a particular method that uses linkage disequilibrium to physically locate genes of interest.

Relatively insignificant linkage disequilibrium is always created by the appearance of a new mutant, because initially the mutant is found only on an "ancestral" haplotype of closely linked alleles. THOMSON 1977 Down showed that hitchhiking can noticeably increase this existing disequilibrium, if selection in favor of the new mutant is strong enough relative to the recombination rate with linked loci. Hitchhiking can also create significant disequilibrium between nonselected alleles if they are closely linked to the selected allele (THOMSON 1977 Down); here, disequilibrium between neutral alleles is induced by mutual association with the selected allele. These associations are expected to decline in strength as recombination breaks up haplotypes bearing the selected mutant.

ROBINSON et al. 1991A Down, ROBINSON et al. 1991B Down introduced the constrained disequilibrium values (CDV) method as a means of identifying loci that may have been subject to recent hitchhiking. Inference with the CDV method depends on comparisons between pairwise disequilibrium measures, which have been normalized according to different constraints imposed by two- and three-locus systems. A familiar two-locus linkage disequilibrium measure is

(1)
where fab is the frequency of the ab haplotype and pa, pb are the corresponding one-locus allele frequencies. The value of Dab depends strongly on the magnitudes of pa and pb, so Dab is commonly normalized using upper and lower bounds imposed by pa and pb (LEWONTIN 1964 Down, LEWONTIN 1988 Down; HEDRICK 1987 Down). ROBINSON et al. 1991B Down showed that a third locus imposes further bounds on Dab, and showed how to normalize Dab using these additional constraints (described below). Differences in pairwise disequilibria, normalized in the two different ways, highlight the influence that a third locus may exert on the pairwise measure. On the basis of deterministic simulations of a three-locus hitchhiking model, ROBINSON et al. 1991A Down proposed that differences in the normalized measures could indicate which of the three loci has the selected mutant. The method for making such inferences was termed the CDV method.

Our purpose is to present some recent results that bear upon the use and interpretation of CDV. First, we summarize further simulations of the deterministic model, describing some circumstances under which CDV does, or does not, lead to reliable inferences about the position of the selected locus. In connection with this, we analyze the normalized disequilibrium measures under a selection model with simplifying assumptions and show that inferences with CDV are especially sensitive to allele frequencies at neutral loci closely linked to the selected locus. Second, we apply the CDV method to data sets generated under a stochastic model of neutral haplotypes, using a simulation program of HUDSON 1983 Down, HUDSON 1985 Down, to assess the performance of CDV under a finite-population "null" model. In our discussion, we reexamine the types of inferences that can be made with CDV and address some conceptual and practical issues. Finally, we apply the CDV method to marker haplotypes from human chromosome 6, to illustrate one use of CDV.


*  METHODS
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Measures of disequilibrium and hitchhiking:
Our attention centers on two normalized measures of pairwise linkage disequilibrium, D' and D'', and in particular on the difference in their magnitudes,

D' is the familiar normalized pairwise linkage disequilibrium measure (LEWONTIN 1964 Down, LEWONTIN 1988 Down; HEDRICK 1987 Down; ROBINSON et al. 1991B Down), where for alleles a and b at distinct loci,

(2)
with Dab, pa, and pb as in (1), qa = 1 - pa and qb = 1 - pb. D'ab is calculated from Dab via division by an appropriate upper or lower bound. The extreme values of D'ab reflect the complete association of the a or b allele with the ab haplotype (D'ab = +1) , or the absence of any ab haplotypes when in fact the constituent alleles are present (D'ab = -1 ).

ROBINSON et al. 1991B Down derived a new normalized pairwise measure, D''ab , which incorporates futher constraints imposed on Dab by a third locus. For a three-locus diallelic haplotype, where the C locus plays the role of the "constraining" locus,

(3)
where

and

The associations between a and c, and between b and c, enter the calculation of D''ab through m1, m2, M1, and M2.

Like D'ab , D''ab lies between +1 and -1, where the extreme values indicate the strongest possible positive or negative association between alleles a and b, within the constraints imposed by the allele frequencies and pairwise disequilibria of the three-locus system. We write D''ab(c) because D''ab is calculated with reference to a particular allele at the third locus, but in a diallelic system, one can show D''ab(c) = D''ab(C) . Moreover, as with D'ab in a diallelic system, D''ab = D''AB = -D''aB = -D''Ab .

Assuming Dab > 0 for the moment, {delta} = |D'| - |D''| is greater than zero when the pairwise measure Dab is more extreme relative to its two-locus maximum, than to its positive range in the three-locus system; in this case the pairwise association between a and b appears to be relatively weaker when all of the pairwise associations of the three-locus system are taken into account. Loosely speaking, when {delta} > 0, the association between a and b is said to be partly accounted for by their mutual association with c. Assuming further that c is a selected mutant, this property of {delta} is the primary reason for treating {delta} > 0 as the "footprint" of a hitchhiking event, in which the neutral a and b alleles have hitchhiked with c (ROBINSON et al. 1991A Down).

Although the normalized measure D' may change during a hitchhiking event (THOMSON 1977 Down), D' alone does not distinguish between loci that may be under positive selection and linked neutral loci that are only hitchhiking with the selected locus. Similarly, there is a single measure of third-order disequilibrium, Dabc, which can also be normalized appropriately (GEIRINGER 1944 Down; THOMSON and BAUR 1984 Down), but Dabc also makes no distinction between selected and hitchhiking loci. The main claim of ROBINSON et al. 1991A Down is that {delta} values, when interpreted appropriately, can make this distinction.

For a given three-locus haplotype, each locus may play the role of the constraining locus, and there are three {delta} values: {delta}ab(c), {delta}a(b)c, and {delta}(a)bc. Using deterministic simulations, ROBINSON et al. 1991A Down found that {delta} was often large and positive when the "constraining" allele was increasing in frequency due to positive selection, but the linked alleles were selectively neutral. When a nonselected allele played the "constraining" role, ROBINSON et al. 1991A Down found that {delta} tended to be zero or negative. Based on their observations, ROBINSON et al. 1991A Down proposed the following criteria for inferring selection based on {delta} values:

  1. If one of the three {delta} values is positive and the remaining two are zero or negative, the constraining allele that gives the positive {delta} is the one that may have experienced recent selection.

  2. If more than one of the {delta} values is positive, but one is much larger than the rest (for this study, more than double the next largest), the constraining allele that gives the large {delta} is the one that may have experienced recent selection.

  3. If all three {delta} values are <=0, or two are positive but close in value, no conclusion about selection can be drawn.

ROBINSON et al. 1991A Down paid considerable attention to the magnitudes of {delta} values under various scenarios, but we first focus simply on which loci the CDV method identifies as candidates for selection, in a large series of deterministic simulations.

A deterministic hitchhiking model:
The deterministic simulations are based on a three-locus, diallelic model that evolves via a standard system of algebraic recursions (FELDMAN et al. 1974 Down; THOMSON 1977 Down; HARTL and CLARK 1989 Down). For purposes of the CDV method, we are interested in a single new mutant allele and the closely linked alleles of the ancestral haplotype on which the mutant first appeared. The alleles of interest, in their order on the chromosome, are a, b, and c, one of which will be the new mutant and the others linked alleles. By convention, A, B, and C may be taken to represent all other alleles at their respective loci.

The recursion equations describing changes in the haplotype frequencies can be specified by selection and mutation parameters described immediately below, the recombination rates r1 and r2 between the A and B loci and the B and C loci, respectively (where r1 + r2 - 2r1r2 gives the recombination rate between A and C for the "no-interference" model), and a set of initial haplotype frequencies. The latter are determined by specifying initial allele frequencies pa(0), pb(0), pc(0), and a single initial disequilibrium value [e.g., D'ab (0) when c is the new mutant]. In addition, we assume that the haplotype bearing the new mutant has not experienced mutation or recombination before the simulation begins at generation zero [for example, if c is the new mutant, this implies fabc(0) = pc(0)]. The frequency dynamics of a strongly selected allele, once it has left the zero-frequency boundary, are commonly modeled as a deterministic process (e.g., KAPLAN et al. 1989 Down). For convenience, we have assumed that the time spent close to the boundary is small relative to recombination and mutation rates near the selected locus.

Fitnesses at the selected locus (using genotypes at the C locus for illustration) are given by wcc = 1 - sc, wCc = 1, and wCC = 1 - sC. We have adopted a general framework for hitchhiking studies, as our selection model encompasses both directional selection leading to fixation of the new mutant (e.g., sc <= 0 and 0 < sC <= 1) and balancing selection (0 < {sC,sc} < 1). Mutation is unidirectional at rates µa = µb = µc = 10-5 per generation from alleles a, b, and c to A, B, and C, respectively, so that the alleles of interest are transient. We use terms like "equilibrium frequency" loosely, referring to the relatively fast adjustment of allele frequencies that results from the appearance of a new selected mutant. For completeness, we have included the recursion equations in the Appendix 1.

Scope of the deterministic simulations:
The parameter space for the deterministic model is large and multidimensional, so we limit our investigation to a relatively narrow subset of parameter values under which measurable linkage disequilibrium is likely to be present. Using simple frequency arguments, one can conclude that most new mutants arise on relatively common haplotypes; but more unusual events, in which mutants appear on rare haplotypes, are actually of greater interest in hitchhiking studies. THOMSON 1977 Down showed that hitchhiking will only noticeably perturb allele frequencies and disequilibria when at least one of the neutral alleles initially linked to the mutant is rare. The pairwise disequilibrium value Dab is only large when alleles a and b are at intermediate frequencies and strongly associated in the "coupling" (ab) phase. An initially rare ab haplotype, on which a strongly selected mutant happens to occur, is in a primary position to pass through this range of intermediate frequencies in strong coupling.

In the following simulations, we have (somewhat arbitrarily) set the initial frequency of at least one of the neutral alleles at p(0) = 0.05, to ensure that the ancestral haplotype is sufficiently rare. Table 1 shows parameter values that are typical of the simulations. Here, c is the selected mutant and pa(0) and pb(0) are treated in a symmetric fashion, each assuming the value p(0) = 0.05 while the other takes values between 0.05 and 0.9 in successive runs. Some values of the initial pairwise disequilibrium D'ab (0) rule out certain combinations of pa(0), pb(0), and pc(0) in Table 1, but the same treatments are always applied to the a and b alleles.


 
View this table:
In this window
In a new window

 
Table 1. c is the new mutant under selection

Values of the remaining parameters were guided by a few basic rules. Hitchhiking is thought to be a weak force unless selective values are roughly an order of magnitude greater than recombination rates (MAYNARD-SMITH and HAIGH 1974 Down; THOMSON 1977 Down; KAPLAN et al. 1989 Down), so we have chosen selection and recombination parameters accordingly. Generally, for each setting of pa(0), pb(0), pc(0), D'ab (0), r1, and r2 that was investigated, we examined a basic series of runs formed by 7 x 10 = 70 pairs of selection coefficients (for example, sC and sc as shown in Table 1). Combinations of parameter values that involved interactions beyond those of primary interest [for example, D'ab(0) = -0.25 and r1 != r2 in Table 1] were left unexamined to keep the number of runs reasonable. More detailed tables are in GROTE 1996 Down and are available upon request.

Within these guidelines, our first objectives are to significantly enlarge upon the number of deterministic cases examined in ROBINSON et al. 1991A Down, and to investigate some cases where the relationship between the {delta} values is inconsistent with correct inference of the selected locus.

CDV in a stochastic neutral model:
Our second aim is to study the performance of CDV in a neutral, finite-population model, to determine whether or not genetic drift and sampling effects can produce patterns of linkage disequilibrium conforming to criteria 1 or 2 above. ROBINSON et al. 1991A Down used an ad hoc method to study {delta} values under genetic drift.

We have modified a computer program of HUDSON 1983 Down, HUDSON 1985 Down to study {delta} values in the neutral model. The program simulates random samples of three-locus haplotypes, generated under the neutral "infinite alleles" model with recombination at equilibrium. The program requires the following input parameters: n, the number of haplotypes per sample; 4Nc, the scaled recombination rate between the A and C loci (the B locus is assumed to be halfway between A and C); {theta}a, {theta}b, and {theta}c (with, e.g., {theta}a = 4N µa, where N is the effective population size and µa is the mutation rate to new A-locus alleles). We used the value {theta} = 0.2 at each locus, corresponding to the approximate numerical solution of

(EWENS 1979 Down), with E[Kn] the expected number of alleles (per locus) in the sample and n = 100. As one would expect, not all samples generated at {theta} = 0.2 were segregating exactly two alleles at each locus, so we screened each sample and retained only those with diallelic loci. We further required in each sample a standard minimum level of heterozygosity, H >= 0.095 per locus. We then calculated the three values, {delta}(a)bc, {delta}a(b)c, and {delta}ab(c) in each accepted sample. For each of three levels of recombination 4Nc, we generated independent samples until 1000 samples had met the screening criteria; our stochastic simulation results are based on these groups of 1000 samples.


*  RESULTS
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Deterministic simulations:
Figure 1 Figure 2 Figure 3 show sample runs of a deterministic model in which c is the selected mutant and the A and B loci are neutral. In Figure 1 Figure 2 Figure 3, recombination rates, initial allele frequencies at the A and C loci, and mutation and selection parameters are the same; only the initial frequency of the b allele varies between the figures.



View larger version (22K):
In this window
In a new window
Download PPT slide
 
Figure 1. Deterministic simulation for selection at c, with D'ab(0) = 0.0 , r1 = r2 = 0.001, pa(0) = 0.05, pb(0) = 0.1, pc(0) = 0.001, sC = 0.025, sc = 0.075.



View larger version (21K):
In this window
In a new window
Download PPT slide
 
Figure 2. Deterministic simulation for selection at c, the same as Figure 1 except pb(0) = 0.5.



View larger version (23K):
In this window
In a new window
Download PPT slide
 
Figure 3. Deterministic simulation for selection at c, the same as Figure 1 except pb(0) = 0.05.

In the allele frequency plots, pc(t) approaches the equilibrium value = 0.25, then slowly declines due to mutation (not evident in these plots). Frequencies of the a and b alleles both increase due to hitchhiking with the selected mutant c. The frequencies ultimately attained by a and b depend on their initial frequencies, Dab(0), the strength of selection on c, and the recombination rates between these loci (MAYNARD-SMITH and HAIGH 1974 Down; THOMSON 1977 Down). The initial value of Dab is zero in each of the runs, and initial values of Dac and Dbc are small positive numbers reflecting the associations of a and b with the new mutant c. All three values of D increase with the hitchhiking effect, then slowly decrease. Because there is only a single selected locus in these runs, the equilibrium value of all disequilibria D and each {delta} is zero. In the deterministic model without selection on c, Dab would remain at zero, whereas Dac and Dbc would decline from their initial values to zero without a transient increase. In these runs, it is only after the disequilibrium measures D have attained relatively large values that deviations from {delta} = 0 are observed.

In Figure 1, {delta} values between roughly generations 100 and 300 satisfy criteria 1 or 2 to correctly indicate selection at the C locus. Later in the run {delta} values conform to criterion 3, where no conclusions about selection would be made. Figure 2 conforms entirely to criterion 3, having no signal for selection during the run. In Figure 3, both the b and c alleles meet criteria for selection at different times in the run, although only c is under selection. In particular, applying the CDV criteria at any time between generations 320 and 550, we could conclude that the neutral b allele is in fact under positive selection (Figure 3 is similar to pattern II' in Figure 3 of ROBINSON et al. 1991A Down). Inferences about the selected allele based on disequilibrium values at a single time-point could indeed be misleading in Figure 3, where knowledge of the whole history of the selective event might seem necessary for correct inference.

The performance of the CDV criteria in a large series of deterministic runs is summarized in Table 1 and Table 2. Following the discussion above, we have classified each run by determining which alleles, if any, the CDV criteria would indicate as "selected." The run of Figure 1 shows a correct signal at the C locus for 100 <= t <= 300 but gives no signal for selection otherwise, and is counted under the column "signal at c alone" in Table 1. Sampling such a run at an arbitrary time, we might draw no conclusions, but would not incorrectly identify a neutral allele as selected. The run of Figure 2 gives no signal for selection at all and is counted under "no signal" in Table 1. The run of Figure 3 gives, for 320 <= t <= 550, a misleading signal for selection at the neutral B locus and is counted under "signal at b" in Table 1 (there is a similar column for "signal at a"). Because there are no runs in this series with signals at both neutral loci, each run falls into only one of these categories. We have chosen a conservative classification that emphasizes times during which CDV leads to incorrect inferences. In the text below, we describe broad trends and give some breakdowns of the runs that would not be evident by examining Table 1 and Table 2 alone. We use percentages in the tables and text as convenient summaries, but do not view these as probabilities.


 
View this table:
In this window
In a new window

 
Table 2. b is the new mutant under selection: pc(0) = 0.05, r2 = 0.001

In 26.1% (676/2590) of the runs in Table 1, the only signal identifying an allele under selection correctly points to c as the selected mutant. The CDV criteria identify the selected locus most reliably when the b allele is initially of moderate frequency: 50.5% (283/560) of the runs in Table 1 with pb(0) = 0.3, 0.4, or 0.5 and pa(0) = 0.05 resulted in the c allele being correctly identified, 35.5% (199/560) led to a possible misidentification of the b allele, and the remaining 14.0% (78/560) gave no signal for selection. CDV also performs well when the initial disequilibrium between the neutral alleles is negative: 46.7% (294/630) of the runs with D'ab(0) = -0.25 correctly identified the c allele and only 20.8% (131/630) gave false signals at a or b. CDV performs poorly when the initial frequencies of the neutral loci differ widely, resulting in a false signal for selection at the rarer of the two neutral alleles: 52.1% (219/420) of the runs with pa(0) = 0.75 or 0.9 and pb(0) = 0.05 gave a false signal at b, and 33.3% (140/420) of the runs with pa(0) = 0.05 and pb(0) = 0.75 or 0.9 gave a false signal at a. In general, CDV does a poor job identifying the selected allele when the b allele is rare: 62.1% (956/1540) of the runs with pb(0) = 0.05 gave a false signal for selection at the b allele. When the b allele is initially rare, the CDV criteria do not distinguish well between the new selected mutant c and its closest neutral neighbor.

In these simulations, when sc <= 0.0 and sC > 0.0, the new mutant c will be transiently fixed in the population (often called a "selective sweep"). In the selective-sweep runs, 24.8% (257/1036) gave a correct signal from the c allele, 50.5% (523/1036) led to a possible misidentification of the b allele, and 4.2% (43/1036) to a possible misidentification of the a allele. In the next section, we will examine why CDV may not perform especially well in a selective sweep.

As one might imagine, there is a trend in the reliability of inferences associated with the ratio sc/sC: for fixed values of the remaining parameters, with both sc, sC > 0, runs with larger values of sc/sC tend to have no signal, those with smaller values of sc/sC tend to have incorrect signals from the b allele, and those with intermediate values of sc/sC allow the CDV method to perform best. The critical values of the ratio sc/sC depend in a complex way on the remaining parameters and appear to be different in each series of runs.

Table 2 is similar in structure to Table 1, except now b is the new selected mutant and the A and C loci are neutral. Here, by symmetry, there is no need to switch the roles of the neutral loci, and we use pc(0) = 0.05, r2 = 0.001 throughout. When b is the new mutant, a relatively small number of runs have a potentially misleading signal at a neutral locus, and nearly all are cases in which the neutral allele frequencies pa(0) and pc(0) differ widely [i.e., pa(0) = 0.75 or 0.9 and pc(0) = 0.05].

The role of allele frequencies at a closely linked neutral locus:
The most problematic observation in the simulations above was a strong tendency for the CDV method to indicate selection at the B locus when c was the new selected mutant. Using some mathematics and general aspects of the hitchhiking model, it is possible to show that a rare neutral allele on the ancestral haplotype can easily be mistaken for the selected allele, when using the CDV method. The analysis requires some simplifying assumptions, but gives some generality to the results of the deterministic simulations, showing that our observations do not depend strongly on particular choices of parameter values.

An overdominance model: We examine the behavior of {delta}a(b)c, the {delta} value that indicates selection at the B locus, during the rapid increase of a new, strongly overdominant c allele. To avoid dealing with the time component explicitly, we focus on {delta}a(b)c at t = 0, t "small" (a few generations) and t "moderate" (on the order of 100 to a few hundred generations). We assume that r1 and r2 are small enough so that recombination in the ancestral haplotype abc can be practically ignored when t is near zero, and further assume that pb(0) is small enough so that b and c are in strong coupling for small-to-moderate t. Low recombination and strong coupling of b and c imply that fab(t), fac(t), and fbc(t) are all approximately equal to pc(t) for small-to-moderate t. We finally assume Dab = 0, but due to hitchhiking, all of Dab, Dac, and Dbc are positive after a few generations of selection.

To characterize {delta}a(b)c, we must study the relationship between D'ac and D''a(b)c for t = 0, t small and t moderate. For convenience, the required definitions when Dac >= 0 are

and

where

and

Because all of Dab, Dac, and Dbc are >=0 by assumption, the sign of min*Dac is determined entirely by the relative sizes of the positive and negative terms in m2. When the disequilibria Dab and Dbc in m2 are small relative to the third-order products of allele frequencies (as they will tend to be for t near zero), m2 >= 0 and min*Dac <= 0. When the disequilibria are large relative to the third-order products (as they tend to be for moderate t), m2 < 0 and min*Dac > 0.

At t = 0, fac = pc, and because the new mutant c is found only with a, Dac = max Dac. Further, the inequality

must hold, because the set (paqc, qapc, M1, M2) that determines max*Dac contains the set that determines max Dac. Dac = max Dac then implies max*Dac = max Dac, and therefore

for t = 0.

For small t, with the disequilibria of m2 still small relative to the third-order products, the reasoning is very similar. Because the new mutant c is still found almost exclusively on the ancestral haplotype, Dac {approx} max Dac to a good approximation, so it also must be true that max*Dac {approx} max Dac. We then have {delta}a(b)c {approx} 0 for small t.

The situation changes when the disequilibria of m2 are large relative to the third-order products, so that m2 < 0 and min*Dac > 0; here, we must use the second case in the definition of D''ac above. We further observe that when the loci are evenly spaced, recombination begins relatively soon to reduce Dac below its two-locus maximum (compared to Dab and Dbc), although all of the disequilibria may have dropped below earlier large values due to allele frequency constraints. Now consider t moderate, with min*Dac > 0 and Dac < max Dac. To determine the sign of {delta}a(b)c, we must examine as before the relative magnitudes of max Dac and max*Dac. It is convenient to use algebraically equivalent expressions for the terms M1 and M2 in max*Dac:

Under our assumptions, fab {approx} fbc {approx} pc for small-to-moderate t to a reasonable approximation, and therefore M1 {approx} qapc, M2 {approx} paqc. We then may write

Along with Dac < max Dac, this implies

which is algebraically equivalent to

or {delta}a(b)c > 0. Putting the above together, we have shown that {delta}a(b)c {approx} 0 for t = 0 and t small, but {delta}a(b)c > 0 for t moderate.

Using very similar arguments, it is possible to show that {delta}ab(c) >= 0 during the same time interval, so that the same general mechanisms give the "correct" signal at the C locus. The constrasting result {delta}(a)bc {approx} 0 can be obtained using the same detailed arguments, or more easily can be obtained by noting that Dbc remains very close to max Dbc during the time interval of interest. Taken together, these arguments suggest that under our assumptions the CDV criteria could indicate selection at either the B or C loci, but not at the A locus.

A selective sweep model: A second basic model may be handled without doing any further analysis. For the selective sweep case, we assume sc <= 0 and sC > 0, so that the selected mutant c will be fixed, but the remaining assumptions are the same. The transient dynamics of allele and haplotype frequencies are the same as in the overdominance model, with perhaps minor differences in time scale; the main difference is in the endpoint of the selection process. MAYNARD-SMITH and HAIGH 1974 Down showed that an allele at a polymorphic locus closely linked to a new favored mutant may readily fix with the selected mutant. In our model, if the c allele fixes in a small number of generations, the time during which Dac and Dbc are positive will be very short, since these equal zero once the C locus has become monomorphic. If there has been very little recombination with A- or B-bearing haplotypes by the time c fixes, Dab will also depart only transiently from zero, because a and b will then nearly fix with c. There appears to be only a small time frame in which we could observe any disequilibrium, hence any change in {delta} values, in the sweep model. The basic reasoning of the previous section again suggests that during this time, a signal for apparent selection is possible from either the selected locus or a nearby neutral locus carrying a rare allele.

Stochastic simulations:
We have calculated {delta} values in simulated random samples from a stochastic, neutral diallelic model, to informally investigate the "type I" error in the CDV method. The three-locus neutral model is perhaps the simplest null-model that would be considered for data of the type used for CDV. HUDSON 1985 Down investigated the sampling distribution of the pairwise disequilibrium D using a similar approach. Pairwise D have been treated analytically under the neutral model by HILL 1975 Down, GOLDING 1984 Down, and HILL and WEIR 1988 Down.

Distributions of the three {delta} values in samples of size n = 100 are shown in Figure 4 for 4Nc = 10, 25, and 100. The histograms of Figure 4 show only the univariate (marginal) distributions of {delta} values and contain no information about the associations within samples of the three {delta} values. At each value of 4Nc, the {delta} = 0 class is by far the most common for each of {delta}(a)bc, {delta}a(b)c, and {delta}ab(c), with {delta} >= 0 relatively uncommon. Negative values of {delta} are more common than positive values when {delta} departs from zero. Relative to {delta}(a)bc and {delta}ab(c), {delta}a(b)c is more often different from zero.



View larger version (26K):
In this window
In a new window
Download PPT slide
 
Figure 4. Distributions of {delta} values under a stochastic neutral model for three values of 4Nc (where c is the recombination rate per generation between the A and C loci and N is the effective population size). Each individual data set consists of n = 100 three-locus haplotypes, having exactly two alleles segregating at each locus, with per-locus heterozygosities of at least 0.095. The distributions are based on 1000 independent data sets, generated using a modified program of HUDSON 1983 Down, HUDSON 1985 Down. Bars above {delta} = 0 are not drawn to scale with the remaining bars; instead the frequencies at {delta} = 0 are indicated in the figure.

The frequencies of apparent "hitchhiking" events, obtained by applying the CDV criteria to the samples in Figure 4, are shown in Table 3. For 4Nc = 10, each locus satisfies the criteria for selection in a small percentage of cases: here one can expect to find a signal for selection at some locus perhaps 8 to 9% of the time, using the CDV criteria in a neutral sample. With 4Nc >= 25, however, any apparent signal for "selection" based on the CDV criteria would be unusual. In concordance with the deterministic simulations, although here there is no selection, we obtain false signals for selection at the B locus more often than at A or C (as expected, the A and C loci give similar results). We take this as further evidence of a "position" effect that favors the middle locus.


 
View this table:
In this window
In a new window

 
Table 3. Frequency of signal for apparent selection: neutral samples (n = 100)

Marker haplotypes from human chromosome 6:
To illustrate one use of the CDV method, we have calculated {delta} values in a series of three-locus microsatellite haplotypes in the 6p21.3-22.1 region of human chromosome 6 (see Figure 5). We do not presume any of these marker loci are selected, but suppose instead that perhaps one or more markers could be closely linked to a selected gene.



View larger version (20K):
In this window
In a new window
Download PPT slide
 
Figure 5. Physical map of seven dinucleotide repeat markers in the 6p21.3-22.1 region of human chromosome 6. Approximate intermarker distances are based on the YAC contig and STS maps of MOSSER et al. 1997 Down. Allele frequency distributions are for the sample of 70 ethnic Germans provided by L. Calandro and G. F. Sensabaugh (SENSABAUGH et al. 1996 Down). The x and y axes of the histograms are labeled according to repeat number and frequency in the sample, respectively.

We used a "sliding window" approach, examining in turn each of the five groups of three adjacent markers among the seven markers shown in Figure 5. Human leukocyte antigen (HLA)-F3' and myelin oligodendrocyte glycoprotein (MOG)c are dinucleotide repeats closely linked to the HLA F locus and the MOG locus, respectively. HLA-A, a major histocompatibility complex class I locus, is located between the D6S265 and HLA-F3' markers shown in Figure 5 (LAUER et al. 1997 Down; MOSSER et al. 1997 Down). This region contains other loci of biological and evolutionary interest and has been the focus of recent intensive efforts to map the hereditary hemochromatosis locus, now known to be ~2.2 Mb telomeric to D6S464 (FEDER et al. 1996 Down; LAUER et al. 1997 Down; MOSSER et al. 1997 Down). The data we used are from a sample of 70 randomly ascertained ethnic Germans and were generously provided by L. Calandro and G. F. Sensabaugh (see SENSABAUGH et al. 1996 Down). We used an expectation-maximization (EM) algorithm to estimate haplotype frequencies from multilocus genotypes (BAUR and DANILOVS 1980 Down), working separately with each group of three adjacent markers. The EM algorithm provides haplotype frequency estimates for all possible combinations of alleles at the three loci, some of which have very low estimates and are unlikely to actually be in the sample. For further calculations, we retained only those three-locus haplotypes in which the constituent two-locus estimates were at least 0.05. Seventeen three-locus haplotypes in all met this minimum frequency threshold: 3 haplotypes of the D6S265/HLA-F3'/MOGc loci, 5 haplotypes of the HLA-F3'/MOGc/D6S258 loci, 3 haplotypes of the MOGc/D6S258/D6S306 loci, 4 haplotypes of the D6S258/D6S306/D6S105 loci, and 2 haplotypes of the D6S306/D6S105/D6S464 loci. All are combinations of the few most common alleles at each locus. We calculated {delta} values in each of these 17 haplotypes, converting to dialleles by combining the alleles not under consideration into a single class. All 3 haplotypes of the D6S265/HLA-F3'/MOGc loci had disequilibrium patterns conforming to criteria 1 or 2 (Table 4), but none of the remaining 14 haplotypes met these criteria.


 
View this table:
In this window
In a new window

 
Table 4. Ethnic German sample haplotypes with signal for selection


*  DISCUSSION
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

Inferences with CDV:
In the deterministic runs, where the new selected mutant appeared at a terminal locus (the C locus), the CDV method did not distinguish well between the selected locus and a neutral neighbor, especially when a relatively rare allele of the neutral locus was initially linked with the selected mutant. In this case a signal for apparent selection could either be detected from the selected locus or the neutral locus. This situation is unfortunate and somewhat paradoxical, because we have argued that selected mutants that form on rare haplotypes create the most significant linkage disequilibrium in a hitchhiking scenario. To some extent, the CDV method is sensitive to each of the parameters of the model, but we discovered in particular a sensitivity to allele frequencies at the middle locus (the B locus). We showed, using an analytical approach under the assumption of strong selection and tight linkage, that a rare neutral allele at the B locus may easily be mistaken by the CDV criteria for the selected mutant c.

LEWONTIN 1988 Down showed that the normalized pairwise measure D'ab and other related measures are not in any general sense independent of the underlying allele frequencies pa and pb, although they are routinely treated as such. The CDV method uses both D' and D'' at each pair of the three-locus system, where D'' incorporates further one- and two-locus frequency constraints. In light of LEWONTIN's (1988) results, it is not unexpected that CDV shows a sensitive dependence on allele and haplotype frequencies, as well as on other parameters of the model.

In our deterministic simulations, when the middle locus (the B locus) had the new selected mutant, the CDV method gave correct inferences in a large majority of runs. It is difficult to put this attractive result into practice in the inference setting, because a signal for apparent selection at the B locus could indeed reflect selection at the locus or could be a false signal of the type that was commonly observed when c was the selected allele. One remedy might be to confine inferences to terminal loci, perhaps obtaining additional markers that could place any locus of interest at the "A" or "C" positions of our model. This assumes we could be virtually certain about inferences at terminal loci, an assertion that is contradicted by the fraction of deterministic runs of Table 1 and Table 2 in which a signal appears at an unselected terminal locus. It further seems possible that a generalization of our analytical approach, which relaxes assumptions about position, could show that inferences about terminal loci may not be reliable in the presence of rare neutral alleles. We think at present that the CDV method may not allow for high-precision inferences about the location of selected mutants; on this point we depart from ROBINSON et al. 1991A Down.

The stochastic simulations showed that patterns of linkage disequilibrium conforming to criteria 1 or 2 are uncommon for 4Nc >= 10 and highly unusual for 4Nc as large as 100. Here, we think there is potential inference value in the CDV method, because a simple neutral model can apparently be ruled out if either criteria 1 or 2 is met in a moderate-sized sample, with 4Nc on the order of 100. At this point, other nonselective alternative hypotheses (such as the neutral model with population structure or migration) cannot immediately be ruled out; this requires work beyond our current scope.

Although we do not think that CDV can very accurately distinguish the particular locus that has the selected allele, we do think that CDV can be used to screen for fairly localized regions that may have a recent history of hitchhiking (in general agreement with ROBINSON et al. 1991A Down). The basic requirements appear to be that the terminal loci span at least a distance of 4Nc = 10 (with the third locus roughly intermediate), that there is a standard minimum level of heterozygosity H >= 0.095 at each locus, and that there is moderately strong, but not complete, linkage disequilibrium in the region.

Selected mutations and linked markers at equilibrium:
We now describe a simple model of recurrent selected mutations and address some implications for CDV and similar methods. The simplest model assumes that selected alleles arise at random points in the genome. If such events are rare, the influence of new selected alleles on linked loci is transient: eventually the new mutant reaches equilibrium, and recombination, mutation, and genetic drift again dominate the dynamics of linked loci. Under this simple model, neutral alleles linked to a new overdominant mutant will increase in frequency and may reach high levels of disequilibrium, but do not generally fix (because the overdominance mode tends to preserve extant variation). Two such loci will return to neutral frequency and phase equilibria, respectively, at rates 1 - 1/2N, the rate of loss of heterozygosity at either locus (with N the effective population size; see, e.g., CROW and KIMURA 1970 Down), and 1 - c - 1/2N, the rate of decay of linkage disequilibrium between the loci (for the random union of gametes model, where c is the recombination rate; HILL 1974 Down). In the selective sweep case, if a neutral allele fixes with the new mutant, the time until polymorphism could be reestablished at the neutral locus is on the order of 1/µ, where µ is the neutral mutation rate (CROW and KIMURA 1970 Down). If either overdominant or favored mutants reoccur in a particular region over relatively short time scales, and the recovery of linkage equilibrium or polymorphism is inadequate, reperturbation by successive hitchhiking events may not be detectable. Even if selected mutants appear only rarely, the availability of adequate polymorphism at closely linked sites, on which linkage disequilibrium could be recorded, may be in question; for if {theta} = 4Nµ is small, a majority of linked neutral sites will be monomorphic. We have further claimed that disequilibrium created by hitchhiking is primarily connected to rare events in which selected mutants appear on low-frequency haplotypes. In particular, these impediments suggest that in chromosomal regions thought to be subject to recurrent selective sweeps (AGUADE et al. 1989 Down; BEGUN and AQUADRO 1994 Down, BEGUN and AQUADRO 1995 Down), the linkage disequilibrium that is indeed observed is primarily the result of mutational or other events that occurred since the most recent sweep.

For tightly linked loci, patterns of linkage disequilibrium conforming to criteria 1 or 2 persist approximately as long as the time required for the new mutant to reach equilibrium (THOMSON 1977 Down, and our deterministic simulations). If we assume strong selection and large N, and confine our attention only to mutants that invade the population, the expected time until the new mutant fixes in the selective sweep model is approximately

generations (EWENS 1973 Down, EWENS 1979 Down, p. 149). Here, is the approximate deterministic change per generation in the frequency of a favored allele a, where fitnesses are wAA = 1 - , wAa = 1, waa = 1 + . Using the same reasoning in the symmetric overdominance model, with fitnesses wAA = 1 - s, wAa = 1, waa = 1 - s (so that the fitness differential between the most extreme genotypes is s in both cases), the expected time for the new mutant to reach the interior polymorphism is approximately

generations. These persistence times can be small relative to the times required for the recovery of linkage equilibrium or neutral levels of polymorphism. For example, if N is 105, s = 0.01, and µ = 10-5, the persistence time for CDV-type patterns of linkage disequilibrium is <5000 generations in the selective sweep model, whereas if most extant variation is lost during the sweep, 105 generations on average are required to reestablish polymorphism at monomorphic sites, and during this period no new CDV-type patterns could be observed.

Human chromosome 6 haplotypes:
In Table 4, we showed three haplotypes of the D6S265/HLA-F3'/MOGc loci that met the CDV criteria for hitchhiking. HLA-F3' and MOGc are physically close, so we must make a rough assessment of 4Nc between these loci if we wish to compare the data with the neutral simulations of Figure 4 and Table 3. Although there is apparently no family data that give precise estimates of the recombination fraction between HLA-F3' and MOGc, the physical distance between these loci is known to be approximately 100–150 kb, based on YAC contig and STS maps (MOSSER et al. 1997 Down; HUMAN GENOME DATA BASE 1997). For estimation purposes, we will assume the distance is 100 kb and use N = 2000, perhaps a conservatively low value for a modern European population. If we use the crude conversion 1 Mb {approx} 1.16 cM [obtained by observing that the genome size is equivalently 3200 Mb or 3702 cM in human females (THE HUMAN TRANSCRIPT MAP 1996)], we conclude that 4Nc between HLA-F3' and MOGc is ~9. Thus, the D6S265/HLA-F3'/MOGc haplotype appears to span a distance over which criteria 1 or 2 are not commonly met in the simple neutral model. The setting here is not directly analogous to the null-model calculations of Figure 4 and Table 3 for two main reasons: (i) different three-locus marker haplotypes may share an allele at one or more loci, introducing dependencies not present in the simulated neutral haplotypes; (ii) it is well known that the "infinite alleles" mutation model used for the neutral simulations does not apply to microsatellite loci (see, e.g., VALDES et al. 1993 Down). However, the role of the mutation model, especially given the time scale for mutational events relative to the duration of hitchhiking events, should be minor. Finally, if the scaling of 4Nc is approximately correct, the chances under the neutral model that even one of the D6S265/HLA-F3'/MOGc haplotypes would meet the CDV criteria appear to be small.

We conclude that hitchhiking with one or more selected alleles, closely linked to the D6S265/HLA-F3'/MOGc loci, is a plausible explanation for the patterns of linkage disequilibrium observed in these haplotypes. Three apparently distinct haplotypes meet criteria 1 or 2, suggesting that hitchhiking with overdominant alleles is the more likely scenario: the data would seem to require otherwise that several favored alleles in the region are simultaneously being selected for, or that an ancestral haplotype bearing a favored allele has experienced several mutation events. We have also argued that the loss of variation under the selective sweep model poses a serious problem for observing disequilibrium, making it unlikely that disequilibrium created specifically by selectively favored alleles would ever be observed. While we have scaled back previous efforts to infer the precise location at which selection has acted, our results are consistent with other work on selection in this region of the human genome (KLITZ and THOMSON 1987 Down; SATTA et al. 1994 Down; PARHAM and OHTA 1996 Down). Our main intention in this example is to demonstrate that evidence for historical selection processes may indeed be found in the patterns of linkage disequilibrium we have focused on, in our investigation of the CDV method.


*  ACKNOWLEDGMENTS

We thank C. H. Langley, who read an earlier draft of the manuscript and made suggestions that led to substantial revisions. The human chromosome 6 haplotypes were collected by L. Calandro and G. F. Sensabaugh, who generously allowed us to use them here. We thank D. Cutler and A. D. Long for discussion and suggestions. An anonymous reviewer made suggestions that improved the presentation. This work was supported by National Institutes of Health grants HD-12731, GM-56688, and 5 T32 GM-07127.

Manuscript received February 6, 1998; Accepted for publication August 7, 1998.


*  APPENDIX 1
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

THREE-LOCUS DETERMINISTIC RECURSIONS
In the three-locus diallelic model, the eight haplotypes (gametes) are ABC, ABc, AbC, Abc, aBC, aBc, abC, abc, and their respective frequencies in a given generation are x1, ... x8, {sum}8i=1 xi = 1. Let

where wij is the fitness of the genotype formed from haplotypes i and j, and let = {sum}8i=1 ixi . After selection and recombination, the haplotype frequencies are given by

where

To complete one generation, we need only introduce mutation, which is unidirectional from a, b, and c to A, B, and C, respectively, all at rate µ per generation. After mutation, the haplotype frequencies are

This completes one generation of the recursion.


*  LITERATURE CITED
*TOP
*ABSTRACT
*METHODS
*RESULTS
*DISCUSSION
*APPENDIX 1
*LITERATURE CITED

AGUADÉ, M., N. MIYASHITA, and C. H. LANGLEY, 1989  Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster.. Genetics 122:607-615[Abstract/Free Full Text].

BAUR, M. P., and J. A. DANILOVS, 1980 Population analysis of HLA-A, B, C and DR and other genetics markers, pp. 955–993 in Histocompatibility Testing 1980, edited by P. TERASAKI. University of California, Tissue Typing Laboratory, Los Angeles.

BEGUN, D. J. and C. F. AQUADRO, 1994  Evolutionary inferences from DNA variation at the 6-phosphogluconate dehydrogenase locus in natural populations of Drosophila: selection and geographic differentiation. Genetics 136:155-171[Abstract].

BEGUN, D. J. and C. F. AQUADRO, 1995  Evolution at the tip and base of the X chromosome in an African population of Drosophila melanogaster.. Mol. Biol. Evol. 12:382-390[Abstract].

CROW, J. F., and M. KIMURA, 1970 An Introduction to Population Genetics Theory. Burgess Publishing Co., Minneapolis.

EWENS, W. J., 1973  Conditional diffusion processes in population genetics. Theor. Pop. Biol. 4:21-30[Medline].

EWENS, W. J., 1979 Mathematical Population Genetics. Springer-Verlag, Berlin.

FEDER, J. N., A. GNIRKE, W. THOMAS, Z. TSUCHIHASHI, and D. A. RUDDY et al., 1996  A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nat. Genet. 13:399-408[Medline].

FELDMAN, M. W., I. FRANKLIN, and G. J. THOMSON, 1974  Selection in complex genetic systems I. The symmetric equilibria of the three-locus symmetric viability model. Genetics 76:135-162[Abstract/Free Full Text].

GEIRINGER, H., 1944  On the probability theory of linkage in Mendelian heredity. Ann. Math. Stat. 15:25-57.

GOLDING, G. B., 1984  The sampling distribution of linkage disequilibrium. Genetics 108:257-274[Abstract/Free Full Text].

GROTE, M. N., 1996 Models of genetic selection and the Human Leukocyte Antigen loci. Ph.D. Thesis, University of California, Berkeley.

HARTL, D. L., and A. G. CLARK, 1989 Principles of Population Genetics. Sinauer Associates, Inc., Sunderland, MA.

HEDRICK, P. W., 1987  Gametic disequilibrium measures: proceed with caution. Genetics 117:331-341[Abstract/Free Full Text].

HILL, W. G., 1974  Disequilibrium among several linked neutral genes in finite population I. Mean changes in disequilibrium. Theor. Pop. Biol. 5:366-392[Medline].

HILL, W. G., 1975  Linkage disequilibrium among multiple neutral alleles produced by mutation in finite populations. Theor. Pop. Biol. 8:117-126[Medline].

HILL, W. G. and B. S. WEIR, 1988  Variances and covariances of squared linkage disequilibria in finite populations. Theor. Pop. Biol. 33:54-78[Medline].

HUDSON, R. R., 1983  Properties of a neutral allele model with intragenic recombination. Theor. Pop. Biol. 23:183-201[Medline].

HUDSON, R. R., 1985  The sampling distribution of linkage disequilibrium under an infinite allele model without selection. Genetics 109:611-631[Abstract/Free Full Text].

HUMAN GENOME DATA BASE, 1997 http://www.gdb.org

THE HUMAN TRANSCRIPT MAP, 1996 http://www.ncbi.nlm.nih.gov/SCIENCE96

KAPLAN, N. L., R. R. HUDSON, and C. H. LANGLEY, 1989  The "hitchhiking effect" revisited. Genetics 123:887-899[Abstract/Free Full Text].

KLITZ, W. and G. THOMSON, 1987  Disequilibrium pattern analysis. II. Application to Danish HLA-A and B locus data. Genetics 116:633-643[Abstract/Free Full Text].

LAUER, P., N. C. MEYER, C. E. PRASS, S. M. STARNES, and R. K. WOLFF et al., 1997  Clone-contig and STS maps of the hereditary hemochromatosis region on human chromosome 6p21.3-p22. Genome Res. 7:457-470[Abstract/Free Full Text].

LEWONTIN, R. C., 1964  The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49:49-67[Free Full Text].

LEWONTIN, R. C., 1988  On measures of gametic disequilibrium. Genetics 120:849-852[Abstract/Free Full Text].

MAYNARD-SMITH, J. and J. HAIGH, 1974  The hitch-hiking effect of a favourable gene. Genet. Res. 23:23-35[Medline].

MOSSER, J., A. M. JOUANOLLE, G. GANDON, N. ANDRIEUX, and A. HAMPE et al., 1997  A YAC contig and an STS map spanning at least 3.9 megabasepairs telomeric to HLA-A. Immunogenet. 45:447-451[Medline].

OHTA, T. and M. KIMURA, 1975  The effect of a selected linked locus on heterozygosity of neutral alleles (the hitchhiking effect). Genet. Res. 25:313-325[Medline].

PARHAM, P. and T. OHTA, 1996  Population biology of antigen presentation by MHC class-I molecules. Science 272:67-74[Abstract].

ROBINSON, W. P., A. CAMBON-THOMSEN, N. BOROT, W. KLITZ, and G. THOMSON, 1991a  Selection, hitchhiking and disequilibrium analysis at three linked loci with application to HLA data. Genetics 129:931-948[Abstract].

ROBINSON, W. P., M. A. ASMUSSEN, and G. THOMSON, 1991b  Three-locus systems impose additional constraints on pairwise disequilibria. Genetics 129:925-930[Abstract].

SATTA, Y., C. O'HUIGEN, N. TAKAHATA, and J. KLEIN, 1994  Intensity of natural selection at the major histocompatibility complex loci. Proc. Natl. Acad. Sci. USA 91:7184-7188[Abstract/Free Full Text].

SENSABAUGH, G. F., L. CALANDRO, T. THORSEN, L. BARCELLOS, and J. GRIGGS et al., 1996  Commentary. Blood Cells Mol. Dis. 22:194a-194b.

STEPHAN, W. and C. H. LANGLEY, 1989  Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. I. Contrasts between the vermillion and forked loci. Genetics 121:89-99[Abstract/Free Full Text].

THOMSON, G., 1977  The effect of a selected locus on linked neutral loci. Genetics 85:753-788[Abstract/Free Full Text].

THOMSON, G. and M. P. BAUR, 1984  Third order linkage disequilibrium. Tissue Antigens 24:250-255[Medline].

VALDES, A. M., M. SLATKIN, and N. B. FREIMER, 1993  Allele frequencies at microsatellite loci: the stepwise mutation model revisited. Genetics 133:737-749[Abstract].




This article has been cited by other articles:


Home page
GeneticsHome page
D. Meyer, R. M. Single, S. J. Mack, H. A. Erlich, and G. Thomson
Signatures of Demographic History and Natural Selection in the Human Major Histocompatibility Complex Loci
Genetics, August 1, 2006; 173(4): 2121 - 2142.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
W. Stephan, Y. S. Song, and C. H. Langley
The Hitchhiking Effect on Linkage Disequilibrium Between Linked Neutral Loci
Genetics, April 1, 2006; 172(4): 2647 - 2663.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. H. Kohn, H.-J. Pelz, and R. K. Wayne
Natural selection mapping of the warfarin-resistance gene
PNAS, July 5, 2000; 97(14): 7911 - 7915.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. C. Fay and C.-I Wu
Hitchhiking Under Positive Darwinian Selection
Genetics, July 1, 2000; 155(3): 1405 - 1413.
[Abstract] [Full Text]