| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 175, 1395-1406, March 2007, Copyright © 2007
doi:10.1534/genetics.106.062828
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Department of Statistics, University of Oxford, Oxford OX1 3TG, United Kingdom
1 Address for correspondence: Department of Statistics, 1 S. Parks Rd., Oxford OX1 3TG, United Kingdom.
E-mail: mcvean{at}stats.ox.ac.uk
| ABSTRACT |
|---|
The aim of this article is to provide an intuitive interpretation of the effects of selective sweeps on patterns of LD, through considering the relationship between LD and the structure of the underlying genealogical history. Previous work has shown that there is a direct quantitative relationship between the magnitude of LD observed between a pair of neutral mutations and the correlation structure of the underlying genealogy (MCVEAN 2002). By using the conventional approximation that strong selective sweeps lead to short, star-like genealogies at the selected site, this theory is extended to examine the correlation structure between the genealogies of neutral loci either separated by or adjacent to the selected site. Comparison with the results of stochastic simulation demonstrates that this theory predicts the qualitative and, to some extent, quantitative, behavior of LD around a selective sweep. In addition, the theory identifies the importance of the age of neutral mutations (relative to the selected one) in determining patterns of LD and predicts large differences in the nature of the breakdown of LD around a selective sweep and a recombination hotspot.
| TWO-LOCUS IDENTITIES AND A GENEALOGICAL INTERPRETATION OF LD |
|---|
(HILL and ROBERTSON 1968). For a pair of biallelic loci, with alleles 0 and 1 at locus x and also 0 and 1 at locus y, the statistic is defined as
![]() | (1) |
is the sample frequency of the 11 haplotype and
is the marginal sample frequency of the "1" allele at locus x. Note that for biallelic loci the value of
does not depend on which allele is assigned the value 1. Consequently, in what follows the subscript for D is omitted.
Ideally, we wish to calculate the expected value of
between alleles at the two loci, conditioning on observing at least one of each allele at each of the two loci in a sample of size n sequences:
![]() | (2) |
:
![]() | (3) |
(i.e., Equation 2) for large sample sizes and when rare variants are excluded.
Previous work (STROBECK and MORGAN 1978; HUDSON 1985) showed that the statistic
can be rewritten in terms of two-locus identity coefficients:
![]() | (4) |
|
can be written in terms of the expectation of these two-locus identity coefficients. Under the infinite-sites model, in which each polymorphism observed is the result of a single mutation event within the sample's history, it is possible to relate the two-locus identities to the expectations of genealogical properties at the two loci (MCVEAN 2002). For example,
![]() | (5) |
is the coalescence time for sequences i and j at locus x. By obtaining similar expressions for the other two-locus identities and also the denominator of Equation 3, it was shown that
![]() | (6) |
is the Pearson correlation coefficient between the coalescence time for sequences i and j at locus x and the coalescence time for sequences k and l at locus y and CVx is the coefficient of variation in the time to the most recent common ancestor (MRCA) for a pair of randomly sampled chromosomes at locus x,
. Note that there are three correlations in Equation 6, relating to the three sample configurations (see Equation 4 and Figure 1A). The most important implication of Equation 6 is that it provides a quantitative approach for relating patterns of LD to features of the underlying genealogical history. For example, demographic histories in which the population has increased, decreased, or remained constant in size influence LD both through their effects on the correlation structure of genealogies and through their effects on the coefficient of variation in time to the MRCA. For example, population growth reduces the coefficient of variation thus reducing LD, while population bottlenecks increase the coefficient of variation, increasing LD. The theory can also be extended to consider more complex situations, for example, the case of a series of island populations connected by migration (WAKELEY and LESSARD 2003). In the next section, the theory is extended to the case of a pair of neutral loci linked to a site that has undergone a complete selective sweep in which the beneficial mutation has just reached fixation in the population.
| MODELING GENEALOGIES UNDER A SELECTIVE SWEEP |
|---|
from a selected site (where r is the genetic map distance in Morgans and Ne is the effective population size, assumed to be diploid) can either recombine away from the selected mutation before its removal from the population, with probability p, or not, with probability q = 1 – p. The probability of "escape" is a function of the recombination rate and the frequency trajectory of the selected mutation, itself a random variable determined by the scaled selection coefficient
. By approximating the trajectory of the selected mutation by that of the deterministic expectation, it has been previously shown that
![]() | (7) |
![]() | (8) |
, taken from Equation 8 (Figure 1B). Although this approximation can be criticized (BARTON 1998; DURRETT and SCHWEINSBERG 2004; ETHERIDGE et al. 2006), it nevertheless has proved very useful in analytical treatments of hitchhiking, because of the resulting independence between lineages in whether they recombine away from the selected mutation.
A further simplifying assumption,
, is also made, where
is the time until the MRCA for a sample of n chromosomes. Under the standard neutral model,
. Looking back in time, the history of the sample can therefore be divided into two phases (Figure 1B). During the first "selection phase" the only events that can occur are recombination events that move neutral loci from the background of the selected allele to that of the ancestral, wild-type allele. The end of the selection phase is marked by the origin of the selected mutation at which point all chromosomes carrying the selected allele coalesce immediately, and the selected allele is removed. Subsequently, in the "neutral phase," the history of the remaining lineages follows that of the standard neutral model. In the extreme, the selection phase can be considered instantaneous with respect to the timescale of the neutral coalescent process (i.e.,
) and therefore any mutations segregating must have occurred on the portion of the genealogy that predates the origin of selected mutation. Under this assumption if no lineages have recombined to the ancestral background at a given distance from the selected site, there will be no polymorphism in the sample.
By dividing the history of the sample into these two phases it can be seen that the effect of the selective sweep on patterns of LD is determined by how it influences the configuration of chromosomes found at the start of the neutral phase (just further back in time than the origin of the selected mutation). In particular, we need to calculate the transition probabilities that describe how each of the initial configurations, A, B, and C, is distributed at the start of the neutral phase. For example, consider configuration A where the selected site separates the two neutral loci (Figure 2). Depending on the distribution of recombination events that move a neutral locus from the selected to the ancestral background, this initial configuration can be transformed into any of 10 possible states at the end of the selected phase. The removal of the selected mutation subsequently transforms these 10 configurations, through coalescence of those still carrying the selected mutation, to any of configurations A, B, and C or to ones where one or both of the neutral loci coalesce (indicated by O in Figure 2). Details of the probabilities of each transition are given in APPENDIXES A and B.
|
|
|
![]() | (9) |
is the probability that configuration A in the sampled chromosomes (all of which carry the selected mutation) results in configuration B at the start of the neutral phase. The subscript S on the left-hand side indicates that the expectation refers to the selected allele, while the subscript W on the right-hand side indicates that these expectations refer to the wild-type allele (i.e., the standard neutral expectations). Under the standard neutral model these quantities are known for different configurations of chromosomes. In particular,
![]() | (10) |
.
Finally, because the configurations can be thought of as relating to subsamples (with replacement) from a sample of n sequences, there is a possibility that sequences i, j, k, and l may not be distinct (the same sequence could be picked twice). A simple correction has to be made to the expectations,
![]() | (11) |
| NEUTRAL LOCI SEPARATED BY THE SELECTED SITE |
|---|
and
, respectively, such that the probabilities of a lineage escaping the selective sweep are
and
, respectively. By considering the probability of recombination in each interval it can be shown that
![]() | (12) |
![]() | (13) |
and
![]() | (14) |
) is zero or at least no greater than background levels caused by finite sample size. This result agrees with previous findings (KIM and NIELSEN 2004; STEPHAN et al. 2006) obtained by simulation and analysis of deterministic models of selection. It is worth noting that a deterministic model (in which drift during the selection phase is ignored) is equivalent to assuming that no coalescent events occur during this period, the same assumption as is made here.
However, it is also worth noting that while LD may be zero, there is actually nonzero correlation in coalescence time. For example, if
and
, it can be shown that
![]() | (15) |
| NEUTRAL LOCI ON THE SAME SIDE OF THE SELECTED LOCUS |
|---|
and the more distant (or distal), y, being at a recombination distance
from x. In this situation the different initial configurations have different probabilities of resulting in each configuration at the start of the neutral phase. For example, configuration A can escape the sweep through a single recombination, while configuration C requires a minimum of two recombination events to escape the sweep. By considering the effect of recombination events occurring in each part of each chromosome during the selection phase (see APPENDIX B) it follows that for configuration A
![]() | (16) |
![]() | (17) |
![]() | (18) |
![]() | (19) |
it follows that
![]() | (20) |
, such that
, it is also critical to account for the finite sample size, such that i, j, k, and l are not necessarily distinct. Under these conditions a good approximation for the expected LD is
![]() | (21) |
) between the alleles if there is no recombination between them (Figure 3). This result can be understood by noting that the most probable way in which polymorphism will be observed if
is if a single lineage escapes the selective sweep. Any neutral mutations must occur during the neutral phase, in which only two lineages will be present (the lineage leading to the MRCA of the selected mutation and the escaped lineage), leading to perfect association (in effect the mutations will occur on the same branch of the unrooted genealogy, as in Figure 1B). Another prediction of Equation 21 is that the magnitude of LD decreases rapidly as the recombination rate between the neutral loci increases. Indeed for moderate to large sample sizes it should decrease below that expected for an identical pair of neutral sites unaffected by a sweep (Figure 3). From a genealogical perspective, any recombination events occurring between the two neutral loci will rapidly lead to a breakdown in the correlation of the genealogies at the two positions. Informally, the effect can also be understood in terms of allele frequency. When
, polymorphism at the proximal locus is most likely to be in the form of a singleton (i.e., one chromosome differs from all the others). Recombination between the proximal and the distal loci will allow nonsingleton polymorphism at the distal locus and this is likely to show weak LD with the singleton allele at the proximal locus.
|
| INCORPORATING NEUTRAL MUTATIONS YOUNGER THAN THE SELECTED MUTATION |
|---|
it is relatively likely that polymorphism observed in a sample that has experienced a selective sweep may be more recent than the selected mutation. From the genealogical perspective, considering such recent mutations is equivalent to setting
. Because no coalescent events occur during the selected phase, the only influence of a nonzero value of
is to increase the expected coalescence time (it has no effect on the correlations in coalescence time or variance) and consequently decrease the coefficient of variation in coalescence time, thus reducing LD. When the neutral loci are either side of the selected site LD is low anyway, so inclusion of recent mutation has little or no impact on LD. However, when the two neutral loci are on the same side of the selected mutation recent mutation can have a considerable impact on LD, because neutral mutations older than the selected one will typically show strong LD if they are themselves tightly linked (as described above). To get an idea for the importance of including recent mutations, note that when
, typically at most one lineage will escape the sweep and the contribution of the neutral phase to the expected time in the genealogy of the sample is 
. Under these same conditions the total length of the genealogy within the selected phase is
. Consequently, the probability that an observed neutral mutation at the proximal locus is older than the selected mutation is 
. In humans the average recombination rate is
in European populations (MYERS et al. 2005), so that a polymorphism 5 kb from the selected site will have only a 50% probability of being older than the selected mutation.
Figure 4 shows that inclusion of recent mutations has a marked effect on
. When the recombination rate between the neutral loci is zero, mutations older than the selected one are predicted to show (and do show) monotonically decreasing LD as a function of increasing
. However, when recent mutations are considered, LD very close to the selected site is near zero when
is small. LD increases as
increases, exceeding the neutral expectation at intermediate values of
. Finally, as
approaches one, the expected LD decreases toward neutral expectation. The nonmonotonic relationship between the distance of the neutral loci from the selected site and the strength of LD is actually more marked in the simulations (see below) than in the theoretical predictions. Qualitatively similar patterns are predicted when the neutral loci are only partially linked (data not shown).
|
| STOCHASTIC SIMULATION |
|---|
during which the only events that can occur are recombination events that move lineages from the selected to the wild-type background, a point of instant coalescence between all lineages still carrying the selected allele, and a neutral phase. In series B, fully stochastic models of selective sweeps were simulated using the program SelSim (SPENCER and COOP 2004). Briefly, the method first simulates a stochastic trajectory for the selected mutation backward in time using a diffusion approximation (COOP and GRIFFITHS 2004) and then subsequently performs a structured coalescent simulation conditional on the trajectory. By performing the two series of simulations it is possible to examine both the accuracy of Equation 6 as an approximation to the expectation of
and the accuracy of the approximate model for selective sweeps. For efficiency simulations were carried out by placing mutations uniformly on the simulated genealogies at loci x and y and the ith simulation was assigned a weight given by the product of the total branch lengths at each site,
. Expected values of
are estimated from the weighted average over
105 simulations for each parameter combination.
Where the selected site separates the two neutral loci the extent of association between the neutral loci in the series A simulations was, as predicted, no higher than background (data not shown). When the selected site does not separate the neutral loci the results are highly sensitive to assumptions about the duration of the selective phase (Figure 4, A and B; note that there is no recombination between the neutral loci). In Figure 4A it was assumed that the age of the selected mutation was negligible compared to the age of the neutral genealogy,
. In Figure 4B, the age of the selected mutation was fixed at
, the average obtained by fully stochastic simulation with S = 400 and a sample size of 20. There are two key features of these results. First, in both cases Equation 6 typically overestimates the expected value of
, although the expression is accurate when the probability of escape is low. The second key point is the difference the inclusion of recent neutral mutations makes. As predicted, mutations older than the selected one do typically show very strong LD. However, when the probability of escaping the selective sweep is very low, recent neutral mutations make the majority contribution to LD, such that the average value of
is very low.
Figure 5 shows the comparison between the analytical results and the average value of
calculated from the fully stochastic simulations. These give qualitatively the same results as those obtained under the approximate model of a selective sweep. When the selected mutation separates the neutral mutations there is no LD between them (Figure 5A), irrespective of the level of diversity observed. When the selected site does not separate the neutral loci the LD between linked neutral loci is zero when the proximal locus is very close to the selected site, increases beyond its neutral expectation as the probability of escape increases, and then decreases back to the neutral expectation. This feature is seen both when the neutral loci are completely linked (Figure 5B) and when they are only partially linked (Figure 5C). The most notable difference between the two series of simulations is that in series A the approximation was a considerable overestimate of the true LD, whereas in series B it is typically a slight underestimate. In the absence of a selective sweep Equation 6 is typically an overestimate of
, as it also is when the approximate model is used as the basis of stochastic simulation (Figure 4). The most likely explanation for the underestimate in Figure 5 is that the genealogy under the selected mutation is not star shaped, and hence there can be significant LD between neutral mutations that occur during the selective phase. Indeed, as the sample size increases, the approximation of a star-like genealogy in the selective phase becomes progressively worse (DURRETT and SCHWEINSBERG 2004).
|
| DISCUSSION |
|---|
Selective sweeps can eliminate LD:
If a selective sweep is sufficiently strong and recent, such that the genealogy of the sample at the selected site can be approximated as a star (i.e., all lineages coalesce at the same time), all LD between neutral loci separated by the selected site is eliminated. As previously noted (KIM and NIELSEN 2004), there is a simple genealogical explanation for this observation. In effect, the genealogical interpretation of LD implies that significant LD will occur when the coalescent time for a pair of chromosomes at one position on a chromosome is informative about the coalescent time for the same pair of chromosomes at another position (relative to the coalescent time of all other pairs of chromosomes). Within a star-like genealogy all pairs of chromosomes coalesce at the same time. Consequently the coalescent time for a given pair at one point is uninformative about the coalescent time at any other point for the same pair (i.e., there is no variance in coalescence time within the star), and there is no LD. Moving away from the selected site recombination events will allow linked neutral sites to revert to the neutral distribution of genealogies. However, such "recovery" from the star-like genealogy happens independently on the two sides of the selected site. Consequently, the coalescent time for a pair of chromosomes on one side of the selected site will always be uninformative about the coalescent time for the same pair of chromosomes on the other side.
What is the implication of this result for understanding patterns of variation? The most obvious issue is that selective sweeps, through abolishing LD, may create patterns that look like recombination hotspots. Indeed, it has been shown that one statistical test for hotspots does have an elevated false positive rate at selective sweeps (REED and TISHKOFF 2005). However, it should be noted that the patterns of genetic variation (and underlying genealogies) associated with a hotspot and those associated with a selective sweep are strikingly different. In humans, hotspots are typically short (1–2 kb) regions where there is a very rapid breakdown in LD, and there are many "detectable" recombination events and no distortion to the distribution of marginal genealogies (i.e., no distortion to the frequency distribution of neutral variation) (JEFFREYS et al. 2001). In contrast, a selective sweep of considerable strength will affect the density and frequency distribution of polymorphism over considerable distances. For example, a scaled selection coefficient of
(a selection coefficient of
1% in humans) will affect the frequency distribution of polymorphism up to a genetic distance of at least
on either side (this is the distance at which there is a 50% chance of lineage escaping the sweep). In humans, the average recombination rate is 
in European populations (MYERS et al. 2005), such that a region some 140 kb in size should be strongly affected. In short, even if a sweep does influence LD in such a way as to resemble a hotspot, the sweep is also likely to lead to unusual patterns of variation that are indicative of a selective sweep.
One way to ask the question of whether selective sweeps can create false hotspots is to ask whether, conditioning on seeing polymorphism at given genetic distances on either side of the selected mutation, the evidence for historical recombination is greater or less than under the neutral model. Table 1 shows how selective sweeps influence the probability of seeing all four possible haplotypes relative to the neutral case. Under the infinite-sites model such data sets are direct evidence for recombination (HUDSON and KAPLAN 1985). The patterns are quite striking: sweeps lead to a dramatic decrease in the probability of observing all four haplotypes relative to the neutral model. This is true whether all mutations are considered or just those >10% in frequency. In short, selective sweeps do not lead to any increase in the evidence for recombination. The reported bias to one method for detecting hotspots (REED and TISHKOFF 2005) therefore is likely to result from the fact that this method uses a nongenealogical model for patterns of variation. Analysis of data sets simulated with selective sweeps indicates that coalescent-based estimators of the recombination rate show no such local increase in estimated rate. Rather, the depression in the opportunity for recombination at such sites also leads to a slight decrease in average estimated rate (Figure 6).
|
|
What are the implications of these results for the interpretation of empirical patterns of genetic variation? Previous work has suggested that incorporating information on LD does not greatly improve the power of statistical approaches to identifying selective sweeps (KIM and NIELSEN 2004). This result is understandable given the complexity of the patterns described. One possibility is that incorporating information about the age of linked neutral polymorphism (for example, by comparison with related populations in which no sweep is thought to have occurred) may increase the power to detect selection. In particular, sweeps will lead to series of old SNPs at low frequency and in strong LD interleaved with series of young SNPs at low frequency and in very low LD. Of course, inferences about the age of a mutation within the population that has experienced selection will be confounded by the effect of the sweep.
One argument against using patterns of LD directly to make inferences about selective sweeps is that their effects on LD can all be understood in terms of the generation of a star-like genealogy at the selected site. Consequently, the most powerful methods for detecting selective sweeps will be those that are most powerful at detecting local star-like genealogies with short times to the MRCA (KIM and STEPHAN 2002; KIM and NIELSEN 2004; NIELSEN et al. 2005). For example, of existing methods to detect recent, complete selective sweeps, perhaps the most powerful is one that compares models with and without a local star-like genealogy at a putatively selected site using only the allele-frequency distribution (NIELSEN et al. 2005). However, what the results presented here show is that selective sweeps can induce unusual patterns of association between neutral mutations near selected sites, a feature that is currently not considered in this method. In effect, the results suggest that there may be additional information about selective sweeps in the way genetic variation recovers around a selected locus; however, it remains to be seen whether such recovery differs systematically from cases where star-like genealogies have occurred by chance or through population bottlenecks.
| ACKNOWLEDGEMENTS |
|---|
| LITERATURE CITED |
|---|
BARTON, N. H., 1998 The effect of hitch-hiking on neutral genealogies. Genet. Res. 72: 123–133.[CrossRef]
BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY and W. STEPHAN, 1995 The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140: 783–796.[Abstract]
COOP, G., and R. C. GRIFFITHS, 2004 Ancestral inference on gene trees under selection. Theor. Popul. Biol. 66: 219–232.[CrossRef][Medline]
DURRETT, R., and J. SCHWEINSBERG, 2004 Approximating selective sweeps. Theor. Popul. Biol. 66: 129–138.[CrossRef][Medline]
ETHERIDGE, A. M., P. PFAFFELHUBER and A. WAKOLBINGER, 2006 An approximate sampling formula under genetic hitchhiking. Ann. Appl. Probab. 16: 685–729.[CrossRef]
FAY, J. C., and C. I. WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155: 1405–1413.
FU, Y. X., and W. H. LI, 1993 Statistical tests of neutrality of mutations. Genetics 133: 693–709.[Abstract]
HILL, W. G., and A. ROBERTSON, 1968 Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38: 226–231.[CrossRef]
HUDSON, R. R., 1985 The sampling distribution of linkage disequilibrium under an infinite allele model without selection. Genetics 109: 611–631.
HUDSON, R. R., and N. L. KAPLAN, 1985 Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111: 147–164.
HUDSON, R. R., K. BAILEY, D. SKARECKY, J. KWIATOWSKI and F. J. AYALA, 1994 Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster. Genetics 136: 1329–1340.[Abstract]
JEFFREYS, A. J., L. KAUPPI and R. NEUMANN, 2001 Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat. Genet. 29: 217–222.[CrossRef][Medline]
KAPLAN, N. L., R. R. HUDSON and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics 123: 887–899.
KIM, Y., and R. NIELSEN, 2004 Linkage disequilibrium as a signature of selective sweeps. Genetics 167: 1513–1524.
KIM, Y., and W. STEPHAN, 2002 Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160: 765–777.
MAYNARD SMITH, J., and J. HAIGH, 1974 The hitch-hiking effect of a favourable gene. Genet. Res. 23: 23–35.[Medline]
MCVEAN, G. A., 2002 A genealogical interpretation of linkage disequilibrium. Genetics 162: 987–991.
MCVEAN, G. A., S. R. MYERS, S. HUNT, P. DELOUKAS, D. R. BENTLEY et al., 2004 The fine-scale structure of recombination rate variation in the human genome. Science 304: 581–584.
MYERS, S., L. BOTTOLO, C. FREEMAN, G. MCVEAN and P. DONNELLY, 2005 A fine-scale map of recombination rates and hotspots across the human genome. Science 310: 321–324.
NIELSEN, R., S. WILLIAMSON, Y. KIM, M. J. HUBISZ, A. G. CLARK et al., 2005 Genomic scans for selective sweeps using SNP data. Genome Res. 15: 1566–1575.
OHTA, T., and M. KIMURA, 1971 Linkage disequilibrium between two segregating nucleotide sites under the steady flux of mutations in a finite population. Genetics 68: 571–580.
PLUZHNIKOV, A., and P. DONNELLY, 1996 Optimal sequencing strategies for surveying molecular genetic diversity. Genetics 144: 1247–1262.[Abstract]
PRZEWORSKI, M., 2002 The signature of positive selection at randomly chosen loci. Genetics 160: 1179–1189.
REED, F. A., and S. A. TISHKOFF, 2005 Positive selection can create false hotspots of recombination. Genetics 172: 2011–2014.[CrossRef][Medline]
SABETI, P. C., D. E. REICH, J. M. HIGGINS, H. Z. LEVINE, D. J. RICHTER et al., 2002 Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832–837.[CrossRef][Medline]
SONG, Y. S., and J. S. SONG, 2007 Analytic computation of the expectation of the linkage disequilibrium coefficient r2. Theor. Popul. Biol. 71: 49–60.[CrossRef][Medline]
SPENCER, C. C., and G. COOP, 2004 SelSim: a program to simulate population genetic data with natural selection and recombination. Bioinformatics 20: 3673–3675.
STEPHAN, W., T. WIEHE and M. W. LENZ, 1992 The effect of strongly selected substitutions on neutral polymorphism: analytical results based on diffusion theory. Theor. Popul. Biol. 41: 237–254.[CrossRef]
STEPHAN, W., Y. S. SONG and C. H. LANGLEY, 2006 The hitchhiking effect on linkage disequilibrium between linked neutral loci. Genetics 172: 2647–2663.
STROBECK, C., and K. MORGAN, 1978 The effect of intragenic recombination on the number of alleles in a finite population. Genetics 88: 829–844.
WAKELEY, J., and S. LESSARD, 2003 Theory of the effects of population structure and sampling on patterns of linkage disequilibrium applied to genomic data from humans. Genetics 164: 1043–1053.
This article has been cited by other articles:
![]() |
K. Bullaughey, M. Przeworski, and G. Coop No effect of recombination on the efficacy of natural selection in primates Genome Res., April 1, 2008; 18(4): 544 - 554. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Eldon and J. Wakeley Linkage Disequilibrium Under Skewed Offspring Distribution Among Individuals in a Population Genetics, March 1, 2008; 178(3): 1517 - 1532. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||