- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Correction to Fig. 6, text
- Correction to program
- A corrigendum has been published
- A corrigendum has been published
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Przeworski, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Przeworski, M.
The Signature of Positive Selection at Randomly Chosen Loci
Molly Przeworskiaa Department of Statistics, University of Oxford, Oxford OX1 3TG, United Kingdom
Corresponding author: Molly Przeworski, Inselstrasse 22, D-04103 Leipzig, Germany., przewors{at}eva.mpg.de (E-mail)
Communicating editor: D. CHARLESWORTH
| ABSTRACT |
|---|
In Drosophila and humans, there are accumulating examples of loci with a significant excess of high-frequency-derived alleles or high levels of linkage disequilibrium, relative to a neutral model of a random-mating population of constant size. These are features expected after a recent selective sweep. Their prevalence suggests that positive directional selection may be widespread in both species. However, as I show here, these features do not persist long after the sweep ends: The high-frequency alleles drift to fixation and no longer contribute to polymorphism, while linkage disequilibrium is broken down by recombination. As a result, loci chosen without independent evidence of recent selection are not expected to exhibit either of these features, even if they have been affected by numerous sweeps in their genealogical history. How then can we explain the patterns in the data? One possibility is population structure, with unequal sampling from different subpopulations. Alternatively, positive selection may not operate as is commonly modeled. In particular, the rate of fixation of advantageous mutations may have increased in the recent past.
CONSIDERABLE debate has focused on what proportion of genetic changes is favored by natural selection, as well as what types of substitutions are most likely to have been selected (![]()
![]()
To infer that positive selection has acted on a particular genomic region, population geneticists usually sequence a number of individuals at a locus and test whether the pattern of polymorphism seen in the sample is unexpected under the standard neutral model of a random-mating population of constant size. Unfortunately, a departure from null model expectations can be due to one of many causes, so it is hard to establish that adaptation is responsible. In particular, an excess of rare variants may reflect a selected substitution at a closely linked site, but it may also be caused by population expansion or purifying selection, just to list a couple of alternatives. For this reason, an ideal "test of neutrality" would not only have high power to detect positive selection, but would also focus on an aspect of the data unlikely to be affected by demography or other factors. Such a test statistic (H) was recently proposed by ![]()
![]()
Since its introduction, significant H values have been reported for samples from Acp26Aa (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In addition, patterns of linkage disequilibrium (LD) depart from the expectations of the standard neutral model in these species. There appears to be a genome-wide excess of intralocus linkage disequilibrium in D. melanogaster and non-African populations of D. simulans (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
If so, patterns of polymorphism in many regions will have been shaped by repeated episodes of positive selection. However, as I show here, the H test has very low power to detect the effects of positive selection on a randomly chosen locus. Similarly, the effect of selection on LD is short-lived, so even neutral loci affected by multiple adaptive substitutions at linked sites are unlikely to show unusually high levels of allelic association.
| METHODS |
|---|
Frequency spectrum-based "tests of neutrality":
The H statistic presented in ![]()
, where N is the diploid effective population size of the species and µ the mutation rate per generation. The two estimates are the average number of pairwise differences in the sample,
(![]()
where n is the sample size and pi the frequency of the derived (i.e., nonancestral) allele at segregating site i (![]()
This statistic is similar to one introduced by ![]()
and
w, an estimate of
based on the number of segregating sites in the sample. In contrast to H, D does not use information about ancestral and derived states. Negative D values reflect a relative excess of rare alleles in a folded frequency spectrum. Here, both H and D are used as one-tailed tests of neutrality.
Simulations of positive selection:
I estimate the power of H to detect a model of recurrent "selective sweeps" (cf. ![]()
![]()
![]()
![]()
![]()
![]()
In the model, a neutral locus is affected by selective sweeps that occur at some random genetic distance c, where c is uniform on (0, M) and M is the maximum distance at which a single sweep has an effect on diversity levels. (What is meant by genetic distance is the population recombination rate between the neutral and selected locus.) M is on the order of 4Ns (![]()
(s is the selective coefficient of the favored allele). In simulations of a single selective sweep, the value of c is specified, as is the time since the fixation of the beneficial allele. In the model of repeated sweeps, the rate of sweeps is constant and chosen so that there is a small probability that two or more would occur simultaneously [using 1 - Equation 6 in ![]()
, where
(r is the crossover rate per generation). There is no gene conversion, and I assume a constant rate of crossing over per base pair. The neutral locus evolves according to the infinite-sites model.
This selective sweep model is implemented as a succession of neutral and selective phases (when there are two alleles at a selected site). The algorithm for the neutral phase is the standard coalescent with recombination (cf. ![]()
![]()
During the sweep, time changes in small increments,
t. Within
t, the probabilities of the events of interest are given by

where x(t) is the frequency of the favored allele at time t, i is the number of lineages carrying the favored allele, and j is the number of lineages carrying the unfavored allele (![]()

and

The change in frequency of the favored allele is modeled deterministically, from frequency
to 1 -
, using Equation 3a in ![]()
(as do ![]()
Call the sum of the probabilities of all possible events within a time interval St; (1 - St) is approximately the probability that no event occurs, when the probabilities of all events are small. To calculate the time to the next event, I solve
y(1 - St) < U for y, where U is a uniform random variable on (0, 1) and the product is taken over successive time intervals. Which event occurs at time y is chosen randomly with probability
.
If the event is of type 3, then with probability
/(
+ c) the crossover event occurs within the neutral locus and with probability c/(
+ c) between the selected and neutral locus. When a crossing-over event occurs within the neutral locus, a breakpoint b is chosen uniformly on [0, L] where L is the length of the neutral locus. Assume, as an illustration, that the selected locus is to the left to the neutral locus and that the lineage carries the favored allele. Segments in the neutral locus right of b would then "migrate" to the subpopulation of the unfavored background. The number of lineages in both subpopulations has to be updated accordingly for those segments. Other cases are treated analogously.
The computer code for these simulations is written in C and based on coalescent programs kindly provided by R. Hudson (available at http://home.uchicago.edu/~rhudson1/). The program was error checked by comparing the output to the results in Fig 3 of ![]()
).
|
|
|
Power tests:
The H and D tests are implemented as in ![]()
![]()
value (with or without recombination). If the value of H for a data set is more extreme than the significance level established for that number of segregating sites under the null model, the null model is rejected.
This procedure is meant to mimic what researchers would do in practice, when they come across a region with low diversity. Since the population mutation rate is unknown, one might ask to what extent the locus is consistent with the neutral model and a low mutation rate by testing if H is more extreme than expected for the observed number of segregating sites. If no segregating sites were found, no test would be performed. When estimating power, I exclude all runs in which there are no segregating sites. [For sake of comparison, note that ![]()
values (results not shown). The same is true for D, as well as other tests of neutrality (![]()
The H test relies on identification of the ancestral allele. In practice, this is done with one or more outgroups, and the inference may be incorrect if there are mutations at the same site on the outgroup lineage(s). How likely this is depends on the mutation rate and on the extent of mutation rate variability across sites. ![]()
Linkage disequilibrium:
There are many possible summaries of LD and none is an obvious choice. Here, I consider two measures of linkage disequilibrium. The first is r2 (cf. ![]()
0.1. A relative excess of LD is sometimes characterized as a deficiency in the number of distinct haplotypes for the observed number of segregating sites (e.g., ![]()
![]()
![]()
> 0.
| RESULTS |
|---|
Selective sweeps with recombination:
Most of the theoretical attention paid to models of positive selection has focused on the "selective sweep" or "hitch hiking" model (![]()
![]()
With recombination, selective sweeps can no longer be treated as population size reductions (![]()
![]()
H has low power to detect old sweeps:
On the basis of these insights, ![]()
can be high. Thus, if we consider a "candidate locus" where there is independent evidence for the action of recent positive selection (e.g., ![]()
Instead, sweeps might be thought of as occurring at random locations and times. In this case, the power of H is much reduced. First, the power of H, P(H), decreases rapidly with the time since the fixation of the favored allele, as the high-frequency variants fix in the population and no longer contribute to polymorphism (![]()
, the power is roughly equal to the nominal rejection probability after 5 x 105 generations or one-eighth of the mean time to coalescence under neutrality, 4N (
in Fig 2). For D. melanogaster, assuming 10 generations a year (and if
), this corresponds to 5 x 104 years. For some time after the sweep, the power is actually <0.05 (see also ![]()
The D test retains substantial power for a much longer period of time since the sweep than does H. These results suggest that D might be a better test for detecting selective sweeps. When selection is recent, however, the use of D and H is not redundant. For example, if the parameters are as in Fig 2 and
, the proportion of runs where H is significant but D is not is 19% (for D but not H, it is 13%).
The effect of other parameters on P(H):
With a larger
value, there is a higher probability of having a mutation on the dotted branch in Fig 1 and therefore more power to detect the effects of a sweep. For example, immediately after a sweep,
is 79% (with
, with other parameters as in Fig 2) while
is 69%. The power of H also increases with larger sample size (results not shown).
Of fundamental importance in determining P(H) is the number of lineages that recombine on to the unfavored background during the sweep. As can be seen in Fig 1, for the ancestral genealogy to have long internal branches requires at least one recombination event between selected classes. How likely this is depends on the strength of selection and on the recombination rate between the selected and neutral loci (c). If c is too small, there will be no recombination events, and all lineages will coalesce during the sweep. If c is very large, there will be many recombination events, and the neutral locus will not reflect the effects of selection. Thus, if the neutral locus is very close to the sweep, or too far away, P(H) is substantially reduced (Fig 3 in ![]()
The power of H depends on s and c, not just on their ratio. Keeping c/s constant does not produce the same number of recombinants for different sets of (c, s) values, because the total length of the tree (and hence the probability of a recombination event) does not depend linearly on s. In fact, for the same c/s value, stronger selection (and therefore larger c values) will result in higher P(H). As an illustration, if
, as might be the case for humans (![]()
, and
, then immediately after a sweep, P(H) is only 10% while P(D) is 58%. For the same c/s value, if
, P(H) is 51% and P(D) is 62% (Fig 2).
The power of H in practice:
Researchers have assessed the significance of the H test with critical values established under the assumptions of a constant population size and no recombination. In reality, however, there is recombination within the neutral locus. In the presence of recombination, the use of critical values for the case of no recombination is conservative; i.e., the null model is rejected <5% of the time at the 5% level. This can be seen by comparing the P(H|no sweep) in Table 1 for different values of
, the population recombination rate for the neutral locus. Even though the H test is conservative in the presence of intralocus recombination, some recombination increases the power to detect a sweep at a linked site. (Obviously this is true only up to a point: If there is a very high level of recombination, the neutral locus will no longer reflect selection at linked sites.) As can be seen in Table 1, the increase in power is slight, and P(H) still decreases extremely quickly with t.
|
In humans, the violation of a second assumption will lead one to overestimate the power of H to detect a sweep. The human population size has increased dramatically in the recent past. The effect of population growth is to increase the rate of coalescences going backward in time. For the same average diversity levels, the tree in Fig 1 would therefore have shorter internal branches than it does under a constant-size model. This will reduce the number of high-frequency-derived alleles found at neutral sites linked to a selective sweep. Thus, the finding of numerous loci with extreme H values is even more surprising when this aspect of human demography is taken into account.
The power to detect sweeps at a randomly chosen locus:
Results for the recurrent selective sweep model are shown in Fig 3. There is essentially no power to detect the effects of selection using H and the power does not increase with the strength of selection or the frequency of selective sweeps. This is to be expected: The power of H is high for very recent sweeps at a suitable distance from the neutral site. Simulations suggest that, if
, and the sample size is 50, the maximum distance at which sweeps have an effect on diversity levels is c/s
0.25 (results not shown). For these parameters, P(H) > 20% for a distance between 0.00035 < c/s < 0.02 (Fig 3 in ![]()
The effect of a single sweep on LD:
As shown above, a significant H value is a short-lived signature of a selective sweep. This is also true of another feature of the data, levels of linkage disequilibrium. In both Drosophila and humans, numerous loci appear to exhibit unexpectedly high levels of LD. In Drosophila, this is usually quantified as a paucity of haplotypes (e.g., ![]()
![]()
(![]()
![]()
, Chud (![]()
![]()
![]()
![]()
has also been shown to be lower than expected for European samples (![]()
As is illustrated in Fig 4 and Fig 5, a recent sweep can substantially increase levels of LD. In Fig 4I plot the expected decay of a summary of pairwise LD, r2, for alleles with a minor allele frequency
0.1. Parameters are chosen to be plausible for D. melanogaster. If the beneficial allele fixed at time t = 0, there is a much slower rate of decay with distance than under the standard neutral model. Note, however, that fewer alleles satisfy the frequency cutoff after a sweep, so long sequences may be required for this pattern to be apparent in actual data. Fig 5 presents scatterplots of r2 vs. distance for parameters germane to humans; as can be seen, a selected substitution at a linked site increases the number of distant pairs in significant LD.
|
|
The effect of a sweep on levels of LD dissipates quickly, depending on the summary of LD used and particularly on the sensitivity of the measure to changes in allele frequencies. Consider first the effect of a single sweep on the mean number of haplotypes normalized by the number of segregating sites, E(nHaps/(S + 1)). As can be seen in Table 2, a neutral locus affected by a very recent sweep can exhibit a paucity of haplotypes relative to a standard neutral model (depending on the values of s and c). This suggests an increase in LD. However, the summary E(nHaps/(S + 1)) becomes greater than expected under neutrality shortly after the sweep (see Table 2). This is easily understood: As the high-frequency variants fix and new mutations arise, most alleles are now rare and many form new haplotypes.
|
When only intermediate-frequency variants are considered, the effect of selective sweeps on allelic associations is clearer. In the last two rows of Table 2I report E(nHaps/(S + 1)) excluding singletons. This statistic loosely corresponds to what is sometimes referred to as "haplotype structure" in the literature (e.g., ![]()
). Pairwise linkage disequilibrium exhibits a similar behavior to the number of haplotypes: For example, in Fig 4, a sweep that ended at t = 0.2 has an undetectable effect on r2. For these parameters, there is still a relative excess of LD by t = 0.1; however, this would be hard to discern in any one data set, because r2 varies greatly from one locus to another under neutrality (![]()
One implication of these results is that selection would have to be strong and recent for selective sweeps to account for the unexpectedly large distances over which LD sometimes extends in humans. This said, recent evidence suggests that most crossing-over events in humans may occur within narrow recombination hotspots, with most of the genome experiencing very low rates of crossing over (e.g., ![]()
The effect of repeated sweeps on LD:
Because the increase in LD is short-lived, anonymous loci subject to repeated selective sweeps do not show a marked excess of LD. In fact, summaries of LD that are highly sensitive to the frequency spectrum, such as Chud or E(nHaps/(S + 1)), suggest less LD under this model of recurrent sweeps than under neutrality. Chud, in particular, is smaller when the sample variance in the number of pairwise differences is larger. Selective sweeps skew the frequency spectrum toward rare alleles, leading to a smaller variance in pairwise differences and larger values of Chud (results not shown). Thus, repeated sweeps cannot account for the low values of Chud found at most loci in both species of Drosophila (![]()
Repeated sweeps do produce a relative excess of LD when attention is restricted to intermediate frequency variants. For example, in 104 simulations, E(nHaps/(S + 1)) excluding singletons is 1.24 in the absence of sweeps, 1.05 for
, and 0.90 for
(
is the rate of sweep per base pair per 4N generations). Fig 6 plots the expected decay of r2 with distance for these two rates of sweeps, with the other parameter values chosen to be plausible for D. melanogaster. The increase relative to a neutral model is slight. Note further that the rate
is probably unrealistically high. For
, and assuming a fixation probability of 2s (cf. ![]()
![]()
|
| DISCUSSION |
|---|
The possible effect of population structure:
If old or recurrent sweeps lead neither to high levels of LD nor to significant H tests, how do we interpret these features of the data? One possibility is that they were produced by a demographic departure from model assumptions. To examine this, I estimated the power of H (implemented as described for the sweep models) to detect a symmetric island model (![]()
for the whole population is 5, so for k demes, it is
/k per deme. First, I consider a two-island model, each of size N/2, with 0.52 migrants per deme per generation; under this particular model, this migration rate corresponds to an FST value of
0.110.33 (![]()
, corresponding to 0.5 migrant per deme per generation in a two-island model), P(H) can be as high as 19%. If there are more than two islands, then, for approximately the same FST value, the power is similar (results not shown). In general, the power of H to detect population structure increases with higher
or lower migration rates (results not shown). In summary, the null model can be rejected by the H test at substantially higher than the nominal rejection probability when samples are drawn unequally from different islands in an island model. In addition, population structure can produce high levels of LD (![]()
![]()
|
This particular model is likely to be unrealistic for both Drosophila and humans. However, the purpose of these simulations is simply to illustrate that a demographic model that produces trees such as Fig 1 more often than the standard neutral model will have the same effect on H as a selective sweep. In fact, recent bottlenecks (results not shown) and a metapopulation model (![]()
Does selection operate as modeled?
An alternative to demographic explanations is that positive selection does not operate as is commonly modeled. One assumption made by this model of recurrent positive selection is that a neutral locus is affected by at most one selected substitution at a time. The validity of this assumption depends crucially on the rate at which advantageous mutations arise and sweep to fixation. ![]()
![]()
![]()
When two or more alleles are simultaneously favored, interference between them might alter the patterns of polymorphism relative to the predictions of a single-site model of positive selection (![]()
More problematic is the assumption that the rate of selective sweeps is constant. If, instead, there has been an increase in the rate of genetic adaptations toward the present, many loci may reflect recent sweeps. In the case of cosmopolitan species of Drosophila, this time frame could reflect recent colonization of temperate habitats. Similarly, anatomically modern humans are thought to have left Africa and spread across the globe starting
50 thousand years ago, and there have been major changes in population density over the past 10 kya (![]()
Note further that the sojourn time of a selected allele in a random-mating population of constant size is
2 ln(2N)/s (assuming that the allele was selected when first introduced), where N is the diploid effective population size and s the selection coefficient of the favored allele (cf. ![]()
2 x 103 generations for humans and 2.9 x 103 generations for Drosophila (respectively, 4 x 104 years assuming 20 years per generation and 300 years assuming 10 generations a year). The demographic assumptions behind this calculation are likely to be invalid for the recent past of many cosmopolitan species. However, they suggest that if there has been an increase in the rate of sweeps in the recent past, a subset of loci may reflect incomplete sweepsones that are still ongoing or where the selected variant is no longer favored.
An additional assumption of this sweep model that is likely to be untrue in both D. melanogaster and humans is that of random mating. Indeed, there is evidence for population structure in both D. melanogaster (e.g., ![]()
![]()
![]()
![]()
In summary, the H test is a useful tool to confirm with polymorphism data that a candidate locus has undergone a recent sweep (e.g., ![]()
![]()
| ACKNOWLEDGMENTS |
|---|
I thank P. Andolfatto, A. Di Rienzo, P. Donnelly, J. Fay, I. Gordo, R. Griffiths, J. Pritchard, and J. Wall for helpful discussions and P. Andolfatto, Y. Gilad, R. Hudson, G. McVean, and J. Wall as well as D. Charlesworth and two anonymous reviewers for comments on the manuscript. M.P. is supported by a National Science Foundation Bioinformatics postdoctoral fellowship.
Manuscript received June 4, 2001; Accepted for publication November 26, 2001.
| LITERATURE CITED |
|---|
ANDOLFATTO, P., 2001 Adaptive hitchhiking effects on genome variability. Curr. Opin. Genet. Dev. 11:635-641[Medline].
ANDOLFATTO, P. and M. PRZEWORSKI, 2000 A genome-wide departure from the standard neutral model in natural populations of Drosophila. Genetics 155:257-268.
ANDOLFATTO, P. and M. PRZEWORSKI, 2001 Regions of lower recombination harbor more rare variants in African populations of Drosophila melanogaster.. Genetics 158:657-665
AQUADRO, C. F., D. J. BEGUN and E. C. KINDAHL, 1994 Selection, recombination and DNA polymorphism in Drosophila, pp. 4656 in Non-Neutral Evolution, edited by B. GOLDING. Chapman & Hall, New York.
BARTON, N. H., 1998 The effect of hitch-hiking on neutral genealogies. Genet. Res. 72:123-133.
BEGUN, D. J. and C. F. AQUADRO, 1993 African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548-550[Medline].
BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995 The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783-796[Abstract].
CAVALLI-SFORZA, L. L., P. MANOZZI and A. PIAZZA, 1994 The History and Geography of Human Genes. Princeton University Press, Princeton, NJ.
CROW, J. F., and M. KIMURA, 1970 An Introduction to Population Genetics Theory. Alpha Editions, Edina, MN.
FAY, J. C. and C.-I WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155:1405-1413
FAY, J. C. and C.-I WU, 2001 The neutral theory in the genomic era. Curr. Opin. Genet. Dev. 11:642-646[Medline].
FRISSE, L., R. R. HUDSON, A. BARTOSZEWICZ, J. D. WALL, and J. DONFACK et al., 2001 Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69:831-843[Medline].
FU, Y.-X., 1995 Statistical properties of segregating sites. Theor. Popul. Biol. 48:172-197[Medline].
GILAD, Y., D. SEGRE, K. SKORECKI, M. NACHMAN, and D. LANCET et al., 2000 Dichotomy of single-nucleotide polymorphism haplotypes in olfactory receptor genes and pseudogenes. Nat. Genet. 26:221-224[Medline].
GILAD, Y., S. ROSENBERG, M. PRZEWORSKI, D. LANCET, and K. SKORECKI, 2001 Evidence for positive selection and population structure at the human MAO-A gene. Proc. Natl. Acad. Sci. USA 99:862-867
HALE, L. R. and R. S. SINGH, 1991 A comprehensive study of genic variation in natural populations of Drosophila melanogaster. IV. Mitochondrial DNA variation and the role of history vs. selection in the genetic structure of geographic populations. Genetics 129:103-117[Abstract].
HAMBLIN, M. T., E. E. THOMPSON, and A. DI RIENZO, 2002 Complex signatures of natural selection at the duffy blood group locus. Am. J. Hum. Genet. 70:369-383[Medline].
HUDSON, R. R., 1987 Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245-250[Medline].
HUDSON, R. R., 1993 The how and why of generating gene genealogies, pp. 2336 in Mechanisms of Molecular Evolution, edited by N. TAKAHATA and A. G. CLARK. Japan Scientific Society, Tokyo.
HUDSON, R. R., M. SLATKIN, and W. P. MADDISON, 1992 Estimation of levels of gene flow from DNA sequence data. Genetics 132:583-589[Abstract].
JEFFREYS, A. J., L. KAUPPI, and R. NEUMANN, 2001 Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat. Genet. 29:217-222[Medline].
JONES, S., R. MARTIN and D. PILBEAM (Editors), 1994 The Cambridge Encyclopedia of Human Evolution. Cambridge University Press, Cambridge.
KAPLAN, N. L., R. R. HUDSON, and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics 123:887-899
KIM, Y. and W. STEPHAN, 2000 Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics 155:1415-1427
KIM, Y. and W. STEPHAN, 2001 Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160:765-777.
KIRBY, D. A. and W. STEPHAN, 1996 Multi-locus selection and the structure of variation in a segment of the white gene of Drosophila melanogaster.. Genetics 144:635-645[Abstract].
LAZZARO, B. P. and A. G. CLARK, 2001 Evidence for recurrent paralogous gene conversion and exceptional allelic divergence in the Attacin genes of Drosophila melanogaster. Genetics 159:659-671
LI, W. H. and M. NEI, 1974 Stable linkage disequilibrium without epistasis in subdivided populations. Theor. Popul. Biol. 6:173-183[Medline].
LI, W. H. and L. SADLER, 1991 Low nucleotide diversity in man. Genetics 129:513-523[Abstract].
MARTINEZ-ARIAS, R., F. CALAFELL, E. MATEU, D. COMAS, and A. ANDRES et al., 2001 Sequence variability of a human pseudogene. Genome Res. 11:1071-1085
MAYNARD SMITH, J. and J. HAIGH, 1974 The hitch-hiking effect of a favourable gene. Genet. Res. 23:23-35[Medline].
MCVEAN, G. A. and J. VIEIRA, 2001 Inferring parameters of mutation, selection and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245-257
NACHMAN, M. W., 2001 Single nucleotide polymorphisms and recombination rate in humans. Trends Genet. 17:481-485[Medline].
NACHMAN, M. W. and S. L. CROWELL, 2000 Contrasting evolutionary histories of two introns of the duchenne muscular dystrophy gene, Dmd, in humans. Genetics 155:1855-1864
OTTO, S. P., 2000 Detecting the form of selection from DNA sequence data. Trends Genet. 16:526-529[Medline].
PARSCH, J., C. D. MEIKLEJOHN, and D. HARTL, 2001 Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans.. Genetics 159:647-657
PRITCHARD, J. K. and M. PRZEWORSKI, 2001 Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69:1-14[Medline].
RIEDER, M. J., S. L. TAYLOR, A. G. CLARK, and D. A. NICKERSON, 1999 Sequence variation in the human angiotensin converting enzyme. Nat. Genet. 22:59-62[Medline].
STEPHAN, W., T. H. E. WIEHE, and M. LENZ, 1992 The effect of strongly selected substitutions on neutral polymorphism: analytic results based on diffusion theory. Theor. Popul. Biol. 41:237-254.
TAILLON-MILLER, P., I. BAUER-SARDINA, N. L. SACCONE, J. PUTZEL, and T. LAITINEN et al., 2000 Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28. Nat. Genet. 25:324-328[Medline].
TAJIMA, F., 1983 Evolutionary relationships of DNA sequences in finite populations. Genetics 105:437-460
TAJIMA, F., 1989a Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595
TAJIMA, F., 1989b The effect of change in population size on DNA polymorphism. Genetics 123:597-601
TAKAHASHI, A., S. C. TSAUR, J. A. COYNE, and C.-I WU, 2001 The nucleotide changes governing cuticular hydrocarbon variation and their evolution in Drosophila melanogaster.. Proc. Natl. Acad. Sci. USA 98:3920-3925
WAKELEY, J. and N. ALICAR, 2001 Gene genealogies in a metapopulation. Genetics 159:893-905
WALL, J. D., 1999 Recombination and the power of statistical tests of neutrality. Genet. Res. 73:65-79.
WALL, J. D., 2001 Insights from linked single nucleotide polymorphisms: what we can learn from linkage disequilibrium. Curr. Opin. Genet. Dev. 11:647-651[Medline].
WALL, J. D. and R. R. HUDSON, 2001 Coalescent simulations and statistical tests of neutrality. Mol. Biol. Evol. 18:1134-1135
WEIR, B. S., 1996 Genetic Data Analysis II. Sinauer Associates, Sunderland, MA.
WRIGHT, S., 1951 The genetical structure of populations. Ann. Eugen. 15:323-354.
This article has been cited by other articles:
![]() |
L.-M. Chevin, S. Billiard, and F. Hospital Hitchhiking Both Ways: Effect of Two Interfering Selective Sweeps on Linked Neutral Variation Genetics, September 1, 2008; 180(1): 301 - 316. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Ingvarsson Multilocus Patterns of Nucleotide Polymorphism and the Demographic History of Populus tremula Genetics, September 1, 2008; 180(1): 329 - 340. [Abstract] [Full Text] [PDF] |
||||
![]() |
|


, and the genetic distance to the selected locus, c, is chosen such that 
. The two lines for P(H) are essentially superimposed.
, and the sample size is 50. The population recombination rate for the neutral locus,
. The time since the fixation of the favored allele, t, is scaled in units of 4N generations. A total of 104 simulations were run for each value of t. Only segregating sites with a minor allele frequency 
2 test (cf. 
, and the sample size is 50. The neutral locus is affected by repeated sweeps occurring at rate 