| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Corresponding author: Magnus Nordborg, University of Southern California, SHS 172, 835 W. 37th St., Los Angeles, CA 90089-1340., magnus{at}usc.edu (E-mail)
Communicating editor: W. STEPHAN
| ABSTRACT |
|---|
A stochastic model for the genealogy of a sample of recombining sequences containing one or more sites subject to selection in a subdivided population is described. Selection is incorporated by dividing the population into allelic classes and then conditioning on the past sizes of these classes. The past allele frequencies at the selected sites are thus treated as parameters rather than as random variables. The purpose of the model is not to investigate the dynamics of selection, but to investigate effects of linkage to the selected sites on the genealogy of the surrounding chromosomal region. This approach is useful for modeling strong selection, when it is natural to parameterize the past allele frequencies at the selected sites. Several models of strong balancing selection are used as examples, and the effects on the pattern of neutral polymorphism in the chromosomal region are discussed. We focus in particular on the statistical power to detect balancing selection when it is present.
COALESCENT theory is based on the realization that selective neutrality allows the separation of descent from state. This makes it possible to model samples (or populations) as random genealogies with superimposed neutral mutations (see ![]()
One way is to construct a genealogical process that "leaves room" for selection by creating genealogies that contain "virtual" branches representing possible lines of descent. After mutations have been superimposed and the state transmitted through each branch is known, the genealogy is "pruned" by preferentially removing selectively inferior branches so that only "actual" lines of descent remain. The process known as "ancestral selection graph" (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The second way, first described in the context of the coalescent by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The two approaches are in a sense complementary: Whereas the ancestral selection graph works only for weak selection, the conditional approach works only for strong selection. How to connect the two is not clear.
In this article we use the second, conditional approach, to show how the original formulations of ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
| A BASIC MODEL |
|---|
Consider a diploid, hermaphroditic population consisting of P patches, each of which harbors a constant, large number of adult individuals, Nk, k = 1, 2, ... , P. Let N =
Nk be the total population size, and define ck = Nk/N. The population reproduces in discrete, nonoverlapping generations according to a generalized Wright-Fisher model in the following manner. Each individual produces an (effectively) infinite number of gametes. Male gametes (e.g., pollen) flows between the patches; let fkl be the probability that a male gamete produced in patch k ends up in patch l. After migration, gametes unite randomly to form zygotes. The number of immature individuals in each patch is thus still effectively infinite, but only a finite number (Nk in the kth patch) reach adulthood. The probability that a given individual survives is determined by its genotype and the genotypic frequencies in the patch. Generalizations of this model are discussed below.
Forward dynamics at the selected loci:
Let H be the number of different haplotypes with respect to the selected locus or loci (which are assumed to be linked). Label the haplotypes 1, 2, ... , H, and the G = H(H + 1)/2 different genotypes by pairs of indices {i, j}, i
j = 1, 2, ... , H, according to their haplotypic composition. Let Nij,k(t) be the number of adults with genotype {i, j} in patch k in generation t. Note that whereas Nij,k(t) is a random variable, Nk is not. The frequency of genotype {i, j} in patch k is

and the frequency of haplotype i in patch k is

Let y'i,k(t) be the frequency of haplotype i among the gametes produced in patch k, generation t. In general, these gamete frequencies will be functions of the adult genotype frequencies and the appropriate segregation, mutation, and recombination parameters. Let

be the frequency of haplotype i among the male gametes in patch k after migration (female frequencies are of course unaffected by migration), where

Zygotes are then formed by random union of gametes within patches, so the frequency of genotype {i, j} among the zygotes in patch k is

Let the relative viability of a zygote with genotype {i, j} in patch k be 1 - wij,k(t), and define

where

The next generation of adults in patch k is formed by drawing Nk individuals according to these "postselection" frequencies. Thus, conditional on the genotype frequencies among adults in generation t, the length-G vector

is multinomially distributed with parameters Nk and

Genealogy of the surrounding segment:
Consider a chromosomal segment that contains the selected locus or loci. Take a particular copy of this segment, sampled from the adult individuals in generation t + 1. With respect to geographic location, it belongs to one of the P patches, and with respect to the selected locus or loci, it belongs to one of the H haplotypes.
Trace the genealogy of this segment one generation back in time. Each nucleotide in the segment in the current generation is a copy of the homologous nucleotide in some parental segment in the previous generation. In the absence of recombination, all nucleotides must have the same parental segment; otherwise, they may have different ones. However, rather than modeling nucleotides, it is convenient to think of the segment abstractly as a continuous unit interval where each point may have a different genealogy. This makes mathematical and computational sense and does not entail any loss of biological generality (see, e.g., ![]()
When tracing the genealogy of the segment back through the life cycle, the first thing to note is that selection cannot affect its state with respect to either patch or haplotype. The chosen segment must have been present in one of the gametes of the same type in the same patch before zygotes were formed and selection took place. However, migration can change the state of the segment with respect to patch, because the gamete may have been an immigrant, if it was male. The probability that a randomly chosen gamete of type i in patch k was male is y''i,k(t)/[y''i,k(t) + y'i,k(t)],
and the probability that a male gamete of type i currently in patch k was produced in patch l is
![]() |
(1) |
Note that migration always changes the state of the entire segment, because a gamete either does or does not migrate.
Having traced the segment through migration to the premigration gamete pool, the next step is to trace it through gamete production to the previous adult generation. Gamete production, i.e., meiosis, can change the state of the segment with respect to haplotype both through mutation at one of the selected loci and through recombination inside the segment. If the gamete was a mutant, then the segment changes state to take on the haplotype of the parental segment. If the gamete was a recombinant, things are more complicated, because the segment then has two parental segments. At each breakpoint, the parentage of the offspring segment switches from one parental segment to the other. Going backward in time, the segment splits into two pieces (or sets of pieces if there was more than one breakpoint), one of which takes on the haplotype of the first parental segment while the other takes on the haplotype of the second parental segment (see Fig 1). Both parental segments consequently have to be followed if the genealogy of the original segment is to be traced farther back in time. As is demonstrated later, the backward transition probabilities through meiosis can readily be derived, but the expressions depend on the details of the model (e.g., on the number of loci). In general, the probability that a gamete of type i in patch k resulted from a particular type of meiotic event in a particular genotype is a function of the length-G vector [xij,k(t)]i
j and the recombination and mutation parameters.
|
Once the genotype and patch of the parental individual has been determined, all eligible individuals are equally likely to have been the parent. Segments can thus be seen as "picking" their parent randomly.
To trace the single-generation genealogy of n copies of the segment, note that, since infinitely many gametes are produced, and gametes unite randomly within patches, the fate of each segment is independent of the fates of all the other segments. In other words, each segment "picks" its parental segment (or segments, in case of recombination) independently of the other segments. Whenever two or more segments pick parental segments belonging to the same patch, haplotype, and genotype, two or more of them may pick the same one. Segments that pick the same parental segment are said to coalesce; if their genealogy is to be traced farther back in time, only the single segment needs to be followed.
Segments can of course pick only the same parental segment if they first pick the same parental individual. Since all individuals of the same genotype are equally likely to be picked, the probability that n segments all pick different parents, given that they all pick parents with genotype {i, j} in patch k is
![]() |
(2) |
When two segments pick the same parental haplotype in the same parental individual, they coalesce with probability one-half if that individual is homozygous (so that there are two possible parental segments) and with probability one if it is heterozygous (so that only one segment is possible as parent).
| COALESCENT APPROXIMATION |
|---|
The purpose of the preceding section was to demonstrate that, conditional on the P length-G vectors [Nij,k(t)]i
j, k = 1, 2, ... , P, for t = 0, -1, ... , i.e., conditional on the genotype frequencies in all past generations, it is possible to model the genealogy of n segments sampled in generation t = 0 as a discrete-time Markov process running backward in time. However, the interesting process is the unconditional one, in which the genotype frequencies are governed by another discrete-time Markov process, running forward in time (as described above). Typically, we would like to know how the genealogy is affected by the parameters of that forward process (e.g., the selection coefficients and population sizes), not how it is affected by a particular realization of it.
Clearly, the unconditional process could be studied through discrete-time simulation: One would simply simulate the genotype frequencies forward in time and then simulate the genealogy backward in time, conditional on those frequencies.
An alternative approach, which is taken here, is to use a coalescent/diffusion approximation and assume that the genotype frequencies can be treated as having evolved deterministically on the continuous timescale (![]()
![]()
We focus on strong balancing selection in the following because the approach is easiest to explain and justify for such models (with balancing selection, the model is very close to that of ![]()
ij,k. In a finite population, the actual frequencies in any given generation will of course differ from these values. Similarly, in the diffusion approximation, the actual frequencies at a given point in time will differ from their expectations. The differences will tend to be smaller the stronger the selection. Note that nothing in the basic model described above limits how strong selection may be. Indeed, it is possible to assume infinitely strong selection (i.e., the selection coefficients need not be scaled in the diffusion approximation), in which case
![]() |
(3) |
and treating the genotype frequencies as constant is evidently justified. However, as discussed in ![]()
and similarly for the haplotype frequencies.
Scaling:
Under the assumption that the genotype frequencies can be treated deterministically, it is possible to use standard arguments to find a continuous-time coalescent approximation for the discrete-time genealogical process described above. Note that the probability (2) can be rewritten
![]() |
(4) |
Thus, a coalescence event occurs with probability O(1/N) per discrete generation, and it is natural to turn the process into a continuous-time process by rescaling time in units of O(N) generations and letting N go to infinity (while keeping ck constant to ensure that all the patches become large). The standard scaling of 2N is used throughout.
The per-generation probabilities of migration, mutation, and recombination are also assumed to be O(1/N), and the corresponding scaled parameters are introduced. Thus, it is assumed that the migration probability fkl can be written

where
kl is the migration rate (recombination and mutation are introduced below). Similarly, it is assumed that all the selection coefficients are O(1/N) (but still large enough for Equation 3 to hold approximately; see above). Taken together, these assumptions ensure that, to O(1/N),

and that, to the same order of approximation,
![]() |
(5) |
To proceed farther we must consider specific genetic models.
Single-locus example:
Consider a segment that contains a locus (or site) that is maintained polymorphic for two alleles by strong balancing selection in a subdivided environment. There are H = 2 "haplotypes," A1 and A2, and three genotypes, A1A1, A1A2, and A2A2. Let aij be the probability that allele Ai mutates to Aj during meiosis, and let r be the probability of a recombination event. It is assumed that these can be written

and

where
ij and
are the mutation and recombination rates. This model is identical to previously published models (![]()
![]()
![]()
![]()
![]()
Consider the backward transitions for a single segment with haplotype i sampled from an individual in patch k, as before. Because the probabilities of migration, mutation, and recombination are all assumed to be small, it is clear that the segment is most likely to be a copy of a segment with the same haplotype in the same patch in the previous generation. Indeed, it is easy to show that the probability is 1 - O(1/N). Furthermore, given that its state with respect to haplotype and patch did not change, the probability that it was produced by an {i, j} individual is
j,k + O(1/N). It can also be shown that the segment is an immigrant of the same type from patch l with probability

a mutant haplotype j
i from the same patch with probability

and a recombinant from an {i, j} individual in the same patch with probability

All other transitions (e.g., the segment is an immigrant mutant) have probability O(1/N2) or less.
As discussed above, it is simple to extend to a sample of n segments, because the single-generation transitions are mutually independent. Only the coalescence probabilities need to be determined. From Equation 4 a single coalescence event has probability O(1/N) or less, which means that the probability of more than two segments coalescing in a single generation has probability O(1/N2) or less. It also means that the probability that a segment involved in a coalescence is a migrant, mutant, or recombinant is O(1/N2). The only coalescence event that has probability O(1/N) is thus between two segments that do not change state with respect to haplotype or patch. For simplicity, consider the probability that n = two segments with haplotype i in patch k coalesce. To do so, they must have been produced by individuals of the same genotype, by the same individual of that genotype, and by the same segment within that individual. The probability of this is

where j
i, and Equation 5 has been used. Generally, it can be shown that the probability that a pair out of n segments of type i in patch k coalesces is

Given these transition probabilities, the limiting coalescent process can be derived using standard arguments. Segments belong to states with respect to haplotype and patch as before. Measure time in units of 2N generations, and let N go to infinity. Then, independently of all other segments, each segment with haplotype i in patch k migrates to patch l at rate

mutates to haplotype j
i at rate

and recombines with a j haplotype at rate
j,k
/2. If there are currently n segments with haplotype i in patch k, the total rate of migration to l is thus

etc. Similarly, each pair of segments with haplotype i in patch k independently coalesces at rate 1/(ck
i,k), so that the total rate for n such segments is

Two-locus example:
Consider a model with two loci each with two alleles, A1/A2 and B1/B2. There are H = 4 haplotypes: A1B1, A1B2, A2B1, and A2B2, which are numbered 1, ... , 4 in the order listed, and 10 genotypes. Define the mutation probability at the B-locus, bij, analogously to aij. The recombination probability, r, is also defined as before, but let dr be the length of the part that lies between the two loci (which are loci in the strict sense of the wordi.e., there is no recombination within them; they can be thought of as single-nucleotide polymorphisms, for example).
The single-generation backward transition probabilities can be found in the same manner as in the single-locus model, but are in some cases more complicated. The probability that a segment of type i in patch k is an immigrant of the same type from patch l is
![]() |
(6) |
as before. The mutation probabilities depend on the number and type of mutations involved. Thus the probability that a segment of type i = 1 (A1B1) in patch k is a mutant type j
i from the same patch is
![]() |
(7) |
for example. The recombination probabilities depend on the genotype in which the recombination event would have taken place. Recombination in homozygotes or single heterozygotes can be treated as in the single-locus model, but recombination in double heterozygotes cannot. Consider again a segment of type i = 1 (A1B1) in patch k. The probability that it is a recombinant from an {1, j} individual in the same patch is
![]() |
(8) |
for j = 1, 2, 3. However, it can also be a recombinant from a {1, 4} individual (the cis-heterozygote, A1B1/A2B2), as long as the recombination did not take place between the two loci. The probability of this event is thus
![]() |
(9) |
Given such an event, the breakpoint can be anywhere in the flanking pieces (it could, for example, be uniformly distributed). The opposite is true if the segment is a recombinant from a {2, 3} individual (the trans-heterozygote, A1B2/A2B1), because in this case the recombination must have taken place between the loci. The probability of this is
![]() |
(10) |
All other events have probability O(1/N2) or less.
The extension to n segments and the conversion to continuous time can be done precisely as for the single-locus model. The complete transition rates are given in Table 1.
|
| DETECTING SELECTION |
|---|
In this section we use simulations to investigate the distribution of the pattern of polymorphism in regions that contain sites subject to balancing polymorphism. We are in particular interested in the power to detect selection under various models and assumptions about parameter values. The simulation software, which is written in C++ with a Mathematica front end, is available on request.
Symmetric single-locus model:
Consider first a "classical" balancing selection model, in which selection maintains two different alleles at high frequencies in a random-mating population. The precise values of the equilibrium frequencies do not matter greatly (we assume that selection is completely symmetric so that the sizes of the two allelic classes, A1 and A2, are even), but the assumption that there is no subdivision does, as we see later. We also assume mutation between the two allelic classes is rare and symmetric (i.e., A1 mutates to A2 at the same rate as A2 mutates to A1). This assumption is discussed further below as well.
Fig 2 shows a typical realization of this model. The time to the most recent common ancestor (MRCA) of the sample is extremely high at the selected site and decreases with distance. The reason for this is clear: The selected site itself cannot coalesce unless a mutation from one allelic class to the other occurs. Since mutations are rare, this means that, almost always, all members of a particular class will coalesce to the MRCA of that class long before a mutation occurs. Sites that are more distantly linked to the selected site can move between allelic classes by recombination and coalesce much faster.
|
The genealogical pattern shown in Fig 2 may result in a characteristic pattern of polymorphism, which may be detected in the data. Fig 3 shows the distribution of the amount of variation within and between allelic classes and of Tajima's D statistic along the chromosome in six realizations. It is evident that the presence of balancing selection usually causes a "peak" due to divergence between the two allelic classes. This well-known phenomenon is also reflected in Tajima's D, which is expected to be greater than zero in the presence of balancing selection or (many forms of) population subdivision (![]()
|
We considered the power to detect selection using Tajima's D statistic. Table 2 gives the probability of observing a significantly positive value of this statistic under various assumptions. Several conclusions are clear. First, the power depends sensitively on the width of the window used to calculate D. A very small window will not have enough segregating sites to achieve statistical significance, whereas a large window will "drown" the peak in neutral noise. The optimal window width will depend on the ratio between
and
, i.e., the number of neutral mutations per recombination. If
/
> 1, balancing selection should usually be detected, but if
/
<< 1, it usually will not be.
|
To make the numbers in Table 2 applicable to sequence data, it is necessary to make assumptions about
and
per base. Assume that
= 0.01 per site, as is reasonable for Drosophila melanogaster (![]()
Loss-of-function mutations:
Many cases of balancing selection, especially those that are due to a trade-off between resistance and cost of resistance to some parasite or pathogen, are likely to involve loss-of-function mutations (![]()
![]()
![]()
![]()
Two loci:
Fig 4 shows a typical realization of a classical symmetric two-locus model with all four haplotypes present at equal frequencies. The two selected sites are positioned at 0.4 and 0.6, respectively. Note that the topologies of the trees behave in the intuitively obvious way as we walk along the chromosome: Close to each selected site, the sample must coalesce in a manner determined by the allelic classes at that site. In a sample that contains the "complementary" haplotypes (A1B1 and A2B2 or A1B2 and A1B2), sites located between the selected sites can coalesce only if there are at least two recombination events.
|
Fig 5 illustrates the effect of the distance between the selected sites on the distribution of
w,
b, and Tajima's D along the chromosome. If the distance between the sites is sufficiently great, there may be two distinct peaks (Fig 5, a and b); otherwise, only a single peak may be visible (Fig 5C). Power studies analogous to those in Table 2 indicate that the probability of rejecting neutrality using Tajima's D depends sensitively on the positioning, numbers, and sizes of the windows used (not shown). Obviously, the best strategy is to use a window size that captures each peak, but since the number of selected sites is not known a priori, this may be difficult to implement in practice. However, regions containing multiple sites subject to balancing selection are considerably less likely to be missed (see also ![]()
|
Subdivision:
The behavior of balancing selection models with population subdivision is very complicated, but the example shown in Fig 6 illustrates the main points. In addition to the structure imposed by the allelic and haplotypic classes, there is now population structure as well. Which structure turns out to be predominant will depend on the relative magnitudes of the parameters. In the example shown in Fig 6, the pattern of coalescence differs between the two selected sites. At the B-locus (located at 0.6), the first stage of coalescence is within allelic class within patches, followed by coalescence within allelic class between the two patches. Finally coalescence occurs between two allelic classes. On the other hand, at the A-locus (located at 0.4), there is a cluster of the four sequences in patch 2, and the longest branch is between sequence 1 (A1 in patch 1) and the others. The genealogical pattern is highly variable between realizations.
|
Local adaptation:
An important motivation for the model described above is to consider local adaptation. A balance between migration and selection seems much more likely to maintain polymorphism than does "pure" balancing selection. Because local adaptation leads to a deficit of heterozygotes at the selected locus or loci, the effect on linked neutral variation may be much greater. Local adaptation should thus be much easier to detect using polymorphism data.
Fig 7 shows an example of a single-locus model of local adaptation. The frequency of A1 is assumed to be 0.9 in patch 1 and 0.1 in patch 2 (and conversely for A2). The effects of local adaptation are even more dramatic when multiple sites are involved. Fig 8 shows an example of a two-locus model where A1B1 is favored in patch 1 and A2B2 is favored in patch 2. Note that the effects of selection extend throughout the simulated region, completely obscuring the peak around the B-locus. Multiple linked sites involved in local adaptation can, in principle, "lock up" entire chromosomal regions in complementary haplotypes.
|
|
| DISCUSSION |
|---|
We have shown how the "structured coalescent" described by ![]()
![]()
Robustness of the SARG:
We described the SARG as a limiting approximation to a specific model of pollen flow in an outcrossing hermaphroditic plant species, but the SARG is much more general. As is the case for the standard coalescent, phenomena such as selfing, separate sexes, or sex linkage, etc., can readily be incorporated (although formally proving convergence is likely to be both difficult and tedious; see ![]()
The hardest problem from a mathematical point of view is also the most interesting from a biological point of view; namely, when is it reasonable to treat selection as population structure? For strong balancing selection, it is plausible to argue that the approximation is a good one (![]()
![]()
Detecting balancing selection:
Our motivation for deriving the SARG was that we wanted to simulate sequence data from regions containing sites subject to balancing selection. It has long been known that balancing selection may create a peak of increased polymorphism centered around the selected site (![]()
![]()
![]()
![]()
![]()
Using simulations, we have focused on the question of whether we should expect balancing selection to be detectable or not. Our original motivation for asking this question is that many years of population genetics research in Drosophila have failed to uncover strong evidence for balancing selection. Is this because balancing selection is indeed rare, perhaps limited to highly unusual cases, like the MHC and plant self-incompatibility loci (![]()
![]()
![]()
![]()
![]()
![]()
Our simulations also tend to support the view that balancing selection might be difficult to detect in outcrossing organisms. For
/
= 1, as may be typical for Drosophila, taking finite sites into account, power to detect balancing selection seems to be
50% (Table 2). Note that this is power in the evolutionary sense, not in the usual "sampling" sense: It is not the case that a different sample would detect balancing selection; rather it is the case that a large fraction of existing balancing polymorphisms are expected to be undetectable (because of the particular evolutionary history of the polymorphism). Power will of course also depend on sampling and on the statistical method used, but perhaps less than we would think. It should be noted that our simulations do not take gene conversion into account: This could decrease power further (![]()
Finally, it seems clear that we should not normally expect to be able to see a typical signal of balancing selection when selection maintains a polymorphism between loss-of-function and functional alleles. The reason is simply that loss-of-function alleles are created far too rapidly to be ancient. When loss-of-function alleles nonetheless appear to be ancient, as is the case for some disease-resistance loci in A. thaliana (![]()
![]()
| FOOTNOTES |
|---|
2 Present address: Human Genetics Center, School of Public Health, University of Texas Health Science Center, 1200 Hermann Pressler, Houston, TX 77030. ![]()
| ACKNOWLEDGMENTS |
|---|
We thank A. Navarro and an anonymous reviewer for comments on the manuscript. M.N. thanks Peter Donnelly, Bob Griffiths, Dick Hudson, Steve Krone, Tom Kurtz, Paul Marjoram, Claudia Neuhauser, Gesine Reinert, Simon Tavaré, and Carsten Wiuf for many, many conversations about selection.
Manuscript received August 7, 2002; Accepted for publication December 17, 2002.
| LITERATURE CITED |
|---|
ANDOLFATTO, P. and M. NORDBORG, 1998 The effect of gene conversion on intralocus associations. Genetics 148:1397-1399.
BARTON, N. H. and A. NAVARRO, 2002 Extending the coalescent to multilocus systems: the case of balancing selection. Genet. Res. 79:129-139.[Medline]
BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995 The hitchhiking effect on the site frequency spectrum of DNA polymorphism. Genetics 140:783-796.[Abstract]
CAMPBELL, R. B., 1999 The coalescent time in the presence of background fertility selection. Theor. Popul. Biol. 55:260-269.[Medline]
DONNELLY, P. and T. G. KURTZ, 1999 Genealogical processes for Fleming-Viot models with selection and recombination. Ann. Appl. Probab. 9:1091-1148.
FEARNHEAD, P., 2001 Perfect simulation from population genetic models with selection. Theor. Popul. Biol. 59:263-279.[Medline]
GRIFFITHS, R. C., and P. MARJORAM, 1997 An ancestral recombination graph, pp. 257270 in Progress in Population Genetics and Human Evolution, edited by P. DONNELLY and S. TAVARÉ. Springer-Verlag, New York.
HEY, J., 1991 A multi-dimensional coalescent process applied to multi-allelic selection models and migration models. Theor. Popul. Biol. 39:30-48.[Medline]
HUDSON, R. R., 1996 Molecular population genetics of adaptation, pp. 291309 in Adaptation, edited by M. R. ROSE and G. V. LAUDER. Academic Press, San Diego.
HUDSON, R. R. and N. L. KAPLAN, 1988 The coalescent process in models with selection and recombination. Genetics 120:831-840.
HUDSON, R. R. and N. L. KAPLAN, 1995 Deleterious background selection with recombination. Genetics 141:1605-1617.[Abstract]
HUDSON, R. R., K. BAILEY, D. SKARECKY, J. KWIATOWSKI, and F. J. AYALA, 1994 Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster.. Genetics 136:1329-1340.[Abstract]
INNAN, H. and F. TAJIMA, 1999 The effect of selection on the amounts of nucleotide variation within and between allelic classes. Genet. Res. 73:15-28.
JOHANSON, U., J. WEST, C. LISTER, S. MICHAELS, and R. AMASINO et al., 2000 Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290:344-347.
KAPLAN, N. L., T. DARDEN, and R. R. HUDSON, 1988 The coalescent process in models with selection. Genetics 120:819-829.
KAPLAN, N. L., R. R. HUDSON, and C. H. LANGLEY, 1989 The "hitch-hiking" effect revisited. Genetics 123:887-899.
KAPLAN, N. L., R. R. HUDSON, and M. IIZUKA, 1991 The coalescent process in models with selection, recombination and geographic subdivision. Genet. Res. 57:83-91.[Medline]
KELLY, J. K. and M. J. WADE, 2000 Molecular evolution near a two-locus balanced polymorphism. J. Theor. Biol. 204:83-101.[Medline]
KIM, Y. and W. STEPHAN, 2002 Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160:765-777.
KREITMAN, M. and H. AKASHI, 1995 Molecular evidence for natural selection. Annu. Rev. Ecol. Syst. 26:403-422.
KRONE, S. M. and C. NEUHAUSER, 1997 Ancestral processes with selection. Theor. Popul. Biol. 51:210-237.[Medline]
NAVARRO, A. and N. H. BARTON, 2002 The effects of multilocus balancing selection on neutral variability. Genetics 161:849-863.
NEUHAUSER, C., 1999 The ancestral graph and gene genealogy under frequency-dependent selection. Theor. Popul. Biol. 56:203-214.[Medline]
NEUHAUSER, C., 2001 Mathematical models in population genetics, pp. 153178 in Handbook of Statistical Genetics, edited by D. J. BALDING, M. J. BISHOP and C. CANNINGS. John Wiley & Sons, Chichester, UK.
NEUHAUSER, C. and S. M. KRONE, 1997 The genealogy of samples in models with selection. Genetics 145:519-534.[Abstract]
NORDBORG, M., 1997 Structured coalescent processes on different time scales. Genetics 146:1501-1514.[Abstract]
NORDBORG, M., 1999 The coalescent with partial selfing and balancing selection: an application of structured coalescent processes, pp. 5676 in Statistics in Molecular Biology and Genetics (IMS Lecture Notes-Monograph Series, Vol. 33), edited by F. SEILLIER-MOISEIWITSCH. Institute of Mathematical Statistics, Hayward, CA.
NORDBORG, M., 2001 Coalescent theory, pp. 179212 in Handbook of Statistical Genetics, edited by D. J. BALDING, M. J. BISHOP and C. CANNINGS. John Wiley & Sons, Chichester, UK.
NORDBORG, M., and S. M. KRONE, 2002 Separation of time scales and convergence to the coalescent in structured populations, pp. 194232 in Modern Developments in Theoretical Population Genetics: The Legacy of Gustave Malécot, edited by M. SLATKIN and M. VEUILLE. Oxford University Press, Oxford.
NORDBORG, M., B. CHARLESWORTH, and D. CHARLESWORTH, 1996 Increased levels of polymorphism surrounding selectively maintained sites in highly selfing species. Proc. R. Soc. Lond. Ser. B 263:1033-1039.
OLSON, M. V., 1999 When less is more: gene loss as an engine of evolutionary change. Am. J. Hum. Genet. 64:18-23.[Medline]
PRZEWORSKI, M., 2002 The signature of positive selection at randomly chosen loci. Genetics 160:1179-1189.
PRZEWORSKI, M., J. D. WALL, and P. ANDOLFATTO, 2001 Recombination and the frequency spectrum in Drosophila melanogaster and Drosophila simulans.. Mol. Biol. Evol. 18:291-298.
SCHIERUP, M. H., D. CHARLESWORTH, and X. VEKEMANS, 2000 The effect of hitch-hiking on genes linked to a balanced polymorphism in a subdivided population. Genet. Res. 76:63-73.[Medline]
SCHIERUP, M. H., A. M. MIKKELSEN, and J. HEIN, 2001 Recombination, balancing selection and phylogenies in MHC and self-incompatibility genes. Genes 159:1833-1844.
SIMONSEN, K. L., G. A. CHURCHILL, and C. F. AQUADRO, 1995 Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413-429.[Abstract]
SLADE, P. F., 2000a Most recent common ancestor probability distributions in gene genealogies under selection. Theor. Popul. Biol. 58:291-305.[Medline]
SLADE, P. F., 2000b Simulation of selected genealogies. Theor. Popul. Biol. 57:35-49.[Medline]
SLADE, P. F., 2001 Simulation of "hitch-hiking" genealogies. J. Math. Biol. 42:41-70.[Medline]
STAHL, E. A., G. DWYER, R. MAURICIO, M. KREITMAN, and J. BERGELSON, 1999 Dynamics of disease resistance at the Rpm1 locus of Arabidopsis.. Nature 400:667-671.[Medline]
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.
TAKAHATA, N., 1990 A simple genealogical structure of strongly balanced allelic lines and trans-species polymorphism. Proc. Natl. Acad. Sci. USA 87:2419-2423.
TAKAHATA, N. and Y. SATTA, 1998 Footprints of intragenic recombination at HLA loci. Immunogenetics 47:430-441.[Medline]
TIAN, D., H. ARAKI, E. STAHL, J. BERGELSON, and M. KREITMAN, 2002 Signature of balancing selection in Arabidopsis.. Proc. Natl. Acad. Sci. USA 99:11525-11530.
This article has been cited by other articles:
![]() |
A. L. Caicedo Geographic diversity cline of R gene homologs in wild populations of Solanum pimpinellifolium (Solanaceae) Am. J. Botany, March 1, 2008; 95(3): 393 - 398. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Teshima and H. Innan Neofunctionalization of Duplicated Genes Under the Pressure of Gene Conversion Genetics, March 1, 2008; 178(3): 1385 - 1398. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Storz, M. Baze, J. L. Waite, F. G. Hoffmann, J. C. Opazo, and J. P. Hayes Complex Signatures of Selection and Gene Conversion in the Duplicated Globin Genes of House Mice Genetics, September 1, 2007; 177(1): 481 - 500. [Abstract] [Full Text] [PDF] |
||||
![]() |