- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.109.109447v1
184/2/571 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Kim, Y.
- Articles by Gulisija, D.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Kim, Y.
- Articles by Gulisija, D.
Originally published as Genetics Published Articles Ahead of Print on December 4, 2009.
Genetics, Vol. 184, 571-585, February 2010, Copyright © 2010
doi:10.1534/genetics.109.109447
Signatures of Recent Directional Selection Under Different Models of Population Expansion During Colonization of New Selective Environments
Yuseob Kim*,1 and
Davorka Gulisija
* Center for Evolutionary Functional Genomics, The Biodesign Institute, and School of Life Sciences, Arizona State University, Tempe, Arizona 85287-5301 and
Center of Rapid Evolution (CORE), Department of Zoology, University of Wisconsin, Madison, Wisconsin 53705
1 Corresponding author: Center for Evolutionary Functional Genomics, The Biodesign Institute, Arizona State University, P.O. Box 875301, Tempe, AZ 85287-5301.
E-mail: yuseob.kim{at}asu.edu
>ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
A major problem in population genetics is understanding how the genomic pattern of polymorphism is shaped by natural selection and the demographic history of populations. Complex population dynamics confounds patterns of variation and poses serious challenges for identifying genomic imprints of selection. We examine patterns of polymorphism using computer simulations and provide analytical predictions for hitchhiking effects under two models of adaptive niche expansion. The population split (PS) model assumes the separation of a founding population followed by directional selection in the new environment. Here, the new population undergoes a bottleneck and later expands in size. This model has been used in previous studies to account for demographic effects when testing for signatures of selection under colonization or domestication. The genotype-dependent colonization and introgression (GDCI) model is proposed in this study and assumes that a small number of migrants carrying adaptive genotype found a new population, which then grows logistically. The GDCI model also allows for constant migration between the parental and the new population. Both models predict reduction in variation and excess of high frequency of derived alleles relative to neutral expectations, with and without hitchhiking. Under comparable conditions, the GDCI model results in greater reduction in expected heterozygosity and more skew of the site frequency spectrum than the PS model. We also find that soft selective sweeps (fixation of multiple copies of a beneficial mutation) occurs less often in the GDCI model than in the PS model. This result demonstrates the importance of correctly modeling the ecological process in inferring adaptive evolution using DNA sequence polymorphism.
THE pattern of genetic variation within a population is determined by its evolutionary history. The density of polymorphic sites along the chromosomes, the distribution of allele frequencies at those sites, and the statistical association of polymorphism at different sites are influenced by events of natural selection and population (demographic) dynamics (ROSENBERG and NORDBORG 2002; NIELSEN 2005). Population genetic theory allows us to predict the pattern of genetic variation under specific models of selection and demography and, inversely, to infer the evolutionary history from a sample of DNA sequences within a population. A recent event of directional selection is often detected when a sudden removal of polymorphism is observed at a genomic location, due to the hitchhiking effect of a rapidly spreading beneficial mutation that wipes out preexisting variation (MAYNARD SMITH and HAIGH 1974; KAPLAN et al. 1989; STEPHAN et al. 1992; BARTON 2000). Numerous surveys of DNA sequence polymorphism revealed local reductions of variation clearly due to hitchhiking or selective sweeps (WOOTTON et al. 1999; NAIR et al. 2003; SCHLENKE and BEGUN 2004; SABETI et al. 2006; MACPHERSON et al. 2007; THORNTON et al. 2007; WILLIAMSON et al. 2007; AKEY 2009). From such findings, it has become evident that directional selection plays a major role in shaping the genomic pattern of sequence variation in natural populations (GILLESPIE 2000; BEGUN et al. 2007; HAHN 2008). A recent selective sweep also provides basic information regarding directional selection, such as the strength and fixation time of beneficial mutations (WANG et al. 1999; KIM and STEPHAN 2002; PRZEWORSKI 2003). However, such inference is not robust to deviation from the standard model of hitchhiking—the fixation of a new codominant beneficial mutation in a constant-sized random-mating population. Fixations of beneficial mutations in real populations are not likely to occur under simple demography or simple models of directional selection (INNAN and KIM 2004; JENSEN et al. 2005; TESHIMA and PRZEWORSKI 2006; CHEVIN and HOSPITAL 2008).
The sensitivity of the pattern of selective sweeps to biological details poses serious problems for studying adaptive evolution using genetic data. However, at the same time, it opens the possibility of capturing information that allows the inference of biological context in which adaptive evolution occurs, beyond merely confirming that a certain allele in a genomic region spread quickly in the recent past. Among numerous biological complications, recent studies have focused on the effects of complex demography on the pattern of selective sweeps. Methods have been proposed to extract the signal of genetic hitchhiking from the background pattern of polymorphism shaped by demography (JENSEN et al. 2005; NIELSEN et al. 2005) or to estimate the joint parameters of demography and directional selection (WRIGHT et al. 2005; LI and STEPHAN 2006). Approaches of the latter studies would generate information regarding the biological context of adaptive evolution. Such studies, however, require either novel theory of genetic hitchhiking or efficient methods of computer simulation that could predict and generate detailed patterns of polymorphism under models of directional selection in the biological setting of interest.
Many well-known and important examples of adaptive evolution occur during or after the establishment of a new population in a new environment. It is believed that the migration of humans out of Africa was followed by repeated episodes of directional selection. For example, strong selective sweeps at pigmentation genes in some non-African human populations demonstrate the history of those populations' adaptation after migration into new environments (LAMASON et al. 2005; MYLES et al. 2007). Likewise, evolution of agronomic traits in domesticated plants and animals involves the establishment of small (cultivated) populations derived from wild ancestors followed by strong directional (artificial) selection (DOEBLEY et al. 2006). Other examples include the invasion of a nonnative species into new habitats following human-caused disturbance (LEE 2002; LEE and GELEMBIUK 2008) and host switching of pathogens (PARRISH et al. 2008). In all of these examples, the genetic footprint of directional selection here should overlap with that created by the demographic process of founding and expanding a new population. This ubiquitous mode of evolution, encompassing all the examples above, might be called "adaptive niche expansion."
In this study, we investigate the pattern of genetic variation under two models of adaptive niche expansion. The first model assumes a simple split of ancestral populations into parental and derived populations (Figure 1A). The population split is followed by directional selection on adaptive alleles in the derived population. This model, referred to here as population split (PS), has been used in previous studies that account for demographic effects when testing for signatures of selection under colonization or domestication (WRIGHT et al. 2005; LI and STEPHAN 2006; THORNTON and JENSEN 2007). The PS model is simple enough to allow the application of standard approximations in population genetics: we may use the Wright–Fisher model of reproduction and the coalescent (diffusion) approximation. Using coalescent simulation, the patterns of genetic variation in this model have been extensively studied (INNAN and KIM 2004; WRIGHT et al. 2005; THORNTON and JENSEN 2007; INNAN and KIM 2008).
|
While the PS model assumes an instantaneous establishment of a new derived population, the natural processes of colonization may be more gradual and complicated. We therefore consider a different model of adaptive niche expansion in which a small number of migrants carrying certain alleles successfully initiate a logistic growth in a new habitat (Figure 1B; see below for further description). These two models may look similar, the first model being the approximation for the second. However, the ecological processes explicitly modeled are clearly different. Our major interest is whether such subtle ecological/demographic differences leave distinct signatures on the patterns of genetic variation. In this study, we present the two models of adaptive niche expansions and expound on their analytic predictions on the pattern of genetic variation. Distinct impacts on the patterns of genetic variation between the two models would emphasize the importance of accounting for details in the demographic scenario when testing for signatures of selection.
ABSTRACT
>POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
For simplicity, we consider populations of haploid individuals that undergo recombination upon random union. The ancestral population maintains a constant effective population size (number of haploids) of N0. At time Td, the ancestral population splits into two daughter populations. They remain diverged for Td generations without exchanging migrants. The effective size of the first population, pop1, remains constant at N1 between time 0 (present) and Td (time is counted backward). The second population (pop2) is a small founding population of size Nb (<< N0). This population bottleneck of size Nb lasts for Lb generations. Then, at time Td – Lb, the size of pop2 increases to N2.
It is assumed that pop2 occupies an environment different from that of ancestral population (pop1). Therefore, directional selection on beneficial alleles, advantageous in the new environment with selection coefficient s, begins in pop2 immediately following the divergence at Td, as modeled in INNAN and KIM (2008). We examine the pattern of genetic variation at a neutral site, which recombines with the site of beneficial mutation with probability r per generation, conditional on the fixation of the beneficial allele. The beneficial allele, denoted B, may originate from the standing genetic variation in the ancestral population or through a single mutation at time Td in the derived population. In the former scenario, it is assumed that the relative frequency of B is f0 both in the ancestral population immediately before the population split and in pop2 immediately after the split. A key assumption of the PS model is that the Nbf0 copies of the B allele simultaneously turn beneficial, enjoying the same selective advantage, at the founding of the derived population.
A soft selective sweep (INNAN and KIM 2004; HERMISSON and PENNINGS 2005; PENNINGS and HERMISSON 2006) can occur with Nbf0 >> 1; if two or more copies of B, which may be linked to distinct sequences, increase to high frequency and contribute to the fixation, the reduction of genetic variation is expected to be less severe than it is when a single copy sweeps through the population (hard selective sweep). Note that a hard selective sweep can occur even if Nbf0 >> 1, because only one copy may survive the stochastic loss in the early phase of increase (ORR and BETANCOURT 2001). In this model, we quantify the expected prevalence of a soft sweep by HB, i.e., the probability that two randomly chosen copies of B at present are not identical by descent when the lineages of the two allele are traced back to time Td. Namely, we define that all Nbf0 copies of the beneficial allele are distinct. (Note that the actual outcome of a soft sweep will depend on whether the two "distinct" copies of B at time Td have single or multiple mutational origins. However, we do not make an assumption about it. Therefore, our definition of a soft selective sweep remains inclusive for both cases.) The expected frequency of B in pop2 at time t is given by
and it takes Tf generations to get fixed where Tf < Lb with strong selection. Then, HB is simply the probability that two randomly chosen lineages of B, starting at present, do not coalesce until time Td in the past. Therefore,
![]() | (1) |
and by decreasing frequency of B
. A soft selective sweep may occur when both factors are moderate. Therefore, a soft selective sweep can prevail only if f0 is greater than 1/(Nbs).
In the case of hard selective sweep, the expected level of genetic variation can be obtained for a neutral locus linked to the target of selection. The derivation uses the diffusion approximation developed in KIM (2006). From Equations A10, A12, and A13 in APPENDIX A, the expected heterozygosity at a neutral locus, which recombines with the selected locus at rate r per generation, in pop2 is given by
![]() | (2) |
0 = 2N0µ,
b = 2Nbµ,
2 = 2N2µ, and µ is mutation rate at a neutral locus. y is the expected final frequency, after hitchhiking, of descendant copies that trace back to a single copy of the neutral allele on the chromosome where the beneficial mutation occurred (BIRKY and WALSH 1988; GILLESPIE 2000). Previous analyses suggest that
, if Nbs is sufficiently large (>100) (STEPHAN et al. 1992; KIM and STEPHAN 2002; KIM and NIELSEN 2004). The three terms in the last line of the equation above correspond to the contributions of mutations that originate in the ancestral population, in pop2 during the bottleneck, and in pop2 after the bottleneck, respectively. If Td is short relative to N2 or N0, allowing us to ignore genetic drift during the post-bottleneck period and the contribution of new mutations after the population split, the above equation is simplified to
![]() | (3) |
0) reduced by both population bottleneck
and the hitchhiking effect (1 – y2). ABSTRACT
POPULATION SPLIT MODEL
>GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
The Wright–Fisher model, or other models of reproduction that require specifying the population size at a given time, is not convenient to be applied here because the growth of the derived population must be modeled separately. Furthermore, if the derived population is a mixture of migrants carrying different genotypes with different fitness, the growth rate of the population depends on the exact genetic composition of the population (each nonadaptive allele produces less than one descendant on average, and thus gets eliminated eventually. However, they do not disappear immediately). We thus use the following simple model of reproduction to allow the feedback between demography and selection. The evolutionary dynamics of different alleles is specified by the absolute, rather than relative, fitness that is a function of ecological parameters. Consider a population of N haploid individuals that reproduce in discrete generations and, during reproduction, may randomly pair and perform recombination. Let WX be the absolute fitness (the expected number of its descendants into the next generation) of an individual carrying genotype X in the given environment. The number of offspring in the next generation from each individual is Poisson distributed with parameter WX. Then, the total number of individuals may increase or decrease stochastically between generations. In the GDCI model above, all individuals in the parental population (pop1) are assumed to have the same fitness. We model the absolute fitness in pop1, for all genotypes, as
![]() | (4) |
is the intrinsic growth rate of the population, N1 is the current size of pop1, and K1 is the carrying capacity of pop1. At equilibrium, N1 will fluctuate around K1. With a large value of N1, the reproduction in this population should approach that of the Wright–Fisher model, since the binomial distribution of offspring number in the latter model converges to a Poisson distribution.
In pop2, the absolute fitness depends on whether a haploid carries the adaptive (B) or nonadaptive (b) allele for the environment, if one locus is responsible for the adaptation. Then, we may specify the absolute fitness as
![]() | (5a) |
![]() | (5b) |
) and carrying capacities for allele b rather than using Equation 5b. However, multiplying a single factor 1 – sb in Equation 5b effectively reduces both ecological parameters simultaneously. As indicated above, sb is given such that Wb < 1 for all values of N2. We assume that migration occurs in both directions and the number of migrants is proportional to the size of the source population; at each generation, the expected number of migrants from pop1 to pop2 is M1 = N1m and that from pop2 to pop1 is M2 = N2m. As in the case of the PS model, the adaptive allele in pop2, B, is assumed to segregate neutrally in pop1 with frequency f0. Therefore, before pop2 is established, on average M1f0 haploids with the adaptive allele arrive in the new environment each generation. Once individuals with the adaptive allele establish the initial population that survives stochastic loss, N2 grows logistically until it reaches K2.
We are interested in the pattern of genetic variation at neutral loci in pop2 observed shortly after the growth of pop2 is completed. The amount of variation depends on the linkage to the adaptive locus. If a neutral locus is closely linked to the adaptive locus, its expected heterozygosity in pop2 should be low because most neutral lineages originate from one or a few that migrated into pop2 along with the adaptive allele, B, on the same chromosome. This mechanism is fundamentally identical to the hitchhiking effect of a beneficial mutation, as first described in MAYNARD SMITH and HAIGH (1974), but in a different mode of directional selection. We thus aim to derive an approximation to H(r), the expected heterozygosity, as a function of recombination rate r.
As our model of reproduction is similar to the Wright–Fisher model with respect to the offspring distribution, we may apply the methods of coalescent approximation that were used in other studies under the Wright–Fisher model. At time t, the derived population is composed of nB(t) haploids carrying B and nb(t) haploids carrying b at the selected locus (nB(t) + nb(t) = N2(t)). Counting time backward from the present (t = 0), let Td be the last generation when nB(t) > 0. With limited migration, Td corresponds to the point when the first successful B haploid (carrying the "founding" copy of B) starts growing in pop2 (Figure 1B). Then, we randomly pick two alleles at a neutral locus at t = 0 in the derived population (pop2) and follow their lineages backward in time. With limited migration and strong selection against b, nb(0) is far smaller than nB(0). Therefore, we consider only the case in which both neutral lineages are linked to the B allele at t = 0. The two lineages in pop2 may coalesce before Td. This event occurs with probability 1/nB(t) at time t, if both lineages remain in pop2 and are linked to B. However, if one of the lineages migrate to pop1 (if time is counted backward), the coalescent event cannot occur. There are two routes by which a lineage at the neutral locus can migrate from pop2 to pop1 before Td. The first route is through recombination onto a chromosome carrying the b allele, which shortly moves to pop1 because haploids in pop2 carrying b must be recent migrants from pop1 due to selection against b. The second route is through direct migration to pop1 along with the linked B allele before Td, which implies a soft selective sweep (this migrating copy of the B allele is different from the "founding" copy of B that entered pop2 at Td).
To obtain the approximate probability of first-route migration, we consider the scenario in which, forward in time, most haploid migrants from pop1 to pop2 carry allele b at the selected locus (f0 << 1). Neutral alleles carried by these migrant chromosomes may stay in pop2 only if they recombine with the B allele. Otherwise, they will be eliminated with rate 1 – Wb. Let M*(t1, t2) be the expected number of neutral lineages that entered pop2 at time t1 and still remain linked with b in pop2 at time t2 (Td > t1
t2
0). Considering selection against b,
![]() |
Ignoring short-term change in Wb(t), we may use Wb(t)
Wb(t2) for t2
t
t1. Therefore,
. Then, backward in time, a neutral lineage that is in pop2 and linked to B can migrate to pop1 if it recombines with b. This happens with probability
, where
is the number of neutral lineages that are linked with b and shortly migrate back to pop1.
is different from nb(t) because some of lineages that are currently linked with b may recombine back with B before migrating to pop1. We find that
![]() | (6) |
.
The probability of second-route migration is simply the proportion of B alleles that just migrated from pop1 (forward in time) among all copies of B in pop2. Therefore, the probability for a given lineage is
, because the expected number of B allele migrating (forward in time) into pop2 each generation is M1f0.
The probability that two lineages coalesce in pop2 is then approximately
![]() | (7) |
The expected value of nB(t) is given by the logistic growth of B haploids. Namely,
![]() | (8) |
. Again, ignoring the short-term change in Wb,
. Using these approximations for nb(t) and nB(t), Equation 7 can be calculated. Further simplification of Equation 7 is possible if we note that most coalescent events would occur when t is reasonably close to Td. Then, we may substitute Wb(t) in the equations above with
and also assume that nb(t) << nB(t) for all t. Therefore, using
,
![]() | (9) |
![]() | (10) |
1 = 2K1µ is the expected heterozygosity in pop1 and Pcoal is given by either Equation 7 or 9. Figure 2 shows that these analytic approximations are reasonably close to the result of individual-based forward-in-time simulations, which is described in APPENDIX B. Both analytic solution and simulations assume that the allelic difference between two neutral lineages linked to different copies of B at Td is equal to that between two randomly chosen neutral lineages in pop1. This is not realistic unless recombination rate between two loci is very large or the recurrent mutations between b and B are very frequent. Therefore, the above equation overestimates the level of sequence variation in real data in the case of soft selective sweeps. However, it is currently not feasible to obtain the expected heterozygosity between two neutral alleles that are linked to an identical (by state) neutral allele at another locus (as in the case of B in pop1) that has drifted to frequency f0. In addition, we note that, if the allele B is deleterious, rather than neutral, in the ancestral population, H(r) given above would further overestimate the actual level of variation, since a deleterious allele has a recent origin.
|
We can isolate the probability of soft selective sweeps, equivalent to Equation 1 for the PS model, by choosing r = 0 in Equation 7. This leads to the solution identical to the probability of a soft selective sweep due to recurrent migration obtained first by PENNINGS and HERMISSON (2006). Namely,
![]() | (11) |
0. For example, using Equation 9,
![]() | (12) |
. Using Pcoal by Equation 7 yields much better agreement to simulation results than by Equation 9, which presumably reflects an error introduced by the assumption that nb(t) << nB(t) for all t. Looking backward in time, the recombination events by which a neutral lineage linked to the adaptive allele escapes coalescence ("first-route migration") occur when
is not too much smaller than N2(t). This happens in a much shorter window of time compared to the standard model of selective sweeps (KAPLAN et al. 1989), as the probability of equivalent recombination event in the latter model is proportional to 1 – x, where x is the relative frequency of the beneficial mutation: while 1 – x changes gradually according to the logistic function,
changes drastically when t is close to Td. It is thus important to correctly describe
for t
Td in Equation 7. Therefore, assuming N2(t) = nB(t) when nb(t) is not much smaller than nB(t) may cause a significant error. This explains that the approximation using Equation 9 gets worse for smaller sb (Figure 3A). The approximations and simulation results also indicate that the hitchhiking effect depends very little on the growth rate (
) of B haploids in pop2 (Figure 3B).
|
ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
>COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
0.6 and s [PS] = sb [GDCI]). Therefore, the weakened signature of genetic hitchhiking due to soft selective sweeps, considered as a potential difficulty in detecting selection in plant domestication and other models of adaptive niche expansion (INNAN and KIM 2004; HERMISSON and PENNINGS 2005; PRZEWORSKI et al. 2005), might be a smaller problem in the GDCI model than in the PS model. With weak selection (Nbs << 1/f0), the probability of a soft sweep in the PS model may become as low as that in the GDCI model. [Note that Equation 1 is a function of the strength of selection but Equation 11 is not (HERMISSON and PENNINGS 2005, PENNINGS and HERMISSON 2006).] However, such weak selection may produce a very narrow region of the selective sweep. With a large value of f0 (> 0.01), there is a limit to which the strength of selection can be reduced and, at the same time, a distinct local reduction of variation (which requires Nbs >> 1) can be produced.
|
When we consider hard selective sweeps only, the GDCI model predicts more severe and wider reduction of variation around the adaptive locus (Figure 5). To compare the extent of local selective sweeps, let us define rc to be a recombination rate that satisfies
![]() |
![]() | (13) |
![]() | (14) |
. This condition may be met for a wide range of reasonable scenarios, for example, if 2Nbs > 100 and M1
1 (see DISCUSSION).
|
Both the probability of soft sweeps and the extent of reduced variation by hard selective sweeps suggest that the GDCI model can produce a greater reduction in polymorphism than the PS model with comparable parameter strength of selection. However, since both models predict V-shaped patterns of local reduction in polymorphism around the locus under selection, it may not be possible to determine which model is more compatible with a given observation of reduced heterozygosity, unless the fitness effect of the nonadaptive allele in the pop2 and effective migration rates can be correctly measured experimentally. We therefore explored whether the site frequency spectrum (SFS; GRIFFITHS 2003) might allow us to distinguish between the two models. The frequency spectrum in the GDCI model can be obtained using a frequency-based forward-in-time simulation (APPENDIX B). Figure 6 shows that, for similar reduction in the heterozygosity of pop2 relative to pop1 after a hard selective sweep, the SFS in the two models are quite different. We find that, relative to the PS model, the hitchhiking effect creates a greater excess of high-frequency derived alleles. Even with moderate reduction of expected heterozygosity (H(r)/
1 = 0.1
0.3), the GDCI model produces an almost U-shaped SFS. It should be noted that a U-shaped distribution can also be produced under the PS or standard hitchhiking model (e.g., KIM 2006, Figure 1), but with a much smaller ratio r/s and thus accompanying a greater reduction in the expected heterozygosity than that shown in Figure 6. A coalescent-based explanation for this difference in the SFS between the PS and the GDCI model is offered below.
|
ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
>DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
On the other hand, the process of founding a derived population is gradual in the GDCI model. Since the change in size of the derived population depends on its genetic composition, the Wright–Fisher model is not adequate. We therefore constructed a model in which each allele leaves descendants according to its absolute fitness, which is determined by ecological parameters. In this model, establishment of the derived population starts with the growth of one or a few migrants carrying the adaptive allele whose absolute fitness in the new environment is greater than one. Even though the constant rate of migration allows late-coming migrants to leave descendants in the new population, their contribution to the final population size is small relative to the early founding migrants. Therefore, genetic variation is less likely to show the pattern of soft selective sweeps at linked neutral loci than in the PS model.
The establishment of a derived population due to continuous migration in the GDCI model might be more realistic than the PS model in many cases of habitat expansion (e.g., freshwater invasion of marine copepods) or certain cases of domestication (e.g., domestication of dogs from gray wolfs). However, it is likely that a real biological process of adaptive niche expansion is more complicated and the PS and GDCI models simply offer two different approximations to the same evolutionary event. It should be noted that the two models studied here are mainly concerned with the adaptive evolution that is critical for the initial foundation of the derived population, most likely due to positive selection on the preexisting mutations in the parental population. Therefore, the hitchhiking effect in the GDCI model is necessarily restricted to only one or a few loci in the genome that played the most important role in the initial growth of the derived population. On the other hand, the PS model (with a hard selective sweep) can be used to analyze the fixation of new adaptive mutations that arose after the initial establishment of the new population, even though this population was founded through continuous migration. In this sense, the PS model might be more general. However, if a new mutation that occurs after the foundation of a small population acts to greatly increase the size of this population (i.e., a mutation conferring large absolute fitness), the signature of hitchhiking around this allele might be closer to the GDCI than to the PS model.
Although these two models may approximate the same process of adaptive niche expansion, the pattern of genetic variation at both linked and unlinked neutral loci are strikingly different. With the condition that
, the expected heterozygosity is much lower in the GDCI model than in the PS model. From
and the fact that most domesticated populations harbor about 30–80% of ancestral variation, a reasonable range of M1 (the expected number of migrants from pop1 to pop2) for the GDCI model, at least in the case of domestication, might be from 0.2 to 2. This might also be true for most other cases of adaptive niche expansion. Then, considering 2Nbs cannot be lower than
50 to produce a distinct local reduction of genetic variation in the PS model (KIM and STEPHAN 2002), the above inequality will generally hold. Therefore, for a comparable strength of selection, the reduction of polymorphism caused by the hitchhiking effect will be much more severe in the GDCI model. This severity of hitchhiking effect is mainly explained by the fact that opportunity for the decay of beneficial-neutral allele association exists only briefly in the GDCI model: a given lineage of a adaptive allele that enters pop2 can recombine onto a different neutral allele with probability
, which decreases much more rapidly than the equivalent probability [i.e., r(1 – x), where x is the relative frequency of beneficial mutation] in the PS model, when time is counted forward.
Furthermore, the two models predict different allele frequency distribution in genomic regions that are either close or distant from the locus of selection (Figure 6). In particular, the excess of high-frequency derived allele is much greater in the GDCI model than in the PS model. Interestingly, this excess is also predicted in genomic regions that are unlinked to the locus under selection (r = 0.5; Figure 6). This may increase false-positive detection of selective sweeps in a SFS-based analysis assuming the PS model when the actual evolutionary process is closer to the GDCI model.
Here we consider possible explanations for the origin of this unique pattern of frequency spectrum in the GDCI model. First, the excess of high-frequency derived alleles might be due to recurrent migration that continues after the growth of pop2 is completed: neutral variants that hitchhike along the adaptive allele reach high frequencies in pop2 but probably never become fixed in the population due to recurrent migration of ancestral variant from pop1. On the other hand, once neutral variants reach fixation in the PS model by hitchhiking they cannot become polymorphic again. If this explanation is correct, the excess of high-frequency derived allele in the GDCI model should diminish as we limit the migration between two populations after the initial establishment of pop2. Figure 7 shows the SFS obtained by frequency-based forward simulation in which recurrent migration between populations lasts only for 10, 20, 50, and 200 generations after the first founding copy of B enters pop2 (Td = 400). Contrary to the expectation, U-shaped SFS persists for all lengths of period in which migration is allowed. Therefore, the excess of high-frequency derived allele in GDCI model is not explained by continuous migration after the founding of pop2.
|
A more plausible explanation for the proportion of high-frequency derived alleles might be offered considering the major difference in the expected shapes of neutral genealogies subject to hitchhiking in the PS vs. the GDCI model. With small rates of recombination, the genealogy at the linked neutral locus is expected to be either one of three types illustrated in Figure 8 if it could leave nonzero polymorphism in the sample of DNA sequences (it is assumed that the contribution of mutations occurring during or after the process of a selective sweep can be ignored). In the standard model of genetic hitchhiking or the PS model, recombination events during a selective sweep are likely to produce genealogies similar to the type I or type II trees shown in Figure 8: looking backward in time, each lineage may recombine onto a chromosome carrying the nonbeneficial allele (b) and escape the coalescence to other lineages linked to the beneficial allele. Then, the separate lineages that exit the selective phase undergo the neutral coalescent process, leading to long inner branches in the genealogy (FAY and WU 2000; KIM and NIELSEN 2004). Because the rate of recombination event is proportional to 1 – x and that of a coalescent event is proportional to 1/x, where x is the frequency of the beneficial allele (KAPLAN et al. 1989), while x is decreasing backward in time, recombination events occur earlier than coalescent events on average. Therefore, each lineage that "migrates" to pop1 by recombination, not having experienced coalescence, is ancestral to only one chromosome in the current sample, thus producing type I or II tree. On the other hand, in the GDCI model, both recombination and coalescent events occur at rates inversely proportional to the number of B haploids [
and
, respectively, in the derivation of Equation 9]. As these two events occur concurrently, when a lineage escapes the hitchhiking effect by a rare event of recombination onto a b chromosome, this lineage may be the common ancestor of a variable number of neutral lineages. This process can thus create the type III genealogy, in which the two lineages that exit the selective phase are ancestral to similar numbers of chromosomes in the sample. Both type I and III trees can produce a distribution of derived-allele frequencies that is symmetrical around 0.5, because there are only two long inner branches where a mutation can occur and therefore the expected frequency of the mutant allele in the sample is 0.5. However, the expected heterozygosity is much lower with type I than with type III tree because the former results in derived alleles only in extreme frequencies (singleton polymorphism) in the sample. Then, it will be a type II tree (or any other with multiple independent lineages escaping coalescence by recombination), rather than type I, in the PS model that would produce the level of expected heterozygosity similar to that produced by a type III genealogy in the GDCI model. Type II genealogy does not produce a U-shaped distribution of derived-allele frequency: since new mutations are descended onto only one of three lineages that are connected by inner branches (i.e., three lineages that exit the selective phase) more often than they are descended onto two of the three lineages, the expected frequency of the derived allele in the sample is less than 0.5. We argue that this explains why there is a greater excess of high-frequency derived alleles in the GDCI model than in the PS model for a comparable reduction in the expected heterozygosity.
|
As a skew of the site frequency spectrum (deviation from the neutral equilibrium) and the pattern of linkage disequilibrium produced after a selective sweep are intimately related to each other due to a common underlying genealogy (KIM and NIELSEN 2004), it is also expected that a unique pattern of linkage disequilibrium will be generated under the GDCI model. In summary, our analyses predict significant differences in many aspects of genetic variation between the PS and GDCI model. This result further highlights the importance of correctly modeling the demographic/ecological background in the analysis of selective sweeps. For example, assuming the PS model, one may greatly overestimate the strength of selection based on the chromosomal span of reduced variation. It should also be noted that the estimation of demographic history from the genome-wide SFS (neutral variation at loci unlinked to the locus of selection) will be erroneous if the correct model is not explored (Figure 6).
ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
>APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
Sampling probability in the derived population:
Under the infinite site model of molecular evolution, the probability of observing k neutral variants at a nucleotide site when n sequences are sampled (0 < k < n) is given by
![]() | (A1) |
(n, k, z, t) is the probability of a neutral mutant that starts at frequency z at time t and is found at frequency k/n in a sample of n sequences at present. Namely,
(n, k, z, t) is the expected contribution of neutral mutants at time t to the current polymorphism of size k. It decreases with increasing genetic drift between time t and present and thus depends on the profile of the effective population size during this period. With constant population size N, KIM (2006) found that
![]() | (A2) |
are obtained from recursions (KIM 2006). This solution can be easily generalized to populations experiencing step-wise size changes.
In the PS model given in Figure 1A, the probability that a neutral variant segregating at time t (originating from either the ancestral population or pop1) is found at frequency k/n1 in the current sample of n1 sequences in pop1 is
![]() | (A3) |
![]() | (A4) |
b and 0 otherwise.) Namely, we normalize time by the effective population size. The sampling probability of neutral variants in pop1 is obtained by integrating the contributions by all mutations in the past.
![]() | (A5) |
k
n1 – 1 and
0 = 2N0µ and
1 = 2N1µ are scaled mutation rates for ancestral population and pop1, respectively. Similarly, the sampling probability in pop2 (without selection at time Td) is
![]() | (A6) |
![]() | (A7) |
b = 2Nbµ and
2 = 2N2µ). The numerical solution to Equation A5 or A6 is very close to that of Equation 1 in MARTH et al. (2004), which was derived using the coalescent theory. As in the case of their solution, ours can be extended to single populations of more complicated demography as long as the population sizes change in steps. Furthermore, the current formula is more flexible than the solution of MARTH et al. (2004) in that it can be easily modified to include population divergence and selective sweeps (see below). We obtain this flexibility mainly because the derivation is based on the allele frequency dynamics forward in time, which allows more intuitive arrangement of terms.
Joint frequency spectrum from two divergent populations:
Let Pij be the probability that, at a given site, i copies of a derived allele are found in a sample of n1 sequences in pop1 and j derived alleles in a sample of n2 sequences in pop2. Here, in addition to the case of simultaneous polymorphism (0 < i < n1 and 0 < j < n2), we consider segregation in one population only (0 < i < n1 and j = 0 or n2, or i = 0 or n1 and 0 < j < n2) and fixed difference (i = 0 and j = n2, or i = n1 and j = 0). First, we examine the expected contribution of neutral mutations that arose in the ancestral population to the joint sampling probability. If such a mutation is either lost or fixed in the population before Td, it cannot generate any polymorphism or difference between two populations. Only an allele segregating in the ancestral population at time Td can thus contribute. The probability of finding a derived allele at a frequency interval [p, p+dp] in the ancestral population at Td is given by (
0/p)dp because the ancestral population is assumed to be in neutral equilibrium. Then the contribution to Pij is given by
![]() | (A8) |
(i, j) is well defined for all 0
i
n1, 0
j
n2.
Derived alleles that originate after Td can also be sampled in pop1 or pop2. Define
![]() | (A9) |
n1. This is the probability of sampling k neutral mutations that occurred in pop1 between Td and present. We consider sufficiently small Td and small
1 so that
(therefore,
1(n1) is well defined). Similarly, for pop2
![]() | (A10) |
![]() | (A11) |
Adding selective sweeps:
Next, we add directional selection to the model. We consider a hard selective sweep caused by a beneficial mutation arising at Td in pop2. The probability of joint polymorphism (sampling i mutants in pop1 and j mutants pop2) at a linked neutral locus is given by
![]() | (A12) |
![]() | (A13) |
![]() |
(i, j) in Equation A11 by
hh(i, j). ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
>APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
LITERATURE CITED
Individual-based simulation:
This method of simulating the GDCI model assumes that, when the successful migration of B haploids to pop2 occurs (Td generations back from the present in Figure 1B), the pattern of variation in pop1 follows that of the standard neutral equilibrium. Then, to obtain the expected heterozygosity at present, we need to observe only the increased identity by descent (due to events of coalescence) in both pop1 and pop2 in the period of Td generations. The simulation thus starts with N1 = K1 haploid individuals, represented as chromosomes carrying alleles at one neutral and one selected locus, in pop1. Then, M1 haploids are randomly chosen from pop1 and move to pop2, where M1 is Poisson distributed with mean N1m. Then, in both pop1 and pop2, each haploid produces the Poisson number of offspring according to the absolute fitness given by Equations 3 and 4. While the allele at the selected locus is inherited to every offspring, the allele at the neutral locus is inherited to the offspring with probability 1 – r. With probability r, the offspring receives the neutral allele of a randomly chosen individual at the parental generation. This process of reproduction is repeated for Td generations. (If N2 > 0, M2 haploids are randomly chosen from pop2 and move to pop1, where M2 is Poisson distributed with mean N2m.)
The expected heterozygosity at the neutral locus is determined by measuring the increase of the identity by descent during siumulation as described in KIM and STEPHAN (2003) and KIM and WIEHE (2009). We obtain
j, the probability that two randomly selected gene lineages in population j do not coalesce between time Td and 0 (when time runs backward). This quantity determines the expected heterozygosity at present: if new mutations at neutral loci can be ignored between time Td and 0, two randomly selected alleles from population j are different only if the two lineages do not coalesce between time Td and 0 and if the two ancestral alleles at time 0 are different (with probability 2K1µ, where µ is the mutation rate, assuming that the expected heterozygosity is approximated by that in the Wright–Fisher population with size K1). Therefore, the expected heterozygosity in population j is given by 2K1µ
j.
Frequency-based simulation:
Next, to obtain the frequency spectrum in the GDCI model, we use the method of frequency-based forward-in-time simulation (KIM and WIEHE 2009). However, the two alleles at the neutral locus, A and a, here are defined by identity-by-state. Each generation, the numbers of individuals, n1, n2, n3, and n4 corresponding to AB, Ab, aB, and ab haploids, in both populations are updated to
according to the deterministic equations for recombination and migration. Then, the number in the next generation,
is determined by Poisson distribution with mean
for haplotype i, where Wi is obtained from the absolute fitness in Equations 4 and 5. We simulate hard selective sweeps in this model. Therefore, we start the simulation when there is one founding copy of B in pop2: at time 0, with the frequency of A being p0, {n1, n2, n3, n4} = {1, M1p0, 0, M1(1 – p0)} with probability p0 (the founding B copy is linked to A) or {0, M1p0, 1, M1(1 – p0)} with probability 1 – p0 (the founding B copy is linked to a) in pop2. The initial frequencies for pop1 are {n1, n2, n3, n4} = {0, K1p0, 0, K1(1 – p0)}; we simply ignore the frequency of B in pop2 to make sure the hard sweep happens in pop2. We draw p0 from the standard distribution of derived-allele frequency under neutral equilibrium (probability density of
). We run the simulation for Td generation, conditional on B being established in pop2, and then observe the final frequency of A (= p). By repeating this procedure, the site frequency spectrum at the neutral locus is obtained: the probability of observing j copies of A in a sample of k sequences is
, where f(p) is the empirical distribution of p obtained in the simulation.
ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
>ACKNOWLEDGEMENTS
LITERATURE CITED
ABSTRACT
POPULATION SPLIT MODEL
GENOTYPE-DEPENDENT COLONIZATION...
COMPARISON OF THE TWO...
DISCUSSION
APPENDIX A: DERIVATION OF...
APPENDIX B: SIMULATION METHODS
ACKNOWLEDGEMENTS
>LITERATURE CITED
AKEY, J. M., 2009 Constructing genomic maps of positive selection in humans: Where do we go from here? Genome Res. 19: 711–722.
BARTON, N. H., 2000 Genetic hitchhiking. Philos. Trans. R. Soc. B. Biol. Sci. 355: 1553.[CrossRef]
BEGUN, D. J., A. K. HOLLOWAY, K. STEVENS, L. W. HILLIER, Y. P. POH et al., 2007 Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 5: e310.[CrossRef][Medline]
BIRKY, C. W., and J. B. WALSH, 1988 Effects of linkage on rates of molecular evolution. Proc. Natl. Acad. Sci. 85: 6414–6418.
CHEVIN, L-M., and F. HOSPITAL, 2008 Selective sweep at a quatitative trait locus in the presence of background genetic variation. Genetics 180: 1645–1660.
DOEBLEY, J. F., B. S. GAUT and B. D. SMITH, 2006 The molecular genetics of crop domestication. Cell 127: 1309–1321.[CrossRef][Medline]
EYRE-WALKER, A., R. L. GAUT, H. HILTON, D. L. FELDMAN and B. S. GAUT, 1998 Investigation of the bottleneck leading to the domestication of maize. Proc. Natl. Acad. Sci. 95: 4441–4446.
FAY, J. C., and C.-I. WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155: 1405–1413.
GILLESPIE, J. H., 2000 Genetic drift in an infinite population the pseudohitchhiking model. Genetics 155: 909–919.
GRIFFITHS, R. C., 2003 The frequency spectrum of a mutation, and its age, in a general diffusion model. Theor. Popul. Biol. 64: 241–251.[CrossRef][Medline]
HAHN, M. W., 2008 Toward a selection theory of molecular evolution. Evolution 62: 255–265.[CrossRef][Medline]
HERMISSON, J., and P. S. PENNINGS, 2005 Soft sweeps molecular population genetics of adaptation from standing genetic variation. Genetics 169: 2335–2352.
INNAN, H., and Y. KIM, 2004 Pattern of polymorphism after strong artificial selection in a domestication event. Proc. Natl. Acad. Sci. 101: 10667.
INNAN, H., and Y. KIM, 2008 Detecting local adaptation using the joint sampling of polymorphism data in the parental and derived populations. Genetics 179: 1713–1720.
JENSEN, J. D., Y. KIM, V. BAUER DUMONT, C. F. AQUADRO and C. D. BUSTAMANTE, 2005 Distinguishing between selective sweeps and demography using DNA polymorphism data. Genetics 170: 1401–1410.
KAPLAN, N. L., R. R. HUDSON and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics 123: 887–899.
KIM, Y., 2006 Allele frequency distribution under recurrent selective sweeps. Genetics 172: 1967–1978.
KIM, Y., and R. NIELSEN, 2004 Linkage disequilibrium as a signature of selective sweeps. Genetics 167: 1513–1524.
KIM, Y., and W. STEPHAN, 2002 Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160: 765–777.
KIM, Y., and W. STEPHAN, 2003 Selective sweeps in the presence of interference among partially linked loci. Genetics 164: 389–398.
KIM, Y., and T. WIEHE, 2009 Simulation of DNA sequence evolution under models of recent directional selection. Briefings Bioinformatics 10: 84–96.
LAMASON, R. L., M. MOHIDEEN, J. R. MEST, A. C. WONG, H. L. NORTON et al., 2005 SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310: 1782–1786.
LEE, C. E., 2002 Evolutionary genetics of invasive species. Trends Ecol. Evol. 17: 386–391.[CrossRef]
LEE, C. E., and G. W. GELEMBIUK, 2008 Evolutionary origins of invasive populations. Evolutionary Appl. 1: 427–448.[CrossRef]
LI, H., and W. STEPHAN, 2006 Inferring the demographic history and rate of adaptive substitution in Drosophila. PLoS Genet. 2: e166.[CrossRef][Medline]
MACPHERSON, J. M., G. SELLA, J. C. DAVIS and D. A. PETROV, 2007 Genomewide spatial correspondence between nonsynonymous divergence and neutral polymorphism reveals extensive adaptation in Drosophila. Genetics 177: 2083.
MARTH, G. T., E. CZABARKA, J. MURVAI and S. T. SHERRY, 2004 The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166: 351–372.
MAYNARD SMITH, J., and J. HAIGH, 1974 The hitch-hiking effect of a favorable gene. Genet. Res. 23: 23–35.[Medline]
MYLES, S., M. SOMEL, K. TANG, J. KELSO and M. STONEKING, 2007 Identifying genes underlying skin pigmentation differences among human populations. Hum. Genet. 120: 613–621.[CrossRef][Medline]
NAIR, S., J. T. WILLIAMS, A. BROCKMAN, L. PAIPHUN, M. MAYXAY et al., 2003 A selective sweep driven by pyrimethamine treatment in Southeast Asian malaria parasites. Mol. Biol. Evol. 20: 1526–1536.
NIELSEN, R., 2005 Molecular signatures of natural selection. Annu. Rev. Genet. 39: 197–218.[CrossRef][Medline]
NIELSEN, R., S. WILLIAMSON, Y. KIM, M. J. HUBISZ, A. G. CLARK et al., 2005 Genomic scans for selective sweeps using SNP data. Genome Res. 15: 1566.
ORR, H. A., and A. J. BETANCOURT, 2001 Haldane's sieve and adaptation from the standing genetic variation. Genetics 157: 875–884.
PARRISH, C. R., E. C. HOLMES, D. M. MORENS, E. C. PARK, D. S. BURKE et al., 2008 Cross-species virus transmission and the emergence of new epidemic diseases. Microbiol. Mol. Biol. Rev. 72: 457–470.
PENNINGS, P. S., and J. HERMISSON, 2006 Soft sweeps II—molecular population genetics of adaptation from recurrent mutation or migration. Mol. Biol. Evol. 23: 1076–1084.
PRZEWORSKI, M., 2003 Estimating the time since the fixation of a beneficial allele. Genetics 164: 1667–1676.
PRZEWORSKI, M., G. COOP and J. D. WALL, 2005 The signature of positive selection on standing genetic variation. Evolution 59: 2312–2323.[CrossRef][Medline]
ROSENBERG, N. A., and M. NORDBORG, 2002 Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat. Rev. Genet. 3: 380–390.[CrossRef][Medline]
SABETI, P. C., S. F. SCHAFFNER, B. FRY, J. LOHMUELLER, P. VARILLY et al., 2006 Positive natural selection in the human lineage. Science 312: 1614–1620.
SCHLENKE, T. A., and D. J. BEGUN, 2004 Strong selective sweep associated with a transposon insertion in Drosophila simulans. Proc. Natl. Acad. Sci. 101: 1626–1631.
STEPHAN, W., T. H. E. WIEHE and M. W. LENZ, 1992 The effect of strongly selected substitutions on neutral polymorphism: analytical results based on diffusion theory. Theor. Popul. Biol. 41: 237–254.[CrossRef]
TESHIMA, K. M., and M. PRZEWORSKI, 2006 Directional positive selection on an allele of arbitrary dominance. Genetics 172: 713–718.
THORNTON, K. R., and J. D. JENSEN, 2007 Controlling the false-positive discovery rate in multilocus genome scans. Genetics 175: 737–750.
THORNTON, K. R., J. D. JENSEN, C. BECQUET and P. ANDOLFATTO, 2007 Progress and prospects in mapping recent selection in the genome. Heredity 98: 340–348.[Medline]
WANG, R. L., A. STEC, J. HEY, L. LUKENS and J. DOEBLEY, 1999 The limits of selection during maize domestication. Nature 398: 236–239.[CrossRef][Medline]
WILLIAMSON, S. H., M. J. HUBISZ, A. G. CLARK, B. A. PAYSEUR, C. D. BUSTAMANTE et al., 2007 Localizing recent adaptive evolution in the human genome. PLoS Genet. 3: e90.[CrossRef][Medline]
WOOTTON, J. C., X. FENG, M. T. FERDIG, R. A. COOPER, J. MU et al., 1999 Genetic diversity and chloroquine selective sweeps in Plasmodium falciparum. Plant Mol. Biol. 50: 333–359.[CrossRef]
WRIGHT, S. I., I. V. BI, S. G. SCHROEDER, M. YAMASAKI, J. F. DOEBLEY et al., 2005 The effects of artificial selection on the maize genome. Science 308: 1310–1314.
Communicating editor: R. NIELSEN
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.109.109447v1
184/2/571 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Kim, Y.
- Articles by Gulisija, D.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Kim, Y.
- Articles by Gulisija, D.





































