Abstract
Genetic diversity is essential for population survival and adaptation to changing environments. Demographic processes (e.g., bottleneck and expansion) and spatial structure (e.g., migration, number, and size of populations) are known to shape the patterns of the genetic diversity of populations. However, the impact of temporal changes in migration on genetic diversity has seldom been considered, although such events might be the norm. Indeed, during the millions of years of a species’ lifetime, repeated isolation and reconnection of populations occur. Geological and climatic events alternately isolate and reconnect habitats. We analytically document the dynamics of genetic diversity after an abrupt change in migration given the mutation rate and the number and sizes of the populations. We demonstrate that during transient dynamics, genetic diversity can reach unexpectedly high values that can be maintained over thousands of generations. We discuss the consequences of such processes for the evolution of species based on standing genetic variation and how they can affect the reconstruction of a population’s demographic and evolutionary history from genetic data. Our results also provide guidelines for the use of genetic data for the conservation of natural populations.
GENETIC diversity in a population of constant size results from the balance between the occurrence of new mutations and the loss of alleles by genetic drift (Fisher 1922; Wright 1931; Kimura and Crow 1964). The expected population genetic diversity can thus be estimated from the effective population size and the mutation rate in the population. In subdivided populations this estimate should further account for the strength of migration (Maruyama 1970; Smith 1970; Nei 1973): limited migration allows for strong differentiation between populations, while strong migration tends to homogenize genetic diversity between populations. Genetic diversity is also known to be affected by population demographic changes; following bottlenecks and founder events, a loss of genetic diversity is expected to occur (Nei et al. 1975). Recently, spatial population expansions were shown to lead to increased differentiation between populations and to generate a low level of genetic diversity at the front of the expansion (Excoffier et al. 2009).
Although theoretical studies on the dynamics of genetic diversity in subdivided populations started appearing in the 1970s (Nei and Feldman 1972; Latter 1973; Nei 1973; Nagylaki 1974, 1977), the transient dynamics and nonequilibrium states of genetic diversity still do not have a good theoretical basis. Early authors characterized the ultimate rate of change of genetic diversity after a perturbation (either a change in population size or gene flow; Nei and Feldman 1972; Latter 1973; Nei 1973; Nagylaki 1974, 1977). They found that changes in genetic diversity are related to the total effective population size, which results in a slow dynamics of genetic diversity change. They thus first highlighted that nonequilibrium states and transient dynamics are expected to act on very large temporal scales. In particular, they showed that decreases in migration rates (population fragmentation or isolation) have long-term effects on genetic diversity: they reduce the amount of genetic diversity within populations and allow for population differentiation (Latter 1973; Takahata and Nei 1985). Additionally, it has been shown that short timescale random fluctuations in migration increase population differentiation (Nagylaki 1979; Whitlock 1992; Rice and Papadopoulos 2009) while cyclic fluctuations of gene flow (such as seasonal fluctuations) mainly affect genetic diversity within populations (Karlin 1982; Shpak et al. 2010). Although the genetic consequences of migration events (admixture) have recently received much attention (e.g., Pritchard et al. 2000; Falush et al. 2003; Price et al. 2009; Gravel 2012), their impact on genetic diversity and more particularly the expected induced transient dynamics have not received much attention.
Genetic diversity has a crucial importance in estimating populations at risk of extinction and species’ adaptive potential. Current genetic diversity characterizes species at risk of extinction through inbreeding depression, loss of genetic diversity, and accumulation of deleterious mutations (Gilpin and Soule 1986; Jimenez et al. 1994; Frankham 1995; Hedrick and Kalinowski 2000). The current level of genetic diversity (or standing genetic variation) is now widely recognized as a determinant for the adaptation of a population to a novel environment (Turner et al. 1993; Feder et al. 2003; Pelz et al. 2005; Colosimo et al. 2005; Hermisson and Pennings 2005; Myles et al. 2005; Hernandez et al. 2011; Jones et al. 2012). First, under new selective pressures, the adaptive value of a preexisting allele can switch from neutral or deleterious to beneficial (Gibson and Dworkin 2004; Hermisson and Pennings 2005). Second, alleles from the standing genetic variation are present at higher frequencies in the population than any newly arisen (de novo) mutation are; thus, they have higher fixation probabilities and lower times to fixation (Barrett and Schluter 2008). Finally, these alleles have already passed successive selective filters and are consequently more likely to be compatible with the background genome (Orr and Betancourt 2001; Schluter et al. 2004; Barrett and Schluter 2008).
Measures of genetic diversity are widely used to understand and infer the demographic and evolutionary history of populations. Indeed, statistical tests using polymorphism data can detect departure from neutrality and infer demographic or selective processes (e.g., Ewens 1972; Watterson 1978; Tajima 1983; Fu and Li 1993; Fay and Wu 2000; see review in Kreitman 2000). Furthermore, due to recent modeling advances in coalescent theory and increased genomic data and computational power, it is now possible to distinguish different demographic scenarios (e.g., population bottleneck and subdivision; Peter et al. 2010) and estimate demographic and selective parameters (e.g., populations size and growth rate, proportion of admixture, selection coefficient) using polymorphism data (Beaumont et al. 2002; Kim and Stephan 2002; Kim and Nielsen 2004; Nielsen et al. 2005; Price et al. 2009). Nevertheless, it is often difficult to distinguish between the transient effects of demographic changes and the effects of selection on polymorphism data (Jensen et al. 2005; Nielsen 2005; Li and Stephan 2006; Kim and Gulisija 2010; Pavlidis et al. 2010). It is also difficult to distinguish between the signatures of different demographic changes such as changes in population size, number, or migration rate (Wakeley 1999). A better understanding of the impact on genetic data of transient dynamics during demographic changes is necessary to disentangle these processes.
Interestingly, although the impact of population subdivision and short timescale population demographic changes on genetic diversity have received a lot of attention, other processes, such as long-term isolation and subsequent population reconnection, have received little attention. Such events have, without a doubt, occurred several times in the past, at long and short timescales. Repeated environmental changes have modified habitats and species distribution and created isolation and reconnection of populations. For example, during the climatic oscillations of the Quaternary period, temperate and tropical species were successively isolated into refugia and experienced habitat and population expansion, allowing for population reconnection (Hewitt 2000, 2004; Zhang et al. 2008; Young et al. 2009). At the same time, the reduction of sea levels (120 m lower than present; Lambeck et al. 2004) allowed the formation of land bridges that connected isolated lands in several parts of the world (Hewitt 2000). Repeated changes in water level resulted in fragmentation and fusion of basins within continents (as in the Great African Lakes; Galis and Metz 1998; Sturmbauer et al. 2001). Similarly, geological events such as volcanic eruptions induced periodic isolation and reconnection of islands (Cook 2008), while tectonic processes such as the formation of mountains isolated populations and reconnected others (Hughes and Eastwood 2006; Antonelli et al. 2009; Antonelli and Sanmartín 2011). More recently, climatic, environmental, and anthropogenic changes (e.g., global warming, urbanization, and agriculture) have also played important roles in modifying the connectivity pattern between populations (Miller and Hobbs 2002; Delaney et al. 2010). Consequently, some species are currently subdivided into poorly connected or completely isolated populations; for examples, ground beetles (Keller et al. 2004), salamanders (Noel et al. 2007), and crickets (Vandergast et al. 2009). In the meantime, other species experience habitat and population expansion (e.g., sparrows, white-tailed deer, zebra mussels; Waples 2010). Isolation and reconnection of populations not only reflect abiotic processes, but they can also represent spatial and temporal interactions of populations (e.g., secondary contacts; Green et al. 2010; Domingues et al. 2012). Consequently, transient states of genetic diversity are expected to be the norm and deserve much more attention.
In this study, we analytically characterized the dynamics of genetic diversity following a change in migration rate between populations, given any migration rate, mutation rate, population size, and degree of fragmentation. We first analyzed how genetic diversity is affected by an event of isolation of populations and by an event of reconnection of populations. We then generalized our results for situations where the migration rate between populations displays strong variation. We demonstrate that temporal changes of migration generate periods where genetic diversity reaches unexpectedly high values that can be maintained over thousands of generations. We also show that migration changes can produce a signature on summary statistics such as Tajima’s D and Ewens–Watterson’s statistics that cannot be differentiated from a signature of population size change or from the signature of selection. Finally, we discuss how such processes can affect observed macroevolutionary patterns of species diversity and how they can affect the reconstruction of populations’ demographic and evolutionary history from genetic data.
Genetic Diversity of Populations
To study the dynamics of genetic diversity after connectivity changes, we consider diploid individuals in a finite island model composed of n random mating populations of size N, so that the total population size is nN. The populations exchange migrants at a rate m. The mutations follow the infinite allele model (each mutation produces a new allele; Kimura and Crow 1964) and occur at a rate μ. The generations are nonoverlapping (Wright–Fisher model; Fisher 1930; Wright 1931).
Genetic diversity, H, is estimated using the identity-by-descent F between pairs of alleles, through the relationship provided by Nei and Feldman (1972):
Predicting the Dynamics of Genetic Diversity
To characterize the impact of connectivity changes on genetic diversity, we analyzed the trajectories of within- and between-population genetic identities from any initial genetic identity state. Using Equation 2, Smith (1970) and Maruyama (1970) showed that genetic identities converge toward an equilibrium value Feq (value given in Supporting Information, File S1). Extending the results obtained by Nei and Feldman (1972) for n = 2 populations, we show that the temporal dynamics of genetic identities follow (see Appendix A for more details)
In the next section, we provide from Equations 4a and 5 the temporal change of genetic diversity and derive the corresponding time to reach genetic diversity equilibrium after a connectivity change.
Time to Reach Genetic Diversity Equilibrium
The change of genetic diversity can be decomposed in two main temporal dynamics: a long-term and a short-term dynamics. Indeed, the temporal change of genetic diversity depends on two components:
When migration and mutation rates are small (i.e., m ≪ 1 and μ ≪ 1) and local population sizes are large (i.e., N ≫ 1), the decay constants r1 and r2 follow
We can estimate the durations of the ultimate and transient changes of genetic diversity, denoted t1 and t2, respectively. Formally, we define t1 and t2 as the times (in number of generations) needed for
The genetic diversity changes as follows (Figure 1): (i) a convergence of duration t2 from the initial genetic diversity value to a transient genetic diversity value and then (ii) a convergence of duration t1 to the genetic diversity equilibrium Heq. The time to reach genetic diversity equilibrium, t1, depends only on two terms: the mutation rate (term 2μ) and the genetic drift at the total population level (term
The time (in number of generations) t1 to reach genetic diversity equilibrium and the length of the transient dynamics period t2 as a function of the migration rate m. The solid line corresponds to n = 2 populations, the dashed line to n = 10, and the dotted line to n = 100. t1 is always at least one order of magnitude higher than t2. This separation of the two periods becomes even greater when . Parameter values are N = 2500, μ = 10−5, α = 5%.
Dynamics of Genetic Diversity After an Isolation Event
We analyzed the dynamics of genetic diversity after an isolation event, starting with a situation in which populations are connected and at their equilibrium value; i.e., within- and between-population genetic diversity Hs and Hb are at the expected connection equilibrium values
We observe (Figure 2) that immediately after an isolation event, within-population genetic diversity decreases due to genetic drift to the point where it reaches the mutation-drift equilibrium of an isolated population
Dynamics of (A) within-population Hs and (B) between-population Hb genetic diversity after an isolation event. Within- and between-population diversity (solid lines) were previously at their respective connection equilibrium and
. After the isolation event, within- and between-population diversities reach their isolation equilibrium
and
(dashed lines) at rates determined by r2 and r1 (Equation 6). t2 and t1 estimate the time to reach the within- and between-population genetic diversity equilibrium, respectively (Equation 8). Under the effect of genetic drift, within-population diversity reaches its equilibrium value faster than between-population genetic diversity. Parameters are n = 10, N = 2500, m = 10−4 before isolation and m = 0 afterward, and μ = 10−5. t1 ≃ 149,000 generations and t2 ≃ 13,600 generations (for α = 5%).
The decrease of population genetic diversity (within) can occur quickly relative to population differentiation (between-population genetic diversity; see Figure 1). After an isolation event, within-population genetic diversity (Hs) remains above its expected equilibrium
Dynamics of Genetic Diversity After a Connection Event
We analyzed the dynamics of genetic diversity after a connection event, starting with a situation in which populations are isolated and at their equilibrium value
Within- and between-population diversities change successively according to two timescales: first, a fast transient dynamics, followed by a slow asymptotic dynamics (separation of timescales is derived in Appendix A and illustrated in Figure 3). Because the transient dynamics can be shorter than the asymptotic dynamics, the excess of genetic diversity (ΔHs and ΔHb) can be maintained for a very long period (from Figure 1, t1 is longer than 10,000 generations).
Peak of Genetic Diversity Generated by a Connection Event
In this section, we characterize the peak of within-population genetic diversity, ΔHs, and the excess of between-population genetic diversity, ΔHb, observed after a connection event as a function of the mutation rate, the genetic drift, the number of populations, and the migration rate after connection. The exact value of the within-population genetic diversity peak is represented in Figure 4. Assuming that migration and mutation rates are small, we can show that good approximations of the values of ΔHs and ΔHb are (see derivations in Appendix B)
Dynamics of (A) within-population genetic diversity Hs and (B) between-population genetic diversity Hb after a reconnection event. Within- and between-population diversities were previously at their respective isolation equilibriums and
(Equation 4b). After the reconnection event, within- and between-population diversities reach their respective connection equilibriums
and
(dashed lines). As shown in Equation 8, the time to reach genetic diversity equilibrium t1 and the length of the transient period t2 are well separated. The two periods are: (1) fast convergence at a rate determined by r2 (Equation 6) that is driven by the spread of diversity that had accumulated within populations during isolation, which creates the peak of within-population diversity (ΔHs) and the excess of between-population diversity (ΔHb) and (2) slow dynamics at a rate determined by r1 (Equation 6) that is caused by the gradual loss of genetic diversity. A large number of generations is needed to reach equilibrium. When n = 10, N = 2500, m = 10−4 after reconnection, and μ = 10−5, t1 ≃ 97,000 generations, t2 ≃ 6,900 generations (for α = 5%) and ΔHs ≃ ΔHb ≃ 0.11.
Peak of within-population genetic diversity ΔHs generated by a reconnection event. (A) Contour plot of ΔHs as a function of θ and M, for n = 100. We can clearly see the highest peak of diversity in the high M and low θ region. (B) Contour plot of the peak of genetic diversity after a reconnection event as a function of θ and n, for M ≫ 1 (high M region identified in A). In the high M region, the within-population diversity peak ΔHs and the between-population diversity excess ΔHb are equal. The dashed line represents the number of populations which maximizes the peak of diversity .
The peak of genetic diversity increases with the difference between the two timescales (ΔHs and ΔHb increase with
In the domain where the peak is the largest (M ≫ 1 and θ ≪ 1), ΔHs and ΔHb reach the same value:
The corresponding peak of genetic diversity, reached at n*, is
In summary, high peaks of genetic diversity (ΔHs > 0.25 in Figure 4) can occur for a large range of the parameter space: when mutation is weak (θ < 0.05) and migration is moderate to strong (M > 0.5). Under these conditions, drastic genetic diversity changes can be observed (ΔHs values >0.95; Figure 4B for M ≥ 50 and θ < 5 × 10−4). The number of populations that maximize the peak of diversity, n*, ranges from a few populations when θ ≃ 1, up to a few hundred populations when θ = 10−6 (values of θ < 10−6 are expected to be very rare, and they would require a mutation rate lower than 2.5 × 10−12/bp for a 1-kb gene and a population size of 100). Interestingly, a significant peak of genetic diversity is also observed when only two populations reconnect (ΔHmax|n=2 = 0.5; Figure 4B).
Peak of Genetic Diversity Resulting from a Migration Rate Increase
Complete isolation of populations is not required to generate peaks of genetic diversity. Indeed, an abrupt increase of migration can generate the peak of genetic diversity characterized in the previous sections. In File S3, we determined that if migration crosses a threshold value MT, peaks of genetic diversity can occur. The value of the threshold MT, assuming that m ≪ 1 and μ ≪ 1, is
Implications for the Inference of Demography and Selection
To describe the impact of migration changes on the inference of demography and selection from genetic data, we described the dynamics of two broadly used summary statistics: the Ewens–Watterson statistics (Watterson 1978) and Tajima’s D (Tajima 1989). Both the Ewens–Watterson statistics and Tajima’s D are known to detect an excess (resp. deficit) of rare alleles, which induces negative (resp. positive) values of the statistics, compared with the expected neutral equilibrium (constant size population without selection). Usually, an excess of rare alleles is interpreted either as the signature of balancing selection or population expansion, and a deficit of rare alleles is interpreted as the signature of directional selection or as a population bottleneck. We used the Ewens–Watterson statistics, which we denote HEW and follows (Watterson 1978)
We simulated samples of 50 sequences of 1 kb, with a per-nucleotide mutation rate of 2 × 10−8, in four populations of size 2500, and ran 5000 replicate simulations. We simulated an isolation event, where the migration rate changed from 0.002 to 0 and a reconnection event in which the migration rate changed from 0 to 0.002. The simulations were performed with the software fastsimcoal (Excoffier and Foll 2011), and the data analysis was performed with Arlequin (Excoffier and Lischer 2010). We simulated samples from the same population (with the parameter values that were used, the sampling scheme had a very weak impact; see Chikhi et al. 2010 for a discussion of how the sampling scheme affects the values of Tajima’s D). To allow for convergence of the coalescent algorithm, we always assumed that the populations were connected prior to the isolation phase. In the reconnection event simulations, we set the duration of the isolated phase to 10N, which allowed the genetic diversity values to reach their equilibrium value.
We followed the dynamics of the statistics and estimated their distribution as a function of time. The results in Figure 5 show that an isolation event produces the same signature on the Ewens–Watterson statistics (Figure 5A) and Tajima’s D (Figure 5D), as expected from a bottleneck event and from directional selection. Indeed, following an isolation event, genetic drift first causes the elimination of rare alleles and then eliminates more common alleles. Consequently, the number of alleles K decreases more quickly than genetic diversity Hs (Figure 5, B and C). Similarly, the number of segregating sites S decreases faster than the number of pairwise differencies π (Figure 5, E and F). Therefore, DT and HEW are skewed toward positive values, as expected after a bottleneck or under the effect of directional selection. Moreover, the statistics remain skewed for a long period of time (<10,000 generations in our simulations, see Figure 5).
Effect of an isolation event on Ewens–Watterson and Tajima’s D neutrality tests and on related summary statistics. (A) Ewens–Watterson statistics (HEW), (B) genetic diversity (Hs), (C) number of alleles (K), (D) Tajima’s D (DT), (E) number of pairwise differences (π), and (F) number of segregating sites (S). For each statistics, the solid line represents the median of the distribution and the light shading represents the 97.5 and 2.5% quantiles of the distribution as a function of the number of generations t after the isolation event. Dark shading in A and D represent the expected distribution of the statistics in an isolated population at equilibrium. Values of HEW and D after an isolation event are skewed toward positive values (signature of a bottleneck or directional selection), while there was no change in the size of the population. K and S decrease more quickly than Hs and π, because rare alleles are eliminated by genetic drift more quickly than common alleles. Coalescence simulations of a 1-kb locus with a mutation rate of 2 × 10−8/bp, where four populations of size 2500 are isolated; 5000 replicates.
Results in Figure 6 show that a reconnection event can successively produce the same signature as expected from a population expansion or from a bottleneck event on HEW (Figure 6A) and DT (Figure 6, B and C). Indeed, following a reconnection event, migrants first create an excess of rare variants. The number of alleles K increases more quickly than the genetic diversity Hs (Figure 6, B and C), and the number of segregating sites S increases faster than the number of pairwise differences π (Figure 6, E and F), which skews HEW and DT toward negative values. Second, new alleles brought by migrants increase in frequency, creating an excess of common variants. Consequently Hs increases more than K, and π increases more than S, which skews DT and HEW toward positive values.
Effect of a reconnection event on Ewens–Watterson and Tajima’s D neutrality tests and on related summary statistics. (A) Ewens–Watterson statistics (HEW), (B) genetic diversity (Hs), (C) number of alleles (K), (D) Tajima’s D (DT), (E) number of pairwise differences (π), and (F) number of segregating sites (S). For each statistics, the solid line represents the median of the distribution, and the light shading represents the 97.5% and 2.5% quantiles of the distribution, as a function of the number of generations t after the isolation event. Dark shading in A and D represent the expected distribution of the statistics in an isolated equilibrium population. Values of HEW and D after a reconnection event are first skewed toward negative values (signature of a population expansion or balancing selection) and then toward positive values (signature of a bottleneck or directional selection), while there was no change in the size of the population. K and S first increase more quickly than Hs and π because immigrants bring rare alleles, and then Hs and π reach a higher value because immigrant alleles increase in frequency. Finally, alleles are eliminated by genetic drift until the statistics reach their expected equilibrium value when populations are connected. Coalescence simulations of a 1 kb locus with a mutation rate of 2 × 10−8 per bp, where 4 populations of size 2500 isolated during 25,000 generations are reconnected with a migration rate m = 0.002; 5000 replicates.
Interestingly, the observed duration of the periods in which both statistics are skewed are similar to the expected duration of the dynamics of genetic diversity (from Equation 8). After an isolation event, we observe, in Figure 5, that all statistics reach their equilibrium value within ∼10,000 generations (4N generations). This duration corresponds to the value of the time required to reach within-population genetic diversity equilibrium after an isolation event, t2 ≃ 12,000 generations (4.8N generations, estimated from Equation 8 with α = 5%). In this example, genetic drift is stronger than mutation (2μ ≪ 1/2N) and thus t2 ∼ 2N. t2 corresponds to the duration of the period in which the deficit of rare alleles skews the distribution of HEW and DT. After a reconnection event, we observe (Figure 6) that HEW and DT reach a “peak” within ∼600 generations (0.24N generations). This duration corresponds to the value of the duration of the transient dynamics following a reconnection event, t2 ≃ 540 generations (0.216N generations, estimated from Equation 8 with α = 5%). In this example, migration is stronger than genetic drift and mutation (m ≫ 1/2N and m ≫ μ), and thus, t2 ∼1/2m. t2 corresponds to the period during which the distribution of HEW and DT is skewed. Subsequently, HEW and DT reach their equilibrium value in ∼80,000 generations (32N generations); this duration corresponds to the time required to reach the genetic diversity equilibrium value after a reconnection event, t1 ≃ 75,000 generations (30N generations, estimated from Equation 8 with α = 5%). t1 corresponds to the period during which the deficit of rare alleles is eliminated.
In conclusion, both an isolation and a reconnection event induce changes in the proportion of rare alleles, which skews the values of HEW and DT, thus producing a signature that cannot be differentiated from the signature of past demographic events or of selection.
Discussion
We documented a simple neutral mechanism, which creates long-term peaks of genetic diversity. This peak of genetic diversity appears shortly after an abrupt increase in migration and is conserved for a long time. We also demonstrated that such genetic diversity peaks can occur for a large and plausible range of population sizes, migration rates, mutation rates, and numbers of populations. Subsequent to the genetic diversity peak, the rate of decay of genetic diversity was slow. Consequently, the mechanisms described here leave a strong and long-term footprint on genetic diversity that affects the Ewens–Watterson statistics and Tajima’s D that are commonly used to infer the history of populations from genetic data.
The peak of genetic diversity is due to the spread of the genetic diversity accumulated during (partial) isolation. Therefore, the migration model that is assumed (island model of migration) is a leading factor in determining the strength of the observed genetic diversity peak. Assuming isolation by distance, the within-population genetic diversity is expected to have locally lower peaks. At the same time, under this assumption, the between-population genetic diversity is expected to be higher. Indeed, once populations are connected, each population shares with its neighboring populations alleles accumulated during isolation; thus, differentiation between distant populations will be maintained. Additionally, the amount of genetic diversity accumulated during isolation determines the size of the peak of genetic diversity. The maximum value is reached when populations are completely differentiated, i.e., when no alleles are shared between populations. Our results are robust to the relaxation of the complete isolation and complete differentiation assumptions: when isolation is not complete (because of small migration or nonequilibrium genetic diversity), we show that a genetic diversity peak is still observed (see File S3).
A connection event that occurs after an isolation period might play an important role on species diversification. Indeed, we have demonstrated that such events create an excess of genetic diversity. A high level of genetic diversity has often been hypothesized as being a key factor for species diversification. First, evolution from standing genetic variation might be stronger than from de novo mutation (Gibson and Dworkin 2004; Hermisson and Pennings 2005; Myles et al. 2005; Barrett and Schluter 2008). Second, both theoretical (Gavrilets 2003; Gavrilets and Losos 2009) and empirical work suggest that a high level of preexisting genetic diversity in a population increases its rate of diversification (Harmon et al. 2003; Seehausen 2004; Barrett and Schluter 2008). Interestingly, in several cases of adaptive radiation, a high genetic diversity of founder populations has been documented (e.g., Barrier et al. 1999; Bezault et al. 2011). Several authors argued that the connection of populations after a period of isolation might have played an important role in many adaptive radiations (Hughes and Eastwood 2006; Antonelli and Sanmartín 2011; Bezault et al. 2011; Joyce et al. 2011). Therefore, species that experienced population isolation followed by reconnection events could have benefited from a temporary genetic diversity peak, which has promoted the diversification of that species. Numerous species are known to have experienced such connectivity changes in the past and show remarkable levels of genetic and species diversity (Arnegard et al. 1999). For example, cichlid fishes in the great African lakes experienced periods of habitat fragmentation and reconnection due to lake water level fluctuations (Arnegard et al. 1999); there is some evidence that these processes might have played a role in the explosive radiation of the species (Owen et al. 1990; Young et al. 2009). Additionally, a high rate of speciation is correlated with the timeframe surrounding the uplift of the Northern Andes (Sedano and Burns 2010). The mechanisms described here are thus expected to considerably affect the ability of species to adapt to novel environmental conditions and to diversify over a very long period of time.
Statistics on allelic frequencies such as the Ewens–Watterson statistics (Watterson 1978) and Tajima’s D (Tajima 1989) allow the inference of either selection or population demographic changes. Here, we demonstrate that migration changes can lead to signatures that cannot be differentiated from a selection process or a population size change when using the Ewens–Watterson test and Tajima’s D. Therefore, past migration changes must be considered more carefully and should be viewed as an alternative explanation of bias in neutrality tests and bottleneck or expansion signals. Recently, authors have shown that population structure can bias neutrality tests and produce false bottleneck signals (Leblois et al. 2006; Städler et al. 2009; Chikhi et al. 2010) and that shortly after an isolation event departure from the neutrality can be incorrectly inferred (as shown with simulations by Broquet et al. 2010 and discussed in Waples 2010). The proper interpretation of genetic signatures is crucial for the understanding of the evolutionary history of populations. An interesting extension of this work would be to analyze in more detail the molecular signature of the mechanisms described here and to provide methods that allow the differentiation of such events from selection or demographic changes. Moreover, our results are also relevant for the study of genealogies. Indeed, genetic identities as considered here are commonly used to describe coalescence time distributions (Slatkin 1991; Rousset 1996; Wakeley 1999). Future investigations should also investigate the consequences of isolation and connection events on phylogenetic tree reconstruction. Statistical tools that are available to estimate demographic parameters classically focus on a priori specific scenarios (e.g., population bottleneck, expansion, population with constant migration, population split with subsequent migration; see review in Kuhner 2009). Given the strong impact of migration changes on genetic diversity, accounting for such scenarios is necessary. Recent methods allowing a larger range of population demographic scenarios, such as approximate Bayesian computation (Beaumont et al. 2002; Beaumont 2010), may be powerful tools with which to disentangle the signature of demographic processes from the observed genetic diversity.
One of the major goals of conservation genetics is to maintain genetic diversity, decrease extinction risks, avoid inbreeding depression, maintain species evolutionary potential, and decrease species vulnerability to environmental change (Gilpin and Soule 1986; Newman and Pilson 1997; Jump et al. 2009). In this context, conservationists need to estimate the genetic diversity of a population and its effective size. Such measures are commonly obtained from genetic data and are estimated with standard statistics (Wright 1950; Jorde and Ryman 2007). Although new approaches that consider populations at a nonequilibrium state are emerging, to estimate population size changes and instantaneous migration rates (e.g., Hey and Nielsen 2004), the expected level of genetic diversity is still commonly estimated assuming that populations are at an equilibrium. As shown here, genetic diversity is more likely to be in a transient state. We have demonstrated that reconnecting isolated populations increases genetic diversity above the expected equilibrium value, while isolating populations induces a slow decrease of genetic diversity. Consequently, any estimate inferred from data collected from a population that underwent strong migration changes will not reflect the demographic situation of the population (e.g., census size, genetic diversity). This can have drastic consequences on the selection of conservation strategies and for the management of species (Pearse and Crandall 2004; Caballero et al. 2010).
Acknowledgments
The authors thank two anonymous reviewers and Noah Rosenberg for their valuable comments and suggestions, which increased the clarity and the scope of the article. This project was funded by the Swiss National Research Foundation (SNRF) grant nos. PZ00P3–121702, PZ00P3–139421/1, and 31003A–130065.
Appendix A: Dynamics of Genetic Diversity
In this appendix, we describe the temporal change of genetic diversity (derivation of Equations 4a and 8 and the separation of the dynamics of genetic diversity into two timescales).
Temporal change of genetic diversity
The solution to Equation 3 is
Ft changes according to two exponential decay functions,
Therefore, the eigenvalues of matrix A can be used to compute the rates of change of genetic diversity. As λ1 > λ2 and both eigenvalues are <1, we have |r2| > |r1|, and thus
We now want to simplify the expression of the rates of change of genetic diversity. To do so, we can rewrite Equation 5 as
Considering that migration rates and mutation rates are small, we can neglect terms in m2,
Respective length of the asymptotic and transient dynamics periods
We denote t1 and t2 as the times needed for
Timescales separation
This section presents the conditions for t1 ≫ t2. When t1 ≫ t2,
Equation A10 decomposes the dynamics of Ft into two timescales: a transient period of length t2 and an asymptotic period of length t1 − t2 ≃ t1. For t > t1, the genetic identity is close to its equilibrium value Feq, so t1 can be interpreted as the duration of the disequilibrium period.
Equation A10 is true if a t exists such that
We can demonstrate that proposition (A12) depends only on the ratio
From Equation A13 we can derive the conditions of the timescales separation of the dynamics of genetic diversity (i.e., the parameter values for which t1 ≫ t2). When n > 14, differences are the highest (t1 ≫ t2), in the domain where
For θ ≫ 1, we can neglect terms that do not contain θ, and conditions (A14) simplify to
Domain of validity of Equation A10 (white contours), as a function of the strength of migration (M) and mutation (θ), (A) n = 2, (B) n = 20. The dark shading represent the domains where the validity of Equation A10 is poor (i.e., the exact value of ; from Equation 5).
For θ ≪ 1, terms that contain θ can be neglected in Equation A14, which yields the following conditions:
Appendix B: Peak of Genetic Diversity Generated by a Reconnection Event
We derive first the value of the peak of within-population genetic diversity, ΔHs, and the transient excess of between-population genetic diversity, ΔHb, generated by a reconnection event. Second, we characterize their dependency on the migration rate, mutation rate, size, and number of populations.
Derivation of ΔH
We denote the vector of genetic diversity excess at the end of the transient dynamics phase as
The genetic diversity at the end of the transient dynamics phase is approximately
Denoting the initial value of genetic identity
Assuming isolation equilibrium for the initial identity leads to
Assuming that migration and mutation rates are always small, we can neglect terms in m2, μ2,
Equation 9 provides a good approximation of the size of the peak of within-population genetic diversity ΔHs and of the transient excess of between-population genetic diversity ΔHb in the entire parameter domain. Figure B1, A and C, and Figure B2, A and C, represent the exact (solid line) and approximate (dashed line; from Equation 9) values of ΔHs and ΔHb, respectively, as a function of θ and M, for n = 2 and n = 20; we can see that the true and approximate values are very close. Figure B1, B and D, and Figure B2, B and D, represent the absolute error resulting from the use of Equation 9 as an approximation of ΔHs and ΔHb, respectively, instead of its exact value, for n = 2 and n = 20. Discrepancies between Equation 9 and the true values of ΔHs and ΔHb can first come from the assumption that m ≪ 1, μ ≪ 1, and N ≫ 1 and second from the assumption of the existence of t2 such that
(A and C) Exact (solid lines) and approximate (dashed lines, from Equation 9) values of the peak of genetic diversity ΔHs, as a function of the strength of migration (M) and mutation (θ). In both A and C, the exact and approximate values of ΔHs are very close. (B and D) Absolute error when using Equation 9 to approximate ΔHs, as a function of M and θ. The maximum absolute error is reached when M ≃ 5 and θ < 1 in both B and D. The error decreases when M ≫ 5 or M ≪ 5. The absolute error increases when n decreases, but remains weak: (B) the maximum absolute error is 0.025 for n = 2, and (D) 0.018 for n = 20. Consequently, Equation 9 is a good approximation for the peak of genetic diversity whatever the parameter values of θ, M, and n considered.
(A and C) Exact (solid lines) and approximate (dashed lines, from Equation 9) values of the transient excess of between-population genetic diversity ΔHb, as a function of the strength of migration (M) and mutation (θ). In both A and C, the exact and approximate values of ΔHb are very close. (B and D) Absolute error when using Equation A9 to approximate ΔHb, as a function of M and θ. The maximum absolute error is reached when M ≃ 1 and θ < 1, the absolute error is 0.09 for n = 2 (B), and 0.016 for n = 20 (D). Equation 9 is a good approximation for ΔHb whatever the parameter values of θ, M, and n considered.
Maximum peak of diversity after a connection event
The peak of genetic diversity increases monotonously with M (
Footnotes
Communicating editor: N. A. Rosenberg
- Received August 27, 2012.
- Accepted December 26, 2012.
- Copyright © 2013 by the Genetics Society of America