## Abstract

Genetic diversity is essential for population survival and adaptation to changing environments. Demographic processes (*e.g.*, bottleneck and expansion) and spatial structure (*e.g.*, migration, number, and size of populations) are known to shape the patterns of the genetic diversity of populations. However, the impact of temporal changes in migration on genetic diversity has seldom been considered, although such events might be the norm. Indeed, during the millions of years of a species’ lifetime, repeated isolation and reconnection of populations occur. Geological and climatic events alternately isolate and reconnect habitats. We analytically document the dynamics of genetic diversity after an abrupt change in migration given the mutation rate and the number and sizes of the populations. We demonstrate that during transient dynamics, genetic diversity can reach unexpectedly high values that can be maintained over thousands of generations. We discuss the consequences of such processes for the evolution of species based on standing genetic variation and how they can affect the reconstruction of a population’s demographic and evolutionary history from genetic data. Our results also provide guidelines for the use of genetic data for the conservation of natural populations.

GENETIC diversity in a population of constant size results from the balance between the occurrence of new mutations and the loss of alleles by genetic drift (Fisher 1922; Wright 1931; Kimura and Crow 1964). The expected population genetic diversity can thus be estimated from the effective population size and the mutation rate in the population. In subdivided populations this estimate should further account for the strength of migration (Maruyama 1970; Smith 1970; Nei 1973): limited migration allows for strong differentiation between populations, while strong migration tends to homogenize genetic diversity between populations. Genetic diversity is also known to be affected by population demographic changes; following bottlenecks and founder events, a loss of genetic diversity is expected to occur (Nei *et al.* 1975). Recently, spatial population expansions were shown to lead to increased differentiation between populations and to generate a low level of genetic diversity at the front of the expansion (Excoffier *et al.* 2009).

Although theoretical studies on the dynamics of genetic diversity in subdivided populations started appearing in the 1970s (Nei and Feldman 1972; Latter 1973; Nei 1973; Nagylaki 1974, 1977), the transient dynamics and nonequilibrium states of genetic diversity still do not have a good theoretical basis. Early authors characterized the ultimate rate of change of genetic diversity after a perturbation (either a change in population size or gene flow; Nei and Feldman 1972; Latter 1973; Nei 1973; Nagylaki 1974, 1977). They found that changes in genetic diversity are related to the total effective population size, which results in a slow dynamics of genetic diversity change. They thus first highlighted that nonequilibrium states and transient dynamics are expected to act on very large temporal scales. In particular, they showed that decreases in migration rates (population fragmentation or isolation) have long-term effects on genetic diversity: they reduce the amount of genetic diversity within populations and allow for population differentiation (Latter 1973; Takahata and Nei 1985). Additionally, it has been shown that short timescale random fluctuations in migration increase population differentiation (Nagylaki 1979; Whitlock 1992; Rice and Papadopoulos 2009) while cyclic fluctuations of gene flow (such as seasonal fluctuations) mainly affect genetic diversity within populations (Karlin 1982; Shpak *et al.* 2010). Although the genetic consequences of migration events (admixture) have recently received much attention (*e.g.*, Pritchard *et al.* 2000; Falush *et al.* 2003; Price *et al.* 2009; Gravel 2012), their impact on genetic diversity and more particularly the expected induced transient dynamics have not received much attention.

Genetic diversity has a crucial importance in estimating populations at risk of extinction and species’ adaptive potential. Current genetic diversity characterizes species at risk of extinction through inbreeding depression, loss of genetic diversity, and accumulation of deleterious mutations (Gilpin and Soule 1986; Jimenez *et al.* 1994; Frankham 1995; Hedrick and Kalinowski 2000). The current level of genetic diversity (or standing genetic variation) is now widely recognized as a determinant for the adaptation of a population to a novel environment (Turner *et al.* 1993; Feder *et al.* 2003; Pelz *et al.* 2005; Colosimo *et al.* 2005; Hermisson and Pennings 2005; Myles *et al.* 2005; Hernandez *et al.* 2011; Jones *et al.* 2012). First, under new selective pressures, the adaptive value of a preexisting allele can switch from neutral or deleterious to beneficial (Gibson and Dworkin 2004; Hermisson and Pennings 2005). Second, alleles from the standing genetic variation are present at higher frequencies in the population than any newly arisen (*de novo*) mutation are; thus, they have higher fixation probabilities and lower times to fixation (Barrett and Schluter 2008). Finally, these alleles have already passed successive selective filters and are consequently more likely to be compatible with the background genome (Orr and Betancourt 2001; Schluter *et al.* 2004; Barrett and Schluter 2008).

Measures of genetic diversity are widely used to understand and infer the demographic and evolutionary history of populations. Indeed, statistical tests using polymorphism data can detect departure from neutrality and infer demographic or selective processes (*e.g.*, Ewens 1972; Watterson 1978; Tajima 1983; Fu and Li 1993; Fay and Wu 2000; see review in Kreitman 2000). Furthermore, due to recent modeling advances in coalescent theory and increased genomic data and computational power, it is now possible to distinguish different demographic scenarios (*e.g.*, population bottleneck and subdivision; Peter *et al.* 2010) and estimate demographic and selective parameters (*e.g.*, populations size and growth rate, proportion of admixture, selection coefficient) using polymorphism data (Beaumont *et al.* 2002; Kim and Stephan 2002; Kim and Nielsen 2004; Nielsen *et al.* 2005; Price *et al.* 2009). Nevertheless, it is often difficult to distinguish between the transient effects of demographic changes and the effects of selection on polymorphism data (Jensen *et al.* 2005; Nielsen 2005; Li and Stephan 2006; Kim and Gulisija 2010; Pavlidis *et al.* 2010). It is also difficult to distinguish between the signatures of different demographic changes such as changes in population size, number, or migration rate (Wakeley 1999). A better understanding of the impact on genetic data of transient dynamics during demographic changes is necessary to disentangle these processes.

Interestingly, although the impact of population subdivision and short timescale population demographic changes on genetic diversity have received a lot of attention, other processes, such as long-term isolation and subsequent population reconnection, have received little attention. Such events have, without a doubt, occurred several times in the past, at long and short timescales. Repeated environmental changes have modified habitats and species distribution and created isolation and reconnection of populations. For example, during the climatic oscillations of the Quaternary period, temperate and tropical species were successively isolated into refugia and experienced habitat and population expansion, allowing for population reconnection (Hewitt 2000, 2004; Zhang *et al.* 2008; Young *et al.* 2009). At the same time, the reduction of sea levels (120 m lower than present; Lambeck *et al.* 2004) allowed the formation of land bridges that connected isolated lands in several parts of the world (Hewitt 2000). Repeated changes in water level resulted in fragmentation and fusion of basins within continents (as in the Great African Lakes; Galis and Metz 1998; Sturmbauer *et al.* 2001). Similarly, geological events such as volcanic eruptions induced periodic isolation and reconnection of islands (Cook 2008), while tectonic processes such as the formation of mountains isolated populations and reconnected others (Hughes and Eastwood 2006; Antonelli *et al.* 2009; Antonelli and Sanmartín 2011). More recently, climatic, environmental, and anthropogenic changes (*e.g.*, global warming, urbanization, and agriculture) have also played important roles in modifying the connectivity pattern between populations (Miller and Hobbs 2002; Delaney *et al.* 2010). Consequently, some species are currently subdivided into poorly connected or completely isolated populations; for examples, ground beetles (Keller *et al.* 2004), salamanders (Noel *et al.* 2007), and crickets (Vandergast *et al.* 2009). In the meantime, other species experience habitat and population expansion (*e.g.*, sparrows, white-tailed deer, zebra mussels; Waples 2010). Isolation and reconnection of populations not only reflect abiotic processes, but they can also represent spatial and temporal interactions of populations (*e.g.*, secondary contacts; Green *et al.* 2010; Domingues *et al.* 2012). Consequently, transient states of genetic diversity are expected to be the norm and deserve much more attention.

In this study, we analytically characterized the dynamics of genetic diversity following a change in migration rate between populations, given any migration rate, mutation rate, population size, and degree of fragmentation. We first analyzed how genetic diversity is affected by an event of isolation of populations and by an event of reconnection of populations. We then generalized our results for situations where the migration rate between populations displays strong variation. We demonstrate that temporal changes of migration generate periods where genetic diversity reaches unexpectedly high values that can be maintained over thousands of generations. We also show that migration changes can produce a signature on summary statistics such as Tajima’s *D* and Ewens–Watterson’s statistics that cannot be differentiated from a signature of population size change or from the signature of selection. Finally, we discuss how such processes can affect observed macroevolutionary patterns of species diversity and how they can affect the reconstruction of populations’ demographic and evolutionary history from genetic data.

## Genetic Diversity of Populations

To study the dynamics of genetic diversity after connectivity changes, we consider diploid individuals in a finite island model composed of *n* random mating populations of size *N*, so that the total population size is *nN*. The populations exchange migrants at a rate *m*. The mutations follow the infinite allele model (each mutation produces a new allele; Kimura and Crow 1964) and occur at a rate *μ*. The generations are nonoverlapping (Wright–Fisher model; Fisher 1930; Wright 1931).

Genetic diversity, *H*, is estimated using the identity-by-descent *F* between pairs of alleles, through the relationship provided by Nei and Feldman (1972):*H*_{s} and between-population genetic diversity *H*_{b}, using within- and between-population genetic identities, respectively. Within-population genetic identity, *F*_{s}, corresponds to the probability that two genes randomly chosen from the same population are identical by descent. Between-population genetic identity, *F*_{b}, corresponds to the probability that two genes randomly chosen from different populations are identical by descent. Considering that within- and between-population genetic identities *F*_{s} and *F*_{b} at a given time *t* are, respectively, *F*_{s,}* _{t}* and

*F*

_{b,}

*, their values at the next generation (forward in time), respectively,*

_{t}*F*

_{s,}

_{t}_{+1}and

*F*

_{b,}

_{t}_{+1}, will follow (Smith 1970; Maruyama 1970; Latter 1973)

*a*is the probability that two genes at the same location before migration are still at the same location after migration (either both migrate to the same location or both do not migrate);

*b*is the probability that two genes that were at the same location before migration migrated to different locations;

*c*is the probability that two genes within a population are copies of the same gene; and (1 −

*μ*)

^{2}is the probability that neither of the two randomly chosen genes mutated.

## Predicting the Dynamics of Genetic Diversity

To characterize the impact of connectivity changes on genetic diversity, we analyzed the trajectories of within- and between-population genetic identities from any initial genetic identity state. Using Equation 2, Smith (1970) and Maruyama (1970) showed that genetic identities converge toward an equilibrium value **F**^{eq} (value given in Supporting Information, File S1). Extending the results obtained by Nei and Feldman (1972) for *n* = 2 populations, we show that the temporal dynamics of genetic identities follow (see Appendix A for more details)*λ*_{1} and *λ*_{2} are respectively the largest and smallest eigenvalues of matrix **A**, and they follow**C**_{1} and **C**_{2} are column vectors of dimension 2 composed of constant values, which depend on the parameters of the model (*m*, *μ*, *n*, and *N*) and on the initial genetic identity **F**_{0} (Appendix A).

In the next section, we provide from Equations 4a and 5 the temporal change of genetic diversity and derive the corresponding time to reach genetic diversity equilibrium after a connectivity change.

## Time to Reach Genetic Diversity Equilibrium

The change of genetic diversity can be decomposed in two main temporal dynamics: a long-term and a short-term dynamics. Indeed, the temporal change of genetic diversity depends on two components: *r*_{1} = ln(*λ*_{1}) and *r*_{2} = ln(*λ*_{2}), respectively (Appendix A). As 1 > *λ*_{1} > *λ*_{2} > 0, *r*_{1} determines the ultimate (or long-term) change of genetic diversity and *r*_{2} determines the transient (or short-term) change of genetic diversity.

When migration and mutation rates are small (*i.e.*, *m* ≪ 1 and *μ* ≪ 1) and local population sizes are large (*i.e.*, *N* ≫ 1), the decay constants *r*_{1} and *r*_{2} follow*M* = 4*Nm* is the scaled migration rate and *N*_{e} is the effective population size of the total population (inbreeding, eigenvalue, variance, and mutation effective size are equivalent in the finite island model; Whitlock and Barton 1997). As expected from theory (Whitlock and Barton 1997; Wakeley 1999), in the strong migration limit (*M* ≫ 1), the effective size is equal to the total population size *nN*, while in the weak migration limit (*M* ≪ 1), the effective size is higher than the total population size.

We can estimate the durations of the ultimate and transient changes of genetic diversity, denoted *t*_{1} and *t*_{2}, respectively. Formally, we define *t*_{1} and *t*_{2} as the times (in number of generations) needed for *α*, where *α* ε ]0; 1]:*t*_{1} and *t*_{2} simplify to (Appendix A):

The genetic diversity changes as follows (Figure 1): (i) a convergence of duration *t*_{2} from the initial genetic diversity value to a transient genetic diversity value and then (ii) a convergence of duration *t*_{1} to the genetic diversity equilibrium **H**^{eq}. The time to reach genetic diversity equilibrium, *t*_{1}, depends only on two terms: the mutation rate (term 2*μ*) and the genetic drift at the total population level (term *t*_{2}, depends on four terms: the mutation rate (term 2*μ*), the migration rate (term 2*m*), the genetic drift in each population (term *i.e.*, *t*_{1} ≫ *t*_{2}) depending on the parameter values. The timescales *t*_{1} and *t*_{2} can differ from several orders of magnitude. When *n* > 14, differences are the highest (*t*_{1} ≫ *t*_{2}), in the domain where *m* > *μ*. When *n* ≤ 14, the same conditions apply for *t*_{1} ≫ *t*_{2} except in a restricted domain where *t*_{2} ≃ 134 and the time to reach equilibrium is *t*_{1} ≃ 1.5 × 10^{5} generations (with *α* = 5%), when 10 populations of size 2500 with a mutation rate of 10^{−6} are connected with a migration rate of 0.01.

## Dynamics of Genetic Diversity After an Isolation Event

We analyzed the dynamics of genetic diversity after an isolation event, starting with a situation in which populations are connected and at their equilibrium value; *i.e.*, within- and between-population genetic diversity *H*_{s} and *H*_{b} are at the expected connection equilibrium values

We observe (Figure 2) that immediately after an isolation event, within-population genetic diversity decreases due to genetic drift to the point where it reaches the mutation-drift equilibrium of an isolated population *r*_{2} (from Equation 6). Meanwhile, between-population genetic diversity slowly increases due to the differentiation of populations induced by mutations (at a rate determined by *r*_{1} from Equation 6). Populations ultimately reach complete differentiation (equilibrium value of *t*_{2} and *t*_{1} generations, respectively (File S2).

The decrease of population genetic diversity (within) can occur quickly relative to population differentiation (between-population genetic diversity; see Figure 1). After an isolation event, within-population genetic diversity (*H*_{s}) remains above its expected equilibrium *H _{b}*) remains below its expected equilibrium

*m*= 0),

*H*

_{s}reaches a value close to its equilibrium value in

*H*

_{b}reaches a value close to its equilibrium value in

*H*

_{s}converges much more quickly than

*H*

_{b}, and when

*μ*= 10

^{−5}and

*N*= 1,000,

*H*

_{b}is significantly lower than the equilibrium value for approximately

*t*

_{1}≃ 150,000 generations while

*H*

_{s}is significantly higher than the equilibrium value for approximately

*t*

_{2}≃ 6000 generations (given

*α*= 5%).

## Dynamics of Genetic Diversity After a Connection Event

We analyzed the dynamics of genetic diversity after a connection event, starting with a situation in which populations are isolated and at their equilibrium value *r*_{2} from Equation 6). Consequently, within-population genetic diversity quickly increases and reaches a high value that is above its expected connected equilibrium value (*H _{s}* and a transient excess of genetic diversity between populations Δ

*H*

_{b}(see Figure 3). Then, due to genetic drift, both the within- and the between-population diversities decrease (slow dynamics in dark shading in Figure 3, at a rate determined by

*r*

_{1}from Equation 6), to the point where the diversities reach the expected value of mutation–migration–drift equilibrium (

Within- and between-population diversities change successively according to two timescales: first, a fast transient dynamics, followed by a slow asymptotic dynamics (separation of timescales is derived in Appendix A and illustrated in Figure 3). Because the transient dynamics can be shorter than the asymptotic dynamics, the excess of genetic diversity (Δ*H*_{s} and Δ*H*_{b}) can be maintained for a very long period (from Figure 1, *t*_{1} is longer than 10,000 generations).

## Peak of Genetic Diversity Generated by a Connection Event

In this section, we characterize the peak of within-population genetic diversity, Δ*H*_{s}, and the excess of between-population genetic diversity, Δ*H*_{b}, observed after a connection event as a function of the mutation rate, the genetic drift, the number of populations, and the migration rate after connection. The exact value of the within-population genetic diversity peak is represented in Figure 4. Assuming that migration and mutation rates are small, we can show that good approximations of the values of Δ*H*_{s} and Δ*H*_{b} are (see derivations in Appendix B)*M*, the number of populations *n*, and the scaled mutation rate *θ* = 4*Nμ* (Maruyama 1970; Smith 1970). These approximations lead to the largest absolute error when a small number of populations (*n* = 2) is combined with weak mutation (*θ* < 1) and intermediate migration (*M* ≃ 5). Nevertheless, this error is small (error <0.025 for Δ*H*_{s} and <0.08 for Δ*H*_{b}), so Equation 9 provides a good approximation of Δ*H*_{s} and Δ*H*_{b} for all *n*, *M*, and *θ* values (see Appendix B for more details about the validity of approximation 9).

The peak of genetic diversity increases with the difference between the two timescales (Δ*H*_{s} and Δ*H*_{b} increase with

In the domain where the peak is the largest (*M* ≫ 1 and *θ* ≪ 1), Δ*H*_{s} and Δ*H*_{b} reach the same value:*H*^{max} is maximized when the number of populations is (dashed line in Figure 4B):

The corresponding peak of genetic diversity, reached at *n**, is*n** maximizes the peak of genetic diversity. This can be easily explained by the following processes. During isolation, a small number of populations accumulates less between-population genetic diversity; thus, once reconnected, they share a smaller amount of diversity. In contrast, a large number of populations accumulates a higher level of genetic diversity but also has a higher connection equilibrium value; thus, once reconnected, diversity reaches its expected equilibrium and no peak of diversity is observed.

In summary, high peaks of genetic diversity (Δ*H*_{s} > 0.25 in Figure 4) can occur for a large range of the parameter space: when mutation is weak (*θ* < 0.05) and migration is moderate to strong (*M* > 0.5). Under these conditions, drastic genetic diversity changes can be observed (Δ*H*_{s} values >0.95; Figure 4B for *M* ≥ 50 and *θ* < 5 × 10^{−4}). The number of populations that maximize the peak of diversity, *n**, ranges from a few populations when *θ* ≃ 1, up to a few hundred populations when *θ* = 10^{−6} (values of *θ* < 10^{−6} are expected to be very rare, and they would require a mutation rate lower than 2.5 × 10^{−12}/bp for a 1-kb gene and a population size of 100). Interestingly, a significant peak of genetic diversity is also observed when only two populations reconnect (Δ*H*^{max}|_{n}_{=2} = 0.5; Figure 4B).

## Peak of Genetic Diversity Resulting from a Migration Rate Increase

Complete isolation of populations is not required to generate peaks of genetic diversity. Indeed, an abrupt increase of migration can generate the peak of genetic diversity characterized in the previous sections. In File S3, we determined that if migration crosses a threshold value *M*_{T}, peaks of genetic diversity can occur. The value of the threshold *M*_{T}, assuming that *m* ≪ 1 and *μ* ≪ 1, is*M*_{0} to *M* crossing the threshold value *M*_{T} (*i.e.*, *M*_{0} ≪ *M*_{T}) generates a peak of genetic diversity that can be approximated by Equation 9 (see File S3). For example, in a subdivided population of *n* = 10 and *θ* = 0.1, an increase in migration from *M*_{0} = 0.01 to *M* = 10 (which crosses the migration threshold *M*_{T} = 0.495; Equation 12), generates a peak of within-population diversity of 0.350, while a reconnection event in a similar situation would generate a peak of similar intensity (0.358).

## Implications for the Inference of Demography and Selection

To describe the impact of migration changes on the inference of demography and selection from genetic data, we described the dynamics of two broadly used summary statistics: the Ewens–Watterson statistics (Watterson 1978) and Tajima’s *D* (Tajima 1989). Both the Ewens–Watterson statistics and Tajima’s *D* are known to detect an excess (resp. deficit) of rare alleles, which induces negative (resp. positive) values of the statistics, compared with the expected neutral equilibrium (constant size population without selection). Usually, an excess of rare alleles is interpreted either as the signature of balancing selection or population expansion, and a deficit of rare alleles is interpreted as the signature of directional selection or as a population bottleneck. We used the Ewens–Watterson statistics, which we denote *H*_{EW} and follows (Watterson 1978)*H*_{s} is the genetic diversity, and *H*_{A} is the expected genetic diversity given the observed number of alleles *K*. We also used Tajima’s *D*, which we denote *D*_{T} and follows (Tajima 1989)*l* is the sample size, *π* is the average number of pairwise nucleotide differences, and *S* is the number of segregating sites.

We simulated samples of 50 sequences of 1 kb, with a per-nucleotide mutation rate of 2 × 10^{−8}, in four populations of size 2500, and ran 5000 replicate simulations. We simulated an isolation event, where the migration rate changed from 0.002 to 0 and a reconnection event in which the migration rate changed from 0 to 0.002. The simulations were performed with the software *fastsimcoal* (Excoffier and Foll 2011), and the data analysis was performed with *Arlequin* (Excoffier and Lischer 2010). We simulated samples from the same population (with the parameter values that were used, the sampling scheme had a very weak impact; see Chikhi *et al.* 2010 for a discussion of how the sampling scheme affects the values of Tajima’s *D*). To allow for convergence of the coalescent algorithm, we always assumed that the populations were connected prior to the isolation phase. In the reconnection event simulations, we set the duration of the isolated phase to 10*N*, which allowed the genetic diversity values to reach their equilibrium value.

We followed the dynamics of the statistics and estimated their distribution as a function of time. The results in Figure 5 show that an isolation event produces the same signature on the Ewens–Watterson statistics (Figure 5A) and Tajima’s *D* (Figure 5D), as expected from a bottleneck event and from directional selection. Indeed, following an isolation event, genetic drift first causes the elimination of rare alleles and then eliminates more common alleles. Consequently, the number of alleles *K* decreases more quickly than genetic diversity *H*_{s} (Figure 5, B and C). Similarly, the number of segregating sites *S* decreases faster than the number of pairwise differencies *π* (Figure 5, E and F). Therefore, *D*_{T} and *H*_{EW} are skewed toward positive values, as expected after a bottleneck or under the effect of directional selection. Moreover, the statistics remain skewed for a long period of time (<10,000 generations in our simulations, see Figure 5).

Results in Figure 6 show that a reconnection event can successively produce the same signature as expected from a population expansion or from a bottleneck event on *H*_{EW} (Figure 6A) and *D*_{T} (Figure 6, B and C). Indeed, following a reconnection event, migrants first create an excess of rare variants. The number of alleles *K* increases more quickly than the genetic diversity *H*_{s} (Figure 6, B and C), and the number of segregating sites *S* increases faster than the number of pairwise differences *π* (Figure 6, E and F), which skews *H*_{EW} and *D*_{T} toward negative values. Second, new alleles brought by migrants increase in frequency, creating an excess of common variants. Consequently *H*_{s} increases more than *K*, and *π* increases more than *S*, which skews *D*_{T} and *H*_{EW} toward positive values.

Interestingly, the observed duration of the periods in which both statistics are skewed are similar to the expected duration of the dynamics of genetic diversity (from Equation 8). After an isolation event, we observe, in Figure 5, that all statistics reach their equilibrium value within ∼10,000 generations (4*N* generations). This duration corresponds to the value of the time required to reach within-population genetic diversity equilibrium after an isolation event, *t*_{2} ≃ 12,000 generations (4.8*N* generations, estimated from Equation 8 with *α* = 5%). In this example, genetic drift is stronger than mutation (2*μ* ≪ 1/2*N*) and thus *t*_{2} ∼ 2*N*. *t*_{2} corresponds to the duration of the period in which the deficit of rare alleles skews the distribution of *H*_{EW} and *D*_{T}. After a reconnection event, we observe (Figure 6) that *H*_{EW} and *D*_{T} reach a “peak” within ∼600 generations (0.24*N* generations). This duration corresponds to the value of the duration of the transient dynamics following a reconnection event, *t*_{2} ≃ 540 generations (0.216*N* generations, estimated from Equation 8 with *α* = 5%). In this example, migration is stronger than genetic drift and mutation (*m* ≫ 1/2*N* and *m* ≫ μ), and thus, *t*_{2} ∼1/2*m*. *t*_{2} corresponds to the period during which the distribution of *H*_{EW} and *D*_{T} is skewed. Subsequently, *H*_{EW} and *D*_{T} reach their equilibrium value in ∼80,000 generations (32*N* generations); this duration corresponds to the time required to reach the genetic diversity equilibrium value after a reconnection event, *t*_{1} ≃ 75,000 generations (30*N* generations, estimated from Equation 8 with *α* = 5%). *t*_{1} corresponds to the period during which the deficit of rare alleles is eliminated.

In conclusion, both an isolation and a reconnection event induce changes in the proportion of rare alleles, which skews the values of *H*_{EW} and *D*_{T}, thus producing a signature that cannot be differentiated from the signature of past demographic events or of selection.

## Discussion

We documented a simple neutral mechanism, which creates long-term peaks of genetic diversity. This peak of genetic diversity appears shortly after an abrupt increase in migration and is conserved for a long time. We also demonstrated that such genetic diversity peaks can occur for a large and plausible range of population sizes, migration rates, mutation rates, and numbers of populations. Subsequent to the genetic diversity peak, the rate of decay of genetic diversity was slow. Consequently, the mechanisms described here leave a strong and long-term footprint on genetic diversity that affects the Ewens–Watterson statistics and Tajima’s *D* that are commonly used to infer the history of populations from genetic data.

The peak of genetic diversity is due to the spread of the genetic diversity accumulated during (partial) isolation. Therefore, the migration model that is assumed (island model of migration) is a leading factor in determining the strength of the observed genetic diversity peak. Assuming isolation by distance, the within-population genetic diversity is expected to have locally lower peaks. At the same time, under this assumption, the between-population genetic diversity is expected to be higher. Indeed, once populations are connected, each population shares with its neighboring populations alleles accumulated during isolation; thus, differentiation between distant populations will be maintained. Additionally, the amount of genetic diversity accumulated during isolation determines the size of the peak of genetic diversity. The maximum value is reached when populations are completely differentiated, *i.e.*, when no alleles are shared between populations. Our results are robust to the relaxation of the complete isolation and complete differentiation assumptions: when isolation is not complete (because of small migration or nonequilibrium genetic diversity), we show that a genetic diversity peak is still observed (see File S3).

A connection event that occurs after an isolation period might play an important role on species diversification. Indeed, we have demonstrated that such events create an excess of genetic diversity. A high level of genetic diversity has often been hypothesized as being a key factor for species diversification. First, evolution from standing genetic variation might be stronger than from *de novo* mutation (Gibson and Dworkin 2004; Hermisson and Pennings 2005; Myles *et al.* 2005; Barrett and Schluter 2008). Second, both theoretical (Gavrilets 2003; Gavrilets and Losos 2009) and empirical work suggest that a high level of preexisting genetic diversity in a population increases its rate of diversification (Harmon *et al.* 2003; Seehausen 2004; Barrett and Schluter 2008). Interestingly, in several cases of adaptive radiation, a high genetic diversity of founder populations has been documented (*e.g.*, Barrier *et al.* 1999; Bezault *et al.* 2011). Several authors argued that the connection of populations after a period of isolation might have played an important role in many adaptive radiations (Hughes and Eastwood 2006; Antonelli and Sanmartín 2011; Bezault *et al.* 2011; Joyce *et al.* 2011). Therefore, species that experienced population isolation followed by reconnection events could have benefited from a temporary genetic diversity peak, which has promoted the diversification of that species. Numerous species are known to have experienced such connectivity changes in the past and show remarkable levels of genetic and species diversity (Arnegard *et al.* 1999). For example, cichlid fishes in the great African lakes experienced periods of habitat fragmentation and reconnection due to lake water level fluctuations (Arnegard *et al.* 1999); there is some evidence that these processes might have played a role in the explosive radiation of the species (Owen *et al.* 1990; Young *et al.* 2009). Additionally, a high rate of speciation is correlated with the timeframe surrounding the uplift of the Northern Andes (Sedano and Burns 2010). The mechanisms described here are thus expected to considerably affect the ability of species to adapt to novel environmental conditions and to diversify over a very long period of time.

Statistics on allelic frequencies such as the Ewens–Watterson statistics (Watterson 1978) and Tajima’s *D* (Tajima 1989) allow the inference of either selection or population demographic changes. Here, we demonstrate that migration changes can lead to signatures that cannot be differentiated from a selection process or a population size change when using the Ewens–Watterson test and Tajima’s *D*. Therefore, past migration changes must be considered more carefully and should be viewed as an alternative explanation of bias in neutrality tests and bottleneck or expansion signals. Recently, authors have shown that population structure can bias neutrality tests and produce false bottleneck signals (Leblois *et al.* 2006; Städler *et al.* 2009; Chikhi *et al.* 2010) and that shortly after an isolation event departure from the neutrality can be incorrectly inferred (as shown with simulations by Broquet *et al.* 2010 and discussed in Waples 2010). The proper interpretation of genetic signatures is crucial for the understanding of the evolutionary history of populations. An interesting extension of this work would be to analyze in more detail the molecular signature of the mechanisms described here and to provide methods that allow the differentiation of such events from selection or demographic changes. Moreover, our results are also relevant for the study of genealogies. Indeed, genetic identities as considered here are commonly used to describe coalescence time distributions (Slatkin 1991; Rousset 1996; Wakeley 1999). Future investigations should also investigate the consequences of isolation and connection events on phylogenetic tree reconstruction. Statistical tools that are available to estimate demographic parameters classically focus on *a priori* specific scenarios (*e.g.*, population bottleneck, expansion, population with constant migration, population split with subsequent migration; see review in Kuhner 2009). Given the strong impact of migration changes on genetic diversity, accounting for such scenarios is necessary. Recent methods allowing a larger range of population demographic scenarios, such as approximate Bayesian computation (Beaumont *et al.* 2002; Beaumont 2010), may be powerful tools with which to disentangle the signature of demographic processes from the observed genetic diversity.

One of the major goals of conservation genetics is to maintain genetic diversity, decrease extinction risks, avoid inbreeding depression, maintain species evolutionary potential, and decrease species vulnerability to environmental change (Gilpin and Soule 1986; Newman and Pilson 1997; Jump *et al.* 2009). In this context, conservationists need to estimate the genetic diversity of a population and its effective size. Such measures are commonly obtained from genetic data and are estimated with standard statistics (Wright 1950; Jorde and Ryman 2007). Although new approaches that consider populations at a nonequilibrium state are emerging, to estimate population size changes and instantaneous migration rates (*e.g.*, Hey and Nielsen 2004), the expected level of genetic diversity is still commonly estimated assuming that populations are at an equilibrium. As shown here, genetic diversity is more likely to be in a transient state. We have demonstrated that reconnecting isolated populations increases genetic diversity above the expected equilibrium value, while isolating populations induces a slow decrease of genetic diversity. Consequently, any estimate inferred from data collected from a population that underwent strong migration changes will not reflect the demographic situation of the population (*e.g.*, census size, genetic diversity). This can have drastic consequences on the selection of conservation strategies and for the management of species (Pearse and Crandall 2004; Caballero *et al.* 2010).

## Acknowledgments

The authors thank two anonymous reviewers and Noah Rosenberg for their valuable comments and suggestions, which increased the clarity and the scope of the article. This project was funded by the Swiss National Research Foundation (SNRF) grant nos. PZ00P3–121702, PZ00P3–139421/1, and 31003A–130065.

## Appendix A: Dynamics of Genetic Diversity

In this appendix, we describe the temporal change of genetic diversity (derivation of Equations 4a and 8 and the separation of the dynamics of genetic diversity into two timescales).

### Temporal change of genetic diversity

The solution to Equation 3 is**P** the transformation matrix with eigenvector **U**_{1} (associated with *λ*_{1}) as first column and eigenvector **U**_{2} (associated with *λ*_{2}) as second column,**C**_{1} = *y*_{10}**U**_{1} and **C**_{2} = *y*_{20}**U**_{2} leads to Equation 4a.

**F*** _{t}* changes according to two exponential decay functions,

*r*

_{1}= ln(

*λ*

_{1}) and

*r*

_{2}= ln(

*λ*

_{2}) are the decay constants that determine the rate of change of functions

Therefore, the eigenvalues of matrix **A** can be used to compute the rates of change of genetic diversity. As *λ*_{1} > *λ*_{2} and both eigenvalues are <1, we have |*r*_{2}| > |*r*_{1}|, and thus *r*_{2} determines the transient rate of change of genetic diversity, while *r*_{1} determines the asymptotic rate of change of genetic diversity.

We now want to simplify the expression of the rates of change of genetic diversity. To do so, we can rewrite Equation 5 as*bc* ≪ (1 − *a*(1 − *c*) + *b*)^{2} (as *m* ≪ 1 and *N* ≫ 1), and given that

Considering that migration rates and mutation rates are small, we can neglect terms in *m*^{2}, *μ*^{2}, *m*/*N*, *mμ,* and *N* ≫ 1), Equation A6 simplifies to Equation 6 as ln(1 − *x*) = −*x* + *o*(*x*).

### Respective length of the asymptotic and transient dynamics periods

We denote *t*_{1} and *t*_{2} as the times needed for *α*, where *α* ε ]0; 1],*r*_{1} and *r*_{2} from Equation 6 into Equation A8, and we obtain Equation 8. **F*** _{t}* approximately follows

### Timescales separation

This section presents the conditions for *t*_{1} ≫ *t*_{2}. When *t*_{1} ≫ *t*_{2},

Equation A10 decomposes the dynamics of **F*** _{t}* into two timescales: a

*transient period*of length

*t*

_{2}and an

*asymptotic period*of length

*t*

_{1}−

*t*

_{2}≃

*t*

_{1}. For

*t*>

*t*

_{1}, the genetic identity is close to its equilibrium value

**F**

^{eq}, so

*t*

_{1}can be interpreted as the duration of the disequilibrium period.

Equation A10 is true if a *t* exists such that *t* exists such that

We can demonstrate that proposition (A12) depends only on the ratio *t*_{1} and *t*_{2} Equation 7). Proposition (A12) leads to*m* and *μ* are small, and that *N* is large, this ratio is approximately equal to (from Equation 8)

From Equation A13 we can derive the conditions of the timescales separation of the dynamics of genetic diversity (*i.e.*, the parameter values for which *t*_{1} ≫ *t*_{2}). When *n* > 14, differences are the highest (*t*_{1} ≫ *t*_{2}), in the domain where *m* > *μ*. When *n* ≤ 14, the same conditions apply for *t*_{1} ≫ *t*_{2} except in a restricted domain where *t*_{1} ≫ *t*_{2}; see Figure A1). Indeed, denoting

For *θ* ≫ 1, we can neglect terms that do not contain *θ*, and conditions (A14) simplify to*M* ≫ *θ*.

For *θ* ≪ 1, terms that contain *θ* can be neglected in Equation A14, which yields the following conditions:*A* = 4*n* − 1. The size of this domain decreases when *n* increases (Figure A1), and Equation A10 is valid for any value of *M* in the domain *θ* ≪ 1 when *n* > 14 (as conditions (A16) are relaxed when 4*n* > *A* + 1, with

## Appendix B: Peak of Genetic Diversity Generated by a Reconnection Event

We derive first the value of the peak of within-population genetic diversity, Δ*H*_{s}, and the transient excess of between-population genetic diversity, Δ*H*_{b}, generated by a reconnection event. Second, we characterize their dependency on the migration rate, mutation rate, size, and number of populations.

### Derivation of ΔH

We denote the vector of genetic diversity excess at the end of the transient dynamics phase as

The genetic diversity at the end of the transient dynamics phase is approximately *t* ≃ *t*_{2}). Thus **ΔH** is approximately**C**_{1} = *y*_{10}**U**_{1} (from Equations A2 and A3). To derive the value of **C**_{1}, we compute **U**_{1} and *y*_{10}. The eigenvector *λ*_{1} and follows_{red} = (1 − *a*(1 − *c*) + *b*)^{2} − 4*bc*. We set *u*_{11} = 2(1 − *a*), leading to*λ*_{2}:

Denoting the initial value of genetic identity **U**_{1} and **U**_{2} in Equation A3 leads to

Assuming isolation equilibrium for the initial identity leads to **ΔH** simplifies to*bc* ≪ (1 − *a*(1 − *c*) + *b*)^{2} (as *m* ≪ 1 and *N* ≫ 1), and given that **ΔH** further becomes

Assuming that migration and mutation rates are always small, we can neglect terms in *m*^{2}, *μ*^{2}, *mμ*, *α* = 0.05) in Equation B5, we obtain Equation 9.

Equation 9 provides a good approximation of the size of the peak of within-population genetic diversity Δ*H*_{s} and of the transient excess of between-population genetic diversity Δ*H*_{b} in the entire parameter domain. Figure B1, A and C, and Figure B2, A and C, represent the exact (solid line) and approximate (dashed line; from Equation 9) values of Δ*H*_{s} and Δ*H*_{b}, respectively, as a function of *θ* and *M*, for *n* = 2 and *n* = 20; we can see that the true and approximate values are very close. Figure B1, B and D, and Figure B2, B and D, represent the absolute error resulting from the use of Equation 9 as an approximation of Δ*H*_{s} and Δ*H*_{b}, respectively, instead of its exact value, for *n* = 2 and *n* = 20. Discrepancies between Equation 9 and the true values of Δ*H*_{s} and Δ*H*_{b} can first come from the assumption that *m* ≪ 1, *μ* ≪ 1, and *N* ≫ 1 and second from the assumption of the existence of *t*_{2} such that

### Maximum peak of diversity after a connection event

The peak of genetic diversity increases monotonously with *M* (*m*, *μ*, *n*, and *N*) and decreases monotonously with *θ* (*m*, *μ*, *n*, and *N*). The peak of diversity is maximized for intermediate values of *n* and reached when*μ*^{2}, *m*^{2}, *μm*, *θ* and high *M*, solving Equation B6 using Equation 10 for the peak of genetic diversity leads to Equation 11. When *n* = *n**, Equation 10 leads to a peak of genetic diversity of

## Footnotes

*Communicating editor: N. A. Rosenberg*

- Received August 27, 2012.
- Accepted December 26, 2012.

- Copyright © 2013 by the Genetics Society of America