Recent range expansions have probably occurred in many species, as they often happen after speciation events, after ice ages, or after the introduction of invasive species. While it has been shown that range expansions lead to patterns of molecular diversity distinct from those of a pure demographic expansion, the fact that many species do live in heterogeneous environments has not been taken into account. We develop here a model of range expansion with a spatial heterogeneity of the environment, which is modeled as a gamma distribution of the carrying capacities of the demes. By allowing temporal variation of these carrying capacities, our model becomes a new metapopulation model linking ecological parameters to molecular diversity. We show by extensive simulations that environmental heterogeneity induces a loss of genetic diversity within demes and increases the degree of population differentiation. We find that metapopulations with low average densities are much more affected by environmental heterogeneity than metapopulations with high average densities, which are relatively insensitive to spatial and temporal variations of the environment. Spatial heterogeneity is shown to have a larger impact on genetic diversity than temporal heterogeneity. Overall, temporal heterogeneity and local extinctions are not found to leave any specific signature on molecular diversity that cannot be produced by spatial heterogeneity.
RANGE expansions are a recurrent phenomenon in the history of many species. If one assumes that speciation occurs in a restricted portion of an ancestral species range, the fact that current species have a wide geographic distribution implies that they have spread from their initial location. This is, for instance, thought to be the case of modern humans (e.g., Cavalli-Sforza and Feldman 2003). Climatic changes such as ice ages have also triggered episodes of range contractions and expansions in many plant and animal species (e.g., Taberlet et al. 1998; Hewitt 2000; Shapiro et al. 2004; Tzedakis et al. 2004; Smith and Farrell 2005). Finally, invasive species that are introduced into new and favorable habitats often spread from their introductory site and then quickly expand to a large territory by a series of migrations and colonization events (e.g., Silva et al. 2002; Smith et al. 2004).
Austerlitz et al. (1997) studied the pattern of molecular diversity after a colonization process in a one-dimensional stepping-stone model. They showed that the expansion leads to a series of bottlenecks inducing a gradient of differentiation from the initial deme unless migration rates between demes are very large. Thus, range expansions can somehow be viewed as a series of successive founder events (Austerlitz et al. 1997). Le Corre and Kremer (1998) analyzed the effects of these founder events on within-population heterozygosity and on genetic differentiation (as measured by FST) in two models of range expansions: a linear array of demes (one-dimensional stepping-stone model), where dispersal occurs only between neighboring demes, and an island model representing an extreme case of long-distance dispersal. While an increase in FST and a decrease in intrademe heterozygosity were found in both models, these effects were much more pronounced in the linear stepping-stone model. A recent study has shown that the pattern of molecular diversity observed after a range expansion is different from that expected after a pure demographic expansion and depends mainly on the age of the expansion as well as the number of migrants exchanged between neighboring demes (Ray et al. 2003). By simulating a range expansion in a two-dimensional stepping-stone world, Ray et al. (2003) showed that the expected distribution of the number of differences between pairs of genes (also called the mismatch distribution) is bimodal if the number of genes entering each deme per generation is not large (<50). This is in contrast to the unimodal distributions observed for larger migrations rates between neighboring demes, which are very similar to those obtained after a large demographic expansion in an unsubdivided population (Slatkin and Hudson 1991; Rogers and Harpending 1992). These results were confirmed analytically for a simple instantaneous expansion in an infinite-island model (Excoffier 2004). Ray et al. (2003) also found that summary statistics used in tests of selective neutrality (like Tajima's D and Fu's FS) showed significant negative values only in demes exchanging a larger number of migrants. Contrastingly, demes receiving < ∼20 genes per generation would not show any sign of departure from demographic equilibrium despite being part of a large expanding population.
All expansion models described so far assumed that deme sizes and migration rates had been constant over time, as expected in a homogeneous and stationary environment. However, many species do not meet these assumptions, since they are actually subdivided into locally breeding demes of different size showing metapopulation dynamics, with extinction and recolonization (e.g., Hanski and Gilpin 1997). In a pioneering work, Slatkin (1977) extended the work of Maruyama (1970) and introduced a genetic metapopulation model with two different modes of recolonization: the propagule-pool model in which all colonists originate from a single, randomly chosen deme and the migrant-pool model, in which the colonists are drawn randomly from the whole population. The resulting pattern of genetic differentiation is complex and depends on the pattern of extinctions and recolonizations, since random extinctions can be considered as another form of genetic drift increasing differentiation between demes, while recolonizations can be seen as founder effects contributing to gene flow between demes, making them potentially more similar. Pannell and Charlesworth (1999) observed that colonization was more likely to follow the propagule-pool model in highly structured metapopulations, since colonists would tend to originate from neighboring colonies. Wade and McCauley (1988) reframed the central equations from Slatkin's (1977) models in terms of FST, a measure of the genetic differentiation among demes. They showed that in Slatkin's (1977) metapopulation model without mutation, population differentiation is indeed reduced when the number of colonists exceeds twice the number of migrants exchanged between demes. In all other cases, however, the process of extinction and recolonization increased population differentiation, and often very substantially. Whitlock and McCauley (1990) then generalized these recolonization models by introducing the probability of a common origin of colonizing gametes (Φ). In particular, they showed that population differentiation increases if the number of colonists is less than four times the number of migrants exchanged between demes when Φ = 0.5. This is an interesting result, since several studies of natural populations (e.g., Whitlock 1992a; Ingvarsson et al. 1997) estimated Φ to be ∼0.5.
More recently, Wakeley and Aliacar (2001) and Pannell (2003) studied the genealogical (coalescent) structure of samples drawn from an island model with recurrent extinctions and recolonizations under the generalized recolonization process of Whitlock and McCauley (1990). In line with previous studies of simple finite-island models without extinctions (Wakeley 1999), he showed that if the number of demes is large, there is a separation of timescales within the genealogical history of a sample, which consists of a brief “scattering” phase followed by a more ancient “collecting” phase that dominates the genealogical history. While the scattering phase is a stochastic and structured process shaping the pattern of variability within vs. between demes, the collecting phase is an unstructured coalescent process (Kingman 1982a,b) with an effective size determined by the migration and extinction–recolonization pattern of the metapopulation. While this approach leads to many insights about the effects of metapopulation dynamics on genetic diversity, it is not spatially explicit and assumes (like all previously presented metapopulation models) that all demes have the same size. In an early attempt to account for variability in population densities, Whitlock (1992b) studied the effect of variation in deme size, migration rate, and extinction rate on the probability of identity by descent. He concluded that this kind of temporal or spatial variation can significantly affect the amount of genetic variation among demes. Wakeley (2001) also studied a finite-island model with variation in deme sizes and derived expressions for the effective populations size Ne. He showed that size variation among demes always reduces Ne because the decrease in coalescent times in small demes more than offsets its increase in large demes. Wakeley and Aliacar (2001) then analyzed this model with extinction and recolonization following the migrant-pool model of Slatkin (1977) and showed that the expected number of segregating sites within deme was correlated with deme size and migration rate. Demes facing a higher extinction rate would also be less polymorphic unless the number of new incoming colonists was very large compared to the extinction rate.
Contrary to genetic metapopulation models where an arbitrary probability of extinction is the key parameter distinguishing metapopulation from other subdivided population models (Levins 1969; Slatkin 1977), ecological metapopulation models are often spatially explicit and more realistic, as they often use environmental information like patch size and connectivity to compute extinction probabilities (see, e.g., Hanski and Ovaskainen 2003). An example of such an ecological model is the incidence function model developed by Hanski (1994), which uses deme area, deme quality, and deme coordinates to predict the dynamics or the fraction of occupied demes in any metapopulation. However, while incorporating more biological information than genetic metapopulation models, ecological metapopulation models do not predict patterns of genetic diversity but rather metapopulation stability or the number of occupied demes.
In an attempt to explicitly link environmental information to genetic data, Currat et al. (2004) have developed a computer program (SPLATCHE) to simulate genetic data, taking spatial environmental heterogeneity into account. Under the SPLATCHE framework, the demographic and migration history of a set of subpopulations arranged on a two-dimensional lattice is first simulated and then used to generate the genealogy of genes sampled at one or several locations under a coalescent backward approach (Currat et al. 2004). The separation of demographic and genetic simulations allows one to include more realism into the demographic model, while keeping the genetic simulations simple and fast. Spatial environmental heterogeneity is handled by assigning demes to particular eco-vegetation types, which are assumed to sustain different numbers of individuals and where migration may be more or less difficult. One can thus explicitly associate a given carrying capacity and a given migration friction to each ecotype, which will influence the demographic history of the demes and therefore their expected neutral genetic diversity. Even though the current SPLATCHE framework is relatively flexible and can even deal with some form of temporal changes of the environment (Ray et al. 2003), it assumes that a given type of environment is uniformly occupied, which is quite unrealistic. It may indeed be more realistic to assume that a particular environment is not uniformly occupied by a given species, which could therefore have a patchy distribution due to chance alone. Even though a given environment may be perceived as uniform, it will almost always present microheterogeneities in the distribution of resources (e.g., water, sunlight, nutrients, preys) or various physical conditions (e.g., humidity, temperature), which should translate into different population densities at different locations. For instance, even though most populations living in unfavorable environments (e.g., deserts) have low densities, a few populations may show much larger densities due to microclimatic fluctuations or access to rare resources (e.g., water).
In this article, we extend the SPLATCHE framework (Currat et al. 2004), and we introduce a convenient way to model spatial and temporal environmental heterogeneity with just two new parameters. We model spatial heterogeneity of the environment by assuming that the carrying capacities of the demes found in a given type of environment follow a gamma distribution, and we use the shape parameter (α) of this distribution to control the degree of spatial heterogeneity. Temporal heterogeneity is controlled by a second parameter, which is the autocorrelation of carrying capacities over time (ρ). We then study the effect of various types of environmental heterogeneity on different aspects of molecular diversity within and between demes by extensive simulations.
MATERIALS AND METHODS
Our simulation framework extends that introduced by Currat et al. (2004), who proposed to simulate the effects of heterogeneous environments on the molecular diversity after a range expansion in two steps: a first forward step simulates population densities and migration events, and a second backward step simulates molecular diversity on the basis of the demography simulated in the first step using a coalescent approach. While this approach is more fully described elsewhere (e.g., Ray et al. 2003; Currat et al. 2004), we briefly outline here these two steps.
Following Ray et al. (2003), we performed simulations on a two-dimensional grid of 50 × 50 demes, representing subpopulations (demes) of haploid individuals interconnected by migration. The colonization of the grid arbitrarily starts at the center of the lattice (at position <25;25>). Each generation, a growth phase is followed by an emigration phase from every occupied deme to neighboring demes with an emigration rate per gene lineage m, which may vary over simulations, but which is assumed constant over time and space for a given simulation. The density of each deme is logistically regulated, with an intrinsic growth rate r (we used here a constant r value of 0.5) and a carrying capacity K depending on the environment assigned to the deme. In our simulations, only one type of environment is used for the whole metapopulation. Note, however, that the carrying capacity of the ith deme, Ki, may vary according to spatial and temporal heterogeneity, as described later in this section. The number of emigrants sent by the ith deme during the migration phase is thus Nitm, where Nit is the local deme size at time t. The Nitm emigrants are then distributed equally among the four neighboring demes. The density of the demes is then regulated in the next growth phase. The way migration is modeled here differs from Slatkin's (1977) model and its followers (see, e.g., Pannell and Charlesworth 2000), where migration is considered as the proportion of individuals that are replaced by new immigrants in each deme. In our model, the number of emigrants depends on local deme size, whereas the number of immigrants depends on the size of surrounding demes. In the case of a spatial variation in deme size, this would result in small demes receiving more migrants than they send to larger neighbors, this unbalance being then logistically regulated. In the SPLATCHE framework, migration also depends on a friction parameter that may differ according to the environment, but for the sake of simplicity we have chosen here to assume a uniform friction in the whole environment, such that migration patterns will just depend on deme densities and emigration probability m per lineage.
To get results comparable to those of a previous study of a range expansion in a uniform environment (Ray et al. 2003), we simulated the same expansion time of 4000 generations in the past. This time was then arbitrarily selected because these authors were interested in human evolution and 4000 generations roughly correspond to the emergence of modern humans ∼120,000 years ago. However, results for expansion times of 1000 or 10,000 generations are reported in supplemental Tables S1–S4 at http://www.genetics.org/supplemental/. Note, however, that we did not consider very recent range expansions, because up to 1000 generations are necessary to colonize our simulated world, such that our results would not necessarily apply to the case of invasive species. We always performed 1000 demographic simulations per set of parameters.
The program SPLATCHE was modified to simulate the genealogy of sampled genes to take environmental heterogeneity into account as described below. The genealogy of a sample of 30 genes in a deme located at position <5;5> was simulated to study intrademe DNA sequence diversity, and we allowed for multiple coalescent events per generation per deme. Mutations were assumed to follow a Poisson process with rate μ = 0.001 for a sequence of 300 bp in length, such that four mutations were expected to have occurred since the expansion along any lineage, which roughly corresponds to the range of values observed for human mtDNA HV1 sequences (e.g., Santos et al. 2005). Several summary statistics were computed on the simulated DNA sequences using the software package ARLEQUIN ver. 3 (Excoffier et al. 2005): (i) measures of genetic diversity within deme: average number of pairwise differences π and number of segregating sites S; (ii) measures of genetic diversity among demes: FST and average number of pairwise differences between demes πxy; as well as (iii) statistics to detect departure from mutation–drift equilibrium: Tajima's D (Tajima 1989) and Fu's FS (Fu 1997). To calculate FST and πxy values, we sampled 30 additional genes at position <25;45> and we calculated FST between these two demes as (Weir and Cockerham 1984). Because the number of demes is high (2500), the average number of pairwise differences between demes πxy is equivalent to the expected number of pairwise differences between two randomly chosen genes in the population. We always performed 10 genetic simulations per demographic simulation, such that a total of 10,000 genetic simulations were performed for each set of parameters.
Spatially heterogeneous environments:
We assume that differences in environmental conditions affect the carrying capacity (K) of the demes. While a heterogeneity in carrying capacities over space could certainly result from adaptive or selective processes, we concentrate on neutral diversity at an unlinked marker, the diversity of which depends only on population demography. Environmental heterogeneity was thus modeled by assuming that the carrying capacity of each deme was a random variable drawn from a gamma distribution with mean , shape parameter α, and variance 2/α. The gamma distribution is commonly used to reflect heterogeneous processes like spatial heterogeneity of mutation rates along chromosomes (e.g., Nei et al. 1976; Wakeley 1993; Aris-Brosou and Excoffier 1996). In the context of environmental heterogeneity, small α-values correspond to environments that are spatially more heterogeneous than those simulated with larger α-values. We considered here four different degrees of spatial heterogeneity: large (α = 0.5), intermediate (α = 1), small (α = 5), and none (, implying that ). The distributions of the resulting carrying capacities for = 100 are reported in Figure 1, as well as random examples of their corresponding spatial distribution. As can be seen in Figure 1, α-values <1 lead to L-shaped distributions of carrying capacities, indicating that most of the demes have very small carrying capacities, while a few may have large densities, which could be considered as density hotspots. Such a distribution could correspond to population densities observed in very inhospitable habitats such as deserts, in unsaturated environments, or in mixed environments such as cultivated landscapes, with a very patchy distribution of resources and individuals. α-values >1 lead to bell-shaped distributions of carrying capacities and would correspond to mildly heterogeneous and saturated environments, where resources are more evenly distributed. Our simulations results should therefore cover a wide range of situations for species living in very different environments. Globally, the proportion of empty demes (indicated by open squares in Figure 1B) is largest for highly heterogeneous environments. While the simulated samples are always drawn from the same deme across simulations, the initial size (going backward in time) of this deme varies over simulations. Cases where the initial deme size was zero have been discarded. Note also that we did not assume any spatial autocorrelation between the carrying capacities of neighboring demes.
Temporally heterogeneous environments:
Temporal heterogeneity was incorporated into the demographic simulations by changing the carrying capacity of each deme every generation, while keeping the mean carrying capacity of the demes constant over time. We propose the autocorrelation coefficient (ρ) of the carrying capacities between generations as a measure of the degree of temporal heterogeneity. Small ρ-values therefore indicate a large temporal variability, whereas large ρ-values indicate small temporal changes in carrying capacities for a given deme. Note that temporally stable environments can be simulated by setting ρ to 1. For fixed α- and ρ-values, the absolute magnitude of the temporal change in carrying capacity of a deme will depend on its current value. It implies that the probability for a deme to go extinct will depend on its current size and will thus be larger in currently small than in currently large demes, which seems realistic. This situation differs from a previous genetic metapopulation model (e.g., Slatkin 1977), where the extinction probability is assumed constant over time and identical for all demes. Our model is therefore closest to a conventional genetic metapopulation model when we assume very low ρ-values (ρ → 0), where the probability of extinction is not correlated to deme size and is equal to the expected proportion of empty demes for given α- and -values. In temporally stable environments (ρ ≈ 1), only demes with currently small carrying capacities face the risk of extinction and empty demes will stay empty longer, since they cannot be recolonized as long as their carrying capacity is zero.
The results of our simulations are divided into three sections. We first present results obtained in the absence of environmental heterogeneity and outline here several aspects of DNA sequence diversity within and between samples. We then present comparable results under a model with spatial heterogeneity and finally results obtained with both spatial and temporal environmental heterogeneity.
The pattern of genetic diversity within and between demes is reported in Table 1 for various amounts of spatial heterogeneity, various average carrying capacities (), and two different rates of migration (m = 0.05 and m = 0.2). When considering homogeneous environments (α = ∞), we find results equivalent to those described in Ray et al. (2003). Measures of intrademe diversity, like the average number of pairwise differences (π) and the number of polymorphic sites (S), increase with the number of migrants exchanged between neighboring demes (m), while their coefficient of variation decreases. On the other hand, demes become less differentiated (as measured by FST) with increasing m values. Note that in all cases the overall genetic diversity as measured by πxy is not much affected by the different demographic parameters (but its coefficient of variation is slightly larger for small m values), suggesting that the increase of FST with a lower number of migrants exchanged between demes is primarily due to a reduction in genetic diversity within deme. Note, however, that we have measured FST by considering two demes geographically quite far from each other. The coalescence time of two genes drawn from these two demes will always be close to the expansion time, irrespective of the heterogeneity of the environment. It is possible that the coalescence time of two genes drawn from two geographically close demes would be more affected by environmental heterogeneity and therefore that FST would also depend on the level of genetic diversity between demes in that case.
In Table 2 we present the performance of two statistics used to detect departure from selective neutrality and population equilibrium: Tajima's D and Fu's FS. In homogeneous environments (α = ∞), both statistics change from very significant negative values when m is large (>20) to mostly nonsignificant negative values when m ≤ 20, even reaching positive values when m < 10. These results are again consistent with those from Ray et al. (2003).
Results on the amount of genetic diversity within and between populations after an expansion in a heterogeneous environment are presented in Table 1. We find that an increase in the degree of environmental heterogeneity has globally the same effect for different sets of - and m-values: with decreasing α-values, genetic diversity within sample is reduced (π and S decrease), while it is more variable between replicates (the coefficients of variation of π and S increase with α), and FST is increased while its coefficient of variation remains relatively unaffected by α. However, these effects appear much less marked for large than for small m values. Interestingly, FST levels seem more affected by average levels of gene flow (as measured by m) than by environmental heterogeneity: for instance, FST changes from 0.01 for m = 200 to 0.35 for m = 2.5 in a constant environment, while FST is increased only by a factor of 2–4 for a fixed m value between a homogeneous (α = ∞) and a highly heterogeneous environment (α = 0.5) (Table 1). Note that πxy is also marginally affected by the degree of environmental heterogeneity. While the age of the expansion will affect the extent and pattern of molecular diversity, the effect of environmental heterogeneity is qualitatively similar for different expansion times and is not discussed hereafter (see supplemental Tables S1–S4 at http://www.genetics.org/supplemental/, reporting results analogous to those in Tables 1 and 2 for expansion times of 1000 or 10,000 generations).
In Figure 2, we report FST values as a function of the average carrying capacity of the environment for different degrees of spatial heterogeneity, but for a constant m value of 10. As already shown in Table 1, the level of genetic structure is lowest for the homogeneous environment and increases with environmental heterogeneity. However, Figure 2 shows that FST is also very sensitive to the average size of the demes in heterogeneous environments, which is much less the case in homogeneous environments. The level of genetic structure actually increases with lower -values, and thus FST does not simply depend on the product m, as shown previously in a uniform and spatially unstructured environment (Excoffier 2004).
The variability of Tajima's D- and Fu's FS-statistics computed under scenarios with environmental heterogeneity is reported in Table 2. Both statistics are shifted toward less negative or more positive values with increasing levels of environmental heterogeneity, and they tend to become less often significant. For instance, for m = 40 the power of Tajima's D-statistic to detect past expansion is reduced from 96% in homogeneous environments down to only 13% in heterogeneous environments with α = 0.5.
In Table 3, we report the effect of different levels of temporal heterogeneity (ρ) on genetic diversity within samples for the case of m = 40, because it is for this case that Fu's FS- and Tajima's D-statistics show the largest sensitivity to spatial heterogeneity (Table 2). We can see that temporal heterogeneity does not affect much molecular diversity when spatial heterogeneity is weak (α = 5) or intermediate (α = 1). It has, however, a marked effect when environmental heterogeneity is strong (α = 0.5), where π and S drop sharply in highly changing environments (ρ < 0.4). It seems that S is more affected by temporal heterogeneity than π, since there is only a 14% decrease in π (6.5 to 5.6) when the autocorrelation coefficient (ρ) changes from 1 to 0, while S decreases by 38% (35 to 22 segregating sites) when ρ goes to 0. Interestingly, the coefficients of variation of S and π seem almost unaffected by temporal heterogeneity for all levels of spatial heterogeneity considered here. Qualitatively similar results are shown for m = 10 in supplemental Table S5 at http://www.genetics.org/supplemental/.
In highly fluctuating environments (i.e., when ρ < 0.4 and α = 0.5 in Table 3), genetic diversity within demes is considerably reduced, while the overall genetic diversity as measured by πxy is only slightly decreased, resulting in a sharp increase in FST. We therefore see that increasing temporal heterogeneity has a similar effect as increasing spatial heterogeneity. Note that while it is quite difficult to strictly compare our model to previous metapopulation models, the observed magnitude of the increase in FST due to temporal changes in spatially highly heterogeneous environments is similar to the predictions of Wade and McCauley (1988). For instance, in the case of = 200, ρ = 0, and α = 0.5, we observe that 5.6% of the demes go extinct per generation on average and this extinction rate is expected to result in a 50% increase in FST under Wade and McCauley's (1988) migrant-pool colonization model as compared to the no-extinction case, whereas our simulations show a 46% increase (from 0.168 for ρ = 1 to 0.248 for ρ = 0). So, despite the spatially explicit context of our simulations and the assumed past expansion, our results are in relatively good agreement with former predictions. However, our simulations reveal a previously unnoted behavior for spatially very heterogeneous but temporally slightly changing environments. Indeed, as compared to temporally stable environments, we find for ρ = 0.8 and α = 0.5 (see Table 3) that genetic diversity within deme (π) increases and that FST decreases, before increasing again for lower ρ-values.
In Table 4, we report the influence of temporal heterogeneity on the performance of Fu's FS- and Tajima's D-statistics. When environmental heterogeneity is strong or intermediate, both statistics are shown to change from negative values in the case of a temporally constant environment to less negative or even positive values when ρ = 0. In the case of spatially highly heterogeneous environments (α = 0.5) the power of these statistics to detect departure from population equilibrium is considerably lowered in the presence of temporal heterogeneity, since the probability to get significant statistics is considerably less than the significance level and even drops to zero for Fu's FS when ρ ≤ 0.4. For a smaller m value of 10, these statistics will have even lower power to evidence past range expansion (see supplemental Table S6 at http://www.genetics.org/supplemental/).
Effect of environmental heterogeneity on mismatch distributions:
We report empirical distributions of the number of differences between pairs of genes (mismatch distributions) in Figure 3 for a subset of the cases described in Table 1. As reported previously (Ray et al. 2003), the expected mismatch distributions are unimodal for (m > 50) and this mode is indicative of the age of the spatial expansion (Slatkin and Hudson 1991; Rogers and Harpending 1992; Excoffier 2004). When the environment is homogeneous, expected mismatch distributions are clearly bimodal for m values <50, with the first mode at zero due to recent coalescent events (Figure 3). We find overall that spatial heterogeneity affects mainly mismatch distributions if m is also <50 (Figure 3). In that case, the first mode of the mismatch distribution becomes more important with increasing levels of spatial heterogeneity, which is again due to an increase in the proportion of recent coalescent events. We also note a shift of the second mode toward lower values with increasing environmental heterogeneity, suggesting that coalescent events occur just after the onset of the expansion and thus probably during the range expansion and not mainly when genes are found in the ancestral deme as in homogeneous environments (Ray et al. 2003; Currat and Excoffier 2005). When m > 50, the mismatch distribution is only weakly affected by any degree of spatial heterogeneity.
In Figure 4, we show the mismatch distributions expected after a range expansion in spatially heterogeneous environments for different degrees of temporal heterogeneity ρ. Temporal and spatial heterogeneity have also similar effects on the mismatch distribution, as pairs of genes have more recent ancestors in temporally more fluctuating environments. This pattern is, however, found only in spatially highly heterogeneous environments (Figure 4, α = 0.5), since we find no difference between the mismatch distributions taken from samples in spatially more homogeneous environments (α = 5, Figure 4). Finally, we note that temporal heterogeneity also adds to the variance of the mismatch distribution when environmental heterogeneity is not weak (if α ≤ 1, see Figure 5).
Effect of environmental heterogeneity on genetic diversity:
We have shown that levels of genetic diversity both within and between demes are affected by environmental heterogeneity after a relatively old range expansion. While genetic diversity is smaller in samples taken in more heterogeneous environments, the overall degree of genetic structure is usually stronger. These results can be understood by considering two factors that are affected by environmental heterogeneity. First, the total number of demes where gene lineages can actually migrate will decrease with larger environmental heterogeneity, implying that pairs of genes will have a larger chance to occur in the same deme and thus to coalesce with smaller α-values. Second, the rate of a coalescent event within deme is increased in environments with larger environmental heterogeneity, because the increase in coalescent probability in small demes is more than compensated by its decrease in large demes (Wakeley and Aliacar 2001). Therefore, increasing environmental heterogeneity has the same effect as a decrease in the Km value in a homogeneous environment. Note, however, that there is one exception to this general rule, which occurs in spatially highly heterogeneous environments, where low levels of temporal heterogeneity lead to a slight decrease in FST. In that case, it seems that slight temporal deme size fluctuations can prevent some recent coalescent events from happening as small demes can have their size increased with temporal heterogeneity. Therefore, slight temporal heterogeneity leads to a virtual homogenization of a highly fragmented and heterogeneous landscape.
Contrary to what we found in homogeneous environments (Ray et al. 2003; Excoffier 2004), the pattern of diversity is not just sensitive to the average number of genes exchanged between neighboring demes (m), but it is highly affected by the degree of environmental heterogeneity (Table 1, Figure 2). FST also depends strongly on and m separately in highly heterogeneous environments and very little in a homogeneous environment. For instance, when α ≤ 1, FST is much larger for small ( = 20) than for large ( = 10,000) average deme sizes (Figure 5). In more homogeneous environments, the separate effects of and m tend to be very small on FST (Figure 5), but it also affects patterns of diversity within demes (see Table 1 for two distinct entries with m = 10 and α = ∞). Note that the relationship between genetic diversity and absolute deme size was previously unnoted in homogeneous environments (Ray et al. 2003), since deme sizes >100 were always used in previous simulation studies and this relationship is barely visible for large deme size. Overall, our results suggest that FST can reach high values in a structured population where demes exchange a large number of migrants if environmental heterogeneity contributes to a large variance in deme size. This effect is reinforced by additional temporal heterogeneity, such that the relationship between FST and migration rate in realistic and heterogeneous environments is far more complex than its expectation under the infinite-island model (Wade and McCauley 1988; Whitlock and McCauley 1999).
While our implementation of immigration rate depending on the size of the surrounding demes is certainly more realistic than previous approaches, it has the disadvantage that any variation in deme sizes will affect immigration rates, making it impossible to separate the effect of these two forces on genetic diversity. Under a finite-island model, Wakeley (2001) found that in contrast to a variation in deme sizes that always reduces the effective size of the population, variation in migration rates would decrease the effective size if 4m > 1 and increase it otherwise. Since most of our simulations correspond to cases where 4m ≫ 1 we expect that variation in immigration rates will reduce the effective size of our metapopulation and thus enhance the effect due to the variation in deme sizes.
The role of extinctions:
In our model, extinctions simply follow from temporal fluctuations in deme densities. Extinction rates depend on the global degree of temporal and spatial environmental heterogeneity, as well as the current densities of particular demes. The probability of going extinct is thus larger for small than for large demes, unless there is no correlation in deme sizes between generations. This implementation differs from previous genetic metapopulation models (e.g., Slatkin 1977; Maruyama and Kimura 1980; Wakeley and Aliacar 2001), where the probability of extinction was assumed constant over space and time. However, our results suggest that extinction is not a key parameter to predict a pattern of molecular diversity, since we find that temporal and spatial heterogeneities have virtually the same effect (see Tables 1 and 3). Any local reduction in genetic diversity due to temporal heterogeneity added to spatial heterogeneity could thus be obtained by a more pronounced spatial heterogeneity, without any extinction. This suggests that extinction rates should be extremely difficult to estimate in practice if there is some degree of environmental heterogeneity.
The results we obtain under our metapopulation model after a range expansion differ from previous ones by two other aspects. First, a study of an island model (Pannell and Charlesworth 1999) suggested that population turnover would significantly affect genetic diversity only if the extinction rate was of the same order of magnitude or greater than the migration rate. We actually find that genetic diversity is clearly affected even for extinction rates much smaller than the migration rate. If we estimate the extinction rate as the proportion of empty demes when ρ = 0, then for the case of = 200 and m = 0.2, extinction rates (ε) ≈ 0.056 and 0.005 for α = 0.5 and 1, respectively. From Table 3, it is clear that all aspects of genetic diversity within and between demes are affected by temporal heterogeneity for α = 0.5 (ε/m = 0.28), and π and S are found slightly lower than in the absence of extinctions for α = 1 (ε/m = 0.025). Second, the increased level of genetic structure measured by FST observed in previous metapopulation models (e.g., Wade and McCauley 1988; Whitlock and McCauley 1990) can be attributed to the effect of extinction and recolonization on the total amount of diversity maintained in the metapopulation (Pannell and Charlesworth 1999). However, in our case the total amount of diversity (as measured by πxy in Tables 1 and 3) is primarily affected by the expansion time and is relatively insensitive to spatial or temporal heterogeneity, and therefore the increase in FST is primarily due to a decrease in genetic diversity within deme with increasing environmental heterogeneity.
Detecting range expansions:
Several statistics and approaches have been proposed to detect past range expansions from patterns of molecular diversity (Harpending et al. 1998; Kimmel et al. 1998; Ray et al. 2003; Excoffier 2004; Hamilton et al. 2005). The typical signature of such expansions consists of long external branches of gene genealogies (Slatkin and Hudson 1991; Rogers and Harpending 1992) leading, for instance, to large negative Fu's FS- and Tajima's D-statistics. In homogeneous environments, these signatures of population range expansion are observed provided that the number of migrants (Km) exchanged between neighboring demes is > ∼20 (Ray et al. 2003), while in heterogeneous environments, we find that these signatures are visible only for Km values >50. In highly heterogeneous environments (α = 0.5), Fu's FS- and Tajima's D-statistics can even become positive for m ≤ 20 (see Table 1), which is usually indicative of an excess of recent coalescent events and therefore of recent bottlenecks. This burst of recent coalescent events, which certainly occurs in demes of small sizes, is confirmed when considering mismatch distributions, as the fractions of pairs of genes showing no differences increase for small α-values when m < 50 (Figure 2). Spatial and demographic bottlenecks thus make it more difficult to evidence past range expansions from molecular data in heterogeneous than in homogeneous environments. The use of these statistics and FST to detect selection, such as in genome scans (see, e.g., Storz et al. 2004; Teshima et al. 2006), may also be problematic. Test based on Fu's FS or Tajima's D-statistics would be expected to show many false negative results in the case of environmental heterogeneity, if it is not explicitly taken into account, while tests based on FST (Beaumont 1999; Beaumont and Balding 2004) should show an excess of false positives. A future challenge would certainly be to estimate the extent of environmental heterogeneity from molecular data.
We are grateful to Pierre Berthier and Nicolas Ray for their computing assistance, to Lutz Duembgen for statistical advice, and to two anonymous reviewers for their helpful comments. This work was supported by a Swiss National Science Foundation grant (no. 3100A0-100800) to L.E.
Communicating editor: J. Wakeley
- Received June 30, 2006.
- Accepted September 11, 2006.
- Copyright © 2006 by the Genetics Society of America