Serial Founder Effects During Range Expansion: A Spatial Analog of Genetic Drift

Slatkin, Montgomery; Excoffier, Laurent

doi:10.1534/genetics.112.139022

Abstract

Range expansions cause a series of founder events. We show that, in a one-dimensional habitat, these founder events are the spatial analog of genetic drift in a randomly mating population. The spatial series of allele frequencies created by successive founder events is equivalent to the time series of allele frequencies in a population of effective size k_e, the effective number of founders. We derive an expression for k_e in a discrete-population model that allows for local population growth and migration among established populations. If there is selection, the net effect is determined approximately by the product of the selection coefficients and the number of generations between successive founding events. We use the model of a single population to compute analytically several quantities for an allele present in the source population: (i) the probability that it survives the series of colonization events, (ii) the probability that it reaches a specified threshold frequency in the last population, and (iii) the mean and variance of the frequencies in each population. We show that the analytic theory provides a good approximation to simulation results. A consequence of our approximation is that the average heterozygosity of neutral alleles decreases by a factor of 1 – 1/(2k_e) in each new population. Therefore, the population genetic consequences of surfing can be predicted approximately by the effective number of founders and the effective selection coefficients, even in the presence of migration among populations. We also show that our analytic results are applicable to a model of range expansion in a continuously distributed population.

RANGE expansion through a series of colonization events can produce geographic patterns in allele frequencies that are quite different from what is expected in equilibrium populations. One consequence of range expansion is the steady reduction of heterozygosity with increasing distance from the ancestral population (Austerlitz et al. 1997; DeGiorgio et al. 2011), a pattern well documented in human populations (Prugnolle et al. 2005; Ramachandran et al. 2005; Handley et al. 2007; Li et al. 2008; DeGiorgio et al. 2009; Deshpande et al. 2009). Another consequence is that some alleles may reach a high frequency because of repeated founder events (Edmonds et al. 2004), a process called genetic surfing (Excoffier et al. 2009). Even deleterious alleles may reach a high frequency because of surfing (Klopfstein et al. 2006; Travis et al. 2007; Excoffier and Ray 2008; Hallatschek and Nelson 2010; Hallatschek 2011). In addition to these simulation-based studies, the analytic theory of surfing in terms of reaction–diffusion equations has been developed (Vlad et al. 2004; Hallatschek and Nelson 2008; Hallatschek 2011). The serial-founder model has also been investigated by coalescent approaches (Austerlitz et al. 1997; DeGiorgio et al. 2011) and the theory has been used to date the time of the onset of human expansions from Africa (Liu et al. 2006).

In this article, we show that the effect of range expansion in a one-dimensional habitat is analogous to the effect of random mating in a single population. The spatial series of allele frequencies in a one-dimensional array of populations can be predicted from the standard theory of a single population in which the effective population size is set to an effective propagule size that depends on migration and population growth in each newly founded population, and the selection coefficients are set to effective selection coefficients that depend on the number of generations between successive colonization events. We use the theory of a single population to predict several quantities, including (i) the probability that an allele present in the initial population will persist throughout the range expansion, (ii) the probability that an allele will reach high frequency after the range expansion is complete, (iii) the average allele frequency in each population, and (iv) the rate of decrease in heterozygosity with increasing distance from the founding population. The analytic approximation also allows us to relate the discrete-population model to a model of range expansion in a continuously distributed population.

Our analytic approximation is not intended to replace simulations. In fact, even with relatively large effective propagule sizes, there is considerable stochastic variability in allele frequency after a range expansion, making it difficult to predict what will happen to any one allele. Instead, the theory is intended as a guide to intuition because it shows how each parameter in a model of range expansion influences the intensity of founder effects.

We present our results in several parts: (i) we define an idealized model of range expansion that ignores some of the complexity we allow for later, (ii) we develop analytic theory of a Wright–Fisher model of a single population that is analogous to the idealized model of range expansion, (iii) we compare analytic results for the Wright–Fisher model to simulation results for the idealized model, (iv) we define a more realistic model of range expansion and show that it can be matched to the idealized model by redefining parameters, (v) we compare simulation results of the realistic model with the predictions of the analytic theory of the Wright–Fisher model, and (vi) we discuss the relationship between discrete-population models of range expansion and range expansion in a continuously distributed population.

Idealized Model of Range Expansion

In our idealized model there are n + 1 sites at which populations can be established. They are arranged on a line and numbered 0–n. At t = 0, site 0 is occupied by a diploid population of effectively infinite size in which an allele A is present in frequency x₀ in zygotes. Selection changes the frequency deterministically to ${x^{'}}_{0}$ among adults. Then, k adults are drawn randomly to found a new population at site 1. The propagule at site 1 grows in one generation into a population of zygotes of effectively infinite size. Selection then modifies the frequencies in populations 0 and 1, and finally k adults are chosen randomly from population 1 to found population 2. This process continues for n – 1 more generations, with selection affecting the frequency in each established population each generation. After n generations, all n sites are occupied. Our concern is with the frequency trajectory, $x = {x_{0}, x_{1}, \dots, x_{n}}$ ⁠, at time n given that A has not been lost from the population. Note that if A is not neutral, the final x₀ will differ from the initial value because of selection acting for n generations.

Wright–Fisher Model of a Single Population

The idealized model of range expansion is similar to a model of random mating in a single population containing k individuals. The time sequence of allele frequencies in the randomly mating population is preserved in the spatial sequence of frequencies in the model of range expansion. The only difference is that, if there is selection, then selection continues to modify the allele frequency after a population sends a propagule to found the next population.

Given the similarity of the two models, we can use the well-developed theory of drift and selection in single populations to predict what is seen after all populations are colonized. We assume a Wright–Fisher model of a population of size k. The relative fitnesses of individuals with 0, 1, and 2 copies of A are 1, 1 – s₁, and 1 – s₂, respectively. We are concerned with the case in which A is deleterious or neutral (s₂ ≥ s₁ ≥ 0). We follow the notation of Chap. 2.12 of Ewens (2004). At any time, the population is in a state E_i, i = 0, … , 2k, with i copies of A present. The transition probability from E_i to E_j is

p_{i j} = (\begin{array}{l} 2 k \\ j \end{array}) {x'}^{j} {(1 - x')}^{2 k - j},

(1)

where

x = i / (2 k)

and

x' = \frac{x [1 - s_{2} x - s_{1} (1 - x)]}{1 - s_{2} x^{2} - 2 s_{1} x (1 - x) .}

(2)

Let

p_{i j}^{(t)}

denote the t-generation transition probability. The probability that A survives for n generations, given that it is present in i copies initially, is

S (i, n) = 1 - p_{i 0}^{(n) .}

(3)

For large n, S(i, n) approaches the probability of ultimate fixation.

Given the initial number of copies of A, i₀, the Wright–Fisher model will generate a trajectory, i = (i₀, i₁, ..., i_n), where i_t is the number of copies of A at time t. The probability of any trajectory given that i_n > 0 (i.e., that A survives) is

\Pr (i | i_{n} > 0) = \frac{\prod_{t = 0}^{n - 1} p_{i_{t} i_{t + 1}}}{S (i_{0}, n) .}

(4)

We can use the Wright–Fisher model to calculate other quantities of interest. One is the probability that A will reach a specified number of copies, j, after t generations:

\Pr (i_{n} \geq j) = \sum_{i = j}^{2 k} p_{i_{0} i}^{(t)} .

(5)

And we can predict the decrease in heterozygosity as a function of t,

H (t) = {(1 - \frac{1}{2 k})}^{t} H (0),

(6)

where H(t) is the probability of heterozygosity in generation t (Equation 7.2.8 in Crow and Kimura 1970).

The probability of i copies of A at time t, given nonloss at time n, is

\Pr (i, t | i_{n} > 0) = \frac{p_{i_{0} i}^{(t)} (1 - p_{i 0}^{(n - t)})}{S (i_{0}, n)},

(7)

from which the mean and variance of the number of copies of A can be computed,

E (i, t | i_{n} > 0) = \sum_{i = 1}^{2 k} i \Pr (i, t | i_{n} > 0),

(8)

with a similar equation for the second moment of i.

Comparison of the Idealized and Wright–Fisher Models

The Wright–Fisher model is analogous to our idealized model of range expansion. If A is neutral, i_t is the number of copies of A in the propagule that founds population t. The assumption that each population immediately grows to an effectively infinite size ensures that the frequency of A will remain the same in each population and hence $x_{t} = i_{t} / (2 k)$ ⁠.

The idealized model differs slightly from the Wright–Fisher model if A is not neutral because selection will modify the frequency in population t for n – t more generations. This additional effect of selection makes no difference in the calculation of either $S (i_{0}, n)$ or $\Pr (i_{n} \geq j)$ because they do not depend on frequencies in the intermediate populations. To predict the effect of selection on the average frequency in the intermediate populations, we first calculated $E (x_{t}) = E (i_{t} | i_{n} > 0) / (2 k)$ and then deterministically changed the frequency by applying Equation 2 for n – t generations.

We simulated the idealized model to test the accuracy of the analytic approximation on the basis of the Wright–Fisher model. Typical results are shown in Figure 1. The fit of the average allele frequency in the simulations to the expectation calculated from the analytic theory is quite good for both neutral and deleterious alleles. There is considerable variation among replicates, however; most trajectories deviate substantially from the average. Some trajectories are clinal with a roughly monotonic increase in frequency while others reach an intermediate maximum and then decrease. For Figure 1A, the predicted and observed survival probabilities are 0.159 and 0.158, and in Figure 1B, they are 0.158 and 0.147. The probability that an allele reaches at least a given frequency is also predicted accurately by the analytic theory (Figure 1, C and D).

Figure 1

Open in new tab Download slide

Comparison of analytic predictions with the simulation results for the idealized model described in the text. (A and B) Thin lines show x_i for i = 0 to n = 10 for each replicate in which A was not lost before population 10. The thick black line shows the average of the 100 replicates and the thick blue line shows the expectation based on the analytic theory. (C and D) Probability that an allele reaches a frequency y in the final population when the range expansion is complete. In A–D, k = 100 and one copy of the mutant was present in the propagule founding population 1. In A and C, s₁ = s₂ = 0; in B and D, s₁ = 0.005 and s₂ = 0.01.

Realistic Model: Finite Population Size and Migration

To create a more realistic model, we assume each propagule grows in one generation to size N (> k) and remains at that size for T generations before generating a propagule that colonizes the next site. We also allow for migration: each established population, including population 0, receives immigrants at a rate m per generation from each neighboring population. The migration among established populations continues until all sites are occupied and the process is stopped. We are concerned with weak migration only and ignore the effect of migration on population growth and population size.

These additional features make the resulting model intractable because it is no longer analogous to a Wright–Fisher model. However, the realistic model can be approximated by the Wright–Fisher model if the parameters of that model are defined in a way that equalizes the mean and variance of the change in allele frequencies between successive colonization events. In Appendix A, we show that the variance in the change in allele frequency in a realistic model is the same as that in the Wright–Fisher model if k is replaced by

k_{e} = \frac{1}{a^{T} / k + (1 - a^{T}) / N (1 - a)},

(9)

where

a = 1 - 2 m - 1 / (2 N)

⁠, which is ∼

1 - 1 / (2 N F)

⁠, where

F \approx 1 / (1 + 4 N m)

if m is small. The effective number of founders, k_e, is similar to the effective population size (N_e) defined by Hallatschek and Nelson (2008) for a model of range expansion in a continuously distributed population. Like k_e, Hallatschek and Nelson’s N_e quantifies the rate of loss of heterozygosity at the leading edge of a range expansion. They estimated N_e from simulations.

The composite parameter k_e includes the effects of two opposing forces. Both the founder effect and the period of random mating after the population is founded increase the importance of genetic drift and hence reduce k_e. Immigration from the neighboring population—there is only one for the population most recently founded—reduces the variance in allele frequency and hence increases k_e. Whether k_e is larger or smaller than k depends on the balance reached. A little algebra shows that if m 1, N 1, and T 2N, then $k_{e} > k$ if $2 N m > k - 1 / 2$ ⁠, independently of T.

Additional generations between successive colonization events also increase the effective strength of selection. If A is in low frequency and s₁ ≠ 0, then only the fitness of heterozygous carriers of A is important. We show in Appendix B that, in this case, the effective selection coefficient against heterozygotes is

s_{1,e} = T s_{1} .

(10)

If s₁ = 0, then the effective selection coefficient against AA individuals is

s_{2,e} = T s_{2} .

(11)

When comparing simulation results to the predictions based on the Wright–Fisher model when A is not recessive, we used Equations 10 and 11 to compute the effective selection coefficients, and when A is recessive, we used Equation 11. These effective selection coefficients indicate the strength of selection in a single generation that is equivalent to T generations of weaker selection.

Simulation Test of Realistic Model

We simulated the realistic model of range expansion described above. Each replicate begins with the frequency of A set to the specified initial frequency, x₀. Then one cycle of colonization results in a frequency x₁ in population 1. A cycle consists of the sampling of gametes to form a propagule; selection in the propagule; creation of population 1; and then T generations of migration, selection, and genetic drift in populations 0 and 1. The details of one cycle are described below. If x₁ > 0, the next cycle begins with the formation of the propagule that will establish population 2. Cycles continue until either A is lost or all n populations are established. If x_n ≥ 1/(2N), A survived the range expansion and the set of frequencies ${x_{0}, \dots, x_{n}}$ ⁠, the allele frequency trajectory, was retained for further analysis. This process was continued until a specified number of replicates in which A survived was obtained. The probability of survival was estimated to be the ratio of the number of replicates in which A survived to the total number of replicates run.

The events during one cycle are as follows. If i populations have been established, then the number of copies of A in the propagule that establishes population i + 1, j_i₊₁, is generated from a binomial distribution with probability x_i and sample size 2k. Thus x_i₊₁ in the propagule is

j_{i + 1} / (2 k)

⁠. Then migration deterministically modifies all the allele frequencies according to

{x^{'}}_{i} = (1 - 2 m) x_{i} + m (x_{i - 1} + x_{i + 1})

(12)

for i = 1, … , i. In the two end populations (0 and i + 1), 1 – 2m is replaced by 1 – m, and

x_{- 1}

and

x_{i + 2}

are set to 0. Then, selection deterministically changes the frequencies in all populations to

{x^{″}}_{j}

⁠, for j = 0, … , i + 1, using Equation 2. Finally, the frequency of A in the next generation in each population (including population 0) is obtained by generating a binomially distributed random variate with mean

{x^{″}}_{i}

and sample size 2N.

To compare the simulation results from the realistic model with the Wright–Fisher model, we used the analytic theory described above with k replaced by the integer nearest k_e (Equation 9) and s₁ and s₂ by Ts₁ and Ts₂. The restriction to integer values of k is necessary because the Markov chain has to have an integer number of states.

Figure 2 shows two examples of the fit of the predictions of the Wright–Fisher model to the simulation results of the realistic model without migration. Figure 2, A–C, shows the results for a neutral allele and Figure 2, D–F, shows the results for a deleterious recessive allele (s₁ = 0, s₂ = 0.03). Figure 2, A and D, compares the probability that A reaches a specified frequency. Figure 2, B and E, compares the average frequencies, using the same format as Figure 1. Figure 2, C and F, compares the variances among replicates. These results are typical for other data sets. Without migration, the fit of the analytic predictions to the simulations is quite good for neutral and recessive deleterious alleles, at least for ks₂ < 2. The predicted variances among replicates do not fit as well as the predicted averages, which is not surprising given that second moments are more variable than first moments. For selected alleles, the predicted $\Pr (x \geq y)$ tends to be slightly larger than the simulated values and the predicted $\bar{x}$ tends to be slightly larger than the simulated values. Those tendencies are more pronounced for alleles with an additive deleterious effect on relative fitness (s₂ = 2s₁), particularly if ks₂ > 1, which is large enough that such deleterious alleles would have little chance of increasing to high frequency. The extent to which all three predicted quantities fit the simulations is roughly similar.

Figure 2

Open in new tab Download slide

Comparison of analytic predictions of the Wright–Fisher model with the simulation result for the realistic model with no migration. The format is the same as in Figure 1. The analytic predictions were obtained by replacing k by the integer nearest k_e and s₁ and s₂ by Ts₁ and Ts₂. In A–F, k = 100, N = 10,000, T = 5, and one copy of the mutant was present in the propagule founding population 1. The averages are over 100 replicate simulations. In A–C, s₁ = s₂ = 0; in D–F, s₂ = 0.3 and s₁ = 0 (recessive deleterious alleles).

With migration, the fit of the predictions to the simulations is not quite as good. Four examples are shown in Figure 3. Only the results for the average frequency are presented. With migration but no selection (Figure 3A), the analytic prediction tends to be slightly less than the average of the simulations. That tendency is also seen when there is migration and selection (Figure 3, B–D). Even when the number of colonization events is large enough that A is fixed in some of the replicates (Figure 3D), the predicted average frequency does not deviate by much from the analytic prediction.

Figure 3

Open in new tab Download slide

Comparison of analytic predictions with the simulation results for a model with finite population size and migration. The format is the same as in Figure 1, B and E. In A–D, k = 100, N = 1000, m = 0.04, and one copy of the mutant was present in the propagule founding population 1. In A, s₂ = s₁ = 0 (neutral); in B, s₂ = 0.02 and s₁ = 0 (recessive deleterious); in C, s₂ = 0.02 and s₁ = 0.01 (additive deleterious); in D, s₂ = 0.01 and s₁ = 0.005 (additive deleterious).

After the range expansion is complete, continuing migration will tend to smooth patterns created by the range expansion and continuing selection will cause the frequencies of deleterious alleles to gradually decay. Once the last population is colonized, our approximations for the expansion phase no longer apply. Subsequent evolution will be governed by the effective population sizes and selection intensities in each population. If the populations are large and selection is relatively strong, the decay will roughly follow the deterministic theory of selection and migration until allele frequencies become quite small. One example is presented in Figure 4.

Figure 4

Illustration of the effects of continued evolution after all population are colonized. k = 100, N = 1000, T = 5, s2 = 0.01, s1 = 0.005, and m = 0.05. (A) Trajectories for 100 replicates immediately after the last population is colonized. The format is the same as in Figure 1, A and B. The prediction of the analytic model is shown in A by the thick blue line. The average of the 100 replicates immediately after all populations are colonized is shown by the thick black line in A and B. (B) The same set of trajectories 100 generations later. The thick red line shows the averages after 100 generations. Note the difference in the vertical scale in A and B.

Open in new tab Download slide

Illustration of the effects of continued evolution after all population are colonized. k = 100, N = 1000, T = 5, s₂ = 0.01, s₁ = 0.005, and m = 0.05. (A) Trajectories for 100 replicates immediately after the last population is colonized. The format is the same as in Figure 1, A and B. The prediction of the analytic model is shown in A by the thick blue line. The average of the 100 replicates immediately after all populations are colonized is shown by the thick black line in A and B. (B) The same set of trajectories 100 generations later. The thick red line shows the averages after 100 generations. Note the difference in the vertical scale in A and B.

The analytic approximation also provides an estimate of the probability that an allele will not be lost during the range expansion. Table 1 shows that the predicted and actual probabilities for the six cases shown in Figures 2 and 3 are in reasonable agreement, although the predictions tend to be larger than the simulated values.

Predicted and actual probabilities that an allele initially present in one copy in the propagule founding population 1 will be present in population n

Table 1

Predicted and actual probabilities that an allele initially present in one copy in the propagule founding population 1 will be present in population n

	Predicted	Actual
Figure 2, A–C	0.048	0.043
Figure 2, D–F	0.047	0.038
Figure 3A	0.034	0.048
Figure 3B	0.047	0.031
Figure 3C	0.039	0.007
Figure 3D	0.013	0.0012

The parameter values are given in the legends to Figures 2 and 3.

Open in new tab

Table 1

Predicted and actual probabilities that an allele initially present in one copy in the propagule founding population 1 will be present in population n

	Predicted	Actual
Figure 2, A–C	0.048	0.043
Figure 2, D–F	0.047	0.038
Figure 3A	0.034	0.048
Figure 3B	0.047	0.031
Figure 3C	0.039	0.007
Figure 3D	0.013	0.0012

The parameter values are given in the legends to Figures 2 and 3.

Open in new tab

We can explore further the probability that A will not be lost during the range expansion. Deleterious alleles of additive effect are lost quickly but deleterious recessive alleles have a substantial probability of not being lost even if their ultimate probability of fixation is low. Figure 5 shows some typical results for deleterious alleles. The probability that an allele survives is much lower for an allele of additive effect than for one that has a recessive effect.

Figure 5

Open in new tab Download slide

Illustration of the difference between alleles with an additive (red line) and a recessive (black line) effect on fitness. Pr(survival) is the probability that an allele initially present in one copy in population 0 is still present in population n. For both curves, k = n = 50 and s₂ = s. For additive selection, s₁ = s/2 and for recessive selection, s₁ = 0. The results were obtained by iterating the transition matrix of the Markov chain.

The similarity of the model of range expansion and a model of a single population leads to a simple prediction about the loss in heterozygosity during a range expansion. The heterozygosity of neutral loci will be reduced by a factor of $1 - 1 / (2 k_{e})$ in each successive population.

Continuously Distributed Populations

Our results are based on a model in which populations are discrete. Much of the interest in range expansion comes from human and other populations that are continuously distributed in space. The relationship between continuous and discrete population models is not simple. As Felsenstein (1975) first noted, it is difficult to formulate a consistent model of a finite population that is continuously distributed in space and that maintains a uniform population density. The reason is that the assumption of uniform density is not compatible with the assumption that individuals reproduce independently of one another (Sawyer 1976). The usual resolution of this problem is to approximate a continuous model by a sequence of discrete models (Nagylaki 1978a,b).

To express our results in terms of a model of range expansion in a continuously distributed population, assume the populations in a discrete-population model are a distance l apart. Our results predict that heterozygosity decreases as a linear function of distance,

H (i) = H_{0} (1 - i / (2 k_{e}))

⁠, in line with previous analytical results (DeGiorgio et al. 2011). To express this result as a function of distance, d, rather than population number, we write

H (d) = H_{0} (1 - \frac{d}{2 κ_{e}}),

(13)

where κ_e = lk_e indicates the net effect of genetic drift when the leading edge of the population moves a distance l.

We can relate the parameters of a discrete-population model to those of a model of a continuously distributed population as follows. A continuous-population model at equilibrium is characterized by the population density (ρ), the root-mean-square dispersal distance (σ), and the total length of the habitat (L). The correspondence between discrete and continuous models in a one-dimensional habitat is well established for populations at equilibrium. If L is large enough that end effects do not dominate, the heterozygosity and decrease in the probability of identity-by-descent of neutral alleles in a continuous-population model are the same as in a discrete-population model when $ρ = N / l$ and $σ = l \sqrt{m}$ ⁠, where l = L/n is the distance between adjacent populations (Malécot 1975). Therefore, to recover the parameters of the discrete-population model, for which we have an analytic approximation, $N = l ρ$ ⁠, $m = σ^{2} / l^{2}$ ⁠, and $n = L / l$ ⁠, where l is not yet specified.

To determine l, we assume that new individuals beyond the leading edge of the population are randomly sampled from k individuals at or near the leading edge. Range expansion occurs as new individuals appear in such a way that their average density is ρ. This assumption fits the observation of Hallatschek et al. (2007) who found that, in an experimental study of range expansion in Escherichia coli, colonists appeared to come predominantly from the small number of cells at the expanding edge of the population. In a continuously distributed population expanding at a uniform rate, there is no delay corresponding to the T generations allowed for in the discrete-population model. Hence T can be set to 1. In the continuous model, a habitat of length L is colonized in total time τ, which corresponds to n generations in the discrete-population model. Therefore, n = τ, which we have already determined to be L/l, and consequently l = L/τ.

With these assumptions, the k_e is given by Equation 9 with T = 1 and

a = {(1 - \frac{σ^{2}}{l^{2}})}^{2} - \frac{1}{2 l ρ},

(14)

where l = L/τ. Therefore

κ_{e} = \frac{L}{τ (a / k + (1 - a) / l ρ (1 - a))} .

(15)

The value of κ_e corresponds to N_e of Hallatschek and Nelson (2008). Their simulation model is more realistic and allows for the interaction of range expansion and dispersal in a way that ours does not. We are assuming weak migration and deterministic expansion of the population front.

We can estimate κ_e from the regression of heterozygosity with linear distance. Ramachandran et al. (2005) fitted the regression line

H (d) = 0.7682 - 6.52 \times 10^{- 6} d

(16)

(where d is the distance in kilometers from Addis Ababa) to heterozygosities computed for 783 microsatellite loci in a worldwide sample of 1027 individuals. From this regression we conclude

κ_{e} = 1.5 \times 10^{5}

km. To get a rough idea of what this result means, assume that modern human populations expanded a total distance, L, of 25,000 km in 2000 generations (i.e., 50,000 yr assuming 25 yr per generation). This implies l = 125 km. To make further progress we need to assume a historical equilibrium density, ρ, which we arbitrarily set to 100/km. It is reasonable to assume that the average per generation dispersal distance, σ, was small enough that

σ^{2} / l^{2}

is small. Together these assumptions imply that a in Equation 15 is nearly 1. Consequently,

\frac{1}{1 / κ_{e} + 1 / 12, 500} = \frac{1.5 \times 10^{5}}{125} = 1.2 \times 10^{4}

(17)

or κ_e ≈ 1327. Using our simple model of range expansion in humans, we conclude that the data of Ramachandran et al. (2005) imply the expansion of modern humans into Asia and North America did not require extreme founder events at the leading edge. This application is intended to illustrate how our results can be interpreted in terms of a continuously distributed population rather than to infer details of human history. Human populations did not expand their range at a uniform rate and, more importantly, the expansion had a two-dimensional component that we have not attempted to model here.

Discussion and Conclusions

We show that range expansion in a one-dimensional habitat is similar in some ways to random mating in a single population. The succession of colonization events during range expansion creates a spatial sequence of allele frequencies that is analogous to the time sequence of allele frequencies in a single population. This is true in an idealized model of range expansion and is approximately true in a more realistic model that allows for some delay before the next colonization event and for weak gene flow among established populations. The similarity of the two models allows the theory of random mating to be adapted to make analytic predictions about the consequences of range expansions. There is a strong stochastic component to the process that makes prediction of individual allele frequency trajectories difficult, but the average trajectory and the extent of variation among them are well predicted by the analytic theory.

Although our simulation results are for a model of instantaneous population growth, our analytic theory makes clear that the effective number of colonists, k_e, depends partly on the net effect of genetic drift between successive colonization events and hence can be defined for other models of population growth including the logistic model.

Previous work on the effects of surfing has emphasized that surfing can drive some initially rare alleles to high frequency (Travis et al. 2007; Hallatschek and Nelson 2010; Hallatschek 2011). The probability that an initially rare allele is fixed in a randomly mating population can be calculated from a diffusion approximation (Kimura 1962). Roughly speaking, deleterious alleles have a significant probability of being fixed by genetic drift if Ns ≤ 1, where N is the population size and s is the selection coefficient. In the context of range expansions, our approximate theory tells us that the probability that an initially rare allele is driven to fixation during a range expansion depends on the product k_es_e, where k_e is the effective propagule size (Equation 9) and s_e is the effective selection coefficient (Equations 10 and 11). This provides an approximate way to determine whether a deleterious allele has a significant probability of surfing to a high frequency. It implies that expanding populations could accumulate deleterious mutations at a faster rate than equilibrium populations, which could potentially explain the observed excess of deleterious alleles in Europeans (Lohmueller et al. 2008) or the collapse of some invading species (Cooling et al. 2011).

Clines in allele frequency are often attributed to geographic variation in selection intensities. In humans, clines in the frequencies of alleles that cause monogenic diseases are observed and sometimes attributed to unknown environmental conditions (Novembre and Di Rienzo 2009). For example, the Δ508 allele of CFTR associated with cystic fibrosis (Bertranpetit and Calafell 1996) and the C282Y mutation of hemochromatosis (Lucotte and Dieterlen 2003) have a higher frequency in northern than in southern Europe. Our results show that such clines could be created by range expansion in the absence of any geographic variation in selection intensity. Several authors (Handley et al. 2007; DeGiorgio et al. 2009; Hunley et al. 2009) have emphasized the importance of nonequilibrium processes in structuring human and other populations and the need to consider neutral explanations for apparently adaptive patterns.

Our theory also predicts the slope of a gradient on heterozygosity that results from range expansion in a one-dimensional habitat and can be recast in terms of a continuous habitat. This correspondence allows us to obtain a rough estimate of the effective propagule size on the basis of the data of Ramachandran et al. (2005).

Acknowledgements

M.S. was partially supported by a grant from the U.S. National Institutes of Health, R01-GM40282. L.E. was partially supported by a Swiss National Science Foundation grant, 3100A0-126074.

Literature Cited

Austerlitz

F

,

Jung-Muller

B

,

Godelle

B

,

Gouyon

P H

,

1997

Evolution of coalescence times, genetic diversity and structure during colonization

.

Theor. Popul. Biol.

51

:

148

–

164

.

Google Scholar

Crossref

WorldCat

Bertranpetit

J

,

Calafell

F

,

1996

Genetic and geographical variability in cystic fibrosis: evolutionary considerations

, pp.

97

–

118

in

Variation in the Human Genome

,

edited by

Chadwick

D

,

Cardew

E

.

John Wiley & Sons

,

New York

.

Cooling

M

,

Hartley

S

,

Sim

D A

,

Lester

P J

,

2011

The widespread collapse of an invasive species: Argentine ants (Linepithema humile) in New Zealand

.

Biol. Lett.

DOI: 10.1098/rsbl.2011.1014.(in press)

Google Scholar

OpenURL Placeholder Text

WorldCat

Crow

J F

,

Kimura

M

,

1970

An Introduction to Population Genetics Theory

,

Harper & Row

,

New York

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

DeGiorgio

M

,

Jakobsson

M

,

Rosenberg

N A

,

2009

Explaining worldwide patterns of human genetic variation using a coalescent-based serial founder model of migration outward from Africa

.

Proc. Natl. Acad. Sci. USA

106

:

16057

–

16062

.

Google Scholar

Crossref

WorldCat

DeGiorgio

M

,

Degnan

J H

,

Rosenberg

N A

,

2011

Coalescence-time distributions in a serial founder model of human evolutionary history

.

Genetics

189

:

579

–

593

.

Deshpande

O

,

Batzoglou

S

,

Feldman

M W

,

Cavalli-Sforza

L L

,

2009

A serial founder effect model for human settlement out of Africa

.

Proc. Natl. Acad. Sci. USA

276

:

291

–

300

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Edmonds

C A

,

Lillie

A S

,

Cavalli-Sforza

L L

,

2004

Mutations arising in the wave front of an expanding population

.

Proc. Natl. Acad. Sci. USA

101

:

975

–

979

.

Google Scholar

Crossref

WorldCat

Ewens

W J

,

2004

Mathematical Population Genetics: I. Theoretical Introduction

.

Springer-Verlag

,

New York

.

Excoffier

L

,

Ray

N

,

2008

Surfing during population expansions promotes genetic revolutions and structuration

.

Trends Ecol. Evol.

23

:

347

–

351

.

Excoffier

L

,

Foll

M

,

Petit

R J

,

2009

Genetic consequences of range expansions

.

Annu. Rev. Ecol. Evol. Syst.

40

:

481

–

501

.

Google Scholar

Crossref

WorldCat

Felsenstein

J

,

1975

A pain in the torus: some difficulties with models of isolation by distance

.

Am. Nat.

109

:

359

–

368

.

Google Scholar

Crossref

WorldCat

Hallatschek

O

,

2011

The noisy edge of traveling waves

.

Proc. Natl. Acad. Sci. USA

108

:

1783

–

1787

.

Google Scholar

Crossref

WorldCat

Hallatschek

O

,

Nelson

D R

,

2008

Gene surfing in expanding populations

.

Theor. Popul. Biol.

73

:

158

–

170

.

Hallatschek

O

,

Nelson

D R

,

2010

Life at the front of an expanding population

.

Evolution

64

:

193

–

206

.

Hallatschek

O

,

Hersen

P

,

Ramanathan

S

,

Nelson

D R

,

2007

Genetic drift at expanding frontiers promotes gene segregation

.

Proc. Natl. Acad. Sci. USA

104

:

19926

–

19930

.

Google Scholar

Crossref

WorldCat

Handley

L J L

,

Manica

A

,

Goudet

J

,

Balloux

F

,

2007

Going the distance: human population genetics in a clinal world

.

Trends Genet.

23

:

432

–

439

.

Hunley

K L

,

Healy

M E

,

Long

J C

,

2009

The global pattern of gene identity variation reveals a history of long-range migrations, bottlenecks, and local mate exchange: implications for biological race

.

Am. J. Phys. Anthropol.

139

:

35

–

46

.

Kimura

M

,

1962

On the probability of fixation of mutant genes in a population

.

Genetics

47

:

713

–

719

.

Klopfstein

S

,

Currat

M

,

Excoffier

L

,

2006

The fate of mutations surfing on the wave of a range expansion

.

Mol. Biol. Evol.

23

:

482

–

490

.

Li

J Z

,

Absher

D M

,

Tang

H

,

Southwick

A M

,

Casto

A M

et al. ,

2008

Worldwide human relationships inferred from genome-wide patterns of variation

.

Science

319

:

1100

–

1104

.

Liu

H

,

Prugnolle

F

,

Manica

A

,

Balloux

F

,

2006

A geographically explicit genetic model of worldwide human-settlement history

.

Am. J. Hum. Genet.

79

:

230

–

237

.

Lohmueller

K E

,

Indap

A R

,

Schmidt

S

,

Boyko

A R

,

Hernandez

R D

et al. ,

2008

Proportionally more deleterious genetic variation in European than in African populations

.

Nature

451

:

994

–

997

.

Lucotte

G

,

Dieterlen

F

,

2003

A European allele map of the C282Y mutation of hemochromatosis: Celtic vs. Viking origin of the mutation?

Blood Cells Mol. Dis.

31

:

262

–

267

.

Malécot

G

,

1975

Heterozygosity and relationship in regularly subdivided populations

.

Theor. Popul. Biol.

8

:

212

–

241

.

Nagylaki

T

,

1978a

Random genetic drift in a cline

.

Proc. Natl. Acad. Sci. USA

75

:

423

–

426

.

Google Scholar

Crossref

WorldCat

Nagylaki

T

,

1978b

The geographical structure of populations

.

Stud. Math.

16

:

588

–

624

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Novembre

J

,

Di Rienzo

A

,

2009

Spatial patterns of variation due to natural selection in humans

.

Nat. Rev. Genet.

10

:

745

–

755

.

Prugnolle

F

,

Manica

A

,

Balloux

F

,

2005

Geography predicts neutral genetic diversity of human populations

.

Curr. Biol.

15

:

R159

–

R160

.

Ramachandran

S

,

Deshpande

O

,

Roseman

C C

,

Rosenberg

N A

,

Feldman

M W

et al. ,

2005

Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa

.

Proc. Natl. Acad. Sci. USA

102

:

15942

–

15947

.

Google Scholar

Crossref

WorldCat

Sawyer

S

,

1976

Branching diffusion processes in population genetics

.

Adv. Appl. Probab.

8

:

659

–

689

.

Google Scholar

Crossref

WorldCat

Travis

J M J

,

Munkemuller

T

,

Burton

O J

,

Best

A

,

Dytham

C

et al. ,

2007

Deleterious mutations can surf to high densities on the wave front of an expanding population

.

Mol. Biol. Evol.

24

:

2334

–

2343

.

Vlad

M O

,

Cavalli-Sforza

L L

,

Ross

J

,

2004

Enhanced (hydrodynamic) transport induced by population growth in reaction-diffusion systems with application to population genetics

.

Proc. Natl. Acad. Sci. USA

101

:

10249

–

10253

.

Google Scholar

Crossref

WorldCat

Appendix A

Derivation of Effective Number of Founders

The model assumes that population 1 is founded by k individuals from population 0 in which the frequency of A is x₀. After population 1 is founded, it grows immediately to size N. Each generation, it receives immigrants from population 0 at a rate m per generation for T generations. Let the frequency of A in the newly founded population be x. Following the notation in Crow and Kimura (1970, Chap. 7.3), assuming the 2N copies are drawn with replacement,

E (x) = x_{0}

and

E [{(x - x_{0})}^{2}] = \frac{x_{0} (1 - x_{0})}{2 k},

where k is the number of founding individuals and E(.) denotes the expectation.

Next we find what happens after T generations of random mating with continued immigration from population 0. First, the population size grows to N by sampling with replacement from the k founders, which changes x to

x' = x + ε,

where E(ε) = 0 and

Var (ε) = x (1 - x) / 2 N

⁠. Migration changes x′ to

x^{″} = (1 - m) x^{'} + m x_{0} .

Therefore

E (x^{″}) = (1 - m) E (x) + m x_{0} = x_{0} .

To derive a recursion equation for the variance in x in generation t, V(t), we take the square

{x^{″}}^{2} = {(1 - m)}^{2} x^{2} + {(1 - m)}^{2} ε^{2} + m^{2} x_{0}^{2} + 2 {(1 - m)}^{2} x ε + 2 m (1 - m) x_{0} x + 2 m (1 - m) x_{0} ε

and the expectation of both sides. Writing the variance of x″ as V(t + 1), we obtain

V (t + 1) + x_{0}^{2} = {(1 - m)}^{2} (V (t) + x_{0}^{2}) + {(1 - m)}^{2} \frac{x_{0} (1 - x_{0}) - V (t)}{2 N} + m^{2} x_{0}^{2} + 2 m (1 - m) x_{0}^{2} + m^{2} x_{0}^{2},

which implies

V (t + 1) = a V (t) + {(1 - m)}^{2} \frac{x_{0} (1 - x_{0})}{2 N},

where

a \approx 1 - 2 m - 1 / (2 N)

and the approximate equality assumes m << 1 and N >> 1, which is the case of interest in the present application.

This is a linear recursion equation that has the solution

V (t) = c_{1} a^{t} + c_{2} .

Because c₁ and c₂ have to be chosen to satisfy the initial condition,

\begin{matrix} c_{1} + c_{2} = V_{0} \\ c_{2} = a c_{2} + \frac{x_{0} (1 - x_{0})}{2 N}, \end{matrix}

where we are ignoring terms of order m/N and m².

Therefore

\begin{matrix} c_{2} = \frac{x_{0} (1 - x_{0})}{2 N (1 - a)} \\ c_{1} = V_{0} - \frac{x_{0} (1 - x_{0})}{2 N (1 - a)} . \end{matrix}

The solution has to satisfy V₀ = x₀(1 − x₀)/(2k), which implies

\begin{matrix} c_{1} = x_{0} (1 - x_{0}) (\frac{1}{2 k} - \frac{1}{2 N (1 - a)}) \\ c_{2} = \frac{x_{0} (1 - x_{0})}{2 N (1 - a)} . \end{matrix}

After T generations

\begin{array}{l} V (T) = x_{0} (1 - x_{0}) (\frac{1}{2 k} - \frac{1}{2 N (1 - a)}) a^{T} + \frac{x_{0} (1 - x_{0})}{2 N (1 - a)} \\ = \frac{x_{0} (1 - x_{0})}{2 k} a^{T} + \frac{x_{0} (1 - x_{0})}{2 N (1 - a)} (1 - a^{T}) . \end{array}

To define an effective number of founders, we solve

\frac{x_{0} (1 - x_{0})}{2 k_{e}} = \frac{x_{0} (1 - x_{0})}{2 k} a^{T} + \frac{x_{0} (1 - x_{0})}{2 N (1 - a)} (1 - a^{T})

for k_e to obtain

k_{e} = \frac{1}{a^{T} / k + (1 - a^{T}) / N (1 - a)} .

k_e may be larger or smaller than k. In general k_e increases with increasing m and but may increase or decrease with increasing T. Continued gene flow increases the effective number of founders because it reduces the variance in x at the time the next population is founded.

In this derivation, we have not included the effect of migration from the newly founded population back into the source of the colonists. This back migration would change x₀ slightly. We justify ignoring this change because it is of the same order of magnitude as the migration rate, m, and hence will modify the effect of immigration on the variance in the newly founded population by a term that is only of order m².

Appendix B

Derivation of the Effective Selection Coefficients

To model the effect of selection during the T generations, we assume that A is the deleterious allele and that the relative fitnesses of individuals with 2, 1, or 0 copies of A are 1 – s₂, 1 – s₁, and 1. The deterministic frequency of A after one generation is

Δ x = \frac{s_{2} x^{2} (1 - x) - s_{1} x (1 - x) (1 - 2 x)}{1 - s_{2} x^{2} - 2 s_{1} x (1 - x)},

where x is the frequency of A. Our interest is in the case in which the selection coefficients are small and A is in low frequency. Under these assumptions, when A has intermediate dominance (s₁ > 0), this equation can be approximated by

Δ x \approx - s_{1} x,

which has the solution

x (T) = x (0) {(1 - s_{1})}^{T} \approx x (0) (1 - T s_{1})

after T generations if s₁ is small. With this approximation, we can summarize the net effect of selection by replacing s₁ by an effective selection coefficient s_1,e = Ts₁.

When A is recessive (s₁ = 0), the equation of change can be approximated by

Δ x \approx - s_{2} x^{2} .

This difference equation does not have a closed-form solution but, if s₂ is small, it can be approximated by a differential equation that does,

\frac{d x}{d t} = - s_{2} x^{2},

which has the solution

x (T) = \frac{x (0)}{1 + s_{2} T x (0)} .

Once again, we can define the effective selection coefficient to be s_2,e = Ts₂.

Footnotes

Communicating editor: W. Stephan

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Month:	Total Views:
January 2021	9
February 2021	26
March 2021	39
April 2021	30
May 2021	24
June 2021	24
July 2021	27
August 2021	34
September 2021	22
October 2021	46
November 2021	33
December 2021	18
January 2022	65
February 2022	41
March 2022	71
April 2022	44
May 2022	46
June 2022	30
July 2022	22
August 2022	26
September 2022	40
October 2022	35
November 2022	48
December 2022	43
January 2023	61
February 2023	24
March 2023	25
April 2023	73
May 2023	46
June 2023	39
July 2023	35
August 2023	30
September 2023	39
October 2023	84
November 2023	39
December 2023	39
January 2024	25
February 2024	25
March 2024	41
April 2024	34

Article Contents

Serial Founder Effects During Range Expansion: A Spatial Analog of Genetic Drift

Abstract

Idealized Model of Range Expansion

Wright–Fisher Model of a Single Population

Comparison of the Idealized and Wright–Fisher Models

Realistic Model: Finite Population Size and Migration

Simulation Test of Realistic Model

Predicted and actual probabilities that an allele initially present in one copy in the propagule founding population 1 will be present in population n

Continuously Distributed Populations

Discussion and Conclusions

Acknowledgements

Literature Cited

Appendix A

Derivation of Effective Number of Founders

Appendix B

Derivation of the Effective Selection Coefficients

Footnotes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Serial Founder Effects During Range Expansion: A Spatial Analog of Genetic Drift

Abstract

Idealized Model of Range Expansion

Wright–Fisher Model of a Single Population

Comparison of the Idealized and Wright–Fisher Models

Realistic Model: Finite Population Size and Migration

Simulation Test of Realistic Model

Predicted and actual probabilities that an allele initially present in one copy in the propagule founding population 1 will be present in population n

Continuously Distributed Populations

Discussion and Conclusions

Acknowledgements

Literature Cited

Appendix A

Derivation of Effective Number of Founders

Appendix B

Derivation of the Effective Selection Coefficients

Footnotes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only