Genetics, Vol. 164, 767-779, June 2003, Copyright © 2003

Fixation Probability and Time in Subdivided Populations

Michael C. Whitlocka
a Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada

Corresponding author: Michael C. Whitlock, University of British Columbia, 6270 University Blvd., Vancouver, BC V6T 1Z4, Canada., whitlock{at}zoology.ubc.ca (E-mail)

Communicating editor: D. CHARLESWORTH


*  ABSTRACT
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

New alleles arising in a population by mutation ultimately are either fixed or lost. Either is possible, for both beneficial and deleterious alleles, because of stochastic changes in allele frequency due to genetic drift. Spatially structured populations differ from unstructured populations in the probability of fixation and the time that this fixation takes. Previous results have generally made many assumptions: that all demes contribute to the next generation in exact proportion to their current sizes, that new mutations are beneficial, and that new alleles have additive effects. In this article these assumptions are relaxed, allowing for an arbitrary distribution among demes of reproductive success, both beneficial and deleterious effects, and arbitrary dominance. The effects of population structure can be expressed with two summary statistics: the effective population size and a variant of Wright's FST. In general, the probability of fixation is strongly affected by population structure, as is the expected time to fixation or loss. Population structure changes the effective size of the species, often strongly downward; smaller effective size increases the probability of fixing deleterious alleles and decreases the probability of fixing beneficial alleles. On the other hand, population structure causes an increase in the homozygosity of alleles, which increases the probability of fixing beneficial alleles but somewhat decreases the probability of fixing deleterious alleles. The probability of fixing new beneficial alleles can be simply described by 2hs(1 - FST)Ne/Ntot, where hs is the change in fitness of heterozygotes relative to the ancestral homozygote, FST is a weighted version of Wright's measure of population subdivision, and Ne and Ntot are the effective and census sizes, respectively. These results are verified by simulation for a broad range of population structures, including the island model, the stepping-stone model, and a model with extinction and recolonization.


THE rate of evolution is determined by the balance between the changes in frequency of beneficial and deleterious alleles. If important alleles are always present in populations in large numbers of copies, then the rate of evolution will be deterministic, with beneficial alleles always increasing in frequency and deleterious alleles decreasing. However, if the product of the effective population size and mutation rates is small enough, then there will not always be "large" numbers of each allele in a population at any given time. Indeed, for the range of population sizes experienced by many (or most) species, all possible alleles are not present at any given time. Many alleles are therefore present initially as a single copy, and even in a large population the future of this allele will depend on stochastic processes. If the rate of evolution is limited by the occurrence and fixation of these new mutations, rather than relying on a pool of standing variation for adaptive change, then the rate of evolution will depend on the probability of fixation of these new mutations.

Furthermore, if population size is small enough, deleterious alleles will occasionally fix in species by genetic drift, and this can cause a decline in mean fitness, perhaps even to extinction (WRIGHT 1931 Down; KIMURA et al. 1963 Down; LYNCH and GABRIEL 1990 Down; LANDE 1994 Down; LYNCH et al. 1995A Down, LYNCH et al. 1995B Down). The rate of decline in fitness due to new deleterious mutations depends on the frequency at which these mutations appear and the probability that they fix. In many species, the population size is small enough that the balance between the fixation of new beneficial and new deleterious mutations is critical to the persistence of the species (WHITLOCK 2000 Down).

The first approach to this problem of finding the probability of fixation of a new mutation was made by HALDANE 1927 Down, based on a suggestion of FISHER 1922 Down and later refined by FISHER 1930 Down. Initially, they considered only beneficial alleles in infinite, ideal populations. They showed that the probability of fixation of such an allele was approximately only twice the selective advantage of heterozygotes (2hs). An allele with a 1% advantage has only a 2% probability of fixation and a 98% chance of being lost. This is perhaps one of the most remarkable results in population genetics, that an allele with a substantial selective advantage may be lost a majority of the times it appears as a result of stochastic effects even in an infinitely large population.

KIMURA 1957 Down, KIMURA 1962 Down(see also CABALLERO and HILL 1992 Down) generalized these results to account for deleterious alleles, arbitrary dominance, and arbitrary effective population size. (See the next section.) For additive beneficial alleles, the effect of including effective population size in the calculations is that the probability of fixation becomes about twice the heterozygote selective advantage times the ratio of the effective size to the census size (2hsNe/N). These results did not apply directly to spatially structured populations, although we see below that their derivation is extremely useful for results with the spatially structured case.

Spatial population structure was introduced to the study of fixation by MARUYAMA 1970 Down, who found that for the island model and other cases with conservative migration, the probability of fixation of a beneficial additive allele remained two times the selective advantage of heterozygotes. (Conservative migration means that migration does not change local population sizes. A further assumption to get this "invariance" result, however, is that all demes contribute to the next generation exactly in proportion to their size.) Subsequent articles have for the most part retained the assumption that migration is conservative (MARUYAMA 1974 Down; NAGYLAKI 1980 Down, NAGYLAKI 1982 Down; SLATKIN 1981 Down), leading to the widespread belief of population geneticists that population structure makes no difference in the probability of fixation. BARTON 1993 Down showed, however, that with local extinctions and colonizations, the probability of fixation of beneficial additive alleles in a subdivided population could be very different from that in a panmictic population. Furthermore, he showed that this difference was not fully described by accounting for the change in effective population size in such metapopulations.

This article uses Kimura's diffusion methods with some recent results on spatially structured populations to find the probability of fixation of beneficial or deleterious alleles with arbitrary dominance, allowing for a broad diversity of population structures. All previous results are obtained as special cases. Furthermore, the expected times to fixation and loss are calculated. These analytic results make several assumptions, but simulations show that the results are, perhaps surprisingly, fairly robust to deviations from these assumptions.


*  KIMURA'S DIFFUSION RESULTS
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

KIMURA 1957 Down, KIMURA 1962 Down; CROW and KIMURA 1970 Down used a diffusion process to derive a fuller model of the probability of fixation, which allowed for both beneficial and deleterious alleles, nonideal populations (i.e., the effective population size could be different from the census size), and arbitrary dominance. The approach requires knowing the mean change in allele frequency over a generation (M{delta}x) and the variance in that change (V{delta}x) as a function of allele frequency (x). V{delta}x is given by x(1 - x)/2Ne, where Ne is the diploid variance effective population size. Following Kimura, we write the indefinite integral (or, more properly, ). The probability of fixation can be found as

(1)

where u[p] is the probability of fixation for an allele initially at frequency p. If the function M{delta}x is of the form M{delta}x = {sigma}x(1 - x) (as it is in the simplest case of additive, directional selection in an ideal population), then the integrals in Equation 1 can be solved to give

(2)

If the fitnesses of diploid individuals carrying 0, 1, or 2 copies of the allele in question are 1, 1 + s/2, and 1 + s, respectively, {sigma} in the previous equation can be replaced by s/2. If s > 0, such that this allele is beneficial, then for alleles of relatively weak effect the probability of fixation of an allele present in the population as a single copy turns out to be ~sNe/N. In an ideal population, such that Ne = N, this recreates Haldane's result of u = s, for p = 1/2N. [Note that, for consistency with later notation, the heterozygous effect has been defined as s/2 here, rather than the more traditional s in HALDANE 1927 Down or CROW and KIMURA 1970 Down.] For more complicated expressions of M{delta}x, such as that which occurs with arbitrary dominance, the integrals in Equation 1 may have to be solved numerically.

The diffusion approach makes several assumptions. First, it assumes that the state of a population at any given time can be described by its allele frequency, meaning that the history of the population is unimportant except as described by the current allele frequency. With population structure, many different states of the metapopulation have the same mean allele frequency. An additional assumption required for the approach below is that some description of the population structure does not change much over time. Wright's FST can take this role.

The second assumption of the diffusion approach is that the changes in allele frequency per generation are small enough that the process can be adequately described by a continuous-time approximation. In practice this requires that the effective population size is not too small and the strength of selection is not too large (s << 1). Finally, the diffusion approach assumes that the third and higher moments of change in allele frequency per generation are negligibly small. As simulations indicate, the robustness of these assumptions is not much affected by adding population structure to the model. See EWENS 1979 Down for a fuller discussion of the assumptions of diffusion models.


*  PROBABILITY OF FIXATION IN A METAPOPULATION
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

Diffusion models and metapopulations:
With some recent results on evolution in subdivided populations, Kimura's approach can be easily modified to reach results about the probability of fixation of alleles in metapopulations. The amount of genetic drift in metapopulations has recently been described (WHITLOCK and BARTON 1997 Down; WANG and CABALLERO 1999 Down), such that the variance in change in allele frequency over the whole metapopulation is asymptotically given by where is the mean frequency of an allele (weighted by local population size). This formulation accounts for not only the random changes in allele frequency due to drift in a single generation, but also the long-term effects of these changes (caused, for example, by correlations over time in the differential success of demes). This formulation allows the demographic properties of each deme to change over time, but does assume that the metapopulation has reached a dynamic equilibrium, such that the distribution of demographic states is constant over time. Note that when each deme is weighted by its size, the mean frequency of an allele is also its frequency in the metapopulation as a whole. Thus it remains to find the expected change in allele frequency in a metapopulation due to selection. A recent article (WHITLOCK 2002 Down) has derived this rate of change for weakly selected alleles (such that |s| is less than the reciprocal of the maximum local effective population size). With some assumptions (discussed below in LIMITS TO APPROXIMATIONS), these results on the mean and variance of allele frequency change can be used to get what turns out to be a very good approximation of the probability of fixation in subdivided populations. The validity of these approximations is tested by simulation later in this article.

For diploid individuals, the relative fitnesses of the three possible genotypes are defined to be 1, 1 + hs, and 1 + s, respectively, for 0, 1, or 2 copies of the x allele. Assume for now that there is random mating within each deme such that the genotypes are present in local Hardy-Weinberg proportions. The distribution of allele frequencies across populations can be described usefully by a variation of Wright's FST, where and V[x] is the variance among demes in allele frequency, weighted by population size. Note that this differs from the standard definition of FST by weighting each individual, but not necessarily each population, equally.

The individuals in each deme may contribute to the following generation in two ways: they can have offspring that remain in the same deme as residents and offspring that migrate to other demes. The total contribution of a deme to the next generation is the sum of these offspring individuals, regardless of whether they migrate or not. If the total contribution of a deme is independent of the genotypes in the deme (but instead depends on the total resources available to the deme and/or the luck of the deme with respect to local catastrophes, etc.), then I refer to this as soft selection (see WALLACE 1970 Down). When the expected contribution of a deme to the next generation is proportional to its mean fitness determined by its distribution of genotypes, I call this hard selection. CHRISTIANSEN 1975 Down defines hard and soft selection in subdivided populations in this way. The two are not binary alternatives, but in general act as ends to a scale of possibilities (WHITLOCK 2002 Down).

When all local populations contribute to the next generation independently of their genotype frequencies (soft selection), the change in allele frequency over a generation in a metapopulation is approximately

(3)

where to order s2

(4)

(WHITLOCK 2002 Down). With an additively acting locus, such that h = 1/2, we find

(5)

as expected. Note that this additive case is proportional to (1 - ), so the corresponding ratio M{delta}x/V{delta}x will not be a function of the allele frequency; therefore Kimura's equation can be solved directly.

When the expected contribution of a deme to the next generation is proportional to its mean fitness determined by its distribution of genotypes, (hard selection) causes the allele frequency to change over generations as

(6)

where

(7)

(WHITLOCK 2002 Down). With hard selection and additively acting alleles, the change in allele frequency reduces to

(8)

which is again a function of order (1 - ). As pointed out in the previous article (WHITLOCK 2002 Down), {vartheta}soft is therefore approximately equal to (1 - r) {vartheta}hard, where r is the relatedness of individuals within a deme (r {equiv} 2FST/(1 + FST)). Hard selection in a structured population is always more effective than selection in an undivided population because of the increased frequency of homozygotes, as long as h < 1.

With these equations for the mean and variance of the overall allele frequency of the set of populations, an approximation of the probability of fixation can be found by putting M{delta}x and V{delta}x into Equation 1. This is an approximation, not only because the diffusion itself is an approximation as discussed above, but also because the true system of equations describing a set of populations would be a complex set of separate diffusion equations for each local population. We see that the simpler approximation using the overall allele frequency change with FST values determined by neutral equilibrium expectations does extremely well, surprisingly so, to predict the actual probabilities of fixation.

Additive beneficial alleles:
General model: For the additive cases, as suggested above, the integration in Equation 1 is simple. In fact, {sigma} in the single population case in the previous section can be replaced by s(1 - FST)/2 with soft selection or by s(1 + FST)/2 with hard selection. Thus we can write

(9)

for soft selection and

(10)

for hard selection. For beneficial alleles in a large metapopulation, the denominator of these functions will be approximately unity, and the frequency of a new allele present as a single copy will be 1/2Ntot (where Ntot is the total census size of the metapopulation), so that the probability of fixation of a new weakly beneficial allele will be approximately

(11)

for soft selection or

(12)

with hard selection. Given that the variance effective population size of a subdivided population is likely to be lower than its census size and lower than the Ne of an undivided population with the same number of individuals (WHITLOCK and BARTON 1997 Down), the probability of fixation is likely to be lower in a metapopulation than in an undivided population.

Strictly speaking, the value of FST used in these equations should be the expected value for an allele under the same selection as the alleles in question. Finding this value is difficult, however. Fortunately, for relatively weakly and uniformly selected alleles, the FST expected for neutral variation does an excellent job of predicting the FST of selected alleles (WHITLOCK 2002 Down). Therefore, the large amount of theory predicting neutral FST can be applied in these equations. The weaknesses of this assumption are discussed in WHITLOCK 2002 Down.

The island model and stepping-stone models: The island model and other models with conservative migration have been considered previously (MARUYAMA 1970 Down, MARUYAMA 1974 Down; SLATKIN 1981 Down; NAGYLAKI 1982 Down). With these models (which include the island model, stepping-stone models, etc.), the effective population size is Ntot/(1 - FST) (WRIGHT 1939 Down; WHITLOCK and BARTON 1997 Down). The work of Maruyama and Nagylaki, etc., also assumes soft selection, so by Equation 9, the probability of fixation of a new additive beneficial allele should be

(13)

Changing the fitness function from 1:1 + s/2:1 + s to the 1:1 + s:1 + 2s assumed in the previous work, the current equation is in complete agreement for these reduced cases with completely conservative migration. For example, with a one-dimensional stepping-stone model and an additive beneficial allele, simulation results show that Equation 13 is quite accurate for weak selection. However, these models are aberrant in one significant way: the variance among populations in reproductive success is set to a minimum, zero. Each population contributes exactly equally to the following generation. This mathematical convenience does not reflect biological reality, unfortunately. For more realistic cases where migration is not completely conservative, such that there is some variance in mean fitness among populations, the effective population size is no longer given by Ntot/(1 - FST) and the probability of fixation is no longer given by s. Such cases are dealt with next.

Extinction-recolonization models: The only other models of population structure that, to my knowledge, have been considered analytically for the probability of fixation of alleles are two analyzed by BARTON 1993 Down. These models allow two special cases of local extinction and recolonization. He focused on additive beneficial alleles and implicitly assumed soft selection.

The extinction-recolonization model to be considered here derives from a model introduced by SLATKIN 1977 Down and developed by WADE and MCCAULEY 1988 Down and WHITLOCK and MCCAULEY 1990 Down. It assumes that there are d local populations of size N, except during extinction or recolonization. Extinction occurs with probability e per generation. Newly colonized populations have k individuals, with a probability {phi} that two individuals in a colonizing group come from the same source population. Otherwise the metapopulation functions like an island model, with each extant population exchanging m of its individuals equally among all other populations at random each generation.

Barton considered two special cases of recolonization, one where a new population was founded by a single haplotype (k = 1/2) and another where the population was founded at carrying capacity by individuals chosen at random from the rest of the large metapopulation (k = N, {phi} = 0). In these cases, the probabilities of fixing a new beneficial allele, making the nomenclature consistent with the rest of this article, are

(14)

and

(15)

respectively (BARTON 1993 Down). The Ne for the extinction-recolonization case has been known since SLATKIN 1977 Down(see also MARUYAMA and KIMURA 1980 Down; the more generalized case given here was found by WHITLOCK and BARTON 1997 Down) to be

(16)

and

(17)

(WHITLOCK and MCCAULEY 1990 Down). Using these results for Ne and FST, we can see that the probabilities of fixation of a new beneficial allele derived by BARTON 1993 Down are in fact given by s(1 - FST)Ne/Ntot, as shown above, to the order of approximation used by Barton. More importantly, Equation 16 and Equation 17 applied to Equation 1 generalize the result for deleterious alleles (see below) and a broader range of colonization types and allow us to predict the fixation probability of alleles present at an arbitrary starting frequency and with arbitrary dominance.

Simulations confirm that these equations are in fact successful at predicting the probability of fixing new beneficial alleles. Fig 1 shows the correspondence between the theoretical approximation and the simulation results. The approximations involved in using the diffusion approximation and in using the neutral value of FST work extremely well.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 1. The probability of fixation predicted by the diffusion model is verified by simulation. The cases simulated ranged from the island model, the stepping-stone model, and a span of the parameter space for an extinction/colonization model, with extinction rates varying between 0 and 0.1, probabilities of common origin varying from 0 to 1, and migration rates ranging from 0.001 or 0.1. The dots each plot the results of 106 simulations for a particular parameter set, while the line plots the expected value from the diffusion approximation. In each case there are 100 demes with a carrying capacity of 100 diploid individuals, and s = 0.002 with additive gene expression (h = 1/2).

Hard and soft selection: The examples in the previous sections have all assumed that the productivity of a deme is unrelated to its genotype frequencies; i.e., they have assumed soft selection. When the fitness of a deme does affect its productivity (hard selection), selection is more effective, because competition among relatives no longer hinders the response to selection. The productivity of a deme includes not only the migrants that it gives to other demes, but also the resident individuals that descend from local individuals (see WHITLOCK 2002 Down). If demes are allowed to grow and shrink, then hard selection is possible even without migration, because the alleles in the growing demes make up increasing proportions of the alleles in the metapopulation.

For example, consider a modified island model, where the total contribution of a deme is determined by the product of its current size and its relative mean fitness. A deme with higher fitness grows proportionally, and its contribution to the migrant pool increases accordingly. Each individual has a constant probability of emigrating, and the total metapopulation size is held constant. All demes receive the same number of immigrants from the migrant pool each generation. As long as s < m, the new allele will migrate out of its original deme before it reaches fixation there, and the FST of the system will be well approximated by the neutral FST (WHITLOCK 2002 Down). Fig 2 shows the probability of fixation over a range of migration rates in this model. Equation 10 well predicts the probability of fixation with hard selection. Note that the probability of fixation is now no longer constant even with the island model, but that this probability changes greatly as a function of the migration rate. When the migration rate is low, FST is high, and the resulting differences in fitness among demes cause greater response to selection.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 2. The probability of fixation in a modified island model with hard selection. The size of each deme starts at 100, and then each deme has a total number of offspring proportional to its mean fitness, with the total metapopulation size held constant. A fraction (1 - m) of these offspring remain in their parental deme, and each diploid individual migrates via a migrant pool with the same probability. Thus demes with higher mean fitness contribute proportionally more individuals to the next generation both by population growth and by migration. Other parameters are h = 1/2, s = 0.001, with 100 demes. The dots represent the proportion of fixations from 107 simulations.

Source-sink metapopulations: For many (or most) structured populations, the contribution to the next generation varies substantially among demes. This variance in reproductive success causes a reduction in the effective population size, and we have already seen the effects of such variance in the section on extinction and colonization. However, sometimes this variance among demes in reproductive success is maintained across generations; that is, the most successful demes continue to be successful and unproductive demes also remain so. This generates a correlation in reproductive success across generations, which also decreases the effective population size (NAGYLAKI 1982 Down; WHITLOCK and BARTON 1997 Down). Fig 3 shows that the effects on the probability of fixation caused by this sort of correlation across generations can be predicted usefully using the diffusion equations reported here. As a fraction of the metapopulation contributes less and less to the future of the species, the effective population size drops, and the probability of fixation of beneficial alleles likewise plummets.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 3. The probability of fixation in a source-sink model. There are 100 demes, 20 of which are "sources" and the rest are "sinks." Each deme has 100 individuals, and the immigration rate to the sources is 0.2 while in the sinks it is 0.25. Demes exchange migrants by a modified island model, where each sink's contribution to the migrant pool is a fraction of that of each source. As this asymmetry increases, the effective population size is reduced, and the probability of fixation of beneficial alleles drops. For these examples, s = 0.002 and h = 1/2, and the dots represent the results of 107 simulations.

Beneficial alleles with arbitrary dominance:
With partially recessive alleles (h != 1/2), such that the fitness of the heterozygote is not exactly halfway between the fitness of the two homozygotes, the mean change in allele frequency is no longer of order (1 - ), but it has higher-order terms in mean allele frequency. (See Equation 3, Equation 4, Equation 6, and Equation 7 above.) As such, it is not possible to explicitly solve the integrals to find a closed-form solution to the probability of fixation.

For beneficial alleles, however, an allele is very likely to fix once it is present in the metapopulation at a sufficient frequency that deterministic forces become paramount. For rare beneficial alleles, it is usually sufficient to calculate the value of M{delta}x assuming the new allele is rare enough that terms of order 3 can be ignored. This gives

(18)

and

(19)

such that the probability of fixation is expected to be ~2{vartheta}sNe/Ntot, where the appropriate {vartheta} is used depending on the nature of competition among populations.

These approximations work so well that the differences between the more exact and approximate results are not visible on a graph, except for the case when both h and FST are very close to zero. Complete recessivity (h = 0) causes the allele frequency change to behave very differently from the case of partial recessivity in panmictic populations (CROW and KIMURA 1970 Down), although with population structure the approximate results still work well. This is because, with FST > 0, many alleles are expressed as homozygotes even when rare, and this effect is accounted for in the approximate answers above.

Fig 4 shows the probability of fixation of beneficial alleles with arbitrary dominance for a range of values of FST. While dominant alleles are always more likely than recessive alleles to fix for a given homozygous effect, population structure allows a much higher probability of fixation for recessive alleles than does panmixia, as long as the effective population size is not much affected (see Fig 4, left). Note that for hard selection, the probability of fixation is always higher (for h < 1) with population structure, if the effective population size is unaffected. With soft selection, recessive beneficial alleles may be much more likely to fix with structure, because they can be expressed during the critical phase when they are rare. However, soft selection with population structure hinders the fixation of dominant alleles, because these alleles gain nothing in fitness when expressed as homozygotes when rare, but lose from the competition between relatives within local populations caused by soft selection. This is all true with an island model or other models of conservative migration, although for models with a high rate of extinction or other variation among populations in reproductive success, the reduction in Ne that results can cause population structure to reduce the probability of fixation of even recessive alleles. With an extinction-colonization model, the probability of fixation can be much reduced relative to panmixia (see Fig 4 and Fig 5A).



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 4. The effects of the dominance coefficient of a beneficial allele on its fixation probability, as a function of FST. The probability of fixation is plotted for both hard (top) and soft selection (bottom). The solid line corresponds to the panmictic case when FST = 0, the dotted line shows results for FST = 0.05, and the dashed line shows FST = 0.2. In all cases, for the same homozygous effects, dominant alleles are more likely than recessive alleles to fix. Note that the relative effects of population structure can be quite different when changes in Ne are taken into account (see last two columns). Parameters used in these calculations are s = 0.01, Ntot = 10,000, and p = 1/2/Ne. In the first column, Ne is held constant and equal to Ntot to isolate the effects of the expression of the allele. In the second column, the island model increases Ne and therefore increases the probability of fixation of beneficial alleles relative to the first case. In the third column, with extinction and colonization, Ne is reduced and the probability of fixation decreases for beneficial alleles.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 5. Examples of the fixation probabilities of nearly recessive beneficial alleles (h = 0.01) with soft selection. (A) Extinction and recolonization. In this example, the migration rate between populations was 0.05, colonization occurred by four individuals with a probability of common origin of 1/2, s = 0.002, and there were 100 demes with 100 diploid individuals each. (Each point represents the results from 107 simulations, so the standard error ranges from 6.9 x 10-6 on the left to 3.9 x 10-6 on the right.) As the extinction rate increases, the effective population size of the metapopulation decreases, and therefore so does the probability of fixation. (B) A one-dimensional stepping-stone model. With a stepping-stone model, FST (and therefore Ne) increases as the migration rate drops, so the probability of fixation also increases with lower migration. This is particularly true with recessive alleles, which are expressed often in the homozygous state with the concomitant increase in the efficacy of selection. (There are 100 demes with 100 diploid individuals each, s = 0.0002, and the dots represent 106 simulations each.)

A stronger test of the robustness of the assumptions of the diffusion approximation is a one-dimensional stepping-stone model. Here the probability of fixation is much increased relative to a panmictic population for recessive beneficial alleles, due to both the increase in Ne and the increased expression of recessive beneficial alleles when rare. Fig 5B shows simulation results for the stepping-stone model, where it can be seen that for all parameters examined with biologically reasonable FST values and weak selection, the diffusion approximation does remarkably well.

Deleterious alleles:
Equation 1 can be used to determine the probability of fixation of deleterious alleles as well. With deleterious alleles, the relative fitness of the allele at all frequencies affects the probability of fixation, not just its fitness when rare. This is because an unconditionally deleterious allele never increases deterministically at any allele frequency. Therefore fewer approximations can be made. For the additive case, the closed-form equations in (9) and (10) are accurate; for arbitrary dominance, the answer must be left in integral form. The probability of fixation, as given by Equation 1, resolves to

(20)

where erf[x, y] = erf[y] - erf[x], and A = (FST + (1 - FST)h){lambda}, B = (1 - (1 - FST)h){lambda}, C = (1 - (1 - FST)(1 - h - p + 2hp)){lambda}, and

(21)

where b is a measure of the relative strength of hard selection. (b = 0 corresponds to soft selection; b = 1 means hard selection. See WHITLOCK 2002 Down.) Fig 6 shows some simulation results compared to this rather ugly equation. The diffusion approximation is quite successful, even with a low dominance coefficient and stepping-stone model, where the assumptions of the diffusion model are most stretched.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 6. The probability of fixation of deleterious alleles with (A) extinction and colonization or (B) a one-dimensional stepping-stone model. (A) The three lines plot, from bottom to top, the predicted probability of fixation for alleles with dominance coefficients of 0.5, 0.1, and 0.01, respectively. The symbols mark simulation results over a minimum of 107 replicates each, with the three dominance coefficients represented by triangles, squares, and crosses, respectively. Other parameters used for these examples were s = -0.0002, m = 0.1, 100 demes of 100 diploid individuals each, and colonization by four individuals with a probability of common origin equal to 1/2. The probability of fixation is substantially increased by the reduction in Ne that accompanies extinction dynamics. (B) The parameters in these examples were h = 0.01, s = -0.0002 with 100 demes of 100 diploid individuals. The points represent the results of 108 simulations.

Note that, unlike the case with beneficial alleles, the probability of fixation does not depend heavily on the dominance coefficient h but is almost entirely determined by Ne and s. This is especially the case with population structure, since many of the deleterious effects of alleles are now expressed in homozygous form anyway. The effect on fixation of deleterious alleles from population subdivision is determined mainly by Ne: if subdivision reduces Ne, then deleterious alleles are more likely to fix than those in an undivided population (see Fig 7).



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 7. The probabilities of fixation of deleterious alleles (s = -0.0001). With deleterious alleles, the probability of fixation is not very sensitive to the dominance coefficient. Other parameters are the same as in Fig 4. The reduction in Ne often associated with local extinction and recolonization allow a strong increase in the probability of fixation of deleterious alleles.


*  DRIFT LOAD AND FIXATION FLUX IN A METAPOPULATION
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

With recurrent deleterious mutations that have a nonzero probability of fixation, the mean fitness of a species would be expected to decline due to repeated fixations of harmful alleles via drift, all else being equal (LANDE 1994 Down, LANDE 1998 Down; LYNCH et al. 1995A Down, LYNCH et al. 1995B Down). Thus, recurrent mutation may be one of the biggest genetic threats to the persistence of small or intermediate populations. The key phrase in this claim, however, is "all else being equal" because new mutation also allows for fixation of beneficial alleles. It has been previously shown that a low rate of beneficial alleles (including compensatory and reverse mutations) can halt the decline in mean fitness due to drift at intermediate population sizes (WHITLOCK 2000 Down). Only when populations become quite small does drift allow the fixation of deleterious alleles and interfere with fixation of beneficial alleles sufficiently to expect a decline in fitness. In this section, the joint effects of fixation of beneficial and deleterious alleles will be analyzed, to find the critical value of effective size that might allow indefinite persistence of the species.

Not enough is known about the rate and distribution of effects of new mutations. Therefore, to study this we need to use an arbitrary distribution that has many of the properties one might expect the actual distribution to have. Therefore let us assume that the effects of new mutations of either beneficial or deleterious effect are drawn from a gamma distribution. For this section, let us also assume that new mutations are all additive. Then the expected change in fitness per generation due to new deleterious alleles is the product of the number of new deleterious mutations per genome times the number of genomes times the probability of their fixation times their effect, integrated over all possible mutational effects. Putting these together we can find the rate of change per generation of mean fitness due to new deleterious mutations,

(22)

where UD is the diploid genome deleterious mutation rate, u[p, s] is the probability of fixation of an additive allele given initial frequency p and selection against homozygotes of s, and {Psi}D[s] is the distribution of deleterious homozygous effects. [Note that this differs from the usage in WHITLOCK 2000 Down, where the distribution was given in terms of heterozygous effects.] If we assume that the absolute values of the effects of new deleterious mutations are gamma distributed with mean |{lambda}D| and coefficient of variation CD, we find

(23)

where {omega} = (1 - FST) or (1 + FST) (for soft or hard selection, respectively) and is the generalized Riemann zeta function. This corresponds exactly to the results given in WHITLOCK 2000 Down except for the change in the fitness notation and the addition of the FST term via {omega}. If C2D{omega}Ne|{lambda}D| >> 1, the {zeta} term will be between 1 and 1.65.

Similarly, we can find the flux in fitness due to new beneficial mutations:

(24)

For values of 2C2B{omega}Ne{lambda}B > 1, {zeta}[2 + 1/C2B, 1/2C2B{omega}Ne{lambda}B]

is ~(2C2B{omega}Ne{lambda}B)2+1/C2B. Thus we get

(25)

The equilibrium point where fitness is unchanging ({Delta}WB + {Delta}WD = 0) can be found using the approximations for the zeta functions mentioned above. Then, the critical Ne (above which the species can persist indefinitely) can be found. If both beneficial and deleterious alleles follow an exponential distribution (so CD and CB are both 1), this becomes

(26)

The critical effective size is changed by a factor of 1/{omega} compared to a panmictic population with the same mutational properties.

Note that Ne may not stay constant as new alleles fix in a population. If fixation of deleterious alleles increases the extinction rate of demes or decreases the effective migration rate among demes, then Ne can be reduced, potentially by a substantial margin. Then population structure can feed back into the probability of fixation and increase the probability of mutational meltdown. HIGGINS and LYNCH 2001 Down have observed this in stepping-stone simulations, where the structured population was much more likely than a similar unstructured one to go extinct.


*  TIME TO FIXATION
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

KIMURA and OHTA 1969 Down showed that the time to fixation of an allele that starts at frequency p (considering only those alleles that fix) is given by

(27)

where

(28)

These equations are robust when applied to fixation in a subdivided population. Evaluation of these equations requires numerical integration (even for the panmictic case with selection), and this has been done using Mathematica 4.0.

These equations match well with simulation results. Fig 8 shows that the time to fixation is well predicted by Equation 27 for metapopulations with local extinction and recolonization. Fig 9 shows that even with a stepping-stone structure, the time to fixation is well predicted by the diffusion results.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 8. The time to fixation for a variety of cases with local extinction and recolonization. For each case there were 100 demes of 100 diploid individuals. Analytical approximations based on diffusion do extremely well to predict the time to fixation of new alleles. As the extinction rate increases, the effective size drops, and the time to fixation also drops rapidly. (A) Beneficial alleles, over a range of migration rates (m), extinction rates, and the probability of common origin of colonists ({phi}). For comparison, for this type of selection the time to fixation would be on average 8480 generations in an undivided population of the same census size. The dots are averages over the cases that fixed out of 106 simulations. (B) Deleterious alleles, over a range of dominance values. Symbols and parameter values are the same as in Fig 6A. Dominance does not strongly affect the time to fixation, even with the relatively high FST values associated with high extinction rate.



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 9. The time to fixation of (A) beneficial and (B) deleterious recessive (h = 0.01) alleles in a one-dimensional stepping-stone model. The curve shows analytic approximations from Equation 27, and the dots represent simulation results from the number of fixations out of 107 (A) or 8 x 108 (B) simulations. In all cases there were 100 demes with 100 diploid individuals, with a dominance coefficient of 0.01 for the new allele. For the beneficial alleles in A, s = 0.0002, while for the deleterious alleles in B, s = -0.0002. For comparison, the expected time to fixation in an undivided population with the same selection would be 29,327 or 39,068 generations for the two cases, respectively.

MARUYAMA and KIMURA 1974 Down have shown that for a panmictic population the time to fixation of a beneficial allele is the same as for a deleterious allele with the same strength of selection and dominance of 1 - h. Their derivations will extend to cover the present cases, given the same approximations made above. The symmetry for the time to fixation of beneficial and deleterious alleles is not broken by population structure, according to simulations (results not shown).

The calculations for the time to loss of an allele, also given by KIMURA and OHTA 1969 Down, are also adaptable for subdivided populations. These equations also match simulation results (not shown).


*  LIMITS TO THE APPROXIMATIONS
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

The diffusion approximations used in this article have their limitations. In particular, it has been assumed that (1) selection is weak, (2) FST is constant, and (3) neutral descriptions of population structure are approximately valid for weak selection. Let us explore these assumptions in turn.

The weak selection assumption is common to all diffusion models. In principle, the strength of selection should be weak enough that terms involving s2 and higher can be ignored. Fig 10 shows that the accuracy of the probability of fixation results drops as the strength of selection increases. In particular, the probability of fixation predicted by Equation 1 is an underestimate of the true probability of fixation. However, this is true even for undivided populations, but not to such a large degree. The time to fixation is still accurately predicted with strong selection (see Fig 10B.).



View larger version (0K):
In this window
In a new window
Download PPT slide
 
Figure 10. The probability and time to fixation of strongly beneficial alleles. (A) As the strength of selection becomes very large, the diffusion approximation underestimates the probability of fixation of beneficial alleles, although the approximation does well up to the point where s = m. (B) Even when the probability of fixation is poorly estimated the diffusion approximation still gives a valid estimate of the time to fixation. Parameters used in these calculations: 100-deme extinction-recolonization model with soft selection and N = 100, m = 0.05, e = 0.025, {phi} = 1/2, k = 4, and h = 1/2. The lines represent the results from the diffusion approximations and the dots represent results from 106 simulations.

The second assumption, that FST is constant, is required for the time homogeneity assumption of the diffusion model to hold. In reality, the actual value of FST is likely to fluctuate as a result of the stochastic nature of pedigrees, but it seems from the agreement with the simulations that it is sufficient that the expected value of FST be the same for all stages of the selection process. This assumption seems to be potentially the weakest of the assumptions made by the application of the diffusion equation to population structure. Any setting in which the equilibrium FST would be reached slowly relative to the change of allele frequencies by selection would seem a priori to be unlikely to be well described by the diffusion model with a constant FST. Nevertheless, the simulations show that the model works well, much better than might be expected.

Finally, to use the results from this article it is necessary that the expected value of FST is known, and for most cases we know only the expected values of FST for neutral alleles. While FST for selected loci can differ strongly from that expected with neutral loci (LEWONTIN and KRAKAUER 1973 Down), for weakly selected alleles (such that either Ns < 1 or s < m), FST is well predicted by the neutral models (WHITLOCK 2002 Down). Thus the results presented here should be more sensitive to the assumption of weak selection than would be true for the unsubdivided case. With stronger selection (such that s > m for some demes), FST of the selected alleles will be different from that predicted by the neutral FST; moreover, the dynamics of the fixation of the allele will be substantially different. For example, with very strong selection and weak migration, the future of an allele could largely be determined before the allele even migrated to another deme, making the population structure nearly irrelevant to its probability of success.


*  DISCUSSION
*TOP
*ABSTRACT
*KIMURA'S DIFFUSION RESULTS
*PROBABILITY OF FIXATION IN...
*DRIFT LOAD AND FIXATION...
*TIME TO FIXATION
*LIMITS TO THE APPROXIMATIONS
*DISCUSSION
*LITERATURE CITED

Over the last century, we have come to know more and more about the fates of new mutations. The history of the study of the probability of fixation is one of increasing generality. First, the classic results of HALDANE 1927 Down and FISHER 1922 Down found the probability of fixation of beneficial alleles in a panmictic ideal population to be ~2hs. KIMURA 1964 Down refined this in several important ways, allowing for nonideal populations and for both beneficial and deleterious alleles. For beneficial alleles, Kimura found that the probability of fixation was given by ~2hsNe/Ntot. Moreover, KIMURA's (1964) method turns out to be useful in even more general scenarios, such as those considered here. Population structure was added by MARUYAMA 1970 Down, who found that, for the island and stepping-stone models, the probability of fixation of beneficial additive alleles was again 2hs, for the special case he considered where h = 1/2. BARTON 1993 Down showed that this result was not true for all models of population structure; in particular it failed for two special cases with extinction-recolonization. Our article simplifies and generalizes the results of Maruyama and Barton by placing them into the same framework as previous work. Here we have seen that the probability of fixation of a new beneficial allele (with soft selection as assumed by Maruyama and Barton implicitly) is given by 2hs(1 - FST)Ne/Ntot. The older results cited in this paragraph are all special cases of this formula.

We have seen that the diffusion model generates results that are surprisingly useful for systems with spatial population structure. KIMURA's (1964; KIMURA and OHTA 1969 Down) method to obtain the probability of fixation, probability of loss, and time to fixation or loss of alleles can be applied to cases with population structure, including arbitrary dominance and deleterious alleles.

This generality is a bit surprising, given that the assumptions of the diffusion are flagrantly abused by the inclusion of spatial structure. Most importantly, the pattern of spatial genetic differentiation will change as an allele increases in frequency, in a way that is not expected to be perfectly described by neutral FST, as has been assumed here. Yet simulations show that diffusion equations do very well to describe the probability and time to fixation of new alleles, even with neutral predictions of FST.

One of the most striking implications of the success of these equations is that the values of Ne and FST derived from neutral models of subdivided populations can be used to predict the probability and time of fixation of beneficial and deleterious alleles. There is extensive literature about the expected patterns of FST for a variety of models; moreover, it is relatively straightforward to calculate FST for any arbitrary neutral model. The variance Ne of a subdivided population can generally be calculated from the equations in WHITLOCK and BARTON 1997 Down and WANG and CABALLERO 1999 Down. Moreover, FST and Ne are readily measured from genetic data. With these two quantities, the effects of the population subdivision on the selection process can be effectively summarized, without further information about the specifics of population structure, provided that the strength of selection considered is within the limits of the assumptions detailed above.

The most important effects of population structure entering the equations on probability or time of fixation are caused by changes in the effective population size. Population structure can either increase (as in the island model) or decrease (as is the case with local extinction and colonization or source-sink dynamics) the effective size of a species, for a given census size. When Ne /Ntot is reduced, the probability of fixation of beneficial alleles is decreased proportionally, and the probability of fixing deleterious alleles is increased. For both beneficial and deleterious alleles, the time to fixation is substantially shortened as Ne is reduced.

In addition to the changes in Ne caused by spatial structure, the pattern of genetic differentiation associated with population structure can cause an increase in competition between relatives (with soft selection). As a result, the mean change of allele frequency associated with selection is reduced. This in turn makes beneficial alleles less likely to fix, deleterious alleles more likely to fix, and the total time to fixation increase. With a traditional island model or stepping-stone model, where there is zero variance in reproductive success among demes and pure soft selection, the effect of this change in the efficacy of selection exactly counterbalances the effects of increasing Ne for additive beneficial alleles; thus the probability of fixation appears unaffected by population structure. Even in these simple cases, however, with deleterious alleles or if there is any dominance to either allele, population structure affects fixation probability.

If a new allele is partially recessive to the ancestral allele, population structure can have yet another effect: allowing the expression of recessive phenotypes because of increased homozygosity. A new beneficial recessive allele can be much more likely to fix with population structure because of this effect. Ignoring the effects of the change in Ne, which could be in either direction depending on the case, a new beneficial allele is ~(FST + (1 - FST)h)/h times more likely to fix relative to a panmictic population.

In contrast, the fate of deleterious alleles is not much affected by dominance. This difference between beneficial and deleterious alleles reflects a difference in the critical period for fixation of the two types of allele. The fate of a beneficial allele is determined while it is rare and therefore its expression when rare is key. A deleterious allele is constantly subject to loss by deterministic forces and will never reach a frequency at which its future is determined. For the deleterious allele, its fitness at all frequencies is important, and therefore its dominance is less critical.

Most of the examples in this article have focused on soft selection, where the contribution of a deme to the next generation is independent of the local allele frequency. With hard selection, population structure increases the efficacy of selection for all dominance coefficients. In this case, there is an increased expression of homozygotes and therefore more efficient selection, but this is not accompanied by an increase in competition between relatives. Hard selection, all else being equal, allows a higher rate of fixation of beneficial alleles and a lower rate of fixation of deleterious alleles. However, "all else" may not be equal, since the very fact of hard selection requires the possibility of variance among populations in reproductive success. With hard selection, a model with strictly "conservative" migration is impossible, and effective population sizes may be therefore substantially lower.

The results in this article have focused on the case where selection is applied uniformly in all places. Clearly there are many exciting cases in which selection varies dramatically over space. A case can be made that most new mutations are consistently beneficial or deleterious throughout a species, although the proportion of such mutations is presently unknown. TACHIDA and IIZUKA 1991 Down have begun the analysis of this more interesting case, although their derivations rely on MARUYAMA's (1970) results and therefore apply only to models of conservative migration. It will be very interesting to see extensions of these results to more realistic population structure models.

Finally, we should consider the relevance of the amount of time required for fixation of alleles. Population structure affects the time to fixation mainly in the same way it influences the effective population size. If Ne is much reduced, then the time to fixation is on average much reduced, and the obverse is true if Ne is increased. The time to fixation of a beneficial allele is not invariant with population structure but can change dramatically depending on the details of the demography. A shortened time to fixation of a beneficial allele should, for example, cause the effect of genetic hitchhiking to be increased, since there is less time for recombination with other genotypes. Population structure can be an important determinant for the rate of evolution of a species.


*  ACKNOWLEDGMENTS

This article benefited greatly from discussion