Genetics, Vol. 162, 395-411, September 2002, Copyright © 2002

The Effect of Deleterious Alleles on Adaptation in Asexual Populations

Toby Johnsona and Nick H. Bartonb
a Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
b Institute of Cell, Animal and Population Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom

Corresponding author: Toby Johnson, University of British Columbia, 6270 University Blvd., Vancouver, BC V6T 1Z4, Canada., johnson{at}zoology.ubc.ca (E-mail)

Communicating editor: W. STEPHAN


*  ABSTRACT
*TOP
*ABSTRACT
*GENERAL METHODS FOR CALCULATING...
*MODEL
*ANALYSIS
*NUMERICAL RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

We calculate the fixation probability of a beneficial allele that arises as the result of a unique mutation in an asexual population that is subject to recurrent deleterious mutation at rate U. Our analysis is an extension of previous works, which make a biologically restrictive assumption that selection against deleterious alleles is stronger than that on the beneficial allele of interest. We show that when selection against deleterious alleles is weak, beneficial alleles that confer a selective advantage that is small relative to U have greatly reduced probabilities of fixation. We discuss the consequences of this effect for the distribution of effects of alleles fixed during adaptation. We show that a selective sweep will increase the fixation probabilities of other beneficial mutations arising during some short interval afterward. We use the calculated fixation probabilities to estimate the expected rate of fitness improvement in an asexual population when beneficial alleles arise continually at some low rate proportional to U. We estimate the rate of mutation that is optimal in the sense that it maximizes this rate of fitness improvement. Again, this analysis relaxes the assumption made previously that selection against deleterious alleles is stronger than on beneficial alleles.


IT is often useful to view adaptive evolution in an asexual population (for example, on a nonrecombining chromosome) as two separate processes. The first process is the origin of new beneficial alleles by mutation, and the second process is the fixation of some of those alleles by natural selection. This article is concerned with the second process and specifically with calculating the probability of fixation of a beneficial allele, assuming that it starts at a low frequency. This problem was first studied by modeling the copy number of the beneficial allele as a branching process (FISHER 1922 Down, FISHER 1930 Down; HALDANE 1927 Down), assuming that the number of descendants from each copy of the beneficial allele is drawn from a Poisson distribution with mean W, where W {equiv} 1 + sb is the absolute fitness of an individual carrying the beneficial allele. In a large population of fixed size N, the probability of ultimate fixation of a single copy of a beneficial allele is Pfix = p[sb] (for 1/N << sb), where p[·] is the unique function satisfying

(1)

(FISHER 1922 Down) and Pfix ~= 2sb when 1/N << sb << 1 (HALDANE 1927 Down).

The geographical invariance principle states that the fixation probability of a beneficial allele is unaffected by spatial structuring of the population when there is no variation in W between demes (MARUYAMA 1970 Down, MARUYAMA 1971 Down; NAGYLAKI 1982 Down). However, almost any other departure from an idealized Wright-Fisher population at linkage equilibrium violates the assumptions made by FISHER 1922 Down and HALDANE 1927 Down. Specifically, their result does not hold when there is variation in W, and this is likely under a range of biologically realistic conditions. Examples include variation in space caused by local extinction and recolonization (BARTON 1993 Down; BARTON and WHITLOCK 1997 Down), variation in time caused by fluctuating population size (EWENS 1967 Down; OTTO and WHITLOCK 1997 Down), variation across genetic backgrounds caused by selection at linked loci (FISHER 1930 Down; HILL and ROBERTSON 1966 Down; ROBERTSON 1970 Down; MANNING and THOMPSON 1984 Down; BARTON 1994 Down, BARTON 1995 Down; CHARLESWORTH 1994 Down; PECK 1994 Down; CABALLERO and SANTIAGO 1995 Down; GERRISH and LENSKI 1998 Down; STEPHAN et al. 1999 Down; ORR 2000B Down; GERRISH 2001 Down; RICE and CHIPPINDALE 2001 Down), and variation in the intensity of selection on the beneficial allele itself (for example, POLLAK 1966B Down; KIMURA and OHTA 1970 Down; BARTON 1987 Down).

An accurate formula for fixation probabilities, based on a biologically appropriate model, is desirable for at least two reasons. First, it is one of the building blocks of more complex evolutionary models, in which the behavior of rare beneficial alleles is not explicitly modeled. Instead, the convenient assumption is made that a fraction Pfix of beneficial alleles reach frequencies large enough to actually be considered in the model, and the remaining fraction (1 - Pfix) are lost while still rare and can be ignored altogether (for example, KIMURA 1983 Down; BERG 1995 Down; HARTL and TAUBES 1996 Down, HARTL and TAUBES 1998 Down; GERRISH and LENSKI 1998 Down; ORR and KIM 1998 Down; ORR 1998 Down, ORR 2000A Down, ORR 2000B Down; POON and OTTO 2000 Down; GERRISH 2001 Down). In some models this assumption is in fact a limiting property of the model, obtained when selection is strong and mutation is weak relative to drift (GILLESPIE 1983A Down, GILLESPIE 1983B Down). Second, laboratory evolution experiments with Escherichia coli (GERRISH and LENSKI 1998 Down; IMHOF and SCHLOTTERER 2001 Down) and vesicular stomatitis virus (MIRALLES et al. 1999 Down) have allowed inferences to be made about the number of beneficial mutations that have arisen over the course of the experiment, on the basis of an observation of the number of selective sweeps that have occurred. Clearly, making accurate estimates requires an accurate formula for fixation probability (as pointed out by ORR 2000B Down).

This article is primarily concerned with how the fixation probability of a beneficial allele at one locus is influenced by segregating deleterious alleles at other loci, in a completely asexual population or along a completely nonrecombining chromosome. In the absence of recombination, beneficial mutations that arise in a given genetic background are effectively "trapped" in it, unable to recombine into other backgrounds (FISHER 1930 Down, p. 122). Variation in fitness between the different backgrounds, caused by segregating deleterious alleles, can reduce or outweigh the advantage conferred by a beneficial allele, resulting in a substantially reduced fixation probability. The magnitude of this reduction has been the subject of several theoretical investigations (MANNING and THOMPSON 1984 Down; CHARLESWORTH 1994 Down; PECK 1994 Down; ORR 2000B Down) and also of a recent empirical study (RICE and CHIPPINDALE 2001 Down). This effect has also been studied theoretically for sexual populations, but with the emphasis on either a small number of tightly linked loci (BARTON 1995 Down; STEPHAN et al. 1999 Down) or a large number of loosely linked loci (PECK 1994 Down; BARTON 1995 Down).

Previous analytical work on asexual models has either assumed a fixed selection coefficient against deleterious alleles (sd) with sb < sd (MANNING and THOMPSON 1984 Down; CHARLESWORTH 1994 Down; PECK 1994 Down) or has allowed sd to vary across loci assuming that sb < min{sd} (BARTON 1995 Down; ORR 2000B Down). Under either of these assumptions, the frequency of the beneficial allele is expected to increase only if it is present in the most fit genetic background, as first pointed out by FISHER 1930 Down(p. 122). If, prior to the origination of the beneficial allele, the frequency of the most fit background or genotype is f0, and if a beneficial allele that originates on a lower fitness background has negligible probability of subsequently moving onto the most fit background, then

(2)

for 1/N << sb < sd (CHARLESWORTH 1994 Down; PECK 1994 Down). When all deleterious alleles have a fixed effect, the value of f0 in a population at equilibrium can be calculated by knowing that if the number of new deleterious alleles per individual per generation is Poisson distributed with mean U then the number of deleterious mutations per individual is Poisson distributed with mean U/sd (HAIGH 1978 Down) and hence f0 = exp[-U/sd]. ORR 2000B Down proved that when selection coefficients against deleterious alleles are distributed with harmonic mean Eh(sd) then at equilibrium f0 = exp[-U/Eh(sd)]. (This result was stated previously without proof by GESSLER 1995 Down and CHARLESWORTH 1996 Down.)

In this article we assume deleterious alleles of fixed effect but relax the requirement that sb < sd. This case has been studied previously by MANNING and THOMPSON 1984 Down. However (as discussed by PECK 1994 Down), their analysis made an erroneous assumption that Pfix = 0 if the beneficial allele was lost from the background in which it first arose, and also only a table of numerical results was presented. PECK 1994 Down conducted some simulation work for sb > sd. We present an algorithm for exact solution of the model studied by MANNING and THOMPSON 1984 Down and PECK 1994 Down. This solution is valid when sb >> 1/N and exp[-U/sd] >> 1/N. We also study the case where the beneficial mutation arises in a population that is not at equilibrium with respect to deleterious allele frequencies. A population will be perturbed from equilibrium by a selective sweep or a population bottleneck, and if selection against deleterious alleles is weak then the timescale of the approach back to mutation-selection equilibrium will be very slow (JOHNSON 1999 Down). Indeed, in a rapidly evolving asexual population it is likely that frequencies of deleterious alleles will fluctuate continually and it is therefore of some interest to consider how fixation probabilities are altered in out-of-equilibrium populations. The more specific question addressed here is: What effect does the sweep to fixation of one beneficial allele have on the probability of subsequently arising beneficial alleles also becoming fixed?

We apply our results for fixation probabilities by estimating the expected rate of fitness improvement in an asexual population when beneficial alleles arise continually by mutation at rate kU per individual per generation, with k << 1. Our analysis is essentially an extension of the work of ORR 2000B Down, who used Equation 2 to estimate fixation probabilities and so derived results that are valid only when sb < sd. Following ORR 2000B Down, we estimate the rate of mutation that is optimal in the sense that it maximizes this rate of fitness improvement.


*  GENERAL METHODS FOR CALCULATING FIXATION PROBABILITIES
*TOP
*ABSTRACT
*GENERAL METHODS FOR CALCULATING...
*MODEL
*ANALYSIS
*NUMERICAL RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

The branching process model as originally developed (FISHER 1922 Down, FISHER 1930 Down; HALDANE 1927 Down) assumes that W is the same for all copies of the beneficial allele and is constant over time. Building on work by POLLAK 1966A Down, POLLAK 1972 Down, BARTON 1995 Down(and references therein) has applied the theory of multitype branching processes (see, e.g., HARRIS 1963 Down) to study how fixation probabilities are influenced when W varies according to the "site" in which a given copy of the beneficial allele finds itself. This is a flexible approach in which sites can represent demes or microhabitats (POLLAK 1966A Down, POLLAK 1972 Down; BARTON 1987 Down, BARTON 1993 Down), genetic backgrounds (BARTON 1994 Down, BARTON 1995 Down), age classes, or any other aspect of population structure. Relatively complex models can be analyzed in a straightforward way by considering the probability Qi,t that a single copy of the beneficial allele present in site i in generation t is ultimately lost from the population. An expression for Qi,t can be found by summing over all possible numbers of offspring and all possible movements of offspring between sites. If each copy of the allele present in site i at time t independently gives rise to n offspring, where n is drawn from a Poisson distribution with mean Wi,t, and the independent probability that each of these offspring will be in site j at time t + 1 is mi,j, and each offspring in site j has independent probability Qj,t+1 of being lost, then

(3)

where

(4)

is the probability of loss given that the allele at time t has exactly one offspring (BARTON 1995 Down). These equations extend in a straightforward way to give the generating function for the distribution of the number of copies of the beneficial allele.

The assumption that separate copies of the allele are independent means that the branching process is an appropriate model only when the number of copies of the beneficial allele is small relative to the total population size. When sb >> 1/N it is reasonable to assume that deterministic forces will prevail when the branching process model breaks down in this way, and in this case an allele that arises in background i at time t and that is never lost is said to become established. This occurs with probability Pi,t = 1 - Qi,t. The probability of establishment for a beneficial mutation that arises in a random genetic background is denoted Pfix,t and is calculated as an average of the Pi,t, weighted by the probability of the beneficial mutation initially occurring in each background i. If an allele is established, it is not actually guaranteed fixation, but its frequency might approach a polymorphic equilibrium or it might become fixed only in some sites.

Because the branching process model assumes that the fate of an allele of interest is determined while it is rare, it cannot be used to calculate fixation probabilities for slightly deleterious mutations or for beneficial mutations that confer an advantage that is weak relative to the effects of genetic drift. For the same reason, branching process models cannot be used to determine the distribution of times taken until ultimate fixation of an allele. To address these types of questions KIMURA 1957 Down used a diffusion approximation, which is valid when change is approximately continuous in time (large population sizes and weak selection). This approach has been used extensively to study the expected time to fixation (KIMURA and OHTA 1969 Down) and how fixation probabilities are influenced by population subdivision (MARUYAMA 1970 Down, MARUYAMA 1971 Down; NAGYLAKI 1982 Down; BARTON 1987 Down, BARTON 1993 Down; BARTON and ROUHANI 1987 Down), by "background selection" due to segregating deleterious alleles (CHARLESWORTH 1994 Down), or by changing population size (OTTO and WHITLOCK 1997 Down). Despite the advantages of the diffusion approximation, the branching process model remains a useful tool because it can often be analyzed much more easily.


*  MODEL
*TOP
*ABSTRACT
*GENERAL METHODS FOR CALCULATING...
*MODEL
*ANALYSIS
*NUMERICAL RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

The model used is identical to the one studied by MANNING and THOMPSON 1984 Down and PECK 1994 Down. The notation used is summarized in Table 1. We consider the fate of a beneficial allele within a haploid population of fixed size N. Individuals without the beneficial allele are called "wild type." We assume that N is sufficiently large for deterministic results to apply to the wild-type subpopulation and for the effect of Muller's ratchet (loss of the most fit genotype by genetic drift; see MULLER 1964 Down; HAIGH 1978 Down; GORDO and CHARLESWORTH 2000 Down) within the wild-type subpopulation to be negligible over the timescale of interest. (A sufficient condition for this is exp[-U/sd] >> 1/N.) We assume that deleterious mutations arise at a constant rate U per genome per generation, that there are an effectively infinite number of equivalent biallelic loci, and that at each locus the deleterious allele reduces fitness by a factor (1 - sd) with multiplicative effects across loci. Genetic backgrounds (sites) can therefore be enumerated by i, where i is the number of deleterious alleles and takes discrete values i = 0, 1, 2, ... Back mutations are ignored. We first consider a wild-type population at mutation-selection equilibrium. We then consider the more complex scenario of a wild-type population that is initially free of deleterious alleles and is approaching equilibrium. The correspondingly more complex calculations have been placed in Appendix A and B.


 
View this table:
In this window
In a new window

 
Table 1. Frequently used notations

We assume that a single beneficial allele arises in a randomly chosen wild-type individual, and its presence increases relative fitness by a factor (1 + sb) regardless of the genetic background on which it is expressed; that is, we assume no epistasis for fitness. Except for its small size, the subpopulation carrying the beneficial allele is identical to the large wild-type (sub)population. That is, deleterious mutations arise at the same rate U and have the same effect on fitness (1 - sd). Because the number of copies of the beneficial allele is initially small, we consider the progress of Muller's ratchet within this subpopulation. We calculate the fixation probability for the beneficial allele, Pfix, by considering its copy number in different genetic backgrounds as a multitype branching process. We assume a Poisson distribution of offspring number.

To estimate the long-term average rate of adaptation, measured as the rate of fitness improvement, we embed our model for fixation probabilities within a more complex model. This is a generalization of the model of ORR 2000B Down. We make the standard assumption that the beneficial mutation origination process is Poisson and occurs at rate kU per individual per generation, where k << 1 is the ratio of beneficial to deleterious mutations. Because the population size is N the total number of new beneficial alleles arising per generation is NkU. We then make the questionable assumption that each such mutation is fixed with equal and independent probability Pfix, so that the mutation fixation process is also Poisson and occurs at rate NkUPfix. This assumption requires that the perturbations due to beneficial substitutions are small and transient enough that the fixation probabilities of most beneficial mutations are well approximated by the equilibrium population results. In making this assumption we also ignore interference between multiple beneficial alleles (the Hill-Robertson effect; FISHER 1930 Down; MULLER 1932 Down; HILL and ROBERTSON 1966 Down; FELSENSTEIN 1974 Down; BARTON 1995 Down; GERRISH and LENSKI 1998 Down; GERRISH 2001 Down). We assume that successively fixed beneficial alleles have multiplicative effects on fitness.


*  ANALYSIS
*TOP
*ABSTRACT
*GENERAL METHODS FOR CALCULATING...
*MODEL
*ANALYSIS
*NUMERICAL RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

Fixation probabilities:
Fitness relative to the fittest wild-type individual is denoted w. Hence, when the beneficial allele is present in the fittest possible individual it has relative fitness w = w0 = (1 + sb). A beneficial allele present in genetic background i has relative fitness

(5)

At this stage, a minor technical point should be made about two factors that have not been made very explicit in some previous analyses, although they were discussed in the Appendix Aof PECK 1994 Down. In the branching-process model the expected number of offspring of a given genotype is its absolute fitness Wi, which with a constant population size is its fitness relative to the population mean fitness, . Because the beneficial allele is rare by assumption, is equal to the mean fitness of the wild-type subpopulation, and because we assume here that the wild-type population is at equilibrium (KIMURA and MARUYAMA 1966 Down) and therefore

(6)

The importance of this difference between absolute and relative fitness is not necessarily apparent when both the wild-type population is at equilibrium and only unmutated offspring are of interest. A fraction e-U of the offspring of a given individual carrying the beneficial allele are free from additional deleterious mutations, and so in this special case these two factors cancel out and correct results can be derived by assuming that the "effective absolute fitness" is e-UwieU = wi. This simplification does not apply generally, however.

In Equation 3 and Equation 4, Qi,t is the probability of loss of a single copy of the beneficial allele present after selection followed by movement between sites. Since here "movement between sites" represents deleterious mutation, the probability of the beneficial allele arising in site i is calculated using the frequencies of the different sites after deleterious mutation. Because the number of deleterious alleles, i, carried by a randomly chosen wild-type individual is Poisson distributed with mean {lambda} = U/sd (HAIGH 1978 Down), we have that

(7)

is the probability of the beneficial mutation arising with i deleterious alleles (i.e., the frequency of background i). Equation 7 holds only when exp[-U/sd] >> 1/N.

It is more convenient to rewrite Equation 4 in terms of the probability P*i that an allele in background i is never lost (where ). The probability that a given copy of the beneficial allele in site i is moved to site i + j by deleterious mutation is simply

(8)

When we substitute (4) and then (8) into (3) we obtain

(9)

As mentioned briefly earlier in this article and discussed at more length by PECK 1994 Down, there may be backgrounds in which the beneficial allele confers a net advantage in the short term but that will ultimately give rise to a lineage with a lower mean fitness than the wild-type population; such beneficial alleles are assumed not to ultimately fix. Consider a beneficial allele subpopulation that has become large and that includes individuals with i, i + 1, i + 2, ... deleterious alleles. The most fit individual in this subpopulation has fitness max{wB} = (1 + sb)(1 - sd)i and so the subpopulation will ultimately approach mutation-selection balance with . The wild-type subpopulation ultimately approaches and if B > W, equivalently (1 + sb)(1 - sd)i > 1 or

(10)

then the beneficial allele subpopulation will ultimately replace the wild-type subpopulation. In condition (10) {lfloor}·{rfloor} denotes the integer part and imax is the largest value of i where the condition is satisfied, noting that it is always satisfied for i = 0. Here we have assumed that the wild-type population is sufficiently large that the beneficial allele subpopulation does not have time to fix before reaching approximate mutation-selection balance, and we have ignored back mutation of deleterious alleles. (The validity of these assumptions is discussed below.) To find the probability that the beneficial allele ultimately fixes we need to follow only the number of copies in backgrounds 0 <= i <= imax, because if it is lost from all of these backgrounds then it can never ultimately fix. Therefore we can replace the {infty} in the upper limit of the sum in Equation 9 with (imax - i).

As t -> {infty} with N constant the state of the wild-type population becomes constant over time, and at equilibrium both Pi,t+1 and Pi,t converge to the single value Pi (an abbreviation for Pi,{infty}).

By making the simplifications described in the previous two paragraphs, we obtain a set of simultaneous equations in Pi for i = 0, 1, 2, ... , imax. In general, we can start by solving

(11)

where pi is an abbreviation for p[wi - 1] and p[s] is the fixation probability of a Poisson branching process with mean 1 + s (see Equation 1). Each Pi can then be calculated numerically in descending order of i by solving

(12)

Once the Pi are known, the net fixation probability Pfix can be calculated by averaging over all of the different genetic backgrounds

(13)

where the fi are given by setting {lambda} = U/sd in Equation 7. There does not appear to be a more concise general expression for Pfix, except for the special case where imax = 0, which was solved by PECK 1994 Down(see Equation 2 above). The calculations described here have been automated in a Mathematica (WOLFRAM 1996 Down) notebook available from T. Johnson on request.

The rate of adaptation:
Our model for long-term adaptation assumes that beneficial mutations are rare enough that they can be considered, to a good approximation, to arise in close-to-equilibrium populations, so that Pfix is relevant. We wish to find an approximation for C, the expected increase in mean log-fitness per generation. To do this we first find an approximation for {Delta}C, the expected increase in log-fitness per beneficial substitution, which in turn requires an approximation for {Delta}Ci, the expected increase in log-fitness per beneficial substitution conditional on the beneficial allele arising in background i.

If the beneficial allele arises on background i and is ultimately fixed, then some number hi of deleterious alleles will be fixed by hitchhiking (allowing the possibility of h0 = 0). Clearly hi >= i, but the problem is that hi conditional on fixation is a random variable and may exceed i in the event that the beneficial mutation is lost from the background on which it originally arose but ultimately does becomes fixed. We make use of the decomposition Pi = pi + xi, which is detailed in Appendix C. Here pi = p[wi - 1] is the probability of fixation given that the beneficial allele fixes in the background in which it arose (i), and xi is the probability of fixation given that the beneficial allele is lost from the background in which it arose. xi can be calculated directly by numerical solution of Equation C2 or simply by taking the difference between Pi and pi. This partitioning of fixation events into two mutually exclusive possibilities suggests that for U << 1 it might be reasonable to suppose that the fate of a beneficial allele lost by mutation from background i, conditional on eventual fixation, is identical to the fate of a beneficial allele that arises in background i + 1, again conditional on ultimate fixation. This leads to the approximation

(14)

which can be calculated in decreasing order of i because . The expected increase in population mean fitness given that a beneficial allele arising on background i becomes fixed can be similarly approximated, and because for small sb the mean increases in fitness and log-fitness are roughly equal we obtain

(15)

Both of these quantities can be averaged over the distribution of fiPi. Then we obtain the expected number of deleterious alleles that will hitchhike with a beneficial allele arising on a random background, conditional on its ultimate fixation

(16)

and the expected increase in mean log-fitness per fixation

(17)

As explained above, the assumption of our model for continued adaptation is that the mutation fixation process is Poisson and occurs at rate NkUPfix. Because we assume multiplicative effects of multiple beneficial alleles, the long-term average rate of mean log-fitness increase is given approximately by

(18)

When U is varied and the other model parameters are held constant, C is maximized at a particular value of U, which we call the optimum mutation rate Uopt. ORR 2000B Down proved that when sb < sd (so that imax = 0) the optimum mutation rate is given simply by Uopt = sd. This result can be seen by substituting (2) into (18) and differentiating to obtain

(19)

(because {Delta}C ~= sb does not depend on U for imax = 0) and noting that dC/dU = 0 and d2C/dU2 < 0 when U = sd (ORR 2000B Down). For imax > 0 no such simple derivation is possible, but Uopt can be found by numerical calculation reasonably efficiently using a golden section search, automated in a Mathematica (WOLFRAM 1996 Down) notebook available from T. Johnson on request. It is notable that because both Pfix and {Delta}C are only functions of U, sb, and sd and do not depend on N or k, then Uopt must be a function of sb and sd only and also will not depend on N or k.


*  NUMERICAL RESULTS
*TOP
*ABSTRACT
*GENERAL METHODS FOR CALCULATING...
*MODEL
*ANALYSIS
*NUMERICAL RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

Fixation probabilities when many genetic backgrounds are relevant:
As we described in the Introduction (see also FISHER 1930 Down; MANNING and THOMPSON 1984 Down; CHARLESWORTH 1994 Down; PECK 1994 Down; BARTON 1995 Down; ORR 2000B Down), when sb <= sd a beneficial allele must arise in the single most fit genetic background to have any chance of fixation, and the net fixation probability is given by Equation 2. However, when sb >> sd there are many genetic backgrounds in which a beneficial allele can arise and have some probability of fixation and so the situation is more complex. This is illustrated in Fig 1, which shows Pi (top, line), the fixation probability of a beneficial allele arising in a background with i deleterious alleles, assuming a wild-type population at equilibrium and parameter values sb = 5 x 10-3, sd = 5 x 10-4, and U = 3 x 10-3. The probability of arising in each background, fi (top, dots), and the way that Pi and fi combine (bottom) to determine the net fixation probability Pfix are also shown. For beneficial alleles arising in the least fit background of relevance, with i = imax = 9 deleterious alleles, the fixation probability is very small (wimax < ~1 + sd and hence Pimax < ~2sd for sd << 1) because the advantage of the beneficial allele is almost totally eliminated by the deleterious alleles it is linked to and because any new deleterious mutation will eliminate that advantage altogether. As i decreases Pi increases and approaches an asymptote, which is P0 ~= 2(U + sb) = 1.6 x 10-2 for U << 1, sb << 1. This asymptote occurs because the beneficial allele is in a genotype with absolute fitness approaching (1 + sb)exp[U] and because many new deleterious mutations must occur to reduce that advantage. A mathematical description of this behavior, which is a good approximation when U >> sb, is detailed in Appendix C. In this example the main contribution to the net fixation probability Pfix is from moderate fitness backgrounds with i intermediate between zero and imax. For the parameter values used in Fig 1 a beneficial allele that fixes is most likely to be one that arose on a background of i = 5 deleterious alleles, and therefore at least one-half its selective advantage will be negated by the deleterious alleles that hitchhike to fixation with it.



View larger version (13K):
In this window
In a new window
Download PPT slide
 
Figure 1. A beneficial allele has probability fi (top, dots) of arising in a background of i deleterious alleles. Given that it does, it has fixation probability Pi (top, line). The sum of fiPi (bottom) over all backgrounds gives the net fixation probability Pfix ~= 7.8 x 10-3 for the parameter values used here (sb = 5 x 10-3, sd = 5 x 10-4, U = 3 x 10-3).

Fixation probabilities in an equilibrium population:
Fig 2 shows results for the situation where the beneficial allele arises in a population at equilibrium under selection and deleterious mutations of fixed effect. It explores the region of the parameter space where U/sd <= 10, so that f0 >= e-10 ~= 4.5 x 10-5 will represent a large number of individuals for population sizes that are realistic, at least for bacteria. Shown is the relative fixation probability when there is interference, R = Pfix/p[sb], where the fixation probability of the beneficial allele is Pfix and the value it would take in the absence of any interference is p[sb]. A built-in Mathematica algorithm (WOLFRAM 1996 Down) was used to fit contours to a lattice of values calculated by numerical solution of Equation 11 and Equation 12. The irregularities in the contour lines are not caused by numerical inaccuracies but occur near where the number of relevant genetic backgrounds imax changes from one integer value to the next, which causes the gradient of Pfix to change abruptly.



View larger version (41K):
In this window
In a new window
Download PPT slide
 
Figure 2. The fixation probability Pfix of a beneficial allele conferring a selective benefit sb is reduced by interference from segregating deleterious alleles causing disadvantage sd that arise at completely linked loci at rate U. The relative fixation probability in an equilibrium population (R = Pfix/p[sb], measured relative to the fixation probability without interference) is shown as a function of sb and sd. From top to bottom U = 10-3, U = 10-2, and U = 10-1.

For the region of the parameter space considered in Fig 2 the relative fixation probability spans almost five orders of magnitude. A logarithmic scale of fixation probabilities is appropriate if one wishes to understand in which regions of the parameter space evolution essentially cannot proceed. However, for other questions a linear scale is more appropriate. In the case of competing subpopulations, a small difference in rates of adaptation is critical, and the rate of adaptation is approximately linear in the fixation probability. In a log-log-linear space the (sb, sd, R) surfaces show regions with roughly constant either high (R ~= 1) or low (R ~= 0) fixation probability. To help visualize this we show additional contour lines at R = 0.9 (dotted lines) and R = 0.5 (dashed lines), which together with the first solid contour at R = 0.1 roughly define the transition from no effect to a severe effect of interference from deleterious alleles.

In all of Fig 2, the area above and to the left of the diagonal sb = sd represents the region of the parameter space where Equation 2 applies, and the area below and to the right of this diagonal represents the region of the parameter space where the methods developed above are necessary to calculate the fixation probability.

Several trends worthy of comment are evident in Fig 2. First, by comparing across all three panels we can see that fixation probabilities are reduced more severely by interference as the rate of deleterious mutation U increases. This result is expected. Second, the severity of the reduction in fixation probability often but not always increases monotonically as the strength of selection against deleterious alleles sd decreases. An example of nonmonotonicity in sd can be seen when U = 10-3 and sb = 10-2.5 (Fig 2, top). This perhaps counterintuitive result occurs because there are two opposing forces at work here. As sd decreases the frequency of deleterious alleles increases, and hence they are more likely to be present in the background on which the beneficial allele arises, but the effect of any one deleterious allele in reducing the advantage of the beneficial allele is less. The graphs show that, for the parameter space explored, the former force tends to dominate the latter. For sb < sd examination of Equation 2 shows that R increases as U increases and sd decreases, but it was not at all clear that this dependency would be true for most but not all sb.

A third visible trend is that beneficial alleles of larger effect have fixation probabilities that are less influenced by segregating deleterious alleles. This is an intuitively reasonable result, but one that is not true when sb < sd. Equation 2 shows that R is independent of sb, which can be seen to be true in the parts of Fig 2 where this equation applies, i.e., everything above and left of the line sb = sd. To the right of and below the line we see that the situation is more complex. The final trend worth noting is that in some parts of the parameter space there is a catastrophic reduction in fixation probability caused by interference from segregating deleterious alleles. To a coarse approximation, it can be said that R ~= 0.5 when sb = max{sd,U} and R will be very small when both sb << sd and sb << U.

The three parts of Fig 2 are almost perfect replicas of each other, offset by the value of U, suggesting that R depends only on two compound parameters sb/U and sd/U. Equation 2 shows that this is true when sb < sd. In Appendix C we show that this is also true when sd << sb << 1 and U << 1, that is, when selection and mutation are both weak and many genetic backgrounds are relevant.

Nonequilibrium populations:
In Appendix A and B we derive results for fixation probabilities when the population of interest was free of segregating deleterious alleles at some time t = 0 in the past. This initial condition approximates the effect of a rapid selective sweep, a severe population bottleneck with rapid recovery, or the founding of a laboratory evolution experiment from a single clone. The fixation probability of a beneficial mutation that arises at some subsequent time t = {tau} is denoted Pfix,{tau}. The results above for an equilibrium population are a special limiting case of this scenario, where {tau} -> {infty}.

Consider first the simplest case {tau} = 0. At that time a beneficial mutation is guaranteed to arise in a background free of deleterious alleles (f0 = 1), which would increase its net fixation probability. However, at the same time the population mean fitness would be unity, causing the absolute fitness of genotypes containing the beneficial allele to be lower than in an equilibrium population (where ; see Equation 6), which would cause a decrease in its net fixation probability. There are therefore two opposing forces at work, and their combined effect on the fixation probability of a beneficial allele is not obvious. It is not necessarily sufficient to argue that the net fixation probability is reduced by variance in fitness across backgrounds and is therefore higher when there is zero variance at {tau} = 0, because this argument does not take into account the changing mean fitness of the wild-type population. (Indeed a variance-based argument fails to predict how Pfix depends on sd in an equilibrium population.)

In Appendix B we prove that, for the special case where sb < sd, the fixation probability for a beneficial mutation occurring at time {tau} < {infty} is always greater than in an equilibrium population ({tau} = {infty}). To see the importance of this result, consider further the case of a weakly selected beneficial allele with sb << U, sb << sd arising at time {tau} = 0. Genotypes containing such a beneficial allele all give rise to, on average, less than one offspring of the same genotype at t = 0 and for some period of time thereafter (because e-UWi,0 < 1, see Appendix A). To have any probability of fixation at least one copy of the allele must persist until the wild-type mean fitness has decayed significantly to have absolute fitness greater than one. It is therefore quite surprising to find that such beneficial alleles always have greater fixation probability than they would if they arose in an equilibrium population. We offer the following explanation for our perhaps counterintuitive result. As we go backward in time there is a decrease in fixation probability due to increasing t and an increase in fixation probability due to increasing f0,t. Because these two forces are coupled, and the nature of this coupling ensures that, working backward in time from an equilibrium fixation probability, the fixation probability will always increase. On the basis of our numerical results for sb > sd we speculate that this is true for all values of sb and sd.

We measure the effect of a nonequilibrium population by the inflation in fixation probability, I = Pfix,{tau}/Pfix,{infty}, measured relative to an equilibrium population. Fig 3 shows I as a function of sb and of {tau}. In these calculations we assumed that after 1000 generations the population would be close to equilibrium. The choice of sd = 10-2 for these plots was influenced by the computer time required for the calculations (see Appendix A), and the choices of U = 10-2 (left plot) and U = 10-1 (right plot) represent situations where there is a moderate and a large reduction in fixation probability at equilibrium. For these parameter values, any inflation in fixation probability is a relatively short-lived effect and is negligible [in the sense that (I - 1)/(max{I} - 1) < 0.05] after 300 generations. This is to be expected because a population perturbed from mutation-selection balance decays toward its equilibrium state on a timescale proportional to 1/sd = 100 generations (JOHNSON 1999 Down). What is more remarkable is that the inflation in fixation probability occurs only for beneficial alleles with certain selection coefficients. For U = 10-2 only beneficial alleles with sb ~= 10-2 have inflated fixation probabilities and the effect is moderate (I <= 1.6). For U = 10-1 beneficial alleles with selection coefficients sb ~= 10-2 or greater have inflated fixation probabilities and the effect is substantial (I ~= 40). To understand why this is so, it is necessary to consider these fixation probabilities relative to the case with no interference, measured by R = Pfix,{tau}/p[sb], which are shown in Fig 4. The abrupt changes in gradient visible in the graph are caused when the number of relevant genetic backgrounds imax changes from one integer value to the next.



View larger version (43K):
In this window
In a new window
Download PPT slide
 
Figure 3. The fixation probability Pfix of a beneficial allele conferring a selective benefit sb is inflated if it arises some short time {tau} after segregating deleterious alleles have been purged from an asexual population. The degree of inflation (I = Pfix,{tau}/Pfix,{infty}, measured relative to the fixation probability in a population at mutation-selection equilibrium) is shown as a function of {tau} and sb. Left, U = 10-2, sd = 10-2, and the maximum inflation is I ~= 1.58. Right, U = 10-1, and sd = 10-2, and the maximum inflation is I ~= 37.7. In both sides the same shading scheme is used, solid contours are at geometric intervals of ~1.44, and dashed contours are at geometric intervals of ~1.048.



View larger version (14K):
In this window
In a new window
Download PPT slide
 
Figure 4. Relative fixation probabilities, R, in a nonequilibrium population. Top, sd = 10-2, U = 10-2; bottom, sd = 10-2, U = 10-1. Note the very different scales of the y-axes. Curves show {tau} = 0 (dotted), {tau} = 100 (dashed), and {tau} = 1000 ~= {infty} (solid).

Assuming that the fixation probability is always less than it would be in the complete absence of deleterious mutation (we prove this for sb < sd in Appendix B and speculate that it is true always), then if fixation probabilities are only moderately reduced, as in the case sd = 10-2 , U = 10-2 (Fig 4, top), then the inflation in fixation probability I can be at most moderate. On the other hand when fixation probabilities are substantially reduced, as in the case sd = 10-2, U = 10-1 (Fig 4, bottom), then the inflation in fixation probability I can also be substantial. The slightly mysterious peak of inflation I at sb ~= 10-2 for U = 10-2 can be partly explained because, for sb >> 10-2 , there is no reduction in fixation probability at equilibrium and hence there can be no transient inflation.

Fig 3 and Fig 4 show that, for the specific departure from equilibrium that we have studied, there is little effect on fixation probability for weakly selected beneficial alleles. It is possible that this is because the fate of such alleles takes a long time to be determined, and the transient nonequilibrium state of the wild-type population is therefore of little relevance.

The rate of adaptation and the optimum mutation rate:
Fig 5 shows numerical calculations of the optimum mutation rate Uopt as a function of the two parameters on which it depends, sb and sd. The optimum mutation rate is the rate that maximizes the long-term average rate of fitness increase, as estimated using Equation 18. These results are in agreement with ORR's (2000b) finding that Uopt = sd when sb <= sd (Fig 5, left). However, when sb > sd the dependence on sd is much weaker, and to a very coarse approximation Uopt ~= max{sb, sd} for the whole parameter space examined here. This result can be explained in terms of our results for fixation probabilities. For any chosen combination of sb and sd imagine the appropriate points in Fig 2. Consider first a very small value of U. The fixation probability is high. Now imagine increasing U so that a "hole" starts to appear in the corner of the plane. At first, there is a less-than-linear decline in fixation probability with U and the rate of beneficial mutations increases linearly with U, so the rate of fitness improvement C increases. Suddenly, when U exceeds max{sb, sd} there is a catastrophic decline in fixation probability and an associated decline in C. Hence the rate of adaptation is maximized just before this catastrophe, at Uopt ~= max{sb, sd}. This is illustrated in Fig 6.



View larger version (69K):
In this window
In a new window
Download PPT slide
 
Figure 5. Under the model assumptions (see text and Fig 6), the optimum deleterious mutation rate Uopt maximizes the long-term average rate of fitness improvement in an asexual population. Uopt depends only on sb and sd, against which it is plotted here.



View larger version (9K):
In this window
In a new window
Download PPT slide
 
Figure 6. Assuming the ratio of beneficial to deleterious mutations, k, is fixed and small, the long-term average rate of fitness improvement C in an asexual population is maximized when the deleterious mutation rate U takes a particular value, Uopt, which we term the optimum. Uopt is independent of k and of the population size N. The optimum is clear on plots of C against U, shown here for three choices of parameter values: sb = sd = 10-2 (solid); sb = 10-2 , sd = 10-3 (dotted); and sb = 10-3, sd = 10-2 (dashed). For all curves Nk = 1.


*  DISCUSSION
*TOP
*ABSTRACT
*GENERAL METHODS FOR CALCULATING...
*MODEL
*ANALYSIS
*NUMERICAL RESULTS
*DISCUSSION
*APPENDIX A
*APPENDIX B
*APPENDIX C
*LITERATURE CITED

We have described a method for calculating fixation probabilities when segregating deleterious alleles of fixed effect jointly influence the fate of a beneficial allele in an asexual population. Our analysis includes as a special case the situation studied previously where any single deleterious allele overwhelms the advantage of the beneficial allele (MANNING and THOMPSON 1984 Down; CHARLESWORTH 1994 Down; PECK 1994 Down; ORR 2000B Down). However, our analysis applies for any strength of selection on both the deleterious alleles and the beneficial allele, provided that selection is strong relative to drift and that Muller's ratchet is not operating rapidly. In particular, it allows study of the fate of a beneficial allele of large effect, where "large" means larger than the effect of a single deleterious allele. Our method allows fixation probabilities to be calculated to arbitrary precision by solution of a set of equations, but we cannot in general write down an equation for the fixation probability that improves our understanding of the effect we are studying. The value of our numerical results is that a reasonable intuition about the effect of interfering deleterious alleles on fixation probabilities can be developed by studying Fig 2. This numerical work together with the approximate analytical results derived in Appendix C justifies the following "rules of thumb" concerning R = Pfix/p[sb], a statistic that describes the fixation probability in the presence of segregating deleterious alleles (Pfix) in an equilibrium population, relative to what it would be in the absence of deleterious alleles (p[sb]).

  1. For weak selection and mutation R depends only on the compound parameters sb/U and sd/U, which describe the strength of selection relative to the rate of deleterious mutation.

  2. The relative fixation probability R cannot be predicted from only the variance in fitness in the wild-type population, which is Usd.

  3. For sb < sd the relative fixation probability is R = exp[-U/sd].

  4. For sd << U and sb > U there is negligible reduction in fixation probability and R ~= 1.

  5. For sd << U and sb < U there is substantial reduction in fixation probability and R ~= 0.

Conclusions 1, 2, 4, and 5 are novel.

Previous studies of the effect of linked deleterious alleles on fixation probabilities have all assumed sb < min{sd}, with three relatively minor exceptions. PECK 1994 Down presented simulation results where sb > sd, but only for two choices of parameter values and thus gave little indication of how Pfix depends on model parameters. The earlier numerical work of MANNING and THOMPSON 1984 Down was similarly cursory, and their analysis contained an error (as discussed by PECK 1994 Down and described above also). ORR 2000B Down(Figure 2) showed simulation results where both sb and sd were drawn from continuous distributions for each mutational event and in a fraction (~2.5%) of pairwise cases sb > sd.

The emphasis on sb < sd in previous studies has been influenced by a combination of mathematical convenience and the belief that a major component of adaptation is due to beneficial alleles of small effect. The view that sb is typically small was argued by FISHER 1930 Down on the basis of a "geometrical" model, which involves evolution toward an optimum in high-dimensional phenotype space. In this model, mutations are assumed to alter the phenotype in a random "direction" and the probability that a given mutation produces a phenotype closer to the optimum increases as the effect of that mutation becomes smaller. Thus, the density function for the distribution of sb (conditional on sb > 0) is expected to be monotonically decreasing. This tendency becomes more marked as the dimensionality of phenotype space increases. Later studies (KIMURA 1983 Down, pp. 154–155; ORR 1998 Down, ORR 1999 Down, ORR 2000A Down) describe the distribution of sb conditional on fixation and so do not alter FISHER's (1930) conclusion concerning the distribution of effects of newly arising beneficial alleles. We note that these later studies have all assumed Pfix ~= 2sb, and so a possible area for further work is to examine expressions for fixation probability that are nonlinear in sb, such as those derived here, in the context of Fisher's geometrical model.

There is an accumulating body of empirical evidence (reviewed