Waiting for Two Mutations: With Applications to Regulatory Sequence Evolution and the Limits of Darwinian Evolution
Rick Durrett, Deena Schmidt


Results of Nowak and collaborators concerning the onset of cancer due to the inactivation of tumor suppressor genes give the distribution of the time until some individual in a population has experienced two prespecified mutations and the time until this mutant phenotype becomes fixed in the population. In this article we apply these results to obtain insights into regulatory sequence evolution in Drosophila and humans. In particular, we examine the waiting time for a pair of mutations, the first of which inactivates an existing transcription factor binding site and the second of which creates a new one. Consistent with recent experimental observations for Drosophila, we find that a few million years is sufficient, but for humans with a much smaller effective population size, this type of change would take >100 million years. In addition, we use these results to expose flaws in some of Michael Behe's arguments concerning mathematical limits to Darwinian evolution.

THERE is a growing body of experimental evidence that in Drosophila, significant changes in gene regulation can occur in a short amount of time, compared to divergence time between species. Ludwig et al. (1998, 2000, 2005) studied the evolution of the even-skipped stripe 2 enhancer in four Drosophila species (Drosophila melanogaster, D. yakuba, D. erecta, and D. pseuodoobscura). While expression is strongly conserved, they found many substitutions in the binding sites for bicoid, hunchback, Kruppel, and giant, as well as large differences in the overall size of the enhancer region. In addition, they uncovered several binding sites that have been gained and lost among these four species: a lineage-specific addition of the bicoid-3 binding site in D. melanogaster that is absent in the other species, a lineage-specific loss of the hunchback-1 site in D. yakuba, and the presence of an extra Kruppel site in D. pseudoobscura relative to D. melanogaster. These differences are nicely summarized in Figure 2B of Ludwig et al. (2005).

In a simulation study, Stone and Wray (2001) estimated the rate of de novo generation of regulatory sequences from a random genetic background. They found that for a given six-nucleotide sequence, the time until it arose in a 2-kb region in some individual was ∼5950 years for humans and 24 years for Drosophila. However, as MacArthur and Brookfield (2004) have already pointed out, there is a serious problem with Stone and Wray's computation. They assumed individuals in the population evolve independently, while in reality there are significant correlations due to their common ancestors.

Motivated by this simulation study, Durrett and Schmidt (2007) have recently given a mathematical analysis for regulatory sequence evolution in humans, correcting the calculation mentioned above. They assumed an effective population size of 10,000 and a per nucleotide mutation rate of μ = 10−8. In this situation, the expected number of segregating sites in a 1-kb sequence is 1000(4Neμ) = 0.4 so it makes sense to talk about a population consensus sequence. The authors defined this as the nucleotide at the site if there is no variability in the population and if the site is variable, the most frequent nucleotide at that site in the population. Using a generation time of 25 years, they found that in a 1-kb region, the average waiting time for words of length six was 100,000 years. For words of length eight, they found that the average waiting time was 375,000 years when there was a seven- of eight-letter match to the target word in the population consensus sequence (an event of probability ∼5/16) and 650 million years when there was not.

Fortunately, in biological reality, the match of a regulatory protein to the target sequence does not have to be exact for binding to occur. Biological reality is complicated, with the acceptable sequences for binding described by position weight matrices that indicate the flexibility at different points in the sequence. To simplify, we assume that binding will occur to any eight-letter word that has seven letters in common with the target word. If we do this, then the mean waiting time reduces to ∼60,000 years.

To explain the intuition behind the last result, consider the case of eight-letter words. If all nucleotides are equally likely and independent, then using the binomial distribution we see that a six of eight match to a given eight-letter target word has probability 63/16,384 ≈ 0.00385, so in a region of 1000 nucleotides, we expect to find 3.85 such approximate matches in the population consensus sequence. Simple calculations then show that the waiting time to improve one of these six of eight matches to seven of eight has a mean of 60,000 years. This shows that new regulatory sequences can come from small modifications of existing sequence.

Extending our previous work on the de novo generation of binding sites, this article considers the possibility that in a short amount of time, two changes will occur, the first of which inactivates an existing binding site, and the second of which creates a new one. This problem was studied earlier by Carter and Wagner (2002). In the next section, we present the model and then a simpler theoretical analysis based on work of Komarova et al. (2003) and Iwasa et al. (2004, 2005), who studied the onset of cancer due to the inactivation of tumor suppressor genes. Finally, we compare the theory with simulations and experimental results.


Consider a population of 2N haploid individuals. The reader should think of this as the chromosomes of N diploid individuals evolving under the assumptions of random union of gametes and additive fitness. However, since we use the continuous-time Moran model, it is simpler and clearer to state our results for haploid individuals.

We start with a homogeneous population of wild-type individuals. We have two sets of possible mutant genotypes A and B. Wild-type individuals mutate to type A at rate u1 and type A individuals mutate to type B at rate u2. We assume there is no back mutation. We think of the A mutation as damaging an existing transcription factor binding site and the B mutation as creating a second new binding site at a different location within the regulatory region. We assign relative fitnesses 1, r, and s to wild-type, A mutant, and B mutant individuals, respectively. See Figure 1 for a diagram of our model. We used the word damage above to indicate that the mutation may only reduce the binding efficiency, not destroy the binding site. However, even if it does, the mutation need not be lethal. In most cases the B mutation will occur when the number of A mutants is a small fraction of the population, so most individuals with the A mutation will also carry a working copy of the binding site.

Figure 1.—

An example of our general two-stage mutation process used in this article is as follows. The regulatory region contains two possible binding sites, a and b, where a prime denotes an inactivated site. Wild-type individuals can undergo a type A point mutation (a b′ Formula a′ b′ at rate u1), which inactivates site a, and type A individuals can undergo a type B point mutation (a′ b′ Formula a′ b at rate u2), which creates a new active site b. The relative fitnesses of wild type, A mutant, and B mutant are 1, r, and s, respectively. Note that in this case, wild-type individuals cannot produce individuals with a second active binding site. For a different example of this general process, see p. 955 of Carter and Wagner (2002).

We could also assume that the mutations occur in the other order: B first and then A. This is also a two-stage process that falls into the general framework of our analysis below under the appropriate fitness assumptions. The problems in population genetics to be solved are as follows: How long do we have to wait (i) for a type B mutation to arise in some individual or (ii) for the B mutant to become fixed in the population? These problems were investigated by Komarova et al. (2003) and Iwasa et al. (2004, 2005) for tumor suppressor genes, whose inactivation can lead to cancer, with A being the inactivation of one copy of the gene and B the inactivation of the other. A nice account of these results can be found in Section 12.4 of Nowak's (2006) book on evolutionary dynamics. Here, we apply these results to estimate the waiting time for a switch between two transcription factor binding sites, as defined in the statement of our problem above.

First we need to describe the population genetics model we are considering. Rather than use the discrete-time Wright–Fisher model, we use the continuous-time Moran model. We prefer the Moran process because it is a birth-and-death chain, which means that the number of type A individuals increases or decreases by one on each event. Biologically the Moran model corresponds to a population with overlapping generations and in the case of tumor suppressor genes is appropriate for a collection of cells in an organ that is being maintained at a constant size.

As the reader will see from the definition, the Moran process as a genetic model treats N diploids as 2N haploids and replaces one chromosome at a time. In this context it is common to invoke random union of gametes and assume fitnesses are additive, but that is not necessary. Since homozygous mutants are rare, the fitness of an A mutant is its fitness in the heterozygous state. Supposing that the relative fitnesses have been normalized to have maximum value 1, the dynamics may be described as follows:

  • Each individual is subject to possible replacement at rate 1.

  • A copy is made of an individual chosen at random from the population.

  • Mutation changes the copy from wild type to A with probability u1 and from A to B with probability u2.

  • If the relative fitness of the proposed new individual is 1 − q after mutation (where q is the selection coefficient), then the replacement occurs with probability 1 − q. Otherwise, nothing happens.

For more on this model, see Section 3.4 of Ewens (2004).


Neutral case:

Returning to the problem, we first consider the case in which the fitness of the A mutant r = 1 and the population is of intermediate size as compared to the mutation rates: that is, one in which the population size and mutation rates satisfyMath(1)where for any numbers a and b, Math is read “a is much less than b” and means a/b is small.

Theorem 1. If Math and Math, the probability P(t) that a B mutation has occurred in some member of the population by time t isMath(2)whereis readapproximately.If B mutants become fixed with probability β, then the result for the fixation time is obtained by replacing u2 by βu2 in (2).

In words, the waiting time τB for the first B mutation is roughly exponential with mean Math while the waiting time TB for B to become fixed is roughly exponential with mean Math. Hence, if β < 1, this increases the waiting time by a factor of Math rather than the 1/β that one might naively expect. The last conclusion in the theorem should be intuitive since successful mutations (i.e., those that go to fixation) occur at rate βu2. In each of the next three theorems, the results for the fixation time can be obtained by replacing u2 by βu2 in the waiting-time result.

Sketch of proof. The mathematical proof of this result involves some technical complications, but the underlying ideas are simple. Here and in what follows, readers not interested in the underlying theory can skip the proof sketches. A simple calculation, see Section 2 of Iwasa et al. (2005), shows that the probability a type A mutant will give rise to a type B mutant before its family dies out is asymptotically Math . Since type A mutations arise at rate 2Nu1, the time σB until an A mutant arises that will have a descendant of type B is exponential with mean Math. The amount of time after σB it takes for the B mutant to appear, τB – σB, is of order Math. Since Math, the difference τB – σB is much smaller than σB and the result holds for τB as well.

To give some intuition about how the B mutation arises, we note that in the neutral case, r = 1, the number of copies of the A allele increases and decreases with equal probability, so if we ignore the transitions that do not change the number of mutant alleles, the result is an unbiased random walk. Since such a walk represents the winnings of a gambler playing a fair game, the probability that the number will rise to Math before hitting 0 is 1/k. When this occurs, the central limit theorem of probability theory implies that the number of steps required to return to 0 is of order k2 = 1/u2, since this requires the random walk to move by k, and by the central limit theorem this will take time of order k2. Since B mutants have probability u2, there is a reasonable chance of having a B mutation before the number of A mutants returns to 0. ▪

Iwasa et al. (2004) call this stochastic tunneling, since the second mutant (type B) arises before the first one (type A) fixes. Carter and Wagner (2002) had earlier noted this possibility but they did not end up with a very nice formula for the average fixation time; see their (2.2) and the formulas for the constants given in their Appendix. The assumption Math implies that throughout the scenario we have just described, the number of type A mutants is a small fraction of the population, so we can ignore the probability that the A mutants become fixed in the population. This means that in an intermediate-sized population (1) with r = 1, B mutations arise primarily through stochastic tunneling.

In contrast to populations of intermediate size, populations that are small (compared to the mutation rates) have fixation of the type A mutation before the type B mutation arises.

Theorem 2. If Math and Math, then the probability P(t) that a B mutation has occurred in some member of the population by time t isMath(3)

Sketch of proof. To explain this, we note that the waiting times between A mutations are exponential with mean 1/(2Nu1) and each one leads to fixation with probability 1/(2N), so the time we have to wait for the first A mutation that will go to fixation is exponential with mean 1/u1. The condition Math implies that it is unlikely for the B mutation to appear before A reaches fixation. The average time required for an A mutation to reach fixation conditional on fixation, which is 2N by a result of Kimura and Ohta (1969), and the average time required for the B mutation to appear after fixation of the A mutation, which is 1/(2Nu2), are each short compared to 1/u1 and can be ignored. ▪

When 2N and Math are about the same size, fixation of an A mutation and stochastic tunneling are both possible situations in which a type B mutation can arise, and the analysis becomes very complicated. Iwasa et al. (2005) obtained some partial results; see their Equation 15. Recent work of Durrett et al. (2008) addresses this borderline case and gives the following result.

Theorem 3. If Math and Math, then the probability P(t) that a B mutation has occurred in some member of the population by time t is P(t) ≈ 1 – exp(– α(γ)u1t), whereMath(4)Hence, the mean waiting time in this case is 1/(α(γ)u1).

To summarize the first three theorems, if Math, then the waiting time for the first B mutant to appear in the population, τB, is approximately exponential under the following conditions:

View this table:

Deleterious A mutants:

Suppose now that A mutants have fitness r < 1, and return to the case of intermediate population size defined by (1).

Theorem 4. Suppose Math and that Math, where ρ is a constant that measures the strength of selection against type A mutants. The probability P(t) that the B mutation has occurred in some member of the population by time t isMath(5)

The proof of this is somewhat involved so we refer the reader to Iwasa et al. (2005) for details. In words, the conclusion says that the waiting time in the nonneutral case is still exponential, but the mean has been multiplied by 1/R. Note that when ρ = 0, R = 1, which is the neutral case. Thus, A mutants are essentially neutral when ρ ≈ 0, which is true when Math. When ρ is large, Math [since (ρ + 2/ρ)2 = ρ2 + 4 + 4/ρ2] and we have 1/R ≈ ρ. Therefore, as ρ increases the waiting time increases.

Kimura (1985) considered compensatory mutations that are related to the situation studied in Theorem 4. His model has four genotypes AB, AB, AB′, and AB′, where A and B are wild-type alleles with corresponding mutant alleles A′ and B′. The single mutant genotypes AB and AB′ have fitness 1 – s while AB and AB′ have fitness 1. Assuming Math, the mutation rate, he used diffusion theory to conclude that the average time for the fixation of the double mutant was, see his (16) and (17) and take h = 0,Mathwhere S = 4Nes and V = 4Nev. Evaluating the expression above numerically, he concluded that the fixation time was surprisingly short. Note that his result covers a different range of parameters since Theorem 4 supposes Math. However, stochastic tunneling still occurs. Kimura shows that the frequency of single mutants remains small until the second mutation occurs.


The results in the previous section are theorems about the limit as N → ∞, and their proofs are based on arguing that various complications can be ignored, so we now turn to simulations to show that the approximations are good for even relatively small values of N. We use a standard algorithm, described in the next paragraph, to simulate the continuous-time Markov chain X(t) that counts the number of A mutants in the population at time t. Readers not interested in the details of our simulation algorithm can skip the next paragraph.

Let T0 = 0 and for m ≥ 1 let Tm be the time of the mth jump of X(t). If X(Tm) = 0, we let tm+1 be exponential with mean 1/(2Nu1) and set Tm+1 = Tm + tm+1 and X(Tm+1) = 1. If X(Tm) = k with 1 ≤ k < 2N, then we let tm+1 be exponential with mean 1/(pk + qk + rk), where pk is the rate of jumps to k + 1, qk is the rate of jumps to k – 1, and rk is the rate an A mutant replaces an A mutant, as defined in the following equations. Note that the second term in pk accounts for new A mutants that enter the population:MathWe set X(Tm+1) = k + 1 with probability pk/(pk + qk + rk), X(Tm+1) = k with probability rk/(pk + qk + rk), and X(Tm+1) = k – 1 with probability qk/(pk + qk + rk). In the first two cases there is a probability u2 of a B mutation. We stop the simulation the first time a B mutant appears or X(Tm) = 2N. If an A mutant goes to fixation, we add an exponential with mean 1/(2Nu2) to the final time to simulate waiting for the B mutation to appear.

Let n = 2N. Since our aim is to show that the theoretical predictions work well even for small values of n, we will, in most cases, consider the values n = 1000 and n = 10,000. Table 1 gives the seven simulation scenarios we study. Table 2 compares the predicted mean time from Theorem 1 with the average time found in 10,000 replications of each simulation. In making predictions for the examples, we consider that any numbers a and b satisfy Math if a/b ≤ 1/10. In case 3 our assumption Math (Equation 1) about intermediate population size holds, and we can see that the simulated mean is very close to the predicted mean. In cases 1 and 5, we replace the upper and lower inequalities in (1) by equality, respectively, so in each case one of the two assumptions is not valid. Cases 2 and 4 are intermediate, meaning that the upper and lower inequalities in (1), respectively, do not quite hold since the ratios are 1/4. Yet, cases 2 and 4 show good agreement with the predicted mean.

View this table:

Parameter values for our simulations in terms of n = 2N

View this table:

Comparison of mean waiting times

The last two cases are specific examples related to regulatory sequence evolution in Drosophila and humans, which we consider in more detail in the next section. The Drosophila effective population size is too large to use the true value in the simulations, but this is feasible for humans. In addition, we multiply u2 by Math for these special cases so that the ratio u1/u2 = 30. This comes from our assumption that a mutation at any position in the binding site will damage it, but to create a new binding site we require one position to mutate to the correct letter. Given a 10-letter target binding site, then u1 is 3 × 10 times bigger than u2.

In Figure 2, we plot the observed waiting time/predicted mean for case 3 and see a good fit to the exponential distribution, which agrees with our theoretical prediction. Figure 3 corresponds to case 1 and shows that the tail of the distribution looks exponential, but the simulated mean time is ∼1.5 times larger than the predicted mean. This is caused by the fact that since u1 = 1/(2N), the time τB – σB we have to wait for the B mutant to be produced is of the same order of magnitude as σB. Hence, the total waiting time τB is significantly larger than σB.

Figure 2.—

Waiting time for case 3 with n = 1000 and u1 = u2 = 0.0001. The assumptions for intermediate population size as compared to mutation rates (1) hold and, as predicted by Theorem 1, the waiting time is a good fit to the exponential distribution.

Figure 3.—

Waiting time for case 1 with n = 1000, u1 = 0.001, and u2 = 0.0001. The tail of the waiting-time distribution appears to be exponential, but the simulated mean is ∼1.5 times larger than predicted by Theorem 1. The thick curve corresponds to the waiting-time calculation done by Wodarz and Komarova (2005). This computation matches our Moran model simulation exceptionally well.

To explain the observed shape of the distribution, recall from the sketch of the proof of Theorem 1 that σB has exactly an exponential distribution. Adding the independent random variable τB – σB, which we assume has density g(s), yields the following distribution for τB,Mathsince the integrand is close to g(0) for all s ∈ [0, t], and consequently the exponential fit is not good for small t. Wodarz and Komarova (2005) have done an exact calculation of the waiting time in the branching process approximation of the Moran model, which applies to case 1. As Figure 3 shows, the computation matches the Moran model simulation exceptionally well.

Figure 4 corresponds to case 5 and yields a simulated mean of ∼78% of its predicted value. The curve looks exponential, but it has the incorrect mean. In this case, Theorem 3 shows that fixation of an A mutation and stochastic tunneling are both possible scenarios in which a B mutation can arise, producing a shorter waiting time. More specifically, the assumptions of Theorem 3 hold with γ = 1, which gives α = 1.433, and Figure 4 shows that the exponential distribution with this rate fits the simulated data reasonably well.

Figure 4.—

Waiting time for case 5 with n = 1000, u1 = 0.0001, and u2 = 0.000001. The simulated mean is only ∼78% of the mean predicted by Theorem 1; however, the waiting-time distribution still is approximately exponential. Theorem 3 holds with γ = 1 so α = 1.433, and the exponential with this rate gives a reasonable fit to the simulated data.


In the following two examples, we consider a 10-nucleotide binding site and suppose that transcription factor binding requires an exact match to its target. We assume that any mutation within the binding site will damage it (A mutation) and that at least one 10-nucleotide sequence exists within the regulatory region that can be promoted to a new binding site by one mutation (B mutation). Our previous results show, as MacArthur and Brookfield (2004) observed earlier, that the existence of these so-called “presites” is necessary for the evolution of new binding sites on a reasonable timescale (Durrett and Schmidt 2007).


We assume a per nucleotide mutation rate of 10−8 per generation, a simplification of the values that can be found in the classic article of Drake et al. (1998) and the recent direct measurements of Haag-Liautard et al. (2007). If transcription factor binding involves an exact match to a 10-nucleotide target, then inactivating mutations have probability u1 = 10−7 and those that create a new binding site from a 10-letter word that does not match the target in one position have probability u2 = Math × 10−8. If the target word is 6–9 nucleotides long or inexact matches are possible, then these numbers may change by a factor of 2 or 3. Such details are not very important here, since our aim is to identify the order of magnitude of the waiting time.

We set the effective population size N = 2.5 × 106, which agrees with the value given on p. 1612 of Thornton and Andolfatto (2006). To apply Theorem 1 we needMathThe ratio of the left number to the middle number ≈ 1/300, but the ratio of the middle number to the one on the right is 1/2, which says that Math is not a valid assumption. Ignoring this for a moment, Theorem 1 predicts a mean waiting time ofMathwhich translates into 3460 years if we assume 10 generations per year.

Since the assumption Math is not valid, we use our simulation result for case 6, which has a small population size with parameter values of 2Nu1 = 0.5 and Math similar to the Drosophila example, to see what sort of error we expect (see Tables 1 and 2). We see that in the simulation, the observed mean is ∼25% higher than the theoretical mean, so adding 25% to the prediction gives a mean waiting time of 4325 years.

A second and more important correction to our prediction is that Theorem 1 assumes that the A mutation is neutral and the B mutation is strongly advantageous. If we make the conservative assumption that the B mutation is neutral, then the fixation probability β = 1/2N = 2 × 10−7, and by Theorem 1 the waiting time increases by a factor of Math to ∼9 million years. If the B mutation is mildly advantageous, i.e., s – 1 = 10−4, then β ≈ 10−4 and the waiting time increases by a factor of only 100 to 400,000 years.

If we assume that A mutants have fitness r < 1 where Math, then Theorem 4 implies that the waiting time is not changed, but if Math, then the waiting time is increased by a factor Math if ρ is large. If we use the value of 1 – r = 10−4, the increase is roughly a factor of 2. From this we see that if both mutations are almost neutral (i.e., relative fitnesses r ≈ 1 – 10−4 and s ≈ 1 + 10−4), then the switch between two transcription factor binding sites can be done in <1 million years. This is consistent with the results for the even-skipped stripe 2 enhancer mentioned earlier.


We now show that two coordinated changes that turn off one regulatory sequence and turn on another without either mutant becoming fixed are unlikely to occur in the human population. We assume a mutation rate of 10−8, again see Drake et al. (1998), and an effective population size of N = 104 because this makes the nucleotide diversity 4Neμ close to the observed value of 0.1%. If we again assume that transcription factor binding involves an exact match to a 10-nucleotide target, then inactivating mutations have probability u1 = 10−7, and those that create a new binding site from a 10-letter word that does not match the target in one position have probability u2 = 3.3 × 10−9. For the assumptions of Theorem 1 to be valid we needMathThe ratio of the middle number to the one on the right is 1/500, but the ratio of the left number to the middle one ≈ 1.

Ignoring for the moment that one of the assumptions is not satisfied, Theorem 1 predicts a mean waiting time ofMathMultiplying by 25 years per generation gives 216 million years.

As shown in Tables 1 and 2, we have simulation results for humans using the exact parameters above. In 10,000 replications, the simulation mean is 6.46 million generations, which is only ∼75% of the predicted value. Multiplying by 0.75 reduces the mean waiting time to 162 million years, still a very long time. Our previous work has shown that, in humans, a new transcription factor binding site can be created by a single mutation in an average of 60,000 years, but, as our new results show, a coordinated pair of mutations that first inactivates a binding site and then creates a new one is very unlikely to occur on a reasonable timescale.

To be precise, the last argument shows that it takes a long time to wait for two prespecified mutations with the indicated probabilities. The probability of a seven of eight match to a specified eight-letter word is 8(3/4)(1/4)7 ≈ 3.7 × 10−4, so in a 1-kb stretch of DNA there is likely to be only one such match. However, Lynch (2007, see p. 805) notes that transcription factor binding sites can be found within a larger regulatory region (104 – 106 bp) in humans. If one can search for the new target sequence in 104 – 106 bp, then there are many more chances. Indeed since (1/4)8 ≈ 1.6 × 10−5, then in 106 bp we expect to find 16 copies of the eight-letter word.

The edge of evolution?

Our final example of waiting for two mutations concerns the emergence of chloroquine resistance in P. falciparum. Genetic studies have shown, see Wooton et al. (2002), that this is due to changes in a protein PfCRT and that in the mutant strains two amino acid changes are almost always present—one switch at position 76 and another at position 220. This example plays a key role in the chapter titled “The mathematical limits of Darwinism” in Michael Behe's book, The Edge of Evolution (Behe 2007).

Arguing that (i) there are 1 trillion parasitic cells in an infected person, (ii) there are 1 billion infected persons on the planet, and (ii) chloroquine resistance has arisen only 10 times in the past 50 years, he concludes that the odds of one parasite developing resistance to chloroquine, an event he calls a chloroquine complexity cluster (CCC), are ∼1 in 1020. Ignoring the fact that humans and P. falciparum have different mutation rates, he then concludes that “On the average, for humans to achieve a mutation like this by chance, we would have to wait a hundred million times ten million years” (Behe 2007, p. 61), which is 5 million times larger than the calculation we have just given.

Indeed his error is much worse. To further sensationalize his conclusion, he argues that “There are 5000 species of modern mammals. If each species had an average of a million members, and if a new generation appeared each year, and if this went on for two hundred million years, the likelihood of a single CCC appearing in the whole bunch over that entire time would only be about 1 in 100” (Behe 2007, p. 61). Taking 2N = 106 and μ1 = μ2 = 10−9, Theorem 1 predicts a waiting time of 31.6 million generations for one prespecified pair of mutations in one species, with Math having reduced the answer by a factor of 31,600.

We are certainly not the first to have criticized Behe's work. Lynch (2005) has written a rebuttal to Behe and Snoke (2004), which is widely cited by proponents of intelligent design (see the Wikipedia entry on Michael Behe). Behe and Snoke (2004) consider evolutionary steps that require changes in two amino acids and argue that to become fixed in 108 generations would require a population size of 109. One obvious problem with their analysis is that they do their calculations for N = 1 individual, ignoring the population genetic effects that produce the factor of Math. Lynch (2005) also raises other objections.


For population sizes and mutation rates appropriate for Drosophila, a pair of mutations can switch off one transcription factor binding site and activate another on a timescale of several million years, even when we make the conservative assumption that the second mutation is neutral. This theoretical result is consistent with the observation of rapid turnover of transcription factor binding sites in Drosophila and gives some insight into how these changes might have happened. Our results show that when two mutations with rates u1 and u2 have occurred andMaththen the first one will not have gone to fixation before the second mutation occurs, and indeed A mutants will never be more than a small fraction of the overall population. In this scenario, the A mutants with fitness r are significantly deleterious if Math is large, a much less stringent condition than the usual condition that 2N(1 – r) is large. Also, the success probability of the B mutant is dictated by its fitness relative to the wild type rather than relative to the A mutant. This follows because the fraction of A mutants in the population is small when the B mutant arises, and hence most individuals are wild type at that time.

The very simple assumptions we have made about the nature of transcription factor binding and mutation processes are not crucial to our conclusions. Our results can be applied to more accurate models of binding site structure and mutation processes whenever one can estimate the probabilities u1 and u2. However, the assumption of a homogeneously mixing population of constant size is very important for our analysis. One obvious problem is that Drosophila populations undergo large seasonal fluctuations, providing more opportunities for mutation when the population size is large and a greater probability of fixation of an A mutation during the recurring bottlenecks. Thus, it is not clear that one can reduce to a constant size population or that the effective population size computed from the nucleotide diversity is the correct number to use for the constant population size. A second problem is that in a subdivided population, A mutants may become fixed in one subpopulation, giving more opportunities for the production of B mutants or perhaps leading to a speciation event. It is difficult to analyze these situations mathematically, but it seems that each of them would increase the rate at which changes occur. In any case one would need to find a mechanism that changes the answer by a significant factor to alter our qualitative conclusions.


We thank Eric Siggia for introducing us to these problems and for many helpful discussions and Jason Schweinsberg for his collaboration on related theoretical results. We are also grateful to Nadia Singh, Jeff Jensen, Yoav Gilad, and Sean Carroll who commented on previous drafts and to two referees (one anonymous and Michael Lynch) who helped to improve the presentation. Both authors were partially supported by a National Science Foundation/National Institutes of General Medical Sciences grant (DMS 0201037). R.D. is also partially supported by a grant from the probability program (DMS 0202935) at the National Science Foundation. D.S. was partially supported by a National Science Foundation graduate fellowship at Cornell and after graduation was a postdoc at the Institute for Mathematics and its Applications in Minnesota, 2007–2008.


  • Communicating editor: J. B. Walsh

  • Received September 30, 2007.
  • Accepted August 19, 2008.


View Abstract