- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Wahl, L. M.
- Articles by Krakauer, D. C.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Wahl, L. M.
- Articles by Krakauer, D. C.
Models of Experimental Evolution: The Role of Genetic Chance and Selective Necessity
Lindi M. Wahla and David C. Krakaueraa Institute for Advanced Study, Princeton, New Jersey 08540
Corresponding author: Lindi M. Wahl, Department of Applied Math, University of Western Ontario, London, Ontario N6A 5B7, Canada., lmw{at}scratchy.dhis.org (E-mail)
Communicating editor: M. W. FELDMAN
| ABSTRACT |
|---|
We present a theoretical framework within which to analyze the results of experimental evolution. Rapidly evolving organisms such as viruses, bacteria, and protozoa can be induced to adapt to laboratory conditions on very short human time scales. Artificial adaptive radiation is characterized by a list of common observations; we offer a framework in which many of these repeated questions and patterns can be characterized analytically. We allow for stochasticity by including rare mutations and bottleneck effects, demonstrating how these increase variability in the evolutionary trajectory. When the product Np, the population size times the per locus error rate, is small, the rate of evolution is limited by the chance occurrence of beneficial mutations; when Np is large and selective pressure is strong, the rate-limiting step is the waiting time while existing beneficial mutations sweep through the population. We derive the rate of divergence (substitution rate) and rate of fitness increase for the case when Np is large and illustrate our approach with an application to an experimental data set. A minimal assumption of independent additive fitness contributions provides a good fit to the experimental evolution of the bacteriophage
X174.
HISTORY has been termed the "realm of contingencies" (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
X174 to growth at high temperatures in two different virus hosts (E. coli and Salmonella typhimurium) within a chemostat. Starting with an "ancestral" population at 38°, the phage were propagated over several generations through a graded increase of 1° daily, reaching a maximum of 43.5°. Fitness of isolates sampled from the chemostat was estimated as the log2 increase in phage concentration per hour, that is, the number of phage doublings per hour. Sequencing for each of the replicates was conducted every 10 days. An additional four lines were grown from two isolates sampled from the primary lineages in each of the two hosts. It was observed that the average fitness increase was greatest for the early substitutions in the new environment. The substitution rate remained approximately constant across time. Many of the same substitutions were observed in independent lineages within the same host. When reintroduced into the ancestral host, some lineages were able to reverse the changes, thereby readapting to the original host. Thus extensive convergent and parallel evolution was observed. This pattern of substitution made it impossible to correctly infer the phylogeny of lineages.
![]()
![]()
![]()
X174 (natural host, E. coli) to growth at high temperatures in a novel host, S. typhimurium. Chemostats were sampled every 24 hr and complete genomes sequenced. Fitness was measured as doublings of phage concentrations per hour at 43°. In one replicate (ID) there was a 4000-fold increase in the number of descendants per hour, while in replicate TX, there was an 18,000-fold increase. High degrees of both parallel evolution and adaptive substitution were observed, but the order of substitutions differed between the replicates. The ID lineage lost its ability to replicate in E. coli whereas TX retained this ability.
From this set of experiments and others on RNA viruses (![]()
![]()
![]()
![]()
In the following sections we present a theoretical framework in which many of the questions and patterns described above can be investigated. We first describe the general model and the equilibrium solution of the model, noting that a spectrum of genetic neighbors is maintained by the dominant genotype at any time. We complete the model description by including stochastic effects, both rare mutations and bottlenecks, demonstrating how these increase variability in the evolutionary trajectory.
Using this model, we find that the behavior of evolutionary systems depends critically on the value of the product Np, the population size times the per locus error rate. When Np is small, the rate of evolution is limited by the chance occurrence of beneficial mutations. In contrast, when Np is large and selection pressures are strong, evolution may proceed rapidly through the best possible point mutations, and the rate-limiting step is the time necessary for each beneficial mutation to sweep through the population. These parameter regimes are akin to the two distinct "dynamics of divergence" investigated by ![]()
X174.
| THE QUASI-SPECIES MODEL |
|---|
We consider a population of N individuals; the genome of each individual is specified by a sequence of length
. Each locus or "bit" of this sequence has one of A discrete values. A is thus the number of alleles per locus, such that the total number of unique genome sequences, or genotypes, is given by A
. We use the term "alleles" here loosely, often setting A = 2 in our simulations for simplicity; more realistic values of A might be 4 (nucleotides) or 20 (amino acids).
Each locus of the genome is faithfully reproduced in the next generation with fidelity q, and therefore the probability that an individual of genotype i produces an offspring of genotype j is given by
![]() |
(1) |
where hij is the Hamming distance between sequence i and j (the total number of bits that differ between the two sequences). Note that hij = hji and therefore Qij = Qji; forward and back mutation rates are equivalent. The use of a constant factor for the fidelity, q, assumes that all mutations are equally "accessible" to the genome.
We assign a unique fitness, ßi, to the ith genotype. For example, we may consider the case when sequence i is the "wild-type" genome, and all mutations are deleterious: ßj < ßi,
j
i. Or we may allow that a few rare mutations increase the fitness of the wild type, either independently or with some sort of epistasis. In the sections that follow, we illustrate several possible models for the distribution of the ßi values.
Letting yi denote the frequency of the ith genotype in the population, we have a standard quasi-species model (![]()
![]()
![]()
![]() |
(2) |
Here
is the sum over all k of ßkyk and is simply a normalization term, ensuring that the sum of all the frequencies in the population is equal to one. We note that if the unit of time is generations, ßi corresponds to the number of offspring per individual of genotype i per generation, and
i is the change in the frequency of yi per generation.
To integrate Equation 2 deterministically, we set yi = 0 whenever the frequency of genotype i falls below one individual in the population, or 1/N. Likewise, if the frequency yi is equal to zero at any time,
i must exceed 1/N before the new genotype will be added to the population. This provides an approximation of the underlying stochasticity, which we treat more formally in a later section.
Equilibrium solution:
We can also rewrite Equation 2 in matrix form: ßQ
= 
, where
and ß are diagonal matrices of
and the ßi, respectively,
is a vector of the yi, and Q is the matrix with Qij in the ith row and jth column. The equilibrium of the system will then be given by the eigenvector corresponding to the largest eigenvalue of ßQ.
If the ßi are distinct, a unique stable equilibrium will exist, in which the genotype with the highest ßi dominates the system (![]()
![]()
), a spectrum of other genotypes that are genetic neighbors of the dominant genotype will be maintained at low frequencies by the mutation-selection balance. This equilibrium state may not, of course, be accessible from all starting points, since some of the "steps" on the evolutionary trajectory may involve genotypes with low fitness or initial frequencies that would be much less than 1/N. This situation can be better modeled by allowing rare mutations to occur probabilistically, as described in the following section of the article. Finally, we note that even when the system is not in equilibrium, any high frequency genotype will continually create a suite of low frequency genetic neighbors. This feature becomes important in later sections.
| ASSIGNING FITNESS |
|---|
Independent fitness contributions:
Before we can analyze the behavior of this system, we must make some assumption about the fitness of each genotype, ßi. One way to determine the fitness of each genotype is to assume that the fitness contribution of each locus is independent; there is no epistasis in the system. In this case we write a general function for the fitness contribution of the kth locus, bk(aik), where aik gives the allele at locus k in genotype i (e.g., for a two-allele system, aik is either zero or one). The overall fitness of genotype i can then be written as
![]() |
(3) |
Thus we assume that the fitness contributions of the alleles in the genome are independent and additive.
As an example, consider an eight-bit genome with two possible alleles per locus. This gives 28 = 256 possible genotypes. For simplicity we write the wild-type genome as a sequence of eight zeros, 00000000, and assume that the fitness contribution of each "zero" allele is one-eighth (bk(0) =
for all k), such that the wild-type genome has an overall fitness ß0 = 1.
Fig 1 illustrates one possible time course of the evolution of such a system. Here we have chosen the fitness contribution of each mutant "one" allele from a random distribution, but have arbitrarily imposed that mutations at certain loci will reduce the fitness contribution of that locus, while mutations at other loci will slightly increase the fitness contribution. In particular, we set bk(1) < 1/
when k
5 and bk(1) > 1/
when k
6. Thus mutations in the first five bits of the genome are deleterious, while a mutation in bits 6, 7, or 8 gives some (randomly determined) increase in fitness. We start with a uniform population of the wild-type genome at time t = 0 and assume a generation time of 20 min and population size of 107.
|
The top of Fig 1 shows the evolution of the resulting system. We see that the population frequency of the wild type (dashed line) drops as the frequency of the genotype with a mutation in the eighth locus increases. This genotype is gradually overtaken by the genotype with mutations in the sixth and eighth loci, which in turn is overtaken by the genotype with the highest fitness (loci 6, 7, and 8 are mutated).
The middle of Fig 1 shows the total frequency in the population of mutations at loci 6, 7, and 8. These lines show the probability at each time that a randomly chosen member of the population would have a mutation at the specified locus. The bottom shows the mean fitness of the population, that is, the fitness of each genotype multiplied by its frequency and summed for all genotypes.
It is interesting to note that the variant with the highest fitness, 00000111, is not the first to sweep the population. This is because the probability of creating a three-mutant neighbor of the wild type is low; in this example we have used q = 0.9999, and therefore Qij for a Hamming distance of three is
1 in 1012. For the population size of 107, three-mutant neighbors of the wild type do not exist at the beginning of the simulation. In contrast, all the one-mutant neighbors of the wild type exist after one generation of replication. The fittest one-mutant neighbor is therefore the first to sweep through the population; when its frequency is sufficiently high the two-mutant genotype appears and outcompetes it, followed by the three-mutant genotype.
We also note a characteristic feature of the population fitness function; as illustrated in the bottom of Fig 1, we found that mutations that cause the largest boosts in fitness are the first to sweep the population. This is true whenever (i) fitness contributions are independent and additive and (ii) every one-mutant neighbor of the wild type is likely to exist. This is a result that was previously observed by ![]()
Epistasis:
In the simulated examples that follow, we often assign fitness to genotypes using independent additive fitness contributions as described above. Obviously, the interaction of genetic loci must be more complicated than this simplified model, and complete independence is unlikely; we consider some possible forms of epistatis in the DISCUSSION. We emphasize that the analytical work presented in this article (deriving the rates of substitution and fitness increase) does not necessarily assume the absence of epistasis in the genome.
| STOCHASTIC EFFECTS |
|---|
The model described above is completely deterministic. For a given fitness landscape, defined by the ßi, and a given starting genome, precisely the same evolutionary trajectory would be followed in every trial. To model experimental evolution with greater fidelity, we add two stochastic features to the system: we allow rare mutations to be generated with some small probability, and we use stochastic sampling to model potential "bottlenecks" in the evolutionary process. For example, one such bottleneck may occur during reinoculation of a chemostat, when tubes are changed.
Rare mutations:
To allow for the possibility of rare mutations, we relax the condition for the emergence of a new genotypethat the change in the frequency,
i, must exceed 1/N before the new genotype is added to the population. Instead, we consider the probability that a new genotype is generated within one generation time.
The product N
i gives the expected value of the number of type i individuals generated in one generation. These individuals might be generated by a large number of independent processes, that is, by mutation from any of the individuals currently replicating in the population. Thus we can imagine some underlying distribution that gives the probability that zero, one, two, or more type i individuals are generated; the mean value of this random variable is N
i.
When N
i > 1, genotype i should clearly be added to the simulation. When N
i < 1, we should in principle generate a random number distributed as the random variable described above, and add 0, 1, 2 (etc.) type i individuals to the simulation accordingly (an exponential distribution with mean N
i would be an appropriate model, for example). To approximate this process, we instead draw a random number from a uniform distribution and add a single type i individual if N
i exceeds the value of the random number. Thus we use a random number that is either 0 or 1 to approximate the more complicated distribution described above; we add either 0 or 1 individuals(s) in such frequencies that the same expected value is achieved. This approximation would not be appropriate if, for example, our dynamical system were subject to invasion barriers (minimum frequencies at which new genotypes can invade).
Fig 2 shows the effect of including a stochastic description of rare mutations in the model. In this example, a mutation at either locus 1 or 2 is deleterious, while mutations in locus 3 or 4, or in both loci 1 and 2 together are advantageous. The top shows the results of a strictly deterministic simulation: the simultaneous mutation of loci 1 and 2 is improbable and therefore does not occur. For the simulation shown in the middle, rare mutations occur probabilistically, and eventually the evolving genome finds the "peak" in the fitness landscape.
|
Stochastic sampling:
In experimental models of evolution, it is often necessary for chemostat tubes, media, etc. to be changed at regular intervals. In this case, a sample of the evolving population may be transferred to reinoculate a new system. This sampling may introduce stochastic effects in the evolving system, because genotypes that are rare in the population may be lost during the transfer. Similarly, oversampling of rare genotypes may considerably boost their frequencies after the transfer.
To model such effects, we halt the integration of Equation 2 at regular sampling intervals, producing a simulated population of N individuals with the appropriate genotypic frequencies. We then choose individuals randomly from this population to produce a new starting population (the inoculant) of size fN, for f
[0, 1]. The frequencies yi in the inoculant can vary markedly from the presample population frequencies, especially when f or yi is small.
Fig 3 shows three trials of simulated evolution for a six-locus genome. The starting conditions (100% wild type, 000000) and the fitness landscape (mutations in bits three to six confer a fitness advantage) were identical for each simulation. The population of 107 individuals was resampled once per day, producing a new starting population of size 103. Although the genome with the highest fitness eventually emerges in each of these trials, the evolutionary trajectory from the wild type to the fittest variant is remarkably different. To construct this example we have used an extremely short genome and a severe sampling ratio, 1 in 104. The illustrated effect, however, will be relevant at much gentler sampling ratios, especially if the genome is long and the fitness function complex.
|
| DIVERGENCE AND FITNESS |
|---|
At what rate does evolution occur in these experimental systems? To answer this question, we consider the rate of divergence, that is, the rate at which the Hamming distance between the original genome and the consensus genome in the population increases. Given the distribution of fitness coefficients, we can also transform this rate of divergence, or substitution rate, into an expected rate of fitness increase.
For a population size N and an error rate (per locus per replication) given by p = 1 - q, the probability that a one-mutant neighbor of the dominant genome is produced in the next generation is Np(1 - p)
-1
Np, where
is the length of the genome. We hypothesize that the divergence behavior of the system depends critically on whether Np is greater or less than one.
As an example, consider the experimental evolution of E. coli (![]()
![]()
X174 (![]()
![]()
3.3 x 107, and the error rate per base pair is
6 x 10-10. This gives Np = 0.02; each position along the genome has a 2% chance of being replicated erroneously in each generation. Since there are three possible errors at each locus, each specific one-mutant neighbor of the dominant genotype has <1% chance of being created in each generation. Two-mutant neighbors would be rare (Np2(1 - p)
-2/9
Np2/9
10-12).
Examining the substitution rate for systems such as this, where Np < 1, ![]()
![]()
For the bacteriophage experiments, however, the population size is similar (107), but the mutation rate is much higher (of the order of 10-6 or 10-7). Thus Np
1; there are perhaps one to three copies of every possible one-mutant neighbor of the dominant genome produced in every generation. (Once again, the probability that a specific two-mutant neighbor is produced is low, about 10-6 or 10-8 per generation. Since there are >100 million different two-mutant neighbors, however, we find that 10100 new two-mutant neighbors might be produced on average per generation.) In this situation, all of the one-mutant neighbors are produced almost simultaneously and are immediately in competition with one another. This differs from the situation considered by ![]()
When Np > 1, many copies of each mutant are produced in each generation, and there is no real risk of a particular beneficial mutation being eliminated by stochastic drift. Therefore, if any of the one-mutant neighbors of the original genome have a higher fitness than the original, they will immediately grow toward fixation. The stronger the selective pressure, the more likely it is that at least one of the many genetic neighbors of the wild type will carry some small fitness advantage. The rate at which this mutation will then sweep through the population depends on the fitness difference between the fittest one-mutant neighbor (genotype j) and the founding genotype (genotype i), irrespective of the fitness coefficients of the other, less fit, competing mutants. [We see this from Equation 2 where to first order we have
j = (1 - q)ßiyi + ßjyj - ßiyi; a fuller derivation appears in the following section.]
When a handful of copies of genotype j first appear, the probability of producing a new mutant during the replication of these individuals is extremely low, equivalent to the probability of producing new two-mutant neighbors of type i. As the number of j individuals increases, however, the probability of producing a one-mutant neighbor of genotype j also increases exponentially. At some point before genotype j reaches fixation (in fact, when the frequency of j individuals exceeds 1/Np), it is moreorless certain that all of the one-mutant neighbors of j have been and are continually being produced. If any of these are fitter than j, they will compete with both j and the remaining population of i and will grow toward fixation.
At this point, we note another critical difference between the Np < 1 and Np > 1 cases. When Np > 1, the new mutants that occur while j is growing toward fixation share the mutation that makes j different from the original sequence. That is, they are more likely (by perhaps three orders of magnitude) to be one-mutant neighbors of j, as opposed to two-mutant neighbors of i, which are unrelated to j. Although it is quite possible that genotype j never reaches fixation because of competition from some fitter variant k, k will carry the mutation that characterizes j to fixation.
Finally, we note that experimentally determined substitution frequencies, deduced from a consensus genotype, follow the fixation time course of individual mutations and not individual genotypes (for example, ![]()
The increase of a favorable mutation when rare:
We are interested in the rate at which favorable mutations are able to sweep through the population, as illustrated in Fig 1. To approximate this rate, we consider the simple case of an asexual population that is dominated by a single genotype, i, and the emergence of a favorable mutation at a single locus. This mutation changes genotype i to genotype j and increases the relative fitness of the genome by an amount sj. We assume that genotypes i and j are the only genotypes with significantly large frequencies in the population; they are the major players in the population dynamics at this time.
Following ![]()
(1 - q) and Qii = Qjj
1. Each type will then replicate as described by Equation 2,
![]() |
(4) |
under the condition yi[n + 1] + yj[n + 1] = 1.
Since the constants of proportionality are the same, we can solve for yj[n + 1]. We can then derive the change in the frequency of the favorable allele as
yj = yj[n + 1] - yj[n]. Letting p denote the error rate, 1 - q, we find

Finally, if the change in yj is small in each generation, we can replace this expression with a differential equation, yielding, for p << 1,
![]() |
(6) |
When yj is small (sjyj << p), the second term in the numerator dominates, and we have dyj/dt
p. Thus initially the frequency of a favorable mutation grows linearly at rate p; this growth is strictly by mutation from the prevailing genotype. At intermediate times, when p << sjyj and yj << 1, we find that dyj/dt
sjyj. At these times yj grows exponentially, with rate constant sj. This means that a favorable mutation will sweep through the population at an exponential growth rate given by its fitness advantage. Finally, when yj approaches fixation, we find that dyj/dt
sjyj(1 - yj)/(1 + sjyj); growth is eventually curbed.
The rate at which new genotypes are accessible:
As the frequency yj increases, the chance that it will produce an erroneous copy of itself increases. As discussed previously, once yj has exceeded 1/Np, all of the one-mutant neighbors of j are continually being produced by mutation from j. Most of these neighbors will be originally produced at some time before yj hits this threshold, of course, as the probability Nyjp approaches 1.
In experimental evolution protocols, the fitness benefit of a point mutation is often remarkably high (for example, s = 3.213.9; ![]()
![]()
20 generations. During these 20 generations, clonal interference will dominate; the currently best mutation will continually be challenged by novel mutant strains. However, this interlude is relatively brief, and it is effectively deterministic: all mutations will be tried and the best is almost certain to win. (During the interlude of clonal interference, a number of mutations will occur probabilistically while yj << 1/Np. If one of these randomly occurring mutants is much fitter than yj, it is possible that a point mutation might fix before yj crosses the 1/Np threshold, i.e., before all the one-mutant neighbors of j have been explored. We assume that this is an unlikely occurrence.)
Thus, when Np > 1 and selection is strong, the rate-limiting process in the evolution of the system is the rate at which entire new classes of genotypes are accessible, when the currently best genotype crosses the 1/Np frequency threshold. Recall that sj is the normalized difference in fitness, ß, between the new genotype j and the prevailing genotype, i; sj =
. If we assume that a new mutant j grows linearly from frequency 1/N to p/sj and then exponentially from p/sj to 1/Np (as described previously), the waiting time before a new set of one-mutant neighbors is explored is
![]() |
(7) |
Thus tj is the time, neglecting the brief period of clonal interference, before all the one-mutant neighbors of j have been produced, and the best of these begins to grow in the population.
In reality, there may not be a one-mutant neighbor of j that has a higher fitness than j. In this situation, Np2 becomes relevant: if Np2 > 1 evolution continues at a rate limited by the accessibility of two-mutant neighbors; if Np2 < 1 the system progresses at a rate limited by clonal interference. Along a given evolutionary trajectory, there may be a mix of both types of steps. Whenever one of the one-mutant neighbors of the dominant genotype has any selective advantage, evolution will proceed stepwise through single point mutations; this is very likely the case when the selective pressure is strong. The time between the initial occurrence of the new mutation at each step will then be given by Equation 7, while Equation 6 approximates the time to fixation.
The long-term rate of divergence from the original genotype, however, depends on the distribution of the relative fitness increments, si. To estimate these values, we can imagine that 3
fitness values (corresponding to point mutations in
positions along the genome and three possible mutations at each position) have been chosen at random from some parent distribution. Suppose sj =
- 1 is the relative fitness increment of the most fit of these samples. ![]()
approaches infinity, regardless of the initial distribution of fitness values to all possible genotypes. This is because both ßi and ßj are extreme values of the parent distribution. [ ![]()
Therefore, if the largest fitness increment conferred by each new set of mutational neighbors is distributed as p(s) =
e-
s, then the mean fitness increment will be 1/
and the median fitness increment will be ln 2/
. Substituting into Equation 7, we find that the median rate of divergence from the original genotype, in units of Hamming distance per generation, is given by
![]() |
(8) |
The rate of fitness increase:
As new, fitter genomes successively sweep through the population, the fitness of the most fit genotype in the population, ßj, will increase stepwise. The rate of this increase is given by the fitness increment divided by the waiting time
![]() |
(9) |
and, again, the population fitness will lag behind the most fit genotype by a delay given by the fixation time.
It is clear that
ß/
t depends at each step on sj, the fitness increment at that step, as well as on the current fitness, ßi. To find the average rate at which fitness increases in each step of the process, we must evaluate the integral
![]() |
(10) |
where
·
denotes the expected value. This integral can be approximated by
![]() |
(11) |
The approximation holds as long as ln(Np2) < 2, which could break down for large populations.
If many substitutions occur in rapid succession, such that the change in ß appears continuous, Equation 11 predicts that the change in fitness with time will be an exponential function of the original fitness. The magnitude of this exponent will change for each substitution, but on average will be equal to 2/
2(2 - ln Np2). Thus we have derived two independent estimates of the parameter
. To get the first estimate, we approximate the total change in the population fitness as a series of
equal steps, where
is the total number of substitutions, and each step has mean size
s
. Thus the population changes from an initial fitness, ß0, to the final fitness, ß
:
![]() |
(12) |
Alternatively, we can estimate the total number of natural generations, T, and use the continuous approximation of Equation 11 to find
![]() |
(13) |
We demonstrate an application of these formulas in a later section of the article.
Divergence and fitness depend on epistasis:
The sections above derive the mean rates of divergence and fitness increase, assuming that the fitness of each new set of one-mutant neighbors is drawn randomly from a set of coefficients with roughly the same distribution. The shape of these curves will be affected, however, by the epistasis between mutations.
Under strong epistasis, the distribution of fitness values among the neighbors of genotype j will have no relation to the distribution around genotype i. In this case the time to fixation will have no correlation with the order in which each mutation appears. In this case, divergence increases roughly linearly at the rate given by Equation 8, while fitness increases exponentially (Equation 10).
Alternatively, if the fitness contributions of each locus in the genome are completely independent and additive (no epistasis), the total fitness for closely related genomes is highly correlated. In this case, beneficial mutations will sweep through the population in order, starting with the "best" mutations. Each new mutant will grow more slowly than the last, times to fixation will grow increasingly long, and the fitness benefit accrued by each successive mutation will be smaller. In this case both divergence, as measured via the consensus genotype, and fitness will be saturating functions, although the mean rates as derived above will still hold.
MODELING THE EVOLUTION OF X174 |
|---|
As discussed previously, recent experiments involving the bacteriophage
X174 are amenable to analysis using the model proposed here. For example, the derivation of the rate of fitness increase allows for two independent estimates of
, the parameter describing the distribution of fitness coefficients.
![]()
-5 to 10 doublings per hour, or from 0.32 to 10 individuals per 20-min generation. This increase in fitness was conferred by 13 or 14 nucleotide substitutions and one intergenic deletion, giving a mean fitness increase per substitution of 
= 0.25, and predicting that
= 4.0. Alternatively, we can use Equation 13, with T = 500 natural generations, N = 107, and p = 10-7, obtaining
= 4.0. Taking T = 720 (10 days at 20 min per generation) and p = 10-6 increases our estimate of
to 5.6. The close agreement of these estimates confirms our assumption that the rate-limiting step in this evolutionary trajectory appears to be the rate at which new classes of mutational neighbors are explored. We can use this value of
to compute the median rate of divergence from the original genotype; Equation 8 gives estimates of one substitution every 12 days (89 generations).
This estimate of
also allows us to predict the overall divergence and fitness functions, given some assumption about epistasis in the model. Fig 4 (left) compares the estimated divergence rate with the divergence measured in the two experimental trials, assuming independent fitness contributions at each site in the genome. Given that there is only one free parameter,
, which has not been fit to the data illustrated here but has been calculated as described above, the agreement is rather striking. The fact that the observed divergence is slightly faster than predicted could indicate a small degree of nonadaptive substitution; two substitutions may have been nonadaptive according to the analysis of ![]()
![]()
|
This offset is also visible when the predicted fitness increase (solid line) is compared with the measured fitness (Fig 4, right). Again we find good agreement between model and measured data, using
= 4 as before.
| DISCUSSION |
|---|
We have demonstrated that the behavior of genetic systems evolving under strong selective pressure may be characterized by the product Np, the population size times the per locus per replication mutation rate. This product is akin to the dimensionless parameter k derived by ![]()
![]()
Alternatively, when Np is large, a substantial number of mutations are produced in every generation. In this case the entire neighborhood of genotype space surrounding the dominant genotype is exploredthoroughly and simultaneously. This corresponds to the "coincident-event collective replacement" described by ![]()
In analyzing the experimental evolution of
X174, we found that the assumption of independent fitness contributions at each locus fit the experimental data very well, particularly for the replicate in which Np was consistently >1. This assumption implies both a high degree of parallel evolution and a conserved order of substitutions, the latter of which was not observed experimentally. We note that the small population size early in one replicate and possible bottleneck effects would be expected to add variability to the genetic trajectory.
It is possible that a small number of mutations could contribute to the same "functional" change in a protein, for example, changing the polarity of some structural element. In this scenario there might be a maximum benefit achieved when all of the relevant amino acids have the appropriate polarity, but a change in one or two may confer a substantial fraction of this maximum benefit. In fact, in any situation where several mutations contribute to the same functional benefit, an epistasis of diminishing returns is likely: the same mutation will confer smaller increases in fitness to fitter genotypes. This effect has been convincingly demonstrated for bacteriophage adaptations to heat (![]()
An alternative way to treat the interaction of genetic loci is through multiplicative fitness contributions (![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank Jim Bull and Holly Wichman for suggesting that we undertake this project. We gratefully acknowledge the support of The Leon Levy and Shelby White Initiatives Fund, The Florence Gould Foundation, The Ambrose Monell Foundation, The Seaver Institute, and The Alfred P. Sloan Foundation.
Manuscript received December 14, 1999; Accepted for publication July 18, 2000.
| LITERATURE CITED |
|---|
BELL, G. and X. REBOUD, 1997 Experimental evolution in Chlamydomonas II. Genetic variation in strongly contrasted environments. Heredity 78:498-506.
BULL, J. J., M. R. BADGETT, H. A. WICHMAN, J. P. HUELSENBECK, and D. M. HILLIS et al., 1997 Exceptional convergent evolution in a virus. Genetics 147:1497-1507[Abstract].
BULL, J. J., M. R. BADGETT, and H. A. WICHMAN, 2000 Big-benefit mutations in a bacteriophage inhibited with heat. Mol. Biol. Evol. 17:942-950
EIGEN, M., 1971 Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58:465-526[Medline].
EIGEN, M. and P. SCHUSTER, 1977 The hypercycle: a principle of natural self-organization. Naturwissenschaften 64:541-565. Part A: emergence of the hypercycle.[Medline].
EIGEN, M. and P. SCHUSTER, 1978 The hypercycle: a principle of natural self-organization. Naturwissenschaften 65:341-369. Part C: the realistic hypercycle..
EWENS, W., 1979 Mathematical Population Genetics. Springer, New York.
FRANKLIN, I. and R. C. LEWONTIN, 1970 Is the gene the unit of selection? Genetics 65:707
GERRISH, P. J. and R. E. LENSKI, 1998 The fate of competing beneficial mutations in an asexual population. Genetica 102(103):127-144.
GILLESPIE, J. H., 1991 The Causes of Molecular Evolution. Oxford University Press, London/New York/Oxford.
HUYNEN, M. A., 1996 Exploring phenotype space through neutral evolution. J. Mol. Evol. 43:165-169[Medline].
HUYNEN, M. A., P. F. STADLER, and W. FONTANA, 1996 Smoothness within ruggedness: the role of neutrality in adaptation. Proc. Natl. Acad. Sci. USA 93:397-401
JOHNSON, P. A., R. E. LENSKI, and F. C. HOPPENSTEADT, 1995 Theoretical analysis of divergence in mean fitness between initially identical populations. Proc. R. Soc. Lond. Ser. B 259:125-130
JONES, B. L., 1978 Some principles governing selection in self-reproducing macromolecular systems. An analog of Fisher's fundamental theorem. J. Math. Biol. 6:169-175[Medline].
KAUFFMAN, S. A., 1993 The Origins of Order. Oxford University Press, London/New York/Oxford.
KRACAUER, S., 1969 History. The Last Things Before the Last (reprinted by Markus Weiner Publishers, Princeton, NJ, 1995).
LENSKI, R. E., M. R. ROSE, S. C. SIMPSON, and S. C. TADLER, 1991 Long-term experimental evolution in Escherichia coli. I. Adaptation and divergence during 2000 generations. Am. Nat. 138:1315-1341.
LENSKI, R. E. and M. TRAVISANO, 1994 Dynamics of adaptation and diversification: a 10,000-generation experiment with bacterial populations. Proc. Natl. Acad. Sci. USA 91:6808-6814
LEWONTIN, R. C., 1974 The Genetic Basis of Evolutionary Change. Columbia University Press, New York.
MAYNARD SMITH, J., 1998 Evolutionary Genetics. Oxford University Press, London.
MIRALLES, R., P. J. GERRISH, A. MOYA, and S. ELENA, 1999 Clonal interference and the evolution of RNA viruses. Nature 285:1745-1747.
MIRALLES, R., A. MOYA, and S. F. ELENA, 2000 Diminishing returns of population size in the rate of RNA virus adaptation. J. Virol. 74:3566-3571
NOVELLA, I. S., J. QUER, and E. DOMINGO et al., 1999 Exponential fitness gains of RNA virus populations are limited by bottleneck effects. J. Virol. 73:1668-1671
PAPADOPOULOS, D., D. SCHNEIDER, J. MEIER-EISS, W. ARBER, and R. E. LENSKI et al., 1999 Genomic evolution during a 10,000-generation experiment with bacteria. Proc. Natl. Acad. Sci. USA 96:3807-3812
RAINEY, P. B. and M. TRAVISANO, 1998 Adaptive radiation in a heterogenous environment. Nature 394:69-72[Medline].
ROSENZWEIG, R. F., R. R. SHARP, D. S. TREVES, and J. ADAMS, 1994 Microbial evolution in a simple unstructured environment: genetic differentiation in Escherichia coli.. Genetics 137:903-917[Abstract].
THOMPSON, C. L. and J. L. MCBRIDE, 1974 On Eigen's theory of the self-organization of matter and the evolution of biological macromolecules. Math. Biosci. 21:127-142.
TREVES, D. S., S. MANNING, and J. ADAMS, 1998 Repeated evolution of an acetate-crossfeeding polymorphism in long-term populations of Escherichia coli.. Mol. Biol. Evol. 15:789-797[Abstract].
TSIMRING, L. S., H. LEVINE, and D. A. KESSLER, 1996 RNA virus evolution via a fitness-space model. Physiol. Rev. Lett. 76:4440-4443.
WICHMAN, H. A., M. R. BADGETT, L. A. SCOTT, C. M. BOULIANNE, and J. J. BULL, 1999 Different trajectories of parallel evolution during viral adaptation. Science 285:422-424
This article has been cited by other articles:
![]() |
A. Stoltzfus and L. Y. Yampolsky Climbing Mount Probable: Mutation as a Cause of Nonrandomness in Evolution J. Hered., September 1, 2009; 100(5): 637 - 647. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Heineman, J. J. Bull, and I. J. Molineux Layers of Evolvability in a Bacteriophage Life History Trait Mol. Biol. Evol., June 1, 2009; 26(6): 1289 - 1298. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Betancourt Genomewide Patterns of Substitution in Adaptively Evolving Populations of the RNA Bacteriophage MS2 Genetics, April 1, 2009; 181(4): 1535 - 1544. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hawks, E. T. Wang, G. M. Cochran, H. C. Harpending, and R. K. Moyzis Recent acceleration of human adaptive evolution PNAS, December 26, 2007; 104(52): 20753 - 20758. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Jain and J. Krug Deterministic and Stochastic Regimes of Asexual Evolution on Rugged Fitness Landscapes Genetics, March 1, 2007; 175(3): 1275 - 1288. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Anderson, C. Sirjusingh, and N. Ricker Haploidy, Diploidy and Evolution of Antifungal Drug Resistance in Saccharomyces cerevisiae Genetics, December 1, 2004; 168(4): 1915 - 1923. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. De Gelder, J. M. Ponciano, Z. Abdo, P. Joyce, L. J. Forney, and E. M. Top Combining Mathematical Models and Statistical Methods to Understand and Predict the Dynamics of Antibiotic-Sensitive Mutants in a Population of Resistant Bacteria During Experimental Evolution Genetics, November 1, 2004; 168(3): 1131 - 1144. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. G. M. de Visser, A. D. L. Akkermans, R. F. Hoekstra, and W. M. de Vos Insertion-Sequence-Mediated Mutations Isolated During Adaptation to Growth and Starvation in Lactococcus lactis Genetics, November 1, 2004; 168(3): 1145 - 1157. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. A. Orr The Distribution of Fitness Effects Among Beneficial Mutations Genetics, April 1, 2003; 163(4): 1519 - 1526. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Wahl, P. J. Gerrish, and I. Saika-Voivod Evaluating the Impact of Population Bottlenecks in Experimental Evolution Genetics, October 1, 2002; 162(2): 961 - 971. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Wick, H. Weilenmann, and T. Egli The apparent clock-like evolution of Escherichia coli in glucose-limited chemostats is reproducible at large but not at small population sizes and can be explained with Monod kinetics Microbiology, September 1, 2002; 148(9): 2889 - 2902. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Wahl, L. M.
- Articles by Krakauer, D. C.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Wahl, L. M.
- Articles by Krakauer, D. C.




















