Abstract
A twolocus model of reversible mutations with compensatory fitness interactions is presented; single mutations are assumed to be deleterious but neutral in appropriate combinations. The expectation of the time of compensatory nucleotide substitutions is calculated analytically for the case of tight linkage between sites. It is shown that selection increases the substitution time dramatically when selection intensity Ns > 1, where N is the diploid population size and s the selection coefficient. Computer simulations demonstrate that recombination increases the substitution time, but the effect of recombination is small when selection is weak. The amount of linkage disequilibrium generated in the process of compensatory substitution is also investigated. It is shown that significant linkage disequilibrium is expected to be rare in natural populations. The model is applied to the mRNA secondary structure of the bicoid 3′ untranslated region of Drosophila. It is concluded that average selection intensity Ns against single deleterious mutations is not likely to be much larger than 1.
MODELS of compensatory evolution involve mutations from two or more loci. These mutations are assumed to be deleterious when they occur independently, but, in combination, they (at least partially) compensate each other for their deleterious effects. Kimura (1985) proposed a twolocus, twoallele model, with alleles A and a at the first locus and B and b at the second locus, as illustrated in Figure 1A. He assumed that the two intermediate haplotypes, Ab and aB, are deleterious, while the wildtype AB and the double mutant ab are neutral. He further assumed that selection intensity against the deleterious intermediates is so strong that their frequencies in a population are very low. Accordingly, he considered only unidirectional mutations from A to a and from B to b and ignored back mutations (Figure 2A).
An important example of compensatory evolution is found in RNA secondary structures. In singlestranded RNAs, WatsonCrick (WC) pairing of complementary nucleotide bases is the basic mechanism in the formation of stemloop structures. It is believed that an individual mutation that breaks up a WC pairing is deleterious and that a second “compensatory” mutation at the complementary site can reestablish pairing and restore fitness. The relatively simple pattern of intramolecular WC basepairing involved in RNA structures has made them a suitable model for the study of compensatory evolution (Stephan and Kirby 1993; Golding 1994; Schöniger and von Haeseler 1994; Kirbyet al. 1995; Muse 1995; Rzhetsky 1995; Tillier and Collins 1995; Stephan 1996; Higgs 2000; Savillet al. 2001). Most of these authors analyzed rRNA secondary structures.
Phylogenetic analysis has revealed a number of compensatory nucleotide changes between species. On the other hand, however, mismatches, including not only GU wobble pairs but also other noncanonical pairs, are frequently observed, indicating that selection against deleterious intermediates may not be very strong (Roussetet al. 1991; Parschet al. 2000). This suggests that Kimura's compensatory evolution model, which assumes strong selection against deleterious intermediates, may not be generally applicable to the evolution of RNA secondary structures.
In this article, a compensatory evolution model is described in which selection against deleterious single mutations is not necessarily strong but covers a broad range of selection coefficients. Under weak selection, deleterious haplotypes may increase in frequency and even fix in the population as illustrated in a simulation run shown in Figure 2B. Since back mutations play an important role when the frequencies of deleterious haplotypes become large, we use a twolocus, twoallele model in which bidirectional mutations are considered (Figure 1B). This model is different from those used by Kimura (1985), Iizuka and Takefu (1996), and Stephan (1996), who assumed that selection against deleterious intermediates is so strong that the mutation process may be considered unidirectional (see above). Our analysis is also different from that of Higgs (1998), who assumed bidirectional mutation but analyzed the model only in the parameter range of strong selection (i.e., Ns ⪢ 1, where N is the diploid population size and s the selection coefficient).
Our goal is to calculate the time to proceed from the wildtype state AB to the fixation of the double mutant ab in this bidirectional mutation model. It should be noted, however, that neither AB nor ab is an absorbing state. That means we do not calculate a “fixation” time sensu stricto but do calculate the time to “flip” back and forth between AB and ab. This is possible as the nucleotide mutation rate is much smaller than 1/N (see below). To describe the transition between AB and ab, it is therefore reasonable to use the term “fixation” (or “substitution”). We present analytical results for the rate of this compensatory substitution event when there is no recombination between the two loci and use computer simulation to obtain this time under the influence of recombination. The theoretical results are applied to DNA sequence data of the bicoid 3′ untranslated region (UTR) of Drosophila to estimate the selection intensity against deleterious mutations.
Another purpose of this article is to evaluate the amount of linkage disequilibrium generated by the compensatory evolution model. It is known that epistatic selection may produce significant linkage disequilibrium in natural populations (Lewontin 1974). Schaeffer and Miller (1993) reported linkage disequilibria in two clusters of DNA polymorphisms in introns of Adh in Drosophila pseudoobscura. These disequilibria are likely due to epistatic selection maintaining premRNA secondary structures (Kirbyet al. 1995). Here we examine whether these findings are consistent with our model of compensatory evolution.
THEORY
Consider a twolocus, twoallele model for a randomly mating population with N diploids. There are alleles A and a at the first locus and B and b at the second locus. The (bidirectional) mutation rate between A and a is given by μ_{1} and that between B and b is given by μ_{2} (Figure 1B). The mutation rates are assumed to be much smaller than 1/(2N), as is the case for nucleotide substitution rates. The relative fitnesses of the haplotypes AB, Ab, aB, and ab, are given by 1, 1 − s_{1}, 1 − s_{2}, and 1, respectively. Effects of fitness are assumed additive within locus and therefore the diploid model is equivalent to a haploid model. The haplotype frequencies are denoted by x_{0}, x_{1}, x_{2}, and x_{3}, respectively.
In the following, we consider the process of compensatory substitution under the joint action of drift, selection, mutation, and recombination. We assume the process starts at x_{0} = 1 at t = 0. First, mutations in AB produce Ab and aB types. Next, mutations in Ab and aB may create ab, and some of them fix in the population. Recombination between Ab and aB may also produce ab. We are interested in the expected time of compensatory substitution, defined as the time from t = 0 (when the system is at state x_{0} = 1) to the time point when the double mutant ab is fixed (x_{3} = 1). Because the mutation process is bidirectional, the latter state is not an absorption state. The time we are calculating is therefore a first passage time. Because of the assumption μ_{i} ⪡ 1/(2N), double mutants ab either get lost by drift or go to fixation; the proportion of ab haplotypes reverting back to Ab or aB is very small. Recombination can only retard the fixation process (Kimura 1985; Stephan 1996). The fixation time can be divided into two parts, T_{1} and T_{2}. T_{1} is the waiting time for the “successful” double mutant ab (that will eventually get fixed) to appear in the population, and T_{2} is the time from the appearance of ab to the fixation event (see Figure 2).
Symmetrical model: Consider a symmetrical model with s = s_{1} = s_{2} and μ = μ_{1} = μ_{2}. In the analytical derivations, recombination is neglected. The four haplotypes are divided into two groups: one consists of AB and ab, and the other is the group of the deleterious intermediates Ab and aB. Let X be the frequency of the group of deleterious intermediates (X = x_{1} + x_{2}) and Y the frequency of the other group (Y = x_{0} + x_{3}). Denote the distribution of X by Φ(X). In phase 1 when the system waits for a successful double mutant to appear, it will reach a quasiequilibrium after a short initial period. At this quasiequilibrium, its distribution is given approximately by
Thus, we assume that when a new ab appears by a mutation in Ab or aB, the distribution of X before the mutation is given by (1). After the mutation, Y (= 1 − X) changes to Y′ = Y + 1/(2N). Let p be the fixation probability of the Y group. Since the mutation rate is assumed to be very small, p can be approximated by
In the case of neutrality, since p′ is 1/(2N), the time becomes
When selection is very strong, X is maintained in very low frequency (approximately at the deterministic mutationselection balance). The expected frequency is then given by
T_{2} is the time from the appearance of a successful double mutant haplotype to the fixation event. In the case of neutrality, the expectation of T_{2} is ~4N (Kimura and Ohta 1969) since we assumed μ_{i} ⪡ 1/(2N). When selection is very strong, T_{2} may be close to 4N again. That is expected because x_{1} and x_{2} are very small and ab has almost no selective advantage in the population. T_{2} for moderate selection intensities is expected to be smaller than 4N because ab has a selective advantage when x_{1} and x_{2} are not very small.
General model: Consider a general model where μ_{1} ≠ μ_{2} and/or s_{1} ≠ s_{2}. Recombination is again neglected in the analytical treatment. In this case, it is necessary to investigate the frequency distributions of Ab and aB separately. Let ϕ_{1}(x) and ϕ_{2}(x) be the frequency distributions of Ab and aB, respectively. To obtain these two distributions, we reconsider the symmetrical model, where the distribution of the sum of x_{1} and x_{2} is given by (1). In a population with N diploids, the probability that i haplotypes are Ab or aB is given by
Let P_{1}(i) and P_{2}(i) be the probability distributions of the numbers of Ab and aB, respectively. If we assume that μ is so small and Ns is so large that Ab and aB do not coexist frequently, P_{1}(i) and P_{2}(i) are given approximately by
Next we consider ϕ_{1}(x) and ϕ_{2}(x) in the general model, where μ_{1} ≠ μ_{2} and/or s_{1} ≠ s_{2}. If we assume that Ab and aB do not coexist frequently, the frequencies of Ab and aB follow two independent distributions. From the arguments in Equations 9, 10 and 11, it is expected that ϕ_{1}(x) and ϕ_{2}(x) are given for x > 1/(4N) by
Denote by
Therefore, the expected number of ab that will fix in the population per generation is given by
T_{2} for the general model is the same as for the symmetrical model. That is, T_{2} ≈ 4N when selection is very strong and T_{2} < 4N when selection intensity is moderate.
COMPUTER SIMULATIONS
Computer simulations were carried out for the following reasons. The first one is to check the theoretical results for T_{1} shown above, because we use the following assumptions in the derivation. We use approximate formulas for the distribution of the haplotype frequencies ignoring the initial condition (X = 0 at t = 0). In the general asymmetric model, we assume that two haplotypes, Ab and aB, do not coexist. The second reason is to examine the effect of recombination on T_{1} under a broad range of selection coefficients. In contrast to the strong selection case (Kimura 1985; Stephan 1996), we were unable to find analytical expressions for T_{1} with recombination when selection is weak. T_{2} is also investigated by simulations with and without recombination. Another purpose of the simulations is to evaluate the amount of linkage disequilibrium generated in the process of compensatory substitution.
Monte Carlo simulations with mutation, selection, recombination, and random genetic drift were conducted in a constant size population of N diploids as follows. Each replication of the simulations starts from the initial condition (x_{0}, x_{1}, x_{2}, x_{3}) = (1, 0, 0, 0). In every generation, the frequencies are determined by the pseudosampling method (Kimura and Takahata 1983). The recombination rate between the two loci is assumed to be r per generation. Every 4N generations, the frequencies (x_{0}, x_{1}, x_{2}, x_{3}) are scored to investigate their frequency distributions. The amount of linkage disequilibrium (D = x_{0}x_{3} − x_{1}x_{2}) is also calculated. Each replication ends when (x_{0}, x_{1}, x_{2}, x_{3}) = (0, 0, 0, 1) is reached for the first time, and T_{1} and T_{2} are recorded. When 0 ≤ t′ ≤ T_{2} (with t′ = t − T_{1}), (x_{0}, x_{1}, x_{2}, x_{3}) is recorded every N/10 generation to calculate D.
Computer simulations with no recombination were conducted assuming N = 100, s = s_{1} = s_{2}, and μ = μ_{1} = μ_{2}, and the results for T_{1} and T_{2} are summarized in Table 1. The theoretical results for T_{1} in the symmetrical model were compared with the simulation results with no recombination (Figure 3). In Figure 3A, the theoretical expectation calculated by (5a) is shown with the simulation results of θ = 0.01. Ns was changed from 0 to 5. The theory is in very good agreement with the results of the simulations. Figure 3B shows the results of theory and simulations for a relatively high mutation rate (θ = 0.1) although our model assumes very low mutation rates. It is shown that Equation 5a gives a quite good approximation even for θ = 0.1, unless Ns = 0. The underestimation for Ns = 0 may occur because frequent recurrent mutations reduced the fixation probability of ab. The degree of reduction is expected to be relatively small when T_{1} is large. For all parameter sets examined, the standard deviation of T_{1} is similar to the mean (Table 1), supporting the exponential distribution of T_{1} as suggested by (5b). Similar results were obtained from computer simulations with N = 1000 (data not shown).
The effect of recombination on T_{1} was also investigated by computer simulations with θ = 0.01 and 0.1 (Table 1). The effect of recombination is very small when selection is weak. The effect, however, can be seen for selection intensity Ns > 1. In Figure 4, a clear positive correlation is observed between 4Nr and T_{1} when Ns = 2 and 5. These results are consistent with those of Kimura (1985) and Stephan (1996). Similar relationships between 4Nr and T_{1} were observed in other twolocus models with epistatic interactions (Michalakis and Slatkin 1996; Christiansenet al. 1998).
The results for T_{2} are also shown in Table 1. T_{2} is much smaller than T_{1} in all the parameter sets investigated. First, we consider T_{2} with no recombination. In the case of neutrality, T_{2} obtained from simulations for θ = 0.01 is close to 400 as expected from the theory, although T_{2} for θ = 0.1 is a little >4N. When selection is very strong (Ns ≥ 5), T_{2} is close to 4N again. T_{2} for moderate selection intensity is <4N.
T_{2} may be negatively correlated with recombination rate (Table 1). The degree of reduction in T_{2} is larger when selection is stronger. T_{2} for 4Nr = 0 is similar to that for 4Nr = 10 when Ns ≤ 1, while T_{2} for 4Nr = 10 is much smaller than that for no recombination when Ns ≥ 2. The result for the T_{2} phase may be understood as follows. Since recombination usually occurs between AB and ab in T_{2}, it reduces the fixation probability of ab and increases T_{1}. As the fixation probability is reduced, only ab, which increases its frequency quickly, can successfully fix in the population.
Theoretical results for T_{1} in the general model were compared with the results of computer simulations, and the results for θ_{1} = 0.02 and θ_{2} = 0.01 are shown in Figure 5 with the theoretical expectations calculated from (16). It is shown that the theoretical expectations are in good agreement with the results of simulations when Ns_{1} ≥ 0.5 and Ns_{2} ≥ 0.5. If one of the selection intensities is very small, Equation 16 overestimates T_{1} because the assumption used to derive (16) does not hold. In the derivation, we obtained approximate formulas for ϕ_{1}(x) and ϕ_{2}(x) under the assumption that Ab and aB do not coexist at the same time. If one of the selection intensities, say Ns_{1}, is very small, ϕ_{1}(x) given by (12) does not agree with the frequency distribution of Ab obtained by simulations. Similar results were obtained for other values of θ_{1} and θ_{2}. Good agreement between the theory and simulations was observed when neither Ns_{1} nor Ns_{2} is small (data not shown).
Table 2 shows the amount of linkage disequilibrium (D = x_{0}x_{3} − x_{1}x_{2}) in the process of compensatory substitution obtained by simulations under the symmetrical model with no recombination. D was calculated every 4N generations as long as 0 < t < T_{1}. For this interval of 4N generations, we found almost no correlation between sampling. Mean and variance were calculated for all runs. For θ = 0.01 the level of linkage disequilibrium is extremely low. Even for θ = 0.1 D is very small although much larger than that for θ = 0.01. This indicates that almost no linkage disequilibrium is expected during the waiting time for the appearance of a successful double mutant haplotype. Simulations with recombination showed that recombination reduces the level of linkage disequilibrium (data not shown).
On the other hand, strong positive linkage disequilibrium is generated after a successful double mutant has appeared and is on its way to fixation (0 < t′ < T_{2}). The average of linkage disequilibrium for t′ ≥ 0 is plotted in Figure 6. Selection increases the level of linkage disequilibrium significantly (Figure 6, A and B). Each distribution of D has a peak near t′ = 100. As Ns increases, the peak is getting larger and seems to saturate at D ≈ 0.2. Note that the theoretical maximum value of D is 0.25, which is reached when x_{0} = x_{3} = 0.5 and x_{1} = x_{2} = 0. When selection is weak, the level of linkage disequilibrium is higher for θ = 0.1 than for θ = 0.01.
Figure 6C shows the effect of recombination on the level of linkage disequilibrium when θ = 0.01 and Ns = 2. As the recombination rate increases, the level of linkage disequilibrium is getting weaker. Similar results were obtained for other mutation rates and selection intensities (data not shown).
DISCUSSION
Theory: We analyzed a model of reversible mutations with compensatory fitness interactions; i.e., single mutations are assumed to be deleterious but harmless (neutral) in appropriate combinations. In proceeding under mutation pressure, epistatic selection, and genetic drift from one fitness peak to another, a population must pass through a valley of lower individual fitness. This process of compensatory evolution is investigated by analytical approximation and computer simulation. Our model is more general than Kimura's (1985), which assumed that mutation pressure is unidirectional and that deleterious intermediates are very strongly selected against. In contrast, our model covers a broad range of selection coefficients, including very small ones, and agrees with analyses of Kimura's model when selection is strong (Kimura 1985; Stephan 1996).
As in these latter analyses, we focus on the expected time for the compensatory substitution process to go from one fitness peak to another. In addition, we study the structure of variation (linkage disequilibrium) within a population during this transition. The process of compensatory substitution is analyzed by dividing it into two phases defined by the time periods T_{1} and T_{2}. T_{1} is the waiting time for a successful double mutant haplotype (ab) that will fix in the population, and T_{2} is the time from the appearance of a successful double mutant to the fixation event. The results of theory and computer simulations show that T_{1} ⪢ T_{2} (Table 1) as suggested by Stephan (1996).
The expectation of T_{1} is obtained analytically for the case without recombination. It is shown that selection dramatically increases T_{1} (Figure 3). Although the theory is based on the assumption of very low mutation rates, it is shown that it is also applicable to very high nucleotide mutation rates relative to 1/N (e.g., θ = 0.1) unless Ns is very small. Thus, our analysis is useful for natural populations for which θ is generally ⪡0.1 (Kimura 1983; Nei 1987; Gillespie 1991). Our computer simulations demonstrate that there is almost no effect of recombination on T_{1} when selection is relatively weak (Ns ≤ 1), while T_{1} increases significantly with increasing recombination rates when selection is strong (Figure 4).
Parameter estimation: An important parameter of the compensatory evolution model is the intensity of selection Ns against deleterious intermediates. In the following, we attempt to estimate this parameter for mRNA secondary structures. Our theoretical results predict that T_{1} is much larger than T_{1neu} unless Ns is very small (Figure 3), suggesting that nucleotide substitutions occur very slowly in pairing regions of mRNA secondary structures. It is known that such regions are highly conserved between species in contrast to other (unpaired) regions (Muse 1995; Parschet al. 2000). Thus, it may be possible to estimate Ns in pairing regions from DNA sequence comparisons.
We compare the rates of substitutions between species in pairing regions with those in regions that are considered selectively neutral. As an example, we analyze the bicoid 3′ UTR of Drosophila. It has been shown that bicoid mRNA has a complex secondary structure in the 3′ UTR (Macdonald 1990; Seeger and Kaufman 1990; Ferrandonet al. 1997; Macdonald and Kerr 1998). Based on the alignment of DNA sequences from nine Drosophila species, Parsch et al. (2000) identified eight highly conserved pairing regions, of which seven have been supported by mutational analysis (Ferrandonet al. 1997; Macdonald and Kerr 1998). We consider these eight stems as pairing regions and the remainder of the 3′ UTR as unpaired. It is also assumed that there is no selection in the unpaired regions. The total length of the paired segments is 142 nucleotides, corresponding to 71 bp.
We first compare the nucleotide sequences between D. melanogaster and D. simulans, using the alignment suggested by Parsch et al. (2000). In 142 nucleotides of the pairing regions, 3 nucleotide differences are observed and the number of substitutions per site (d_{p}) is estimated to be 0.0214 by the Jukes and Cantor (1969) method (Table 3). In the remaining regions of the 3′ UTR, 31 nucleotide differences are observed and the number of substitutions per site (d_{n}) is estimated to be 0.0435. This suggests that the rate of nucleotide substitutions is reduced by roughly a factor of 2 in the pairing regions in comparison with the remainder of the 3′ UTR. For the pair of D. melanogaster and D. pseudoobscura, d_{p} and d_{n} are estimated to be 0.2099 and 0.5087, respectively. The ratio of d_{n} to d_{p} is ~2.4, similar to that of the comparison between D. melanogaster and D. simulans.
Since it is assumed that the unpaired regions are selectively neutral, the ratio of d_{n} to d_{p} is comparable to T_{1}/T_{1neu}. In Figure 7, T_{1}/T_{1neu} is plotted as a function of Ns for the symmetrical model without recombination. T_{1}/T_{1neu} is calculated by (5a) and (6). The results show that T_{1}/T_{1neu} increases rapidly with increasing Ns, as soon as Ns > 1. This is particularly the case for θ ≤ 0.01. T_{1}/T_{1neu} depends weakly on θ when θ ≤ 0.01.
In the bicoid 3′ UTR of Drosophila (Table 3), the ratio of d_{n} to d_{p} is ~2.0–2.4. The estimate of θ in the unpaired regions for a D. melanogaster population from Zimbabwe is ~0.003 (J. F. Baines, Y. Chen and W. Stephan, unpublished results). Figure 7 suggests therefore that the observed ratio of d_{n} to d_{p} can be explained if Ns is ~0.6–0.7. If we consider only the number of complete compensatory substitutions (WC/WC in Table 3) for the comparison between D. melanogaster and D. pseudoobscura, the ratio of d_{n} to d_{p} becomes ~5 and the estimate of Ns ≈ 1.
There are, however, some caveats.
Ns could be larger than this estimate because d_{n} is underestimated if selection is acting in the regions that we consider as unpaired. There may be some evidence for weak selection in these regions. First, Macdonald (1990) suggested the possibility of longrange pairings encompassing almost the entire 3′ UTR. However, his suggestion was not supported by a strict phylogenetic analysis (Parschet al. 2000). Second, average silent divergence in the unpaired segments of the 3′ UTR between D. melanogaster and D. simulans is ~0.0435, which is about a factor of 2.5 lower than in the rest of the bicoid gene upstream of the 3′ UTR (J. F. Baines, Y. Chen and W. Stephan, unpublished results). On the other hand, even if weak selection is acting in the unpaired regions, the estimate of Ns for the pairing regions does not increase much, as the average time of compensatory substitutions becomes extremely large for Ns > 1 when mutation pressure is relatively weak (i.e., θ ≤ 0.01; Figure 7). Thus, it may be concluded that the selection intensity in the pairing regions of the bicoid 3′ UTR is on average not much >1. This estimate is similar to the estimate of average selection intensity for codon usage in Drosophila (Akashi 1995).
To estimate the selection intensity, we used the average d_{p} of eight pairing regions. In other words, Ns is the average selection intensity of these eight pairing regions. Parsch et al. (2000) found heterogeneity for d_{p} among pairing regions, caused by variation in both stem length and the physical distance between basepairing residues. One reason is that long stems are under less selective constraints than short ones (see Figure 3A of Parschet al. 2000). Another factor is that shortrange pairings (hairpins) experience a higher rate of evolution than longrange pairings because of the retarding effects of recombination when selection is sufficiently strong (Figure 4A of Parschet al. 2000). As a consequence, the estimates of Ns appear to vary substantially among pairing regions.
Parsch et al. (2000) were able to distinguish the effect of stem length from that of physical distance when only pairing regions with covariations were considered. According to their Figure 4A, they found an approximately fivefold drop in the rate of compensatory evolution over a physical distance of nearly 200 bp between basepairing residues. We have to ask whether such a large decrease of the rate of compensatory evolution is consistent with an estimate of Ns≈1. Assuming that a physical distance of 200 bp of the bicoid 3′ UTR corresponds to a value of 4Nr in the order of 10 (i.e., using the standard estimates of effective population size and recombination rate for D. melanogaster that are in the order of 10^{6} and 10^{−8}, respectively), this distance effect can be explained by our model only if Ns is ~5 (see Table 1). Thus, it appears that this value is not compatible with our estimate of Ns ≈ 1 obtained without taking recombination into account. However, one has to keep in mind that for technical reasons the analysis of Parsch et al. (2000) is based on pairing regions with covariations only. A much weaker distance effect would presumably result if all pairing regions were considered, including those with no covariations (Parschet al. 2000). A much larger data set and more sophisticated methods are needed to take the distance effect into account.
Linkage disequilibrium: Strong linkage disequilibrium is sometimes considered as evidence for epistatic selection (Lewontin 1974). We investigated the amount of linkage disequilibrium in the process of compensatory substitution. It is demonstrated that the level of linkage disequilibrium is very low during time period T_{1}. Strong positive linkage disequilibrium, however, is observed in phase T_{2} if selection is strong. This suggests that significant linkage disequilibria due to compensatory interactions should be rarely observed in natural populations because T_{2} is much smaller than T_{1} if selection is strong. On the other hand, if selection is weak, linkage disequilibrium is not very large even in phase T_{2}.
Schaeffer and Miller (1993) detected two clusters of polymorphisms in the Adh introns of D. pseudoobscura that exhibit significant linkage disequilibrium. In both cases, the disequilibria are due to two highly diverged haplotypes, ha1 and ha2, that have been shown to form different premRNA secondary structures (Kirbyet al. 1995). It was also revealed that this structural polymorphism has predated the species split of D. pseudoobscura, D. persimilis, and D. miranda because ha1 and ha2 are similar to D. persimilis and D. miranda haplotypes, respectively (Kirbyet al. 1995). This observation is not consistent with our results, which show that a compensatory substitution requires a long waiting time for a successful double mutant to occur and that the fixation event of ab follows relatively quickly. In other words, our model does not predict that the secondarystructureforming haplotypes of Adh are maintained for such a long time, as observed in these species. This suggests that our model of compensatory evolution is either too simple, as it allows only two sites to undergo base changes, or that some additional form of selection (for instance, balancing selection) may be maintaining the haplotypes ha1 and ha2. While there is no evidence for the latter suggestion, the fact that in both examples the haplotypes were subject to significant rearrangement during evolutionary time (due to insertions and deletions of bases) may indicate that the underlying compensatory process is much more complicated than our twolocus model assumes. Therefore, to model such complex compensatory changes, models need to be developed that include compensatory insertions and deletions in addition to base substitutions.
Acknowledgments
We thank an anonymous reviewer for helpful suggestions that improved the manuscript. John F. Baines and Ying Chen kindly provided unpublished results. This research was supported in part by a fellowship from the Japan Society for the Promotion of Science to H.I. and by National Institutes of Health grant GM58405 to W.S.
Footnotes

Communicating editor: G. B. Golding
 Received February 17, 2001.
 Accepted June 4, 2001.
 Copyright © 2001 by the Genetics Society of America