Abstract
A two-locus model of reversible mutations with compensatory fitness interactions is presented; single mutations are assumed to be deleterious but neutral in appropriate combinations. The expectation of the time of compensatory nucleotide substitutions is calculated analytically for the case of tight linkage between sites. It is shown that selection increases the substitution time dramatically when selection intensity Ns > 1, where N is the diploid population size and s the selection coefficient. Computer simulations demonstrate that recombination increases the substitution time, but the effect of recombination is small when selection is weak. The amount of linkage disequilibrium generated in the process of compensatory substitution is also investigated. It is shown that significant linkage disequilibrium is expected to be rare in natural populations. The model is applied to the mRNA secondary structure of the bicoid 3′ untranslated region of Drosophila. It is concluded that average selection intensity Ns against single deleterious mutations is not likely to be much larger than 1.
MODELS of compensatory evolution involve mutations from two or more loci. These mutations are assumed to be deleterious when they occur independently, but, in combination, they (at least partially) compensate each other for their deleterious effects. Kimura (1985) proposed a two-locus, two-allele model, with alleles A and a at the first locus and B and b at the second locus, as illustrated in Figure 1A. He assumed that the two intermediate haplotypes, Ab and aB, are deleterious, while the wild-type AB and the double mutant ab are neutral. He further assumed that selection intensity against the deleterious intermediates is so strong that their frequencies in a population are very low. Accordingly, he considered only unidirectional mutations from A to a and from B to b and ignored back mutations (Figure 2A).
An important example of compensatory evolution is found in RNA secondary structures. In single-stranded RNAs, Watson-Crick (WC) pairing of complementary nucleotide bases is the basic mechanism in the formation of stem-loop structures. It is believed that an individual mutation that breaks up a WC pairing is deleterious and that a second “compensatory” mutation at the complementary site can reestablish pairing and restore fitness. The relatively simple pattern of intramolecular WC base-pairing involved in RNA structures has made them a suitable model for the study of compensatory evolution (Stephan and Kirby 1993; Golding 1994; Schöniger and von Haeseler 1994; Kirbyet al. 1995; Muse 1995; Rzhetsky 1995; Tillier and Collins 1995; Stephan 1996; Higgs 2000; Savillet al. 2001). Most of these authors analyzed rRNA secondary structures.
Phylogenetic analysis has revealed a number of compensatory nucleotide changes between species. On the other hand, however, mismatches, including not only GU wobble pairs but also other noncanonical pairs, are frequently observed, indicating that selection against deleterious intermediates may not be very strong (Roussetet al. 1991; Parschet al. 2000). This suggests that Kimura's compensatory evolution model, which assumes strong selection against deleterious intermediates, may not be generally applicable to the evolution of RNA secondary structures.
In this article, a compensatory evolution model is described in which selection against deleterious single mutations is not necessarily strong but covers a broad range of selection coefficients. Under weak selection, deleterious haplotypes may increase in frequency and even fix in the population as illustrated in a simulation run shown in Figure 2B. Since back mutations play an important role when the frequencies of deleterious haplotypes become large, we use a two-locus, two-allele model in which bidirectional mutations are considered (Figure 1B). This model is different from those used by Kimura (1985), Iizuka and Takefu (1996), and Stephan (1996), who assumed that selection against deleterious intermediates is so strong that the mutation process may be considered unidirectional (see above). Our analysis is also different from that of Higgs (1998), who assumed bidirectional mutation but analyzed the model only in the parameter range of strong selection (i.e., Ns ⪢ 1, where N is the diploid population size and s the selection coefficient).
Models of compensatory evolution. (A) Kimura's (1985) model in which very strong selection is assumed and only unidirectional mutations are considered. (B) The model used in this article. Bidirectional mutations are considered. The fitnesses and frequencies are presented in parentheses.
Our goal is to calculate the time to proceed from the wild-type state AB to the fixation of the double mutant ab in this bidirectional mutation model. It should be noted, however, that neither AB nor ab is an absorbing state. That means we do not calculate a “fixation” time sensu stricto but do calculate the time to “flip” back and forth between AB and ab. This is possible as the nucleotide mutation rate is much smaller than 1/N (see below). To describe the transition between AB and ab, it is therefore reasonable to use the term “fixation” (or “substitution”). We present analytical results for the rate of this compensatory substitution event when there is no recombination between the two loci and use computer simulation to obtain this time under the influence of recombination. The theoretical results are applied to DNA sequence data of the bicoid 3′ untranslated region (UTR) of Drosophila to estimate the selection intensity against deleterious mutations.
Another purpose of this article is to evaluate the amount of linkage disequilibrium generated by the compensatory evolution model. It is known that epistatic selection may produce significant linkage disequilibrium in natural populations (Lewontin 1974). Schaeffer and Miller (1993) reported linkage disequilibria in two clusters of DNA polymorphisms in introns of Adh in Drosophila pseudoobscura. These disequilibria are likely due to epistatic selection maintaining pre-mRNA secondary structures (Kirbyet al. 1995). Here we examine whether these findings are consistent with our model of compensatory evolution.
THEORY
Consider a two-locus, two-allele model for a randomly mating population with N diploids. There are alleles A and a at the first locus and B and b at the second locus. The (bidirectional) mutation rate between A and a is given by μ1 and that between B and b is given by μ2 (Figure 1B). The mutation rates are assumed to be much smaller than 1/(2N), as is the case for nucleotide substitution rates. The relative fitnesses of the haplotypes AB, Ab, aB, and ab, are given by 1, 1 − s1, 1 − s2, and 1, respectively. Effects of fitness are assumed additive within locus and therefore the diploid model is equivalent to a haploid model. The haplotype frequencies are denoted by x0, x1, x2, and x3, respectively.
In the following, we consider the process of compensatory substitution under the joint action of drift, selection, mutation, and recombination. We assume the process starts at x0 = 1 at t = 0. First, mutations in AB produce Ab and aB types. Next, mutations in Ab and aB may create ab, and some of them fix in the population. Recombination between Ab and aB may also produce ab. We are interested in the expected time of compensatory substitution, defined as the time from t = 0 (when the system is at state x0 = 1) to the time point when the double mutant ab is fixed (x3 = 1). Because the mutation process is bidirectional, the latter state is not an absorption state. The time we are calculating is therefore a first passage time. Because of the assumption μi ⪡ 1/(2N), double mutants ab either get lost by drift or go to fixation; the proportion of ab haplotypes reverting back to Ab or aB is very small. Recombination can only retard the fixation process (Kimura 1985; Stephan 1996). The fixation time can be divided into two parts, T1 and T2. T1 is the waiting time for the “successful” double mutant ab (that will eventually get fixed) to appear in the population, and T2 is the time from the appearance of ab to the fixation event (see Figure 2).
Symmetrical model: Consider a symmetrical model with s = s1 = s2 and μ = μ1 = μ2. In the analytical derivations, recombination is neglected. The four haplotypes are divided into two groups: one consists of AB and ab, and the other is the group of the deleterious intermediates Ab and aB. Let X be the frequency of the group of deleterious intermediates (X = x1 + x2) and Y the frequency of the other group (Y = x0 + x3). Denote the distribution of X by Φ(X). In phase 1 when the system waits for a successful double mutant to appear, it will reach a quasi-equilibrium after a short initial period. At this quasi-equilibrium, its distribution is given approximately by
(A) Illustration of the process of compensatory substitution when selection is very strong and the intermediates Ab and aB are maintained in low frequencies. Solid circles present births of ab double mutants. (B) Illustration of the process of compensatory substitution when selection is weak. Ab and aB sometimes fix in the population.
Thus, we assume that when a new ab appears by a mutation in Ab or aB, the distribution of X before the mutation is given by (1). After the mutation, Y (= 1 − X) changes to Y′ = Y + 1/(2N). Let p be the fixation probability of the Y group. Since the mutation rate is assumed to be very small, p can be approximated by
In the case of neutrality, since p′ is 1/(2N), the time becomes
When selection is very strong, X is maintained in very low frequency (approximately at the deterministic mutation-selection balance). The expected frequency is then given by
T2 is the time from the appearance of a successful double mutant haplotype to the fixation event. In the case of neutrality, the expectation of T2 is ~4N (Kimura and Ohta 1969) since we assumed μi ⪡ 1/(2N). When selection is very strong, T2 may be close to 4N again. That is expected because x1 and x2 are very small and ab has almost no selective advantage in the population. T2 for moderate selection intensities is expected to be smaller than 4N because ab has a selective advantage when x1 and x2 are not very small.
General model: Consider a general model where μ1 ≠ μ2 and/or s1 ≠ s2. Recombination is again neglected in the analytical treatment. In this case, it is necessary to investigate the frequency distributions of Ab and aB separately. Let ϕ1(x) and ϕ2(x) be the frequency distributions of Ab and aB, respectively. To obtain these two distributions, we reconsider the symmetrical model, where the distribution of the sum of x1 and x2 is given by (1). In a population with N diploids, the probability that i haplotypes are Ab or aB is given by
Let P1(i) and P2(i) be the probability distributions of the numbers of Ab and aB, respectively. If we assume that μ is so small and Ns is so large that Ab and aB do not coexist frequently, P1(i) and P2(i) are given approximately by
Next we consider ϕ1(x) and ϕ2(x) in the general model, where μ1 ≠ μ2 and/or s1 ≠ s2. If we assume that Ab and aB do not coexist frequently, the frequencies of Ab and aB follow two independent distributions. From the arguments in Equations 9, 10 and 11, it is expected that ϕ1(x) and ϕ2(x) are given for x > 1/(4N) by
Denote by
Therefore, the expected number of ab that will fix in the population per generation is given by
T2 for the general model is the same as for the symmetrical model. That is, T2 ≈ 4N when selection is very strong and T2 < 4N when selection intensity is moderate.
COMPUTER SIMULATIONS
Computer simulations were carried out for the following reasons. The first one is to check the theoretical results for T1 shown above, because we use the following assumptions in the derivation. We use approximate formulas for the distribution of the haplotype frequencies ignoring the initial condition (X = 0 at t = 0). In the general asymmetric model, we assume that two haplotypes, Ab and aB, do not coexist. The second reason is to examine the effect of recombination on T1 under a broad range of selection coefficients. In contrast to the strong selection case (Kimura 1985; Stephan 1996), we were unable to find analytical expressions for T1 with recombination when selection is weak. T2 is also investigated by simulations with and without recombination. Another purpose of the simulations is to evaluate the amount of linkage disequilibrium generated in the process of compensatory substitution.
Monte Carlo simulations with mutation, selection, recombination, and random genetic drift were conducted in a constant size population of N diploids as follows. Each replication of the simulations starts from the initial condition (x0, x1, x2, x3) = (1, 0, 0, 0). In every generation, the frequencies are determined by the pseudosampling method (Kimura and Takahata 1983). The recombination rate between the two loci is assumed to be r per generation. Every 4N generations, the frequencies (x0, x1, x2, x3) are scored to investigate their frequency distributions. The amount of linkage disequilibrium (D = x0x3 − x1x2) is also calculated. Each replication ends when (x0, x1, x2, x3) = (0, 0, 0, 1) is reached for the first time, and T1 and T2 are recorded. When 0 ≤ t′ ≤ T2 (with t′ = t − T1), (x0, x1, x2, x3) is recorded every N/10 generation to calculate D.
Computer simulations with no recombination were conducted assuming N = 100, s = s1 = s2, and μ = μ1 = μ2, and the results for T1 and T2 are summarized in Table 1. The theoretical results for T1 in the symmetrical model were compared with the simulation results with no recombination (Figure 3). In Figure 3A, the theoretical expectation calculated by (5a) is shown with the simulation results of θ = 0.01. Ns was changed from 0 to 5. The theory is in very good agreement with the results of the simulations. Figure 3B shows the results of theory and simulations for a relatively high mutation rate (θ = 0.1) although our model assumes very low mutation rates. It is shown that Equation 5a gives a quite good approximation even for θ = 0.1, unless Ns = 0. The underestimation for Ns = 0 may occur because frequent recurrent mutations reduced the fixation probability of ab. The degree of reduction is expected to be relatively small when T1 is large. For all parameter sets examined, the standard deviation of T1 is similar to the mean (Table 1), supporting the exponential distribution of T1 as suggested by (5b). Similar results were obtained from computer simulations with N = 1000 (data not shown).
The effect of recombination on T1 was also investigated by computer simulations with θ = 0.01 and 0.1 (Table 1). The effect of recombination is very small when selection is weak. The effect, however, can be seen for selection intensity Ns > 1. In Figure 4, a clear positive correlation is observed between 4Nr and T1 when Ns = 2 and 5. These results are consistent with those of Kimura (1985) and Stephan (1996). Similar relationships between 4Nr and T1 were observed in other two-locus models with epistatic interactions (Michalakis and Slatkin 1996; Christiansenet al. 1998).
The results for T2 are also shown in Table 1. T2 is much smaller than T1 in all the parameter sets investigated. First, we consider T2 with no recombination. In the case of neutrality, T2 obtained from simulations for θ = 0.01 is close to 400 as expected from the theory, although T2 for θ = 0.1 is a little >4N. When selection is very strong (Ns ≥ 5), T2 is close to 4N again. T2 for moderate selection intensity is <4N.
T2 may be negatively correlated with recombination rate (Table 1). The degree of reduction in T2 is larger when selection is stronger. T2 for 4Nr = 0 is similar to that for 4Nr = 10 when Ns ≤ 1, while T2 for 4Nr = 10 is much smaller than that for no recombination when Ns ≥ 2. The result for the T2 phase may be understood as follows. Since recombination usually occurs between AB and ab in T2, it reduces the fixation probability of ab and increases T1. As the fixation probability is reduced, only ab, which increases its frequency quickly, can successfully fix in the population.
Theoretical results for T1 in the general model were compared with the results of computer simulations, and the results for θ1 = 0.02 and θ2 = 0.01 are shown in Figure 5 with the theoretical expectations calculated from (16). It is shown that the theoretical expectations are in good agreement with the results of simulations when Ns1 ≥ 0.5 and Ns2 ≥ 0.5. If one of the selection intensities is very small, Equation 16 overestimates T1 because the assumption used to derive (16) does not hold. In the derivation, we obtained approximate formulas for ϕ1(x) and ϕ2(x) under the assumption that Ab and aB do not coexist at the same time. If one of the selection intensities, say Ns1, is very small, ϕ1(x) given by (12) does not agree with the frequency distribution of Ab obtained by simulations. Similar results were obtained for other values of θ1 and θ2. Good agreement between the theory and simulations was observed when neither Ns1 nor Ns2 is small (data not shown).
Table 2 shows the amount of linkage disequilibrium (D = x0x3 − x1x2) in the process of compensatory substitution obtained by simulations under the symmetrical model with no recombination. D was calculated every 4N generations as long as 0 < t < T1. For this interval of 4N generations, we found almost no correlation between sampling. Mean and variance were calculated for all runs. For θ = 0.01 the level of linkage disequilibrium is extremely low. Even for θ = 0.1 D is very small although much larger than that for θ = 0.01. This indicates that almost no linkage disequilibrium is expected during the waiting time for the appearance of a successful double mutant haplotype. Simulations with recombination showed that recombination reduces the level of linkage disequilibrium (data not shown).
On the other hand, strong positive linkage disequilibrium is generated after a successful double mutant has appeared and is on its way to fixation (0 < t′ < T2). The average of linkage disequilibrium for t′ ≥ 0 is plotted in Figure 6. Selection increases the level of linkage disequilibrium significantly (Figure 6, A and B). Each distribution of D has a peak near t′ = 100. As Ns increases, the peak is getting larger and seems to saturate at D ≈ 0.2. Note that the theoretical maximum value of D is 0.25, which is reached when x0 = x3 = 0.5 and x1 = x2 = 0. When selection is weak, the level of linkage disequilibrium is higher for θ = 0.1 than for θ = 0.01.
Figure 6C shows the effect of recombination on the level of linkage disequilibrium when θ = 0.01 and Ns = 2. As the recombination rate increases, the level of linkage disequilibrium is getting weaker. Similar results were obtained for other mutation rates and selection intensities (data not shown).
DISCUSSION
Theory: We analyzed a model of reversible mutations with compensatory fitness interactions; i.e., single mutations are assumed to be deleterious but harmless (neutral) in appropriate combinations. In proceeding under mutation pressure, epistatic selection, and genetic drift from one fitness peak to another, a population must pass through a valley of lower individual fitness. This process of compensatory evolution is investigated by analytical approximation and computer simulation. Our model is more general than Kimura's (1985), which assumed that mutation pressure is unidirectional and that deleterious intermediates are very strongly selected against. In contrast, our model covers a broad range of selection coefficients, including very small ones, and agrees with analyses of Kimura's model when selection is strong (Kimura 1985; Stephan 1996).
Results of computer simulations for T1 and T2
Relationship between T1 and Ns under the symmetrical model without recombination. The theoretical expectation is obtained from (5a) and represented by a solid curve. The simulation results are based on Table 1 and presented by open circles. (A) Results for θ = 0.01. (B) Results for θ = 0.1.
As in these latter analyses, we focus on the expected time for the compensatory substitution process to go from one fitness peak to another. In addition, we study the structure of variation (linkage disequilibrium) within a population during this transition. The process of compensatory substitution is analyzed by dividing it into two phases defined by the time periods T1 and T2. T1 is the waiting time for a successful double mutant haplotype (ab) that will fix in the population, and T2 is the time from the appearance of a successful double mutant to the fixation event. The results of theory and computer simulations show that T1 ⪢ T2 (Table 1) as suggested by Stephan (1996).
Effect of recombination on T1 under the symmetrical model with θ = 0.01. The results of computer simulations are from Table 1.
The expectation of T1 is obtained analytically for the case without recombination. It is shown that selection dramatically increases T1 (Figure 3). Although the theory is based on the assumption of very low mutation rates, it is shown that it is also applicable to very high nucleotide mutation rates relative to 1/N (e.g., θ = 0.1) unless Ns is very small. Thus, our analysis is useful for natural populations for which θ is generally ⪡0.1 (Kimura 1983; Nei 1987; Gillespie 1991). Our computer simulations demonstrate that there is almost no effect of recombination on T1 when selection is relatively weak (Ns ≤ 1), while T1 increases significantly with increasing recombination rates when selection is strong (Figure 4).
Relationship between T1 and selection intensities Ns1 and Ns2 under the general model without recombination. The theoretical expectations are obtained from (16) and represented by solid curves. θ1 = 0.02 and θ2 = 0.01 are assumed. Solid circles, solid squares, open circles, and open squares represent the results of simulations for Ns1 = 0, 0.5, 1, and 2, respectively.
Results of computer simulations for linkage disequilibrium
Parameter estimation: An important parameter of the compensatory evolution model is the intensity of selection Ns against deleterious intermediates. In the following, we attempt to estimate this parameter for mRNA secondary structures. Our theoretical results predict that T1 is much larger than T1neu unless Ns is very small (Figure 3), suggesting that nucleotide substitutions occur very slowly in pairing regions of mRNA secondary structures. It is known that such regions are highly conserved between species in contrast to other (unpaired) regions (Muse 1995; Parschet al. 2000). Thus, it may be possible to estimate Ns in pairing regions from DNA sequence comparisons.
We compare the rates of substitutions between species in pairing regions with those in regions that are considered selectively neutral. As an example, we analyze the bicoid 3′ UTR of Drosophila. It has been shown that bicoid mRNA has a complex secondary structure in the 3′ UTR (Macdonald 1990; Seeger and Kaufman 1990; Ferrandonet al. 1997; Macdonald and Kerr 1998). Based on the alignment of DNA sequences from nine Drosophila species, Parsch et al. (2000) identified eight highly conserved pairing regions, of which seven have been supported by mutational analysis (Ferrandonet al. 1997; Macdonald and Kerr 1998). We consider these eight stems as pairing regions and the remainder of the 3′ UTR as unpaired. It is also assumed that there is no selection in the unpaired regions. The total length of the paired segments is 142 nucleotides, corresponding to 71 bp.
Linkage disequilibrium in phase T2. (A) Results of computer simulations without recombination for θ = 0.01. The effect of selection intensity is investigated. (B) Results of computer simulations without recombination for θ = 0.1. (C) Results of computer simulations for θ = 0.01 and Ns = 2. The effect of recombination is investigated.
Summary of nucleotide differences in the bicoid 3′ UTR of Drosophila
We first compare the nucleotide sequences between D. melanogaster and D. simulans, using the alignment suggested by Parsch et al. (2000). In 142 nucleotides of the pairing regions, 3 nucleotide differences are observed and the number of substitutions per site (dp) is estimated to be 0.0214 by the Jukes and Cantor (1969) method (Table 3). In the remaining regions of the 3′ UTR, 31 nucleotide differences are observed and the number of substitutions per site (dn) is estimated to be 0.0435. This suggests that the rate of nucleotide substitutions is reduced by roughly a factor of 2 in the pairing regions in comparison with the remainder of the 3′ UTR. For the pair of D. melanogaster and D. pseudoobscura, dp and dn are estimated to be 0.2099 and 0.5087, respectively. The ratio of dn to dp is ~2.4, similar to that of the comparison between D. melanogaster and D. simulans.
Relationship between T1/T1neu and Ns for various values of θ.
Since it is assumed that the unpaired regions are selectively neutral, the ratio of dn to dp is comparable to T1/T1neu. In Figure 7, T1/T1neu is plotted as a function of Ns for the symmetrical model without recombination. T1/T1neu is calculated by (5a) and (6). The results show that T1/T1neu increases rapidly with increasing Ns, as soon as Ns > 1. This is particularly the case for θ ≤ 0.01. T1/T1neu depends weakly on θ when θ ≤ 0.01.
In the bicoid 3′ UTR of Drosophila (Table 3), the ratio of dn to dp is ~2.0–2.4. The estimate of θ in the unpaired regions for a D. melanogaster population from Zimbabwe is ~0.003 (J. F. Baines, Y. Chen and W. Stephan, unpublished results). Figure 7 suggests therefore that the observed ratio of dn to dp can be explained if Ns is ~0.6–0.7. If we consider only the number of complete compensatory substitutions (WC/WC in Table 3) for the comparison between D. melanogaster and D. pseudoobscura, the ratio of dn to dp becomes ~5 and the estimate of Ns ≈ 1.
There are, however, some caveats.
Ns could be larger than this estimate because dn is underestimated if selection is acting in the regions that we consider as unpaired. There may be some evidence for weak selection in these regions. First, Macdonald (1990) suggested the possibility of long-range pairings encompassing almost the entire 3′ UTR. However, his suggestion was not supported by a strict phylogenetic analysis (Parschet al. 2000). Second, average silent divergence in the unpaired segments of the 3′ UTR between D. melanogaster and D. simulans is ~0.0435, which is about a factor of 2.5 lower than in the rest of the bicoid gene upstream of the 3′ UTR (J. F. Baines, Y. Chen and W. Stephan, unpublished results). On the other hand, even if weak selection is acting in the unpaired regions, the estimate of Ns for the pairing regions does not increase much, as the average time of compensatory substitutions becomes extremely large for Ns > 1 when mutation pressure is relatively weak (i.e., θ ≤ 0.01; Figure 7). Thus, it may be concluded that the selection intensity in the pairing regions of the bicoid 3′ UTR is on average not much >1. This estimate is similar to the estimate of average selection intensity for codon usage in Drosophila (Akashi 1995).
To estimate the selection intensity, we used the average dp of eight pairing regions. In other words, Ns is the average selection intensity of these eight pairing regions. Parsch et al. (2000) found heterogeneity for dp among pairing regions, caused by variation in both stem length and the physical distance between base-pairing residues. One reason is that long stems are under less selective constraints than short ones (see Figure 3A of Parschet al. 2000). Another factor is that short-range pairings (hairpins) experience a higher rate of evolution than long-range pairings because of the retarding effects of recombination when selection is sufficiently strong (Figure 4A of Parschet al. 2000). As a consequence, the estimates of Ns appear to vary substantially among pairing regions.
Parsch et al. (2000) were able to distinguish the effect of stem length from that of physical distance when only pairing regions with covariations were considered. According to their Figure 4A, they found an approximately fivefold drop in the rate of compensatory evolution over a physical distance of nearly 200 bp between base-pairing residues. We have to ask whether such a large decrease of the rate of compensatory evolution is consistent with an estimate of Ns≈1. Assuming that a physical distance of 200 bp of the bicoid 3′ UTR corresponds to a value of 4Nr in the order of 10 (i.e., using the standard estimates of effective population size and recombination rate for D. melanogaster that are in the order of 106 and 10−8, respectively), this distance effect can be explained by our model only if Ns is ~5 (see Table 1). Thus, it appears that this value is not compatible with our estimate of Ns ≈ 1 obtained without taking recombination into account. However, one has to keep in mind that for technical reasons the analysis of Parsch et al. (2000) is based on pairing regions with covariations only. A much weaker distance effect would presumably result if all pairing regions were considered, including those with no covariations (Parschet al. 2000). A much larger data set and more sophisticated methods are needed to take the distance effect into account.
Linkage disequilibrium: Strong linkage disequilibrium is sometimes considered as evidence for epistatic selection (Lewontin 1974). We investigated the amount of linkage disequilibrium in the process of compensatory substitution. It is demonstrated that the level of linkage disequilibrium is very low during time period T1. Strong positive linkage disequilibrium, however, is observed in phase T2 if selection is strong. This suggests that significant linkage disequilibria due to compensatory interactions should be rarely observed in natural populations because T2 is much smaller than T1 if selection is strong. On the other hand, if selection is weak, linkage disequilibrium is not very large even in phase T2.
Schaeffer and Miller (1993) detected two clusters of polymorphisms in the Adh introns of D. pseudoobscura that exhibit significant linkage disequilibrium. In both cases, the disequilibria are due to two highly diverged haplotypes, ha1 and ha2, that have been shown to form different pre-mRNA secondary structures (Kirbyet al. 1995). It was also revealed that this structural polymorphism has predated the species split of D. pseudoobscura, D. persimilis, and D. miranda because ha1 and ha2 are similar to D. persimilis and D. miranda haplotypes, respectively (Kirbyet al. 1995). This observation is not consistent with our results, which show that a compensatory substitution requires a long waiting time for a successful double mutant to occur and that the fixation event of ab follows relatively quickly. In other words, our model does not predict that the secondary-structure-forming haplotypes of Adh are maintained for such a long time, as observed in these species. This suggests that our model of compensatory evolution is either too simple, as it allows only two sites to undergo base changes, or that some additional form of selection (for instance, balancing selection) may be maintaining the haplotypes ha1 and ha2. While there is no evidence for the latter suggestion, the fact that in both examples the haplotypes were subject to significant rearrangement during evolutionary time (due to insertions and deletions of bases) may indicate that the underlying compensatory process is much more complicated than our two-locus model assumes. Therefore, to model such complex compensatory changes, models need to be developed that include compensatory insertions and deletions in addition to base substitutions.
Acknowledgments
We thank an anonymous reviewer for helpful suggestions that improved the manuscript. John F. Baines and Ying Chen kindly provided unpublished results. This research was supported in part by a fellowship from the Japan Society for the Promotion of Science to H.I. and by National Institutes of Health grant GM-58405 to W.S.
Footnotes
-
Communicating editor: G. B. Golding
- Received February 17, 2001.
- Accepted June 4, 2001.
- Copyright © 2001 by the Genetics Society of America