Abstract
Recombination is a main factor determining nucleotide variability in different regions of the genome. Chromosomal inversions, which are ubiquitous in the genus Drosophila, are known to reduce and redistribute recombination, and thus their specific effect on nucleotide variation may be of major importance as an explanatory factor for levels of DNA variation. Here, we use the coalescent approach to study this effect. First, we develop analytical expressions to predict nucleotide variability in old inversion polymorphisms that have reached mutation-drift-flux equilibrium. The effects on nucleotide variability of a new arrangement appearing in the population and reaching a stable polymorphism are then studied by computer simulation. We show that inversions modulate nucleotide variability in a complex way. The establishment of an inversion polymorphism involves a partial selective sweep that eliminates part of the variability in the population. This is followed by a slow convergence to the equilibrium values. During this convergence, regions close to the breakpoints exhibit much lower variability than central regions. However, at equilibrium, regions close to the breakpoints have higher levels of variability and differentiation between arrangements than regions in the middle of the inverted segment. The implications of these findings for overall variability levels during the evolution of Drosophila species are discussed.
CHROMOSOMAL inversion polymorphisms have been a cornerstone in the study of evolution all through the history of population genetics. Since the establishment of the modern synthesis, inversions have been a privileged system to study such diverse subjects as phylogenies, geographical clines, temporal cycles, meiotic drive, and, of course, to look for evidence of natural selection (see Krimbas and Powell 1992 for a review). In fact, the first studies on the role of natural selection in the maintenance of genetic polymorphisms, either in nature or in experimental populations, used inversions because they could be detected by means of simple cytological techniques and their frequency changes could easily be followed through the generations (Dobzhansky 1970; Lewontin 1981). The onset of electrophoresis and the allozymes era was followed by an intense search for linkage disequilibria between allozyme loci associated with inversion polymorphisms and between allozyme loci and the inversions themselves, because they were thought to be generated by epistatic selection (Prakash and Lewontin 1968; Zapata and Álvarez 1987, 1992, 1993; Krimbas and Powell 1992). Now, in the DNA era, inversions may still be useful places to look for selection (Kreitman and Wayne 1994; Depauliset al. 1999), because, by reducing recombination, inversions can act as amplifiers of the effects on nucleotide polymorphism of selective phenomena such as hitchhiking with favorable mutations (Kaplanet al. 1989; Aquadro and Begun 1993; Aquadroet al. 1994) or deleterious background selection (Charlesworthet al. 1993; Charlesworth 1994; Hudson 1994; Hudson and Kaplan 1995).
Recombination affects levels of nucleotide polymorphism. In Drosophila, it accounts for one-quarter of the variance among genes in nucleotide diversity (see Moriyama and Powell 1996 for a review) and an increasing amount of evidence of the same trend is being gathered in other organisms, such as Mus domesticus (Nachman 1997), several species of the genus Lycopersicon (Stephan and Langley 1998), and humans (Nachmanet al. 1998). It is also well established that recombination rates are strongly influenced by inversions (Sturtevant and Beadle 1936), the main reason being that, in the heterokaryotypes, crossing-over events within the inversion loop give rise to nonfunctional or nonviable aneuploid meiotic products, and recombination results only from multiple crossovers and gene conversion. Also, inversions are ubiquitous in Drosophila: more than three-quarters of all the species in the genus are polymorphic for paracentric inversions (Sperlich and Pfriem 1986; Krimbas and Powell 1992). It is, therefore, clear that inversions can strongly influence nucleotide polymorphism levels. This influence can take several forms. First, inversions reduce and redistribute recombination in heterokaryotypes (Navarro and Ruiz 1997; Navarroet al. 1997). Hence, the dynamics of selective sweeps and background selection in chromosomes segregating for different arrangements may be different—and their effects conceivably larger—than in chromosomes without inversions. Second, recombination is not uniformly distributed along chromosomes (Lindsley and Sandler 1977; Ashburner 1989; Trueet al. 1996) and when inversions change the position of genes, they are also changing their recombinational context in homokaryotypes. Third, because inversion polymorphisms are maintained by balancing selection (Dobzhansky 1970), they will increase the average life expectancy of nucleotide variability linked to them (Strobeck 1983; Hudson and Kaplan 1988; Kaplanet al. 1988). Finally, the latter effect can be just the opposite with new inversions. A recently appeared inversion increasing its frequency may produce a selective sweep that can potentially eliminate variability in large segments of the chromosome. All these different factors may have powerful and contradictory effects on variability. The aims of this work are to obtain theoretical predictions concerning the amount and pattern of nucleotide variability associated with inversion polymorphisms and to shed some light on the overall effect of inversions on the level of nucleotide variability within species.
The basic tools to carry out such studies have been developed in recent years using the coalescent approach. Theoretical and simulation studies concerning DNA variability under balancing selection (Hudson and Kaplan 1988; Kaplanet al. 1988; Hey 1991; Nordborg 1997) or under subdivision and migration (Slatkin 1987; Strobeck 1987; Tajima 1989a, 1993; Notohara 1990; Nordborg 1997) are providing a detailed picture of the properties of the amount of DNA polymorphism in a population. Although the analogy between inversion systems and balancing selection or subdivided populations is clear, because all of them produce a structured population, the coalescent approach has never been explicitly applied to inversions. One of the causes of this vacuum may be found in the scarcity and some what contradictory nature of empirical information about the degree of exchange of genetic information between arrangements along the inverted chromosome (Krimbas and Powell 1992; Navarroet al. 1997). Also, the lack of detailed theoretical predictions of the effect of inversions on recombination makes it difficult to obtain realistic recombination values for every position along the inverted chromosomal region. A recent theoretical study from Navarro et al. (1997) provides such results. Given the physical and genetic lengths of an inversion, theoretical predictions of recombination and gene flux caused by crossing over and gene conversion between arrangements can be obtained for every site along the chromosome for heterokaryotypes.
The work presented here deals with the effect on neutral nucleotide variability of both new and old inversion polymorphisms. That is, we exclusively consider the effect on variability of inversions themselves. We focus on the two most common measures of DNA variability: the number of segregating sites and the average number of pairwise differences in a sample of DNA sequences (Watterson 1975; Tajima 1983, 1993). We study these two variability measures in a population of DNA sequences linked to a chromosome segregating for two arrangements. First, we develop analytically equations for the mutation-drift-flux equilibrium case, in which the inversion polymorphism is precisely balanced (i.e., inversion frequencies do not change from one generation to another). It is assumed that the polymorphism was established enough time ago for the DNA variability in the population to have reached mutation-drift equilibrium. Second, we explore by means of computer simulation cases in which DNA variability is not at equilibrium because the inversion polymorphism was recently established. Changing the age of the polymorphism in the simulations allows us to study the approach to the equilibrium values of DNA variability previously derived using analytical methods.
MODELS AND METHODS
We study the properties of a sample of n DNA sequences at a locus linked to a chromosome segregating for two arrangements, Standard (St) and Inversion (In), at frequencies p and q, respectively. The two arrangements differ by a single paracentric inversion and St is the oldest one. We denote by N the population size and by Φ the per generation probability of gene exchange between arrangements, i.e., the probability that a DNA sequence recombines with the inversion, which only happens in heterokaryotypes and that is referred to as the probability of gene flux (Navarroet al. 1997). It is assumed that karyotype frequencies are maintained approximately at Hardy-Weinberg equilibrium. Accordingly, the probability that a DNA sequence linked to a St chromosome descends from a sequence linked to an In chromosome in the previous generation is qΦ; the converse probability for an In chromosome is pΦ. Following the infinite-sites model (Kimura 1969), we assume that the DNA sequence is so large that every new mutation takes place in a previously unmutated site. The mutants are selectively neutral and μ is the mutation rate per sequence and per generation.
In the development of our analytical results, we make use of the analogy between inversion polymorphisms and subdivided populations. Thus, although results can be obtained in several other ways (see, for example, Hudson and Kaplan 1988; Hudson 1990; or Nordborg 1997), we follow Tajima's (1989a, 1993) approach to obtain equations giving the number of segregating sites in a sample of n alleles taken at random from a population that has reached mutation-drift-flux equilibrium. For the simulation studies, we use the standard principles of the coalescent process to construct the genealogical tree of a sample and the associated time for each branch (Hudson 1990). The analogy between inversion polymorphisms and subdivided populations is also used to adapt the general coalescent process to a population segregating for two chromosomal arrangements.
We follow a method analogous to the one described in Strobeck (1987) to construct phylogenetic trees for samples taken from such a population. For every sample, the simulation starts by generating a maximum of four random numbers, each derived from the appropriate exponential distribution, which represent the time until one of the four possible events affecting the sample (no simultaneous events are allowed): the time until the most recent flux event (for each arrangement with one or more alleles in the sample, tΦ(St → In) and tΦ(In → St)) and the time until the most recent coalescence event (within each arrangement with two or more alleles in the sample, tC(St) and tC(In)). The smallest of these four times is chosen and the sample is modified by the creation of the corresponding branches and nodes. The chosen time is associated with the newly created branches and the process starts over. The simulation stops when the most recent common ancestor of the sample is reached.
To study nucleotide variability in a new arrangement appearing in the population and reaching a stable polymorphism, we use the simulation method outlined in Braverman et al. (1995) after adapting it to overdominant, instead of directional, selection. Let the three karyotypes St/St, St/In, and In/In have fitnesses 1 − s1, 1, and 1 − s2, respectively. We start by constructing the tree in the way described in the previous paragraph, with the frequencies of the arrangements in the population being
The events of coalescence or gene flux between arrangements during the selective phase are simulated in the same way as in Braverman et al. (1995). We make time change in a per generation basis. For each generation the probabilities of the four possible events are computed. Subtracting the sum of the four probabilities from one gives us the probability of no events taking place during that generation. The probabilities of zero events are multiplied every generation until the product is less than a random number drawn from a uniform distribution between zero and one. When that happens, one of the four events is chosen, taking into account the probability of that event in the current generation. Of course, the simulation is also exited if the most recent common ancestor of the sample is reached during the selective phase. The main difference between our algorithm and that used by Braverman et al. (1995), which considered directional selection, is that the differential equation describing the change of allele frequencies with time under overdominant selection lacks an analytical solution (Nei 1987; Nagylaki 1992) and, therefore, we compute the transition probabilities and change q on a per generation basis.
We illustrate and discuss our results using parameter values from Drosophila because most of the evidence on inversions and on nucleotide variability comes from this genus. The mutation rate per nucleotide per generation of Drosophila melanogaster ranges between 10−8 and 10−9 (Powell 1997). In the same species, the population size has been estimated to be of the order of 106 (Kreitman 1983; Powell 1997), the average θ (= 4Nμ) per nucleotide being ~0.005 (Hudson 1993). To avoid the use of many decimals we will focus all through this article on alleles of 100 nucleotides and, therefore, on a θ value of 0.5.
Inversions affect our model by modifying gene flux rates all along the inverted segment. According to Navarro et al. (1997) and Navarro and Ruiz (1997), the gene flux per nucleotide and per generation between arrangements will range between Φ = 10−2 in the center of a large inversion and Φ = 10−8 in regions close to the breakpoints of a short inversion. This predicted range includes most of the empirically estimated gene flux values available in the literature: 10−4 in the central region of inversion In(3L)Payne of D. melanogaster (Payne 1924); 10−5 in the central region of inversion In(3R)P18 of D. melanogaster (Chovnick 1973); and 10−7 near the breakpoints of O3+4/OSt heterokaryotypes in D. subobscura (Rozas and Aguadé 1994). Details on how to obtain Φ values for any site along the chromosome can be found in Navarro et al. (1997).
ANALYTICAL RESULTS
We use the coalescent approach to study the variability of n DNA sequences, among which i sequences are randomly chosen from St chromosomes and j (= n − i) sequences from In chromosomes. Let Q(i, j) represent the state of the sample. In terms of the genealogical relationships of the sequences in the sample, that is, going back in the past, there are four possible adjacent states into which Q(i, j) can move in a single generation, namely, Q(i − 1, j), Q(i, j − 1), Q(i − 1, j + 1), and Q(i + 1, j − 1). The first two changes represent common ancestor events and the latter gene flux events. The probabilities of these events are (derived following Hudson 1983 and Tajima 1989a):
Let S(i, j) be the expected number of segregating sites in a sample in state Q(i, j) taken at random from a population at mutation-drift-flux equilibrium. Given the infinite-sites model, the number of segregating sites is the number of mutations that take place while Q(i, j) is converging to Q(1, 0) or Q(0, 1). To calculate this number we must first consider the sojourn time of the sample, i.e., the expected number of generations during which Q(i, j) does not change. The probability that Q(i, j) changes to one of the four adjacent states in a single generation is
Given that Q(i, j) changes, the conditional probabilities that it changes to each one of the four adjacent states are easily obtained from Equation 1. With those probabilities and (3) we can readily obtain an iterative expression for S(i, j),
From (4), S(i, j) can be computed for every value of i and j. For instance, when n = 2,
The results presented so far allow us to study the effect on variability of a precisely balanced inversion polymorphism that reached mutation-drift-flux equilibrium a long time ago. Table 1 gives the values of
Theoretical expectations for mutation-flux-drift equilibrium
Flux rates affect the variability in the population as a
whole, which increases as flux decreases. Flux rates of 10−2 or higher make E(k) and
Theoretical expectations for mutation-flux-drift equilibrium
The frequency of the chromosome arrangements in the population also has a remarkable effect on variability. The maximum values of E(k) and
Both variability augments can be explained by the same mechanism. With low flux and a lot of drift (mainly in the low frequency arrangement), the two kinds of chromosomes are highly differentiated and, therefore, almost every allele coming by recombination from the other arrangement will be absent in the recipient arrangement. These new alleles add new variability at a higher rate than mutation. This effect overpowers drift and increases with decreasing flux. It only disappears with gene flux rates ⪡10−8 (i.e., very close to zero and smaller than the mutation rate we assume). In that case E(k) and
Differentiation between arrangements can be measured by means of the number of pairwise differences between an In allele and a St allele (Equation 6b). As we can see in Table 2, equilibrium pairwise differences between arrangements do not depend on inversion frequencies and standard deviations are practically unaffected by them, which agrees with previous results (Nordborg 1997).
SIMULATION RESULTS
The simulation program described in models and methods allows us to obtain E(k) and
In Tables 3 and 4 we can see the values of E(k) and
Simulation results
However, the variability differences caused by selection coefficients differing by as much as an order of magnitude are not very important (compare Tables 3 and 4). The reason for that must be sought in the approach of arrangement frequencies to equilibrium. Under overdominant selection, In frequencies increase in a sigmoidal way and, therefore, for much of the time since the appearance of the inversion its frequency is either close to zero, which makes it irrelevant, or close to the equilibrium point,
Simulation results
The convergence to the equilibrium variability in the population as a whole is drawn in Figure 1a. During the first million generations, almost no new diversity is added to the population. Gene flux, having higher rates than mutation, plays a very important role during this phase because it homogenizes variabilities within the two arrangements. Only after the first several million generations has mutation added enough variability to reach the equilibrium. With high gene flux (Φ = 10−2) the equilibrium point is independent of the frequency of inversions. On the other hand, lower gene flux (Φ = 10−6) makes the equilibrium variabilities higher for intermediate arrangement frequencies. Note that the equilibrium points obtained by simulation are equivalent to those obtained analytically in the previous section.
Figure 1b shows the changes in E(k) between two In alleles during the convergence to equilibrium. The convergence process within an average inversion is plotted in Figure 2a. As we can see in these figures, gene flux plays a key role in determining both the amount of variability that is lost during the origin of the inversion polymorphism and the speed at which this lost variability is recovered. With low gene flux, nucleotide variability within the newly appeared arrangement is zero, or very close to zero, during the first 105–106 generations. Convergence to mutation-drift-flux equilibrium is slow because of the scarce amount of variability incoming from St chromosomes. On the other hand, with high rates of gene flux the variability within In chromosomes is very close to the variability left in St chromosomes after the partial sweep and the convergence to equilibrium is faster. Moreover, during the first million generations, gene flux is the main cause of the increase of variability within inversion chromosomes because it adds new variability (imported from standard chromosomes) at higher rates than mutation.
The process that is meanwhile taking place within St chromosomes is represented in Figures 1c and 2b. In this case, of course, gene flux has little influence on the initial variability. It does, however, affect the way in which variability changes, as well as the equilibrium points. We can see that with low flux rates, during the first 105–106 generations, variability within St chromosomes decreases. This can be explained by a sink-source mechanism: the relatively great allele diversity stored in St chromosomes is transferred by flux to In chromosomes, where low flux rates forced an initial elimination of variability. This process lasts until the homogenization of the two arrangements; hence, variability decreases within St chromosomes while increasing within In chromosomes. In chromosomes can undergo a similar temporary variability decrease if flux rates are high and the new inversion reaches a high frequency (Figure 1b).
Simulation results. Approach of nucleotide variability to mutation-drift-flux equilibrium. Abscissa: decimal logarithm of the number of generations since the stabilization of the polymorphism. Ordinate: (a) E(k) for the entire population; (b) E(k) for the In chromosomes; (c) E(k) for the St chromosomes; (d) E(k) for two alleles taken at random, one from the pool of St chromosomes and the other from the pool of In chromosomes. Φ = 10−2 stands for gene flux in the center of an average inversion and Φ = 10−6 for gene flux around the breakpoints. N = 106 and μ = 1.25 × 10−7, so the equilibrium E(k) for a population without inversions is 0.5 (dotted line).
In relation to the time dynamics of the pairwise differences between arrangements, Figures 1d and 2c show that, as proved in analytical results, the equilibrium values of E(k) are dependent only on gene flux. On the other hand, during the first 105–106 generations of polymorphism, the pairwise differences between In and St chromosomes are dependent only on arrangement frequencies.
DISCUSSION
On its way to the establishment of a balanced polymorphism, a newly arisen inversion sweeps a lot of variability from the population. Just after the stabilization of the arrangement frequencies, the chromosomes bearing the newly appeared arrangement will have almost no variability (Tables 3 and 4). As the polymorphism grows older, a slow convergence to mutation-drift-flux equilibrium starts. At this equilibrium, the level of DNA polymorphism in the population as a whole can be higher than in a population of the same size without segregating arrangements (Table 1 and Figure 1). Which of these two effects of an inversion, to reduce or to increment variability, will prevail depends on the time that it takes to reach equilibrium. The convergence to the equilibrium values proceeds at very different speeds, strongly depending on gene flux rates, and, thus, it proceeds differently in different regions of the inverted segment.
In regions close to the breakpoints, flux rates are very low (Figure 2) and, therefore (1) the strength of the partial sweep is greater, and almost no variability is left within the new arrangement; and (2) the linkage disequilibria generated by the frequency increment of the new inversion will persist for a long time (Navarroet al. 1996). Around the breakpoints, only new mutations supply variability to the new arrangement and it is unlikely that new mutants will be exchanged between arrangements. Therefore, in regions close to the breakpoints of In chromosomes, variability will be very low until enough new mutants are added, which will take >106 generations (Figures 1b and 2a). Around the breakpoints of St chromosomes, variability is initially greater than that of In chromosomes, but decreases afterward over several generations (Figure 1c) because gene flux tends to homogenize the two arrangements and no variability was left in the In chromosomes. Although nucleotide variability in relation to inversions has been studied by several authors (Aquadroet al. 1986; Aguadé 1988; Bénassiet al. 1993; Rozas and Aguadé 1993, 1994; Wesley and Eanes 1994; PopadiĆ and Anderson 1995; PopadiĆet al. 1995; Andolfattoet al. 1999; CÁcereset al. 1999; Depauliset al. 1999; Rozaset al. 1999), the only available study in which nucleotide variability was surveyed simultaneously at different positions of a chromosome segregating for two arrangements separated by a single inversion has been carried out by Hasson and Eanes (1996). Our theoretical expectations are consistent with their findings. First, the breakpoints of inversion In(3L)Payne of D. melanogaster host 20 times less polymorphism (π = 0.0003) than the breakpoints of St chromosomes (π = 0.0060). Furthermore, the Hsp83 gene locus, which is close to, but not exactly at, the distal breakpoint, presents higher levels of variability (π = 0.0067) and lower levels of differentiation between arrangements (Nei's d = 0.0053) than the breakpoint itself (π = 0.0058, d = 0.0068), although the differences are not statistically significant (Hasson and Eanes 1996).
Simulation results. E(k) for different positions along the inverted chromosome. Arrangement frequencies reached equilibrium an infinite time ago (solid lines), 1 million generations ago (long dashed lines), or 1 hundred generations ago (short dashed lines). Note how the mutation-drift-flux equilibrium is built up. (a) E(k) for two random In alleles. (b) E(k) for two random St alleles. (c) E(k) for two alleles taken at random from the entire population. N = 106, μ = 1.25 × 10−7 (Φ = 0.5). We consider an average inversion (30 cM long laying at 10 cM from the centromere) at frequency q = 0.5; gene flux values for 21 evenly spaced sites along the inverted segment are obtained according to Navarro et al. (1997).
In the population as a whole, nucleotide variability around breakpoints is low during the first 105–106 generations. The higher the frequency of In chromosomes, the lower the levels of DNA polymorphism (Figure 1a). At equilibrium, on the other hand, low gene flux rates induce substantial differentiation between St and In chromosomes, which causes an increment in the level of variability of the whole population (Table 1, Figures 1d and 2c). This enhancement of polymorphism levels is due to the extension of the average lifetime of mutants caused by balancing selection (Hudson and Kaplan 1988; Kaplanet al. 1988) and it is greater with lower gene flux and intermediate arrangement frequencies.
Gene flux rates are higher in the center of the inverted regions and hence (1) gene flux preserves some of the starting variability from the initial sweep by sheltering it in the new arrangement; and (2) the differentiation between arrangements will decrease at a steady rate, as new mutations are exchanged and some of the variability stored in St chromosomes enters the inversion by gene flux. Inverted chromosomes, therefore, have a higher starting level of polymorphism with higher flux rates (Figure 1b). The smaller the frequency of inversions and the greater the flux, the higher the starting polymorphism level. On the contrary, neither the initial variability within St chromosomes (Figure 1c) nor the initial differentiation between St and In (Figure 1d) is affected by gene flux rates. The main differences between the central region and the regions around the breakpoints arise from the buildup of the equilibrium in the central region of inversions and on the equilibrium state itself. When equilibrium has been achieved, higher flux rates make the amount of variability in the central zone smaller than that of the regions around breakpoints (Figure 2c). On the other hand, higher flux rates during the convergence to equilibrium allow for a rapid increase in polymorphism levels and a decrease in differences between arrangements, which starts at about generation 105 (Figure 1). Again this result is consistent with the finding by Hasson and Eanes (1996) of higher levels of nucleotide variability in the Est-6 gene (π = 0.0192), which lies approximately in the middle of inversion In(3L)Payne, than at the breakpoints of the inversion (π = 0.0058). Also, the silent polymorphism levels for Est-6 were roughly similar in both arrangements (π = 0.0162 in St and π = 0.0200 in In).
In this analysis we have focused on neutral variability without considering any explicit source for the over-dominance of the inversion. The selective maintenance of inversion polymorphisms has been the subject of abundant theoretical work. Some models consider associations of the inversion with either a single gene or a group of genes with additive relationships (e.g., Neiet al. 1967; Ohta and Kojima 1968). Under those models, the establishment of an inversion polymorphism would be a rare event, because, eventually, the linkage disequilibrium between the selected loci and the inversion would break down, rendering the polymorphism unstable and allowing the inversion to drift away until fixation or loss. The most widely accepted models for the maintenance of inversion polymorphisms consider their association with a complex of genes where epistatic selection maintains gametic disequilibrium. Essentially, a new inversion can reach a stable polymorphism only if it occurs in chromosomes carrying an excess gametic type (Charlesworth and Charlesworth 1973; Charlesworth 1974; see Krimbas and Powell 1992 for a review). Under these conditions, a stable polymorphism can be reached because certain recombination events within heterokaryotypes generate unfavored gametic types that would be eliminated by selection. This would have the additional effect of reducing recombination even further and, thus, increasing differentiation between arrangements.
Inversions are the most common form of chromosomal change in the evolutionary history of Drosophila. More than 28,000 paracentric inversions are estimated to be currently segregating in natural populations of Drosophila and >42,000 paracentric inversions have become fixed during the evolution of the genus (Sorsa 1988). To what extent has all this continuous chromosomal reorganization affected DNA variability? Ranz et al. (1997) studied the divergence of chromosomal element E between D. melanogaster and D. repleta. Although evolutionary rates may vary between elements and lineages, a rate of fixation of inversions of approximately one inversion per million years was estimated for the E element (Ranz et al. 1997, 1999). Because most inversions will never reach fixation and will be lost after segregating for some time, we can consider a million years as an overestimate of the average lifespan of an inversion. This time is equivalent to 5 × 106–107 generations if we take 5–10 generations per year as an average for the Drosophila genus (Ashburner 1989; Powell 1997). An examination of Figure 1 yields the conclusion that at least 107 generations are needed to reach mutation-drift-flux equilibrium. It follows that it is very unlikely to achieve equilibrium within the inverted segment. If gene flux rates are ≤10−4, inversions can increase variability levels when mutation-drift-flux equilibrium is reached (Table 1). However, at least 107 generations are needed to achieve equilibrium, and most of the time variability is lower in low gene flux regions (Figures 1 and 2). This fact may imply that, other things being equal, chromosomes and/or species having high levels of inversion polymorphism will have lower levels of DNA polymorphism. It has been pointed out by Akashi (1996), Begun (1996), and Moriyama and Powell (1996) that the autosomes of D. melanogaster, which are moderately polymorphic for inversions, have lower variability than those of D. simulans, which have no known polymorphic inversions. Stronger hitchhiking and background selection in D. melanogaster have been proposed as possible causes of this striking correlation (Begun 1996). Our results show that the presence of inversions in the evolutionary history of D. melanogaster may help to explain its lower levels of nucleotide polymorphism without appealing to other selective forces. Although some caution must be raised because D. melanogaster levels of variability may have been underestimated (Begun 1996; Labateet al. 1999), the extant data fit the expected pattern. Of course, had the inversions been maintained for a longer time, or had the mutation rates been higher, inversions would produce the opposite effect and increase variability. On the other hand, factors, like fluctuating selection, that cause oscillations in the frequencies of the arrangements will boost the loss of neutral variability. Further research concerning all these possibilities is currently under way.
Acknowledgments
We thank P. Andolfatto, N. Barton, A. Berry, E. BetrÁn, B. Charlesworth, A. Clark, F. Depaulis, J. Rozas, and two anonymous reviewers for valuable discussion and criticism. Work was supported by a Formació del Personal Investigador (FPI) fellowship from the DGU (Generalitat de Catalunya, Spain) to A.N. and grant PB95-0607 from the DGICYT (Ministerio de Educación y Ciencia, Spain) to A.R.
APPENDIX
Variances for k(n), the number of pairwise differences in a sample of n alleles, can be easily found for n = 2 by developing an expression for pairwise identities and using it as the moment generating function of the distribution of coalescence times (Hudson 1990). Doing so, we obtain the rather bulky expressions
If n increases, the mean number of pairwise differences remains the same but, of course, the variance decreases. However, as variances decrease the expressions giving them increase in size and in number (for example, if n = 10 one has to obtain 11 different enormous expressions).
Variances for the simplest case (p = q) can be obtained with some pain following Wakeley (1996). We end up with two expressions equivalent to those in Wakeley (1996). The first one gives the variance of k(n) when all the n alleles are linked to the same arrangement:
The second one gives the variance of k(n) when i alleles are linked to St and j alleles to In:
Footnotes
-
Communicating editor: A. G. Clark
- Received August 15, 1999.
- Accepted February 14, 2000.
- Copyright © 2000 by the Genetics Society of America