Abstract
There has recently been considerable debate over the relative importance of selection against hybrids (“endogenous” selection) vs. adaptation to different environments (“exogenous”) in maintaining stable hybrid zones and hence in speciation. Singlelocus models of endogenous and exogenous viability selection generate clines of similar shape, but the comparison has not been extended to multilocus systems, which are both quantitatively and qualitatively very different from the singlelocus case. Here we develop an analytical multilocus model of differential adaptation across an environmental transition and compare it to previous heterozygote disadvantage models. We show that the shape of clines generated by exogenous selection is indistinguishable from that generated by endogenous selection. A stochastic simulation model is used to test the robustness of the analytical description to the effects of drift and strong selection, and confirms the prediction that pairwise linkage disequilibria are predominantly generated by migration. However, although analytical predictions for the width of clines maintained by heterozygote disadvantage fit well with the simulation results, those for environmental adaptation are consistently too narrow; reasons for the discrepancy are discussed. There is a smooth transition between a system in which a set of loci effectively act independently of each other and one in which they act as a single nonrecombining unit.
SPECIATION occurs when selection generates barriers to gene flow between two populations. The structure of these barriers to gene flow can be investigated in cases where they are not entirely effective: that is, in hybrid zones where the effect of natural selection in maintaining the integrity of the parental genomes is undermined by the homogenizing effects of migration and recombination, generating a set of clines (Barton and Hewitt 1985). Such intermediate stages—systems that endure despite having neither achieved nor abandoned reproductive isolation—provide indirect means of observing the mechanisms preventing gene flow and hence of elucidating the components of speciation. Hybrid zones therefore offer a “window on the evolutionary process” (Harrison 1990). In this article, we use theoretical models of hybrid zones to describe and compare the effects of alternative types of selection, specifically adaptation to different environments or hybrid unfitness, in maintaining divergence.
Hybrid zones have been described in a wide range of organisms (for reviews, see Barton and Hewitt 1981b, 1985; Hewitt 1988; Harrison 1990, 1993; Arnold 1997). They usually involve divergence at many loci and can often be described as a set of parallel clines in gene frequency and morphological traits, which are maintained by a balance between dispersal and selection. Earlier theoretical descriptions of such clines concentrated on single (or at most two) locus models, with continuous approximations for time and space, and gene flow described by diffusion (Fisher 1937, 1950; Haldane 1948; Bazykin 1969, 1973; Slatkin 1973, 1975; Nagylaki 1975, 1976a; Endler 1977). However, if clines at multiple loci coincide, as is the case in many hybrid zones, associations between loci, or linkage disequilibria, will be generated by the mixing of populations with different allele frequencies (Li and Nei 1974). Selection on one gene spills over onto any genes that are in linkage disequilibria with it, increasing the effective selection on each locus and generating a sharp step in gene frequency. Singlelocus models cannot adequately represent such a system: consider, for example, the dynamics of a neutral locus linked to a set of selected loci, the effect of which would be lost in a singlelocus model. The extension of the above models to clines at multiple loci offers quantitative estimates of the magnitude of natural selection (Barton and Hewitt 1983; Barton 1986; Barton and Bengtsson 1986). For example, measurements of the shape of clines in genetic markers and the statistical associations between them can be used to estimate the strength of the barrier to gene flow, and even the number of genes under selection, as well as dispersal rates. These methods have been applied to a variety of hybrid zones [Bombina (Szymura and Barton 1986, 1991); Heliconius (Malletet al. 1990); Podisma (Barton 1980; Barton and Hewitt 1981a; Jackson 1992); Pontia (Porteret al. 1998); Sceloporus (Siteset al. 1995); and Uroderma (Barton 1982)].
Various different forms of clines can be distinguished. Dispersal may be negligible, in which case selection maintains a stable equilibrium at each locality (p. 116, Barton and Hewitt 1985). More often, as considered in the diffusion models cited above, clines are maintained in a balance between dispersal and some form of selection. The selection that acts to maintain ancestral divergence can be classified according to whether it maintains different alleles in different environments or in different genetic backgrounds. After Moore and Price (1993), these scenarios can be termed, respectively, (i) “exogenous” selection, with fitness defined in relation to the environment and a geographic selection gradient determining the relative fitness of genotypes, and (ii) “endogenous” selection, with fitness defined by withingenome interactions such as heterozygote disadvantage or epistasis, and being independent of environment. With purely exogenous selection, each allele has an effect on fitness that depends on location and not on the other alleles with which it is combined (i.e., no dominance or epistasis). With purely endogenous selection, fitness is independent of location, though fitness differences may of course still be mediated by interactions with the environment (for example, by different abilities to escape predators).
In reality, natural hybrid zones may well include both kinds of selection. Nevertheless, the relative predominance of either type has been the subject of considerable debate in the hybrid zone literature. Reviews of studies of hybrid zones in >100 taxa (Barton and Hewitt 1981b, 1985) concluded that most are maintained by selection against hybrids independent of the environment, i.e., they are “tension zones” (Key 1968). With this justification, Barton's(1983, 1986) multilocus models invoked endogenous selection, specifically heterozygote disadvantage; simulations showed that the effects of epistatic endogenous selection were qualitatively similar (Barton and Gale 1993; Barton and Shpak 1999). However, there is also substantial evidence of an important role for exogenous selection in hybrid zones (Arnold 1997; Harrison 1990). In singlelocus models, the shape of the clines generated by exogenous vs. endogenous viability selection is effectively indistinguishable (Barton and Hewitt 1989; Barton and Gale 1993), but it is not clear whether the similarity extends to multiplelocus systems. Here, we develop an analogous treatment to Barton's earlier model under the alternative selection regime: exogenous selection. We consider the simplest case of two parapatric populations adapted to different environments (Haldane 1948). The shape of clines generated by the two forms of selection, their effects on linkage disequilibria, and the transition from a situation in which multiple loci act independently of each other to that in which the linkage disequilibria are strong enough to bind alleles of shared ancestry together are compared. For simplicity, we consider throughout selection pressures defined for individual genotypes, as would be the case for viability selection— see Gavrilets (1997b) for an illustration of the extra complexity introduced by fertility selection in even a singlelocus model.
In this analytical treatment of clines maintained in a dispersalselection balance, we rely on a key assumption that selection is weak. The analysis then becomes tractable as higherorder terms in s (the selection coefficient) and linkage disequilibria of third and higher order can be treated as negligible. We develop a simulation model to test the robustness of the analytical descriptions and their assumptions as selection becomes stronger. Finite population sizes in the simulation incorporate the effects of drift, necessarily ignored in the deterministic analytical models but known to widen clines (Slatkin and Maruyama 1975). We also investigate the assumption that linkage disequilibria are generated primarily by migration (and hence can yield estimates of gene flow) and test whether the shapes of clines observed in simulations match those predicted by analytical models of both environmental selection and heterozygote disadvantage.
SIMULATION METHODS
Simulation model: The simulation model is based on that used by Barton and Gale (1993; Jackson 1992), extended to allow selection on diploids rather than haploids. A onedimensional steppingstone model is used to simulate contact between two populations initially fixed for two alternative alleles at a set of loci. We refer to the two types of alleles as 0, from the lefthand population, or 1, from the right. The population is distributed along a chain of demes, each of which contains N diploid individuals. Migration occurs between adjacent demes, with a fixed proportion m of individuals moving per generation, half in either direction; migrants are chosen at random. With this model of nearestneighbor migration, the migration rate m is identical to the variance of distance moved per generation along the cline axis, σ ^{2} (distance being measured in deme spacings).
Migration is followed by reproduction. Recombination occurs at a rate r between adjacent loci, with chiasma positions determined randomly and independently (i.e., no interference); unlinked loci are represented by setting r = 0.5. Within each deme, individuals then mate randomly, and each pair produces two offspring. The new adult population of N individuals is chosen by sampling with replacement from the offspring population; the probability of an individual being sampled is determined by its fitness as defined under the particular selection regime (see below).
End demes remain fixed in gene frequency, to simulate an infinite pure population on either side of the cline (after Feldman and Christiansen 1975), with sufficient numbers of demes being used to ensure that end effects do not disrupt the dynamics. Statistics are calculated after migration and reproduction at 200 generation intervals between generations 600 and 3000 of each run. The results were checked to ensure that (i) the system had reached equilibrium; (ii) these intervals were long enough to avoid autocorrelation, so values could be treated as independent data points. The 95% confidence intervals presented are twice the standard errors. The model was tested to check that it was performing as expected under the random processes, for example, with respect to the decay of linkage disequilibrium in the absence of selection or migration but incorporating the effects of drift (Hill and Robertson 1966), or with respect to the decay with time of a step cline in the frequency of a neutral allele.
Selection regimes: Let L be the number of loci, s the selection coefficient, and x the distance across the cline, with x = 0 defining the center. For each individual, the number of heterozygous loci it carries is given by H, and the number of alleles originating from the population to the right of the environmental step (those of type 1) by a “hybrid index” Z. The individual fitnesses for the two possible selection regimes considered are then defined as
Environmental adaptation:
Heterozygote disadvantage:
Statistical methods: Maximum gradient and cline width: We estimate the gradient of a cline by first transforming the gene frequency p to a logit scale, y = ln(p/(1 − p)). On this scale, a singlelocus cline generated by such viability selection should appear approximately linear with distance (Barton and Hewitt 1989). The gradient in allele frequency at the center (the maximum gradient) is then estimated by regression between the points at which y = −2 and y = +2. The gradient on the logit scale (dy/dx) is exactly four times the gradient of allele frequency (dp/dx) at the center (where p = 0.5), and the width of the cline is defined as the inverse of its maximum gradient (Slatkin 1973). In the simulation runs, the width of the cline at each locus is calculated individually, and all widths are then averaged to give a mean value.
Linkage disequilibrium: The average pairwise linkage disequilibrium across all loci D is estimated using the variance in a hybrid index (see Barton and Gale 1993). Define X_{i} as the number of type 1 alleles at the ith locus (X_{i} = 0, 1, or 2) and the hybrid index Z, as above, as the number of type 1 alleles an individual possesses out of the total possible 2L, so that
The maximum value of linkage disequilibrium should occur at the center of the cline, where p = 0.5. However, if the cline is steep, there may be no single deme at p = 0.5. Rather than using the maximum D or the value of D nearest the center observed in the simulations, we use a linear regression of D on (pq)^{2} to give the value at p = 0.5. This approach relies on the fact that singlelocus clines follow a tanh curve, the gradient of which will be proportional to pq, so D at any point should be proportional to the square of the gradient at that point (see below). However, with this method it is possible for impossibly high values to be obtained (D > 0.25), as cline shape departs from the simple tanh curve when selection is strong.
SINGLELOCUS CLINES
Consider the simple case of a cline at a single locus that is segregating for two alleles. Two populations fixed for alternate alleles are in contact and interbreed at their interface. If selection is weak, the change in gene frequency in one generation due to selection will be small, allowing a continuous time approximation for the effect of selection. Neglecting the secondorder interaction between dispersal and selection, the rate of change in gene frequency p with respect to time t will be determined by the sum of the effects of dispersal (from Fisher 1937) and selection (from Wright 1931),
For a given selection function, Equation 3 can be solved at equilibrium (∂p/∂t = 0) to give an explicit description of gene frequency as a function of distance across the cline (Haldane 1948; Bazykin 1969). The typical spatial scale of the resulting cline depends on the ratio of the dispersal distance and the strength of selection, the “characteristic length” of the cline (Slatkin 1973).
MULTILOCUS CLINES
Selection acting on multiple loci: Extending a singlelocus system to multiple loci can have several effects. Epistasis will affect the marginal selection on each locus in a manner that changes across the cline. Second, statistical associations between loci (i.e., linkage disequilibrium) will increase the effective selection on each locus. We concentrate here on this second component [see Gavrilets (1997a) and Barton and Shpak (1999) for treatment of epistasis in multilocus clines]. If linkage disequilibria are weak, the presence of other loci under selection will not affect the fate of a particular allele. At the other extreme, with very strong linkage disequilibria, a given allele will behave as if subject to the total selection pressure acting across the genome. To describe this variation, a measure of how “congealed” (Turner 1967) the genome is under the given selection pressures is required. Barton's (1983) analytical results show that, for heterozygote disadvantage, the behavior of such a system will be defined by the relative strengths of the selection acting on each locus (s), the rate of recombination (r), and the total number of loci under selection (L). For the sake of tractability, the analysis that follows makes the assumption that this selection is weak, and in particular that the ratio s/r, the “coupling coefficient,” is small (Barton 1983). We show that the dynamics of the system can still be profoundly different from those of singlelocus clines and furthermore that the assumption of a low coupling coefficient need not be unduly restrictive.
Because of the statistical associations between loci, a list of the gene frequencies at each locus [derived from a series of equations like (3)] will not fully describe the system. Gametic frequencies would give the complete picture but soon become unwieldy if more than a few loci are considered as a system with L loci has 2^{L} gametic frequencies to be tracked. An alternative method is to track individual gene frequencies and the statistical associations between alleles at different loci; the two approaches are interchangeable. Here, we make the assumption (which can be checked by comparison with the simulations) that under weak selection pressures the associations higher than pairwise become negligible. The latter method is then analytically considerably more tractable (Barton 1983).
In general, the frequency p of an allele at any point in the cline is determined by four factors: (i) migration; (ii) selection acting directly on that locus; (iii) selection acting on gene combinations containing the locus in question (epistasis); and (iv) selection acting directly on other loci (or other combinations of loci) with which the locus is associated. The first two effects are incorporated in the singlelocus model (Equation 3), and, as mentioned above, the effects of epistasis are not considered here. The fourth effect, that of the selection acting on associated genes, is determined by the degree of association between loci. For two loci, with alleles A/a at one locus and B/b at the other, the pairwise linkage disequilibrium D is defined as the excess of coupling gametes over that expected under random association, for example,
The description can be extended to a system involving L > 2 selected genes, where the allele frequency p_{i} at any one locus is also affected by selection on the other L − 1 loci (each with frequencies p_{j}). For the multiplicative model of selection used here, it can be assumed that higherorder disequilibria make a negligible contribution (Barton 1983). The equilibrium solution to first order in s is therefore given by
The magnitude of linkage disequilibria: The dynamics of the pairwise linkage disequilibria can be described in the same way as those of gene frequencies, as the cumulative effects of diffusion, the mixing of populations at different gene frequencies (Li and Nei 1974), recombination, selection on the disequilibria themselves, and selection on other loci that are in higherorder associations with the pair in question. Under the assumption of weak selection and no epistasis, only the first three factors are relevant, giving
The most questionable assumption in the derivation of (9) is that linkage disequilibrium changes continuously in time. If (as will often be the case) linkage is loose or absent (r = 0.5), then linkage disequilibrium will change substantially between generations. To take account of this case, and to facilitate comparison with the simulation results, we describe in the appendix the discrete analogue of the diffusion model, in which genes are exchanged among three demes. This shows that Equation 9 is exactly equivalent to the threedeme discretetime analogue at a quasiequilibrium (when differences between demes in D are negligible relative to differences in p; Nagylaki 1976b), measured after migration and reproduction. The magnitude of D in a threedeme system is also greater than the value predicted for exchange between two demes (Li and Nei 974; Barton and Gale 1993).
The simulation model can now be used to test the analytical predictions. Figure 1 compares the maximum values of linkage disequilibria observed in simulations with the prediction in Equation 9. The results are presented for environmental selection (Figure 1, a–c) and heterozygote disadvantage (Figure 1, d–f), and for varying degrees of linkage. For unlinked and loosely linked loci, the simulation results fit very well with the predicted values for both fitness regimes. This is so even with strong selection, where fitness at the center is reduced by as much as 80%. However, the predictions break down for tightly linked loci, presumably because the assumption that s/r is small is violated. Note also that the lower migration rates in the simulations presented in Figure 1, a and d, or the number of loci used, do not affect the correspondence. Thus, except where adjacent loci are tightly linked, linkage disequilibrium is determined primarily by a balance between dispersal and recombination. Barton and Gale (1993, Figures 2 and 3) found a similarly close agreement for models of endogenous selection that incorporated epistasis.
The consequences of associations between genes: With the description of linkage disequilibrium (Equation 9) confirmed by the simulation results, (6) can be simplified to give, at equilibrium,
The mean fitness in a population under environmental selection is approximated as
The shapes of clines generated by these alternative selection regimes can be found by numerical integration of the formulas for ∂p/∂x (Mathematica 3.0; Wolfram 1996). Ignoring a common scaling factor of
On the logit scale (Figure 2b), heterozygote disadvantage in a singlelocus model (ϕ = 0) yields a linear cline, while environmental selection gives a slight step. This is because the selection gradient tends to zero at the center under heterozygote disadvantage but remains large under Haldane's (1948) model of a sharp ecotone. By contrast, the multilocus clines for both regimes show a stepped pattern. The step is due to the linkage disequilibria in the center increasing the effective selection pressures experienced by each locus, with positive feedback occurring as steeper gradients in turn generate more linkage disequilibria. The feedback is represented in the exponential terms in Equations 13 and 14. The clines under heterozygote disadvantage fall away more steeply in the tails than do those for environmental selection. The difference is predictable for the fitness functions used: the selection strength s was scaled up by a factor of two so as to give equivalent fitness in the center of the cline, but in the tails, where introgressing alleles are rare and therefore likely to be found only in heterozygous loci, the average selection per locus will inevitably be twice as great under heterozygote disadvantage. However, there is no qualitative difference in the shape of the clines under the alternative selection regimes.
Once again, stochastic simulations give an indication of the robustness of Equations 13 and 14. Here, we compare the average cline width observed in simulations with that predicted from (13) and (14), defined as the inverse of the gradient at p = 0.5. Simulations fit the predictions for heterozygote disadvantage for unlinked loci (Figure 3c) and for weak to moderate selection when loci are linked (Figure 3d). The correspondence suggests that higherorder disequilibria, ignored in the analytical derivations but inevitably present in the simulations, are not generating further positive feedback, steepening the cline beyond what is expected from the pairwise disequilibria. However, with environmental selection the width is systematically greater than predicted (Figure 3, a and b), even for unlinked loci. Wider clines suggest that the maximum reduction in fitness is less than that expected, and we consider possible reasons for the discrepancy in the discussion.
Cohesion of the genome: With weak selection and loose linkage, the genome behaves as a set of effectively statistically independent loci. Then, the width of each cline is
The concept of the cohesion of the genome can be used to compare the analytical descriptions of environmental selection with heterozygote disadvantage. Define s* as the effective selection pressure that would have to act on a single allele to give a cline of the observed width (s ≤ s* ≤ Ls). The ratio s*/Ls gives a measure of the proportion of the selection on the whole genome acting on a single locus and increases from 1/L to 1 as ϕ increases. Figure 5 depicts the ratio for the two selection regimes, with ϕ set so that the expected minimum fitness is equivalent. As before, the relationship becomes meaningless for high values of ϕ, where the weak selection assumptions are breaking down. More interestingly, the behavior of the system is remarkably similar for the two different regimes: the degree of interaction within the genome is analytically equivalent. However, the lack of correspondence between the stochastic simulations and the analytical predictions for environmental selection implies that the realized values of s* will be less and that heterozygote disadvantage may actually keep the genome together more effectively than the environmental selection described here. This is confirmed in Figure 4, where greater selection strengths are required for environmental selection than for heterozygote disadvantage before the width of the observed cline corresponds to that for an entirely congealed genome.
DISCUSSION
Using the machinery set up in previous studies of clines maintained by hybrid dysfunction (Barton 1983, 1986; Barton and Bengtsson 1986), we have developed a multilocus model in which different alleles are favored on either side of a sharp environmental transition. An analytical description confirms the conclusion of singlelocus models, that, at least with the simplest viability selection, the clines produced by either type of regime are indistinguishable in shape. The requirement that selection be relatively weak may seem unduly restrictive, particularly in the application of the theory to data from natural populations, but note that if, for example, a single allele crossing into the other environment type reduces an individual's fitness by only 5%, the width of the resulting gene frequency cline will be equal to less than eight times the average dispersal distance. Furthermore, because linkage disequilibrium increases the effective selection on each locus, the selection pressure acting on each locus increases disproportionately as more loci are considered. Comparisons of the analytical predictions with results from simulations also imply that the weak selection approximation holds for an apparently quite wide range of fitness values (e.g., Figure 1, a and b)—including, for example, the estimated fitness reduction of 0.42 (support limits 0.32–0.46) in hybrid populations in the center of the hybrid zone between the firebellied toads Bombina bombina and B. variegata (p. 255, Szymura and Barton 1991).
The comparison with stochastic simulations reveals an interesting distinction between the two systems: for environmental selection, the observed clines are wider than expected, i.e., the gradient at the center is shallower, but with heterozygote disadvantage the observed cline width fits the prediction of Equation 14 well. The discrepancy is clearly not due to measurement error in the simulations as the observed width corresponds to that generating the linkage disequilibrium for both selection regimes (Figure 1). Furthermore, exact simulations (as used by Barton and Shpak 1999) show the same discrepancy from the analytical predictions (data not shown): drift is not the culprit. We suggest two explanations for the discrepancy.
First, the expected gradient at the center is defined at the point where p = 0.5, which for environmental selection should fall at the point of the step transition in fitness (x = 0). In a chain of discrete demes with a step selection function, there will be one deme immediately to the left of the transition and one to the right, but effectively no deme actually covering the center. If no population is at p = 0.5, then the minimum fitness will never be realized and the clines will be shallower.
The second reason involves the variance in z. Let the frequency of type 1 alleles carried by an individual be z (z = Z/2L; 0 ≤ z ≤ 1), so that the mean value of z in a population is the gene frequency p: z = p. Under environmental selection, fitness is defined by z (Equation 1a). For Equation 11, it was assumed that the mean fitness in a population at mean gene frequency p was equal to that of an individual with frequency p, i.e., that
A brief point should be made with respect to the assumption that the system reaches equilibrium. In an alternative approach to analyzing the consequences of interbreeding, Baird (1995) used Fisher's (1953) junction theory to track the fate of introgressing blocks of genetic material entering a population. Baird's simulation models show that the approach to an analytic equilibrium is slow, of the order of thousands of generations. The results from the simulations presented here fit more closely with equilibrium predictions within a shorter time span and were also checked to ensure that statistics were measured once clines had stabilized. The difference between these conclusions is probably due to the difference in the rates of recombination modeled. Baird considers the fate of a single chromosome of infinitely many loci, so recombination will be much lower than in our model, where recombination rates were chosen to represent either unlinked or loosely linked genes. Infinite deme sizes may also have slowed the approach to equilibrium relative to that in the finite demes considered here. Further comparison of the two systems would be interesting, particularly in the light of evidence that genes for reproductive isolation may be linked (Wu and Palapoli 1994).
We considered here the limit of weak selection. An alternative approach would have been to increase the number of loci L, ultimately reaching a system with infinitely many loci (e.g., Baird 1995). As L increases, so do the number of higherorder disequilibria maintained by migration (Kirkpatrick and Servedio 1999). Their magnitude may be less, but whether this balances the concurrent increase in abundance is not clear— analysis in the limit of L rather than s would generate interesting comparisons. For the moment, our simulations afford a check on the magnitude of higherorder disequilibria, at least in the weak selection limit, as it is explicitly assumed that they are negligible in the analytical derivations (see appendix a, Barton 1983), but they are not excluded from the simulation models. If higherorder disequilibria are affecting the simulation dynamics, they will increase the effective selection pressure experienced at each locus in the same way that the pairwise disequilibria do. This would generate narrower clines than predicted by the analytical description, whatever the selection regime. We do not observe narrower clines either under environmental adaptation or heterozygote disadvantage, thus confirming the assumption that higherorder disequilibria are having a negligible effect.
The method presented here can theoretically be used to represent any form of selection. Our simulations and analysis show that pairwise linkage disequilibria are closely approximated by a balance between dispersal and recombination, and Barton and Gale (1993) and Barton and Shpak (1999) confirm this for epistatic models. The epistatic selection coefficient ∂ Log(W)/∂D_{ij} does not contribute significant linkage disequilibrium, while its effect on allele frequencies can be absorbed in a marginal selection coefficient. Genotype frequencies across a set of clines will be influenced primarily by the marginal selection gradient on each gene, ∂ Log(W)/∂p_{i}, regardless of whether this is due to endogenous or exogenous selection. It would therefore be possible to construct an “exogenous” fitness function that would mimic the equilibrium maintained by any kind of endogenous selection: the effect of any change in genetic background can be simulated by a change in the environment. Note that even the distinction between viability and fertility selection is ultimately arbitrary. For example, the fertility coefficients defined in Gavrilets' (1997b) singlelocus model are effectively equivalent to a twolocus “viability selection” model with epistatic interactions generating nonadditive selection pressures on the offspring produced by different pairings—again, it is the marginal selection gradient on each locus that determines the cline shape. The analysis here is restricted to a set of selected loci assumed to be each experiencing the same, weak selection pressures; but the extension to a system in which selection varies among loci, or some loci are even neutral, is straightforward (see Barton 1986; Barton and Bengtsson 1986; Piálek and Barton 1997; and Gavrilets and Cruzan 1998).
The exclusive action of either endogenous or exogenous selection is undoubtedly an unlikely simplification. Even if divergence were initially driven by differential adaptation to different environments, for a cline to be maintained entirely by exogenous selection requires that no epistatic interactions are ever involved and that selection acts independently on each locus (Barton and Hewitt 1985). Equally unlikely is a scenario in which selection is purely endogenous and fitness entirely independent of the environment. However, there is no reason for either type of selection to be exclusive, either theoretically [for example, Slatkin (1973) describes a singlelocus model incorporating both types of selection] or in nature [for example, in the hybrid zone between two species of firebellied toads (Bombina), there is evidence of both adaptation to different environments (Nürnbergeret al. 1995; Kruuk and Gilchrist 1997) and hybrid dysfunction (Szymura and Barton 1986; Kruuket al. 1999)]. For evolutionary biology the distinction between the two selection regimes is important for our understanding of factors responsible for divergence: Is species diversity determined by rates of accumulation of genetic incompatibilities or by the diversity of alternative ecological niches? However, the above results confirm that with the most straightforward of multilocus systems there is nothing intrinsically different in the population genetics of the two scenarios, and gene flow or selection pressures can be estimated in the same way for each.
Acknowledgments
We are grateful to J. Piálek, C. MacCallum, B. Nürnberger, S. Gavrilets, and M. Kirkpatrick for comments and discussion. This work was supported by Natural Environment Research Council (NERC) grants NERC GR3/11635 and GR3/9353, and the Darwin Trust of Edinburgh.
APPENDIX: EXCHANGE AMONG A CHAIN OF DEMES
We describe here the generation of pairwise linkage disequilibrium by migration between populations at different gene frequency. In a chain of demes, individuals in a population migrate to and from the two demes on either side of it; we therefore consider exchange between three demes. The model is an extension of the treatment of two demes considered by Li and Nei (1974) and Barton and Gale (1993). We show here that the threedeme scenario results in higher linkage disequilibria than migration between two demes.
Suppose that gene frequency is lowest in deme 0, intermediate in deme 1, and highest in deme 2. Each generation, a proportion m of the diploid individuals in the central deme, are exchanged with individuals from deme 0 and deme 2, half in either direction. Let p_{1,}_{x}, p_{2,}_{x}, PP_{x}, and D_{x} be, respectively, the allele frequencies at the two loci, the frequency of the coupling gamete, and the linkage disequilibrium in deme x. After migration, the gene and gamete frequencies in deme 1 are given by
Equation A3 can be solved for a “quasiequilibrium” (Nagylaki 1976b), in which a balance has been reached between the loss due to recombination and the generation by migration. This assumes that differences between the linkage disequilibria in each deme change less than differences in the gametic frequencies, so that the diffusion of linkage disequilibrium can be neglected [second term in (A3)]. The system is taken to be symmetric, so that Δp_{1} = p_{12} − p_{11} = p_{11} − p_{10} etc. Equation A3 then leads to
Note that Equation A4 is without the factor of (1 − m) in the equivalent formula for exchange between two demes (Li and Nei 1974; Barton and Gale 1993). This is because migrants enter a deme from two sides, so that the genetic makeup of the demes on either side of a central deme is important. If the neighboring demes are identical, the system collapses to the twodeme case, but in a chain of demes they will differ in opposite directions, and so the linkage disequilibria generated by exchange between three demes will be greater. Similarly, recursion equations developed by Asmussen and Arnold (1991) to describe the associations between cytoplasmic and nuclear loci generated by admixture imply that, when immigrants into a hybrid population come from two different source populations, the resulting associations will be inflated by differences between the latter; see also Asmussen et al. (1989). In the case considered here, the relationship applies to values measured after mating and recombination. If measurements are made after migration, mixing will have increased D by the amount mΔp_{1}Δp_{2}, and so the QLE approximation will be greater by a factor of (1 + r).
In a system of L > 2 loci, migration will also generate disequilibria of higher order (Kirkpatrick and Servedio 1999). Although the magnitude of each decreases rapidly with order, so that threeway disequilibria are roughly three times smaller than pairwise and so forth, the number of associations increases with order, so their cumulative effect may be substantial. Here, we use the simulation model to test our explicit assumption (Equation 6) that selection coefficients involving higherorder disequilibria (∂ Log(W/∂D_{ijk} etc.) have a negligible effect on gene frequencies.
Footnotes

Communicating editor: M. Slatkin
 Received April 13, 1999.
 Accepted August 4, 1999.
 Copyright © 1999 by the Genetics Society of America