Abstract
Previous studies have shown that genetic exchange in bacteria is too rare to prevent neutral sequence divergence between ecological populations. That is, despite genetic exchange, each population should diverge into its own DNA sequencesimilarity cluster. In those studies, each selective sweep was limited to acting within a single ecological population. Here we postulate the existence of globally adaptive mutations, which may confer a selective advantage to all ecological populations constituting a metapopulation. Such adaptations cause global selective sweeps, which purge the divergence both within and between populations. We found that the effect of recurrent global selective sweeps on neutral sequence divergence is highly dependent on the mechanism of genetic exchange. Global selective sweeps can prevent populations from reaching high levels of neutral sequence divergence, but they cannot cause two populations to become identical in neutral sequence characters. The model supports the earlier conclusion that each ecological population of bacteria should form its own distinct DNA sequencesimilarity cluster.
IT is becoming increasingly clear that a full accounting of ecological diversity in the bacterial world requires a molecular approach. Molecular techniques have demonstrated that only a small fraction of bacterial species are culturable (Ammanet al. 1995; Huberet al. 1995; Ohkuma and Kudo 1996), so our best hope of identifying the full scope of bacterial biodiversity is to characterize the sequence diversity of genes that can be amplified directly from natural habitats (Knightet al. 1992; Pace 1997). Such surveys typically yield clusters of organisms with similar sequences, and each sequencesimilarity cluster is typically interpreted as a distinct ecological population (Britschgi and Giovannoni 1991; Murray and Stackebrandt 1995; BoivinJahnset al. 1996).
This interpretation is justified because in studies of more familiar and culturable taxa, bacterial systematists have found an empirical correspondence between ecologically distinct populations and sequencesimilarity clusters. That is, groups of bacteria known to be ecologically different generally fall into separate sequencesimilarity clusters (Vandammeet al. 1996; Palyset al. 1997); conversely, ecologically uncharacterized strains that fall into separate sequence clusters have subsequently been found to have different ecological properties (Balmelli and Piffaretti 1996; Normandet al. 1996). Sequence surveys appear to be an efficient method for discovering the ecological diversity of culturable as well as nonculturable bacteria (Vandammeet al. 1996; Palyset al. 1997).
While ecologically distinct groups of bacteria are frequently distinguishable as separate sequencesimilarity clusters, it is important to find a strong theoretical basis for this observation. If there are times when multiple ecological populations of bacteria fall together into the same sequence cluster, molecular approaches may severely underestimate bacterial biodiversity (Cohan 1994a,b, 1995, 1996, 1999).
Recent theory has shown why ecological populations should correspond to sequence clusters (Cohan 1994a,b; Palyset al. 1997). In this theory, ecological populations are defined so that (1) each adaptive mutation confers a benefit only in the genetic background of its original population, and (2) mutant cells bearing an adaptive mutation can outcompete only members of their own population. Natural selection favoring adaptive mutants within a particular population purges that population of genetic diversity at all loci, owing to the low rate of recombination in bacteria. [Each such purging event is called a “selective sweep” (Guttman and Dykhuizen 1994b); we refer to recurrent selective sweeps as “periodic selection” (Atwoodet al. 1951; Koch 1974; Levin 1981).] Because an adaptive mutant does not outcompete cells from other populations, periodic selection purges only the diversity within populations and not the divergence between populations. Each round of periodic selection thereby enhances the distinctness of ecological populations at all loci and fosters the divergence of different ecological populations into separate sequencesimilarity clusters.
The tendency for bacterial populations to form separate sequence clusters is opposed by recombination between populations (Cohan 1994a). Depending on the rates of interpopulation recombination and the intensity of periodic selection, the model has shown three possible classes of outcomes of neutral sequence divergence between populations: (1) under extremely low rates of recombination, populations will diverge without bound, so that every nucleotide site that can be substituted harmlessly will eventually become substituted; (2) under higher rates of recombination, populations will reach an equilibrium level of divergence, so that populations fall into distinct sequence clusters, but divergence between them never becomes saturated; and (3) under yet higher rates of recombination, ecologically distinct populations will not be distinguishable by neutral sequence data, as the levels of divergence within and between populations will be nearly equal. Given the low rates of recombination estimated thus far for bacteria (Selander and Musser 1990; MaynardSmithet al. 1993; Whittam and Ake 1993; Guttman and Dykhuizen 1994a; Roberts and Cohan 1995), the model predicts that each population should be distinct as a separate sequencesimilarity cluster, as described in cases 1 and 2 above (Cohan 1994a, 1995; Palyset al. 1997).
Nevertheless, it is not clear that the existing model adequately predicts the degree of sequence divergence between ecological populations. Here we present an alternative and more general model for periodic selection, in which some mutations may be adaptive outside of the context of their original populations. In this model, the domain of competitive superiority of an adaptive mutant (i.e., the cell) is still its own ecological population, but the adaptive mutation (i.e., the allele) can be recombined into other populations, where it can confer higher fitness and cause a local selective sweep within each recipient population (Figure 1). This process may homogenize the populations for any segment that is cotransferred between populations along with the adaptive mutation. We have hypothesized that globally adaptive mutations could homogenize populations for neutral sequence diversity at all gene loci, provided that the size of fragments recombined is large enough and that universally adaptive mutations recur throughout the genome.
In this article, we present a coalescence model to explore the conditions under which universally adaptive mutations can homogenize neutral sequence diversity across ecological populations. We tested whether different ecological populations might fail to diverge into separate sequencesimilarity clusters under the rates of recombination observed in bacteria. We also tested whether universally adaptive mutations may prevent populations with low recombintion rates from diverging without bound.
THE MODEL
Ecological populations and adaptive mutations: A metapopulation consists of n closely related ecological populations, each containing N cells (Table 1). Each population is adapted to a different ecological niche. Recombination occurs rarely within and between these populations, and the metapopulation is closed to recombination with other such metapopulations.
Following Cohan (1994a), we define an ecological population as the domain of competitive superiority of an adaptive mutant. Thus, an adaptive mutant would outcompete to extinction all other strains from the same population (because they are adapted to the same niche) but would not drive to extinction strains from other populations. While an adaptive mutant (i.e., the cell) has a competitive advantage only within its own ecological population, an adaptive mutation (i.e., the mutant allele) may be either locally or globally adaptive. A locally adaptive mutation confers a benefit only in the genetic background of its original population, whereas a globally adaptive mutation can confer a benefit in any genetic background within the metapopulation. A global selective sweep occurs when a globally adaptive mutation recombines from its original population into other populations: any cell receiving the adaptive mutation from the original population is then able to outcompete other members of its own population (Figure 1). Whereas a globally adaptive mutation confers fitness globally to all cells in the metapopulation, natural selection acts only locally to favor the adaptive genotype within each population.
We assume that selective sweeps are rare events and that the duration of the sweep is short relative to the time between sweeps.
Rate of fixation of adaptive mutations: Following Cohan (1994a), adaptive mutation is modeled as a onestep process that occurs randomly over time at a rate μ_{g} (for global adaptations) or μ_{l} (for local adaptations) per capita per generation. Each adaptive mutation confers a selective advantage z. Taking into account that only the fraction 2z of adaptive mutations is expected to become fixed (assuming that the population size N ⪢ l/z; Wright 1931), locally adaptive mutations are fixed within a population by directional selection at a rate σ_{l} = 2zμ_{l}N. It is assumed that once a globally adaptive mutation becomes fixed in its original population (with probability 2z), recurrent recombination and subsequent selection will cause the mutation to eventually become fixed in all populations. Therefore, globally adaptive mutations are fixed at a rate σ_{g} = 2zμ_{g}nN.
Recombination within and between populations: Recombination in bacteria is unidirectional and the segment recombined is usually a small fraction of the genome (Smith 1988). We therefore model recombination as a gene conversion process in which a segment of the recipient DNA is replaced with the homolog of the donor. The model is concerned with recombination at two loci: a “gene of interest,” whose sequence divergence we wish to predict, and a selected gene, whose adaptive mutation is favored by selection. Each gene is assumed to be short enough so that it is not split by recombination. A single recombination event may involve one or both of the genes, depending on the size of the recombining fragment (h) and the distance between the loci (y).
Recombination follows a modified island model, where c_{s} is the rate (per gene segment per genome per generation) at which individuals integrate (as recipients) DNA at a gene segment of interest from other individuals of the same ecological population; c_{d} is the rate at which individuals integrate DNA from any other ecological population; c is the total rate of recombination at which an individual integrates DNA from any other individual in the metapopulation; thus c = c_{s} + c_{d}. The value c_{δ} is the rate at which individuals integrate DNA from a particular ecological population (other than their own). In a metapopulation consisting of n ecological populations, c_{δ} = c_{d}/(n  1).
Probability that a selective sweep leads to coalescence: Our model determines the expected time (going backward from the present) to coalescence into a common ancestor for two homologous gene segments occurring today in two different individuals. These individuals may be cells from the same or different ecological populations of the metapopulation.
We define p as the probability that a selective sweep leads to coalescence at a gene segment of interest. This is the probability that two cells chosen from the metapopulation immediately following a selective sweep are identical by descent for the gene segment of interest. Whether a selective sweep results in coalescence at a gene of interest depends on the relative magnitudes of the selective advantage of the adaptive mutation, the rate at which recombination separates the gene of interest from the selected gene, and the population size. If the rate of recombination is high and the selective advantage low, the event is unlikely to lead to coalescence.
We consider several instances of the variable p, corresponding to the probabilities of coalescence within and between populations, for globally and locally adaptive mutations: p_{l} is defined as the probability that a local selective sweep within a population leads to coalescence of segments from that population; p_{gs} is the probability that a global selective sweep leads to coalescence of segments from the same population; and p_{gd} is the probability that a global selective sweep leads to coalescence of segments from different populations of the metapopulation.
In appendix a, we derive a method (adapted from Kaplanet al. 1989) for calculating p_{l}, p_{gs}, and p_{gd}, for the special case of two ecological populations, i.e., n = 2. The variables as p_{l}, p_{gs}, and p_{gd} are functions of c_{s}, c_{d}, N, and q, where q is the probability that a recombination event results in corecombination of the adaptive allele with the segment of interest. This probability is a function of the length h of the DNA taken up by a recipient cell during a recombination event and of the distance y between the adaptive mutation and the segment of interest (both h and y are measured as fractions of the genome):
Because the coalescence of homologous segments from different populations requires that the transfer of the adaptive mutation from population 1 to population 2 includes the segment of interest (Figure 1), p_{gd} will be highly dependent on the probability of cotransfer.
We assume that the size (h) of the recombining DNA fragment is constant, while adaptive mutations occur randomly throughout the genome. Because we are interested in modeling the consequences of many selective sweeps, we need to calculate the mean probability (P) that a selective sweep leads to coalescence, averaged over all possible distances (y) between the neutral marker and the adaptive mutations (i.e., between 0 and ^{1}/_{2} because the bacterial chromosome is circular):
Both the probabilities p and the above integral were evaluated numerically.
The coalescence model: Our coalescence model calculates the expected time that two homologous gene segments (occurring in different organisms) have diverged since their last common ancestor. These gene segments are postulated to be short enough so that they are not split by recombination. The following are the expected times to coalescence for two strains from the same and different ecological populations, E(t_{s}) and E(t_{d}) (derived in appendix b):
Calculation of the expected nucleotide divergence: The expected nucleotide sequence divergence is predicted using the probability density functions for t_{s} and t_{d} following Cohan (1994a). Nucleotide substitutions are postulated to consist only of synonymous mutations, and every third base substitution is taken to be synonymous, with no synonymous substitutions allowed at the first or second bases of codons. The number of neutral substitutions per third base site (Δ) is then obtained by multiplying the time to coalescence by twice the per third base site rate of mutation (μ_{0}),
The nucleotide sequence divergence over all sites, π, may be calculated by correcting for multiple substitutions per site (Jukes and Cantor 1969) and correcting for substitutions occurring only at third base sites:
RESULTS
The following parameter values were used in all numerical calculations: the neutral mutation rate per third base site, μ_{0} = 3 × 10^{10}; the selective advantage, z = 10^{2}; the population size, N = 5 × 10^{14}; and the number of populations in the metapopulation, n = 2. Recombination rates within and between populations were set as equal (c_{s} = c_{δ}; i.e., no sexual isolation between populations) to maximize the homogenizing effect of recombination.
The diversitypurging effect of an adaptive mutation: The probability that a particular global selective sweep causes coalescence, within or between populations, is shown in Figure 2. The probability of coalescence within a population, p_{gs}, is always near 1 because recombination is so rare in bacteria (see also Cohan 1994b). The probability of coalescence of segments from different populations, p_{gd}, is approximately equal to q (Figure 2). This is because a globally adaptive mutation causes coalescence between populations at a gene of interest only when it causes coalescence within each population (occurring with probability p_{gs}) and the gene of interest is cotransferred between populations along with the adaptive mutation (occurring with probability q). Thus, the probabilities of coalescence within and between populations are most similar for genes most closely linked to the adaptive mutation (i.e., q = 1).
The ratio of globally to locally adaptive mutation rates: We next explore the effect of recurrent adaptive mutations on population structure. We focus on the significance of the ratio of globally to locally adaptive mutations. We maintain the total frequency of adaptive mutations constant, while allowing the ratio of global:local adaptations to vary. We consider three relative frequencies of global:local events, 1:0, 1:1, and 0:1 (Figure 3).
Figure 3 shows that globally adaptive mutations reduce neutral sequence divergence between populations compared to the case with only local adaptations. This effect is most pronounced at low recombination rates. When only local selective sweeps are possible, the model shows that a recombination rate of 10^{10} leads to unbounded neutral divergence between populations (i.e., π_{d} ≈ 1/4). Increasing the global:local ratio decreases the divergence between populations by up to 50fold. The divergence within populations also decreases, but to a much lower extent. Thus, increasing the proportion of globally adaptive mutations makes the populations less distinct in neutral characters.
Consider next whether globally adaptive mutations can prevent different ecological populations from diverging into separate sequence clusters. We define populations as falling into separate sequencesimilarity clusters when E(π_{d}) > 2E(π_{s}) (Palyset al. 1997). Using this criterion, Figure 4 shows that the critical recombination rates necessary for populations to diverge into separate clusters are nearly the same whether or not globally adaptive mutations occur.
Analysis of a simplified model with no locally adaptive mutations: We concentrated on the effect of globally adaptive mutations by considering the special case of a twocomponent metapopulation in which all the adaptive mutations are global (Figure 5). For this special case we treated the coalescence equations analytically to gain further insight into the behavior of the sequence divergence functions presented in Figure 3. Noting that bacterial populations are always large enough so that the probability of coalescence by drift is negligible relative to coalescence by periodic selection (i.e., 1/N ⪡ σP), Equations 3 and 4 reduce to
We may consider σ_{g}P_{gs} and σ_{g}P_{gd} as pseudoparameters, representing the diversitypurging effect of periodic selection (i.e., the rate of selective sweeps times the probability of coalescence within each sweep). The times to coalescence are then determined by only three factors: c_{δ}, the rate of recombination between populations; σ_{g}P_{gs}, the withinpopulation diversitypurging effect of global periodic selection; and σ_{g}P_{gd}, the betweenpopulation diversitypurging effect of global periodic selection.
Consider the relative magnitudes of σ_{g}P_{gs} and σ_{g}P_{gd}. We used Equation A4 of appendix a to calculate the values of P_{gs} and P_{gd} across the range of frequency of recombination (c) and selective advantage (z) considered in this article, and we found that P_{gs} ≥ P_{gd}/h. We assume that the size of the recombination fragment h is usually <10% for the genome (see discussion). Therefore σ_{g}P_{gs} ⪢ σ_{g}P_{gd}. This leaves four regions of magnitude for c_{δ}: c_{δ} ⪢ σ_{g}P_{gs}, c_{δ} ∼ σ_{g}P_{gs}, c_{δ} ∼ σ_{g}P_{gd}, and c_{δ} ⪡ σ_{g}P_{gd}. These regions correspond to regions I through IV, respectively, of Figure 5.
Region I of Figure 5, c_{δ} ⪢ σ_{g}P_{gs} ⪢ σ_{g}P_{gd}, corresponds to very high recombination rates, yielding the following approximation of Equations 8 and 9:
The conditions in region II, c_{δ} ∼ σ_{g}P_{gs} ⪢ σ_{g}P_{gd}, yield
In region II of Figure 5, recombination is no longer sufficient to prevent populations from diverging. The divergence between populations is greater than that within populations and is determined by the equilibrium between recombination (which acts to homogenize the populations) and local diversitypurging events (which tend to keep the populations distinct).
The conditions of region III, σ_{g}P_{gs} ⪢ σ_{g}P_{gd} ∼ c_{δ}, yield
Region III reflects the increasing significance of global periodic selection. Divergence between populations is determined by the combined homogenizing effects of recombination (c_{δ}) and global periodic selection (σ_{g}P_{gd}).
Region IV corresponds to the case of extremely rare recombination, σ_{g}P_{gs} ⪢ σ_{g}P_{gd} ⪢ c_{δ}, yielding
This is the limiting case, where recombination between populations becomes so infrequent that its effects are entirely overwhelmed by periodic selection. In this limit, the divergence within populations is determined solely by the intensity of local purging of diversity, while the divergence between populations is only limited by the intensity of global purging of diversity.
Under the conditions of rare recombination (region IV), the ratio of the times to coalescence (i.e., E[t_{d}]:E[t_{s}]) approaches 1/h. This follows from two consequences of rare recombination. First, because the locus of interest and the adaptive mutation are rarely separated by recombination, a selective sweep almost certainly leads to coalescence of gene segments from the same population (i.e., P_{gs} ≈ 1). Second, the transmission of the adaptive mutation from population 1 to population 2 is likely to be the result of a single transfer event. Hence, the probability of coalescence of two gene segments from different populations (P_{gd}) approaches the probability that the transfer event was a cotransfer of the adaptive mutation and the segment of interest (averaged over all distances between the two loci). That is, P_{gd} ≈ h, and E[t_{d}]/E[t_{s}] ≈ 1/h.
Effect of recombination fragment size on population divergence: We consider next the effect of the recombination fragment size (h) on the distinctness of ecological populations. In general, larger recombination fragments increase the probability (q) that a gene of interest will cotransfer across populations with an adaptive mutation (Equation 1), thus fostering coalescence of segments between populations (Figure 2). Hence, larger sizes of recombination fragments tend to make ecological populations appear less distinct in neutral characters (A, Figure 5). The effect of h on the distinctness of populations is most important at low betweenpopulation recombination rates (Figure 5).
The effect of h on population distinctness (quantified as E[π_{d}]/E[π_{s}]) is shown explicitly in Figure 6. Under very low rates of betweenpopulation recombination, the distinctness ratio approaches 1/h for large fragment sizes (i.e., h > 10%; Figure 6). Thus, global periodic selection alone (i.e., with little recombination between populations) cannot reduce the distinctness ratio of populations to 1 (so that E[π_{d}] ≈ E[π_{s}] unless the recombination fragment size reaches 100% of the genome.
We explored in more detail the effect of h on population distinctness for the case when withinpopulation divergence levels are 1% (i.e., E[π_{s}] = 0.01), because this is the divergence level frequently observed within bacterial sequencesimilarity clusters (Palyset al. 1997). With this level of divergence, global periodic selection is quite ineffective in reducing divergence between populations when recombination fragments are small (Figure 7). For example, global periodic selection with a recombination fragment of 1% of the genome cannot reduce the betweenpopulation divergence by >4%; however, with larger recombination fragments (e.g., h = 5%), global periodic selection may significantly reduce the betweenpopulation divergence from unbounded neutral divergence (π_{d} = 0.25) to a much more limited level of divergence (π_{d} = 0.09; Figure 7).
DISCUSSION
This study presents a coalescence model for investigating the effect of globally adaptive mutations on neutral sequence divergence in bacteria. We used this model to test whether interpopulation transfer of globally adaptive mutations might prevent neutral sequence divergence between ecologically distinct populations of bacteria.
Assumptions of the model: If globally adaptive mutations are to reduce divergence between ecological populations at every locus in the genome, we must assume that every gene locus has the opportunity to hitchhike from population to population along with globally adaptive mutations (Figure 1). We therefore assume that globally adaptive mutations that confer benefits in more than one population exist, that they are numerous, and that they appear throughout the genome. The latter two assumptions are required because only a limited fraction of the genome can be cotransferred (and subsequently homogenized) across populations with any given adaptive mutation: the segments transferred in bacterial recombination are generally small (Smith 1988), and the transfer of large segments across populations is probably disfavored by natural selection (Cohan 1994b; Zawadzki and Cohan 1995).
Consider next the central premise of the model, that globally adaptive mutations exist and are numerous. The likelihood of globally adaptive mutations must depend on the degree of ecological divergence between populations. In the early stages of population divergence, a mutation that is adaptive in one population is likely to be adaptive in others. As the populations become progressively more finely tuned to their respective niches, accumulating many nichespecific adaptations, we should see fewer adaptive mutations that can benefit more than one population. We therefore expect globally adaptive mutations to prevent neutral sequence divergence genomewide only between the most closely related populations.
Does a typical adaptive mutation confer a benefit in more than one population? Recently, Guttman and Dykhuizen (1994b) provided evidence that one adaptive mutation precipitated selective sweeps in all the ecological populations included within Escherichia coli. A selective sweep apparently purged sequence diversity within a small chromosomal region from all the various sequence clusters of E. coli, while these clusters retained their distinctness for all other chromosomal regions studied. This is exactly the pattern expected soon after a global selective sweep. As shown in Figure 2, for genes that are closely linked to the adaptive mutation (q ≈ 1), there is nearly total purging of diversity both within and between populations; for genes that are not linked to the adaptive mutation (q ≈ 0), there is purging of diversity within populations but none between populations. Provided that each of the E. coli sequence clusters is actually a separate ecological population (Cohan 1994a,b, 1999; Palyset al. 1997), the selective sweep demonstrated by Guttman and Dykhuizen (1994b) appears to have been driven by a globally adaptive mutation.
Globally adaptive mutations as a homogenizing force in neutral sequence evolution: Analysis of our model has shown that, in general, globally adaptive mutations tend to make populations less distinct. Especially under extremely low recombination rates, globally adaptive mutations severely depress neutral sequence divergence between populations while having only a minor effect on withinpopulation diversity (Figure 3). Populations that would diverge without bound in the absence of global periodic selection may be prevented from diverging without bound in the presence of global periodic selection.
Nevertheless, global periodic selection does not homogenize neutral sequence divergence to the extent that populations become indistinguishable. Consider, for example, ecological populations whose average withinpopulation sequence divergence is ∼1%, a value typical for sequencesimilarity clusters in bacteria (Palyset al. 1997). In the absence of global periodic selection, such populations diverge into separate sequencesimilarity clusters whenever the betweenpopulation recombination rate is <10^{7.6}; in the presence of global periodic selection, even for a large recombination fragment (h = 10%), the critical recombination rate decreases only slightly to 10^{7.8} (Figure 4). Recombination rates between most bacterial populations are unlikely to exceed either critical value (Whittam and Ake 1993; Roberts and Cohan 1995; Palyset al. 1997; Cohan 1999). We therefore conclude that in spite of the homogenizing effect of global periodic selection, ecological populations should diverge into separate sequencesimilarity clusters.
Analysis of the model has shown that the effect of global adaptations on betweenpopulation divergence is highly dependent on the size of the fragment recombined (Figure 7). If the recombination fragment is small (<1% of the genome), global periodic selection is virtually ineffective in reducing betweenpopulation divergence; however, if the recombining fragment is large (e.g., 5% of the genome), global periodic selection may significantly reduce the betweenpopulation divergence (Figure 7).
The effect of global periodic selection on sequence divergence may therefore depend on the mode of genetic transfer between populations, because the various modes of transfer differ greatly in the length of DNA recombined. In naturally competent taxa, such as Streptoccus and Bacillus, transformation may be the predominant mode of DNA exchange. The average fragment of DNA incorporated in both Streptococcus and Bacillus transformation is <1% of the genome (Humbertet al. 1995; Zawadzki and Cohan 1995). To the extent that transformation is the primary mode of transferring adaptive mutations across populations in these taxa, global periodic selection should have virtually no effect on sequence divergence (Figure 7).
Other modes of recombination, such as transduction and conjugation, can transfer much larger segments of DNA. A generalized transducing phage can, in principle, transfer segments as large as the phage’s own genome, which could be ∼10% of the bacterium’s genome (FraenkelConrat 1985; Arber 1994). Conjugating plasmids can transfer even larger segments: in the case of the Hfr plasmid of E. coli, most of the genome can be transferred (but there may be additional fitness constraints on the size of large transferred fragments; see above). Therefore, when transduction and conjugation are the principal means of transfer of adaptive mutations, global periodic selection can have an important role in reducing divergence between populations.
In summary, global periodic selection can limit the sequence divergence between ecological populations. The effect of global periodic selection is most pronounced for groups of populations with low betweenpopulation recombination, such that global periodic selection is the only constraint on divergence between populations. Global periodic selection is unlikely to prevent the divergence of ecological populations into separate sequence clusters. A quantitative prediction of the homogenizing effect of global periodic selection would require more information about the rate of mutations that confer adaptations in multiple populations, information about how evenly globally adaptive mutations are distributed throughout the genome, and information about the size of fragments that can be transferred between populations and then successfully accommodated by the receiving population.
APPENDIX A: Probability That a Periodic Selection Event Leads to Coalescence
We consider the special case of a metapopulation consisting of two ecological populations. The adaptive mutation driving the periodic selection event begins in population 1 and is subsequently passed into population 2 by recombination. We use a twolocus, fourallele model. A is the locus under selection, while B is the segment of interest whose neutral sequence divergence we are investigating. Alleles in population 1 are designated by subscript 1; those in population 2 are designated by subscript 2. Within population 1, the advantageous allele is designated as A_{1}, and all other alleles at the selected locus are designated a_{1}. A_{2} is the advantageous allele in population 2; a_{2} designates all the other alleles at the selected locus in this population. An allele at locus B can be attached to any of the four A alleles, i.e., A_{1}, a_{1}, A_{2}, a_{2}. The frequencies of the alleles A_{1} and A_{2} in their respective populations are x_{1} and x_{2}.
Let g_{X}(Y,t) be the conditional probability that if a randomly selected gene B from generation t of the metapopulation is attached to the allelic type Y (at locus A), its ancestor in generation (t  1) was attached to allelic type X. [g_{X}(Y,t) is equivalent to the quantity f_{X}(Y,t)/f(Y,t) of Hudson and Kaplan (1988).] Following Hudson and Kaplan (1988), and keeping only the highest order terms in N (where N is the population size), 8 of the 16 g probabilities are
We now define the Q process. Suppose that m B genes are selected at random at the end of the selective sweep (time t = 0). Let Q(0) = (i, j, k, l), where i, j, k, l represent the number of B genes attached to A_{1}, a_{1}, A_{2}, a_{2}, respectively. Going back in time, Q(t) describes the number of ancestral B genes attached to each A allele at time t (i.e., t generations before time 0). The total number of ancestral B genes in generation t is denoted by Q(t). Note that Q(t) never increases, because the number of ancestral alleles  can only stay constant or decrease (if two or more of the sampled alleles had a common ancestor in the previous generation). We are interested in the cases where Q(t) changes states, i.e., Q(t  1) ^{1} Q(t). There are two possible cases.
Case 1. Q(t  1) = Q(t): The only possible state changes allowed by this condition result from recombination between parental genes. Given Q(t) = (i, j, k, l), there are 12 possible states of Q(t  1):
Note that all other jumps would require more than a single recombination event and their probabilities are therefore of the order 1/N^{2} and are negligible. We are interested in the probabilities that the process jumps from (i, j, k, l), to any of the above states, e.g.,
The probability that a selected gene B from generation t is attached to a_{1} while its ancestor was attached to A_{1} is given by g_{A1}(a_{1},t). Because we are sampling j a_{1} alleles,
Equations of the same form can be obtained for the remaining 11 jumps.
Case 2. Q(t  1) ^{1} Q(t): We have already noted that the number of ancestral alleles can only decrease going backward in time. This case implies that some of the genes sampled at time t must have a common ancestor at time (t  1). Kaplan et al. (1988) have shown that the probability that two genes of a particular allelic type at time t have a common ancestor in generation (t  1) is, to the first order in N, given by the probability of coalescence by drift, i.e.,
Calculation of p_{gd} and p_{gs}: At the end of the selective sweep we sample two B alleles from the metapopulation. We want to know the probability that the two alleles had a common ancestor during the selective sweep. We consider two cases:
Case 1: the two genes are sampled from different ecological populations. The probability of coalescence during the sweep is p_{gd}.
Case 2: the two genes are sampled from the same ecological population. The probability of coalescence during the sweep is p_{gs}.
We begin by considering case 1, calculation of p_{gd}. We follow the Q process back in time, going from t = τ_{f} (end of the selective sweep) to t = τ_{b} (beginning of selective sweep; Figure 8). We can write (1  p_{gd}) as the probability of escaping coalescence, i.e., leaving the selective sweep (t = τ_{f}) with one B gene in population 1 (attached to A_{1}) and one B gene in population 2 (attached to A_{2}) and entering it (t = τ_{b}) with two ancestral B genes:
We also need to define P_{ijkl}(t), the probability of finding the Q process in the state (i, j, k, l) at time (t), given that at the end of the selective sweep Q(τ_{f}) = (1,0,1,0),
To calculate all the relevant P_{ijkl}(τ_{b}) values, we use the differential equations governing their behavior:
To be able to treat the model deterministically, we follow Kaplan et al. (1989) and limit t to
It remains for us to establish the boundary conditions
Because the adaptive mutation enters population 2 later than it enters population 1, we must determine the value of x_{1} at the time when x_{2} = 1  ε. For that purpose we need to first run the frequency equations forward in time (starting at x_{1} = ε, x_{2} = 0). We then allow for the transfer of a single adaptive mutation to population 2. This takes place at time E(τ_{c}), which is the expectation of the transfer of the first adaptive allele that will become fixed in population 2. (This is in fact equal to the time at which 1/2z alleles have crossed over.) We run the equations forward in time until τ_{2}(1  ε) to establish the end boundary conditions for x_{1}; then we can run both x_{1} and x_{2} backward, along with the equations for P_{ijkl}, following Equation A3 (see Figure 8).
The transfer of the first adaptive allele from population 1 to population 2 is a stochastic event and hence introduces a discontinuity in the solutions to differential equations for P_{ijkl}. At the time immediately preceding the initial transfer of the adaptive allele into population 2 (time τ_{c}+), no A_{2} alleles existed. Therefore, at time τ_{c}+ the probabilities of a B allele being attached to an A_{2} allele [P_{ijkl}(τ_{c}+) for k ^{1} 0] must be zero. However, these P terms might have nonzero values at τ_{2}(ε). Following Kaplan et al. (1989), we may neglect the contribution of P_{0020}(τ_{2}[ε]) to p_{gd}. However, a nonzero value of P_{ijkl}(τ_{2}[ε]) implies that immediately after the transfer event, one of the sampled B genes is attached to an A_{2} allele. In this case, there exist two possible states of the Q process immediately preceding the transfer: Q(τ_{c}+) = (i + 1, j, 0, l) if the transfer was a corecombination of A and B (with a probability q); or Q(τ_{c}+) = (i, j, 0, l + 1) if only the A allele was transferred (with a probability q  1). Hence, the additional boundary conditions at τ_{c} needed to account for the transfer are
APPENDIX B: The Expected Time to Coalescence
Coalescence within populations
Following Cohan (1994a), we first consider the time to coalescence t_{s} for two segments currently residing in cells of the same ecological population. The statistical properties of t_{s} are investigated by dividing t_{s} into two constituent quantities: the time k_{s} necessary to go back into the past of the two lineages to reach a “key event” (defined below) and the additional time necessary to go beyond the key event to reach a coalescence of the two lineages into a common ancestor (if the key event did not result in coalescence). A key event is any event that changes the expected time to coalescence. In the case of segments from the same ecological population, a key event may be any of the following: coalescence by drift acting on N cells of the population, a local selective sweep, a global selective sweep, or a genetic exchange event in which one of the segments is transferred into its present population from another ecological population.
The variable t_{s} is thus defined below, where the first term k_{s} represents the time to reach the most recent key event, and the other three terms represent the additional time necessary to go beyond the key event to reach a coalescence (Table 2):
Local selective sweep as the key event: The random variable χ_{σl} indicates whether the key event was a local selective sweep (χ_{σl} = 1 if the key event was a local selective sweep; χ_{σl} = 0 else). Two lineages that are in the same population at the end of the selective sweep may begin the sweep in one of three states: a single lineage (i.e., the lineages coalesce); two lineages in the same population; or they may begin as two lineages in different populations. The random variable χ_{pl} indicates whether the lineages escape coalescence and begin in the same population (1 if yes, 0 else, as above), and
Global periodic selection as the key event: The random variable χ_{σg} indicates whether the key event was a global selective sweep (1 if yes, 0 else). The random variable χ_{p}_{gs} indicates whether the lineages escape coalescence and begin the sweep in the same population (1 if yes, 0 else), and the random variable
Recombination between populations as the key event: The random variable χ_{c}_{d} indicates whether the key event is a recombination between populations, in which one of the lineages enters the current population (1 if yes, 0 else). The random variable
Coalescence between populations
Next consider the time to coalescence, t_{d}, for segments that are now in two cells belonging to different ecological populations. As above, we divide t_{d} into the time k_{d} to go back to the most recent key event (a global selective sweep or a genetic exchange event in which one of the lineages enters its current population from the population of the other lineage) and the additional time required to go beyond the key event to reach coalescence:
Global selective sweep as the key event: The random variable χ_{p}_{gd} indicates whether the two lineages presently in different populations escape coalescence and begin the sweep in different populations, and
Betweenpopulation recombination as the key event: The random variable χ_{c}_{δ} indicates whether the key event was a recombination event between populations, such that before the event the lineages were in the same population and afterward were in different populations (1 if yes, 0 else). The random variable
Expected values of t_{s} and t_{d}: The expected values of all indicator variables were calculated following Cohan (1994a), except when the key event is a selective sweep. The expectation of χ_{p}_{l} is the probability that two segments from the same population escape coalescence and begin the sweep in the same population (1  P_{l}) and likewise for the expected values of other indicator variables: E(χ_{p}_{gs}) = 1  P_{gs}; E(χ_{p}_{gd}) = 1  P_{gd}; E(χ_{γl}) = P_{l}_{c}. We also define E(χ_{γg}) = P_{gs}_{c} and E(χ_{γd}) = P_{gd}_{c}. All the relevant expected values are calculated in appendix a.
The expected values of t_{s} and t_{d} are as follows:
Probability density functions of t_{s} and t_{d}: We define P_{t}_{s}(t) as the probability that the time t_{s} to coalescence for two segments currently in the same population is equal to t. Similarly, P_{t}_{d}(t) is the probability that the time t_{d} to coalescence for two segments currently in different populations is equal to t. The values P_{t}_{s}(t) and P_{t}_{d}(t) can then be expressed as the probability that the most recent key event occurred at exactly time t and led to coalescence or, if the key event occurred at any other time τ < t and did not lead to coalescence, the additional time necessary to reach coalescence was (t  τ). We can consider the key events to be exponentially distributed (Hudson and Kaplan 1988). Hence,
The above recursions were solved numerically to yield numerical representations of the two probability distribution functions.
Acknowledgments
We thank Michael Feldgarden for suggesting that we explore a model of global periodic selection and Richard Hudson for suggesting important improvements to the model. This work was supported by Environmental Protection Agency grants R821388010 and R825348010 and by research funds from Wesleyan University.
Footnotes

Communicating editor: R. R. Hudson
 Received September 20, 1996.
 Accepted April 23, 1999.
 Copyright © 1999 by the Genetics Society of America