Abstract
An approximate method is developed to predict the number of strongly overdominant alleles in a population of which the size varies with time. The approximation relies on the strongselection weakmutation (SSWM) method introduced by J. H. Gillespie and leads to a Markov chain model that describes the number of common alleles in the population. The parameters of the transition matrix of the Markov chain depend in a simple way on the population size. For a population of constant size, the Markov chain leads to results that are nearly the same as those of N. Takahata. The Markov chain allows the prediction of the numbers of common alleles during and after a population bottleneck and the numbers of alleles surviving from before a bottleneck. This method is also adapted to modeling the case in which there are two classes of alleles, with one class causing a reduction in fitness relative to the other class. Very slight selection against one class can strongly affect the relative frequencies of the two classes and the relative ages of alleles in each class.
STRONG balancing selection can result from overdominance in fitness or from disassortative mating of the kind created by selfincompatibility systems in plants. Models of balancing selection are difficult to analyze completely because of the large number of alleles present, so numerous approximations have been used to provide some quantitative understanding of how a balance is achieved between mutation, selection, and genetic drift. Approximations that have been used—including those of Wright (1939), Kimura and Crow (1964), Sasaki (1989, 1992), Takahata (1990), and Vekemans and Slatkin (1994)—have all assumed that a stochastic equilibrium was reached in a population of constant size. Yet many applications, such as the estimation of the ancestral population size of modern humans (Kleinet al. 1993; Ayala 1995), are to populations that have been variable in size. Of particular concern in this and other applications is the effects of a bottleneck in size. The problem is to obtain quantitative results without resorting to extensive computer simulations. Simulations are important and useful, but they cannot usually show how a model’s behavior depends on simple combinations of parameters.
In this article, we introduce an approximate analytic theory that allows us to obtain relatively simple results for populations of variable size and to generalize models of balancing selection to allow for slight differences in relative fitness of different classes of alleles. Our approximate method is based on the strongselection weakmutation (abbreviated as SSWM and pronounced “swim”) approximation of Gillespie (1984, 1991). This approximation leads naturally to a Markov chain for the number of common alleles in a population and provides a way to compute the rates of gain and loss of common alleles when population size varies. Our approximation is the same as that used previously by Sasaki (1989, 1992), and our results for a population of constant size are the same as his.
The Markov chain can be further approximated by an ordinary differential equation for the expected numbers of alleles and thus can provide a rough prediction of the rate at which alleles are gained and lost because of a bottleneck. We verify that our method provides an accurate approximation by comparing our predictions with computer simulations. Our results are very similar to and consistent with those of Takahata (1990) for populations of constant size and with those of Takahata (1993) for a bottleneck in population size.
ONE SELECTIVE CLASS OF ALLELES
Analytic theory: Throughout, we consider a single autosomal locus in a randomly mating diploid population in which the population size, N(t), may vary with time. We assume that mutations occur at rate u and that every mutation is to an allele not previously found in the population—the infinite alleles model of Kimura and Crow (1964). We let i be the number of common alleles in the population at any time, where a common allele is one with a frequency near its deterministic equilibrium frequency, as described later. We assume overdominance in fitness, in which every heterozygote has a fitness of 1 + s relative to every homozygote. This parameterization is different from that of Takahata (1990), who assumed every homozygote has a fitness 1  s relative to every heterozygote. In the diffusion limit (large population size and small s), on which both Takahata’s and our theories depend, there is no difference in the parameterizations. A value of s in Takahata’s parameterization corresponds to s/(1  s) in ours.
New mutations arise and if they become common, they persist in the population as common alleles for a long time. This situation is suitable for Gillespie’s (1984, 1991) SSWM approximation. We treat i, the number of alleles, as a random variable for which transitions from one time to the next can be modeled by a Markov chain. In fact, the Markov chain we use when there is one class of alleles is of a particularly simple kind because it allows for an increase or decrease in i by only one. Consider a population with i common alleles and assume that they are at the deterministic equilibrium with each having a frequency 1/i. Now suppose that a new allele appears by mutation. Let the frequency of the new mutant be z. The change in z is approximately
With a mutation rate u per generation, there are on average 2N(t)u mutations per generation, so the probability per generation that a new mutant appears and becomes common is
We also need the probability of loss of a common allele. Takahata (1990, Equation 6) found that in a population of constant size, the average time to loss of an allele in frequency x is approximately
When i alleles are equally frequent, F = 1/i. Equation 3 differs from Takahata’s equation because time is measured in generations and not in units of 2N generations. When S is large, common alleles are held relatively close to their expected frequency, x = 1/i, with only a weak force of genetic drift pulling them toward the boundary (x = 0). This stochastic process is similar to that described by Newman et al. (1985), in which the transition from being common to being lost occurs in a very short time. In that case, it is reasonable to approximate the process of loss of a common allele as a Poisson process occurring with an instantaneous rate 1/t¯(x).
When population size varies with time, it is still reasonable to use the assumption that loss of an allele occurs instantaneously with the probability of loss per generation determined by N(t), provided that 2N(t)s is large enough that the selection dominates genetic drift. In that case, the instantaneous rate of loss is still 1/t¯(x) with S replaced by 2N(t)s. Because there are i alleles that are liable to be lost by drift, the probability of loss per generation is
We can compare our approximation with Takahata’s (1990) by assuming a constant population size N and finding the value of i for which μ_{i} = λ_{i}. The result is
We can use the Markov chain to predict the stationary probability distribution of i in the population, α_{i}. Because of the simple form of the chain, we can use standard theory to find α_{i}
Simulation results: To test the accuracy of our analytic approximation, we developed a computer simulation of our model. We assumed a population of N diploid individuals. A generation consisted of mutation, viability selection, and random mating. Beginning with N zygotes, the program generated a Poisson distributed random number of mutations with mean 2Nu. For each mutation, one of the 2N copies was chosen at random to mutate to an allele not found in the population. It was possible for the same copy to mutate more than once, but because 2Nu was one or less in our simulations, double mutations were extremely rare. Viability selection was modeled by computing the deterministic change in allele frequencies given the allele frequencies and selection parameter. Then random mating was modeled by randomly drawing 2N new alleles from a multinomial distribution with allele frequencies given by the values after selection. We began each simulation with the population initially fixed for one allele and waited until a stochastic equilibrium was reached before recording summary statistics. The waiting time depended primarily on the mutation rate and was always <1/u generations.
In the simulations, we recorded the numbers of common alleles at each time. We used Takahata’s (1990) threshold for a common allele, his value ∂ = 1/(4Nsm), where m = F  u/s and F is the homozygosity. We found that the exact value of this threshold made little difference. Figure 1 shows some typical results for the distribution of i, the number of common alleles across replicates, compared to normal distribution with the mean and variance computed from the Markov chain approximation. The fit is close although not perfect. There are slightly more common alleles found than predicted. Figure 2 shows the average number of common alleles as a function of N. These results show that there is little difference between the predictions of Equation 5 and Takahata’s (1990) result (his Equation 5) and that both provide good approximations to the simulated results. Although in Figure 2 the results from the Markov chain are slightly closer to the simulation results, that is not always the case for other combinations of parameters.
POPULATION BOTTLENECK
We can use our method to predict the response to changes in population size. These predictions can be obtained in two ways. The first is to use the solution to the equation that treats i as a deterministic variable with λ_{i} and μ_{i} as the deterministic rates of increase and decrease. That is, we use the equation
The second and more accurate way to predict the effect of a change in population size is to iterate the Markov chain using the timedependent values of λ_{i} and μ_{i}. Figure 3 compares the simulation results with the results from iterating the Markov chain and with the linear approximation, Equation 9.
In many discussions of the effect of a population bottleneck on variability at an overdominant locus, an important question is how rapidly are alleles lost as a result of the bottleneck. Our results support the conclusions of Takahata (1993). Figure 4 shows the probability of loss of an allele per generation, comparing the exact result based on the evaluation of t¯(x) by numerical integration with values based on assuming no selection and values based on Equation 3. We can see that the probability of loss under selection is always smaller than under neutrality, as expected. Overdominant selection retards the loss of common alleles. The analytic approximation, Equation 3, is quite accurate for a reduction in size by a factor of two but not for a larger reduction. The neutral approximation is accurate if the reduction is by a factor of five or more.
Takahata (1993) discusses the quantitative effects of a bottleneck in population size on the number of common alleles and discusses the survival of allelic lineages from before the bottleneck. Our results are consistent with Takahata’s. The probability of loss of a lineage from before the bottleneck to any later time is, as Takahata (1993) argues, the integral of t¯(x) over that time period. Because all common alleles are equivalent in survival probability, the probability distribution of the number of lineages surviving from before the bottleneck is a binomial distribution with the probability being the probability of survival of a single lineage and the sample size being the number of lineages immediately before the bottleneck. The distribution of the number of new lineages is then obtained by subtraction.
Both Takahata’s (1993) and our models of a bottleneck in population size assume a diffusion limit in which at most one common allele can be lost per generation. That assumption is appropriate if a bottleneck is not too sudden or small, and it appears to be appropriate for modeling the history of human populations. If the reduction in population size is extreme and rapid as might happen in the colonization of an island or other isolated region, then many alleles could be lost in a single generation. In that case, a single generation of random sampling would have to be included in the analysis to allow for the loss of several alleles in one generation.
These results support the discussions of Takahata (1990) and Ayala (1995) that for a bottleneck to lead to a substantial reduction in the number of alleles the ratio of the length of time during which the bottleneck occurs and the size during the bottleneck must exceed one, provided that the reduction in size is sufficiently great that the neutral approximation can be used. When the neutral approximation is not valid, the rate of loss is somewhat smaller and more alleles survive the bottleneck than expected on the basis of neutral theory. Although the differences between the rates of loss for neutral and overdominant alleles appear small in Figure 4, the rate for neutral alleles is two to three times that for overdominant alleles.
We have not been able to obtain a simple criterion for the ranges of parameter values for which the asymptotic approximation for strong selection, Equation 4, applies and when the asymptotic approximation for neutral alleles applies. The exact result depends in a complex way on i. The integral is easy to evaluate using a mathematical algebra program, so that is the best way to determine whether either of the asymptotic approximations can be used in any case.
TWO SELECTIVE CLASSES OF ALLELES
Our analysis of overdominant selection follows a long tradition of assuming equal fitnesses of all homozygotes and all heterozygotes. Even that case has proved difficult enough to analyze. The method we have developed allows us to test the robustness of results based on assuming homogeneity of selection on heterozygotes and homozygotes. We explore the possibility that there is heterogeneity in selection and show that relatively small differences in fitness can lead to large differences from the results previously obtained. We use a fitness scheme that leads to relatively simple algebra, but other assumptions about heterogeneity in fitness can be analyzed in the same way. Satta (1997) carried out computer simulations of a different model of heterogeneity in fitness (her asymmetric balancing selection model), in which the relative fitness of a heterozygote increased with the number of codons that differed between two alleles.
We assume that there are two classes of alleles, with i different alleles in class A and j different alleles in class a. The relative fitnesses are
We first investigate the deterministic theory of this selection model. We let x be the equilibrium frequency of each allele in class A and y be the equilibrium frequency of each allele in class a (ix + jy = 1). It is relatively easy to show that at equilibrium
Using the same approach as in the previous section, we assume that the population contains i class A alleles, each in frequency x, and j class a alleles, each in frequency y, and then find the change in frequency of a new mutant of each type. It is straightforward to show that if z_{A} is the frequency of a new class A mutant and z_{a} is the frequency of a new class a mutant, then
If the population size is constant, we can estimate the equilibrium numbers of alleles in both classes by solving the pair of equations,
We tested the accuracy of the Markov chain approximation for the case of two classes of alleles. The simulation program was the same as described above except for the modifications necessary to account for two classes of alleles. As shown in Figure 5, the average numbers of alleles are as predicted by the analytic theory.
If the population is constant in size, the average time to loss of alleles in the two classes is approximately the average allele age. The average ages of class A and a alleles can be quite different, even for relatively small values of the ratio r/s. By substituting into (3), we find
Figure 6 shows the ratio of average ages as a function of r/s. Figure 6 also shows the ratio of allele ages found in the simulations. For relatively small values of r/s the Markov chain accurately predicts the ratio of average ages (and both average ages as well), but for larger r/s the ratio of ages is even larger than predicted because alleles in the a class are so rare that their loss is no longer slowed by overdominant selection. Note that even very small values of r/s can result in ratios of average ages of five or larger.
DISCUSSION
We have shown that the SWMM approximation of Gillespie (1984, 1991) provides a simple, accurate way to predict the numbers of common alleles when there is strong overdominant selection. This approach allows us to make simple predictions about the loss of alleles during a bottleneck in population size and to explore the consequences of slight fitness differences among overdominant alleles. The cases we examined were chosen to illustrate the effects of varying population size and heterogeneous selection in simple contexts but the method can be easily generalized to provide approximate results for more complex and realistic cases. The success of the method depends only on the decoupling of the dynamics of new mutations from the dynamics of common alleles.
Our results are of relevance to the problem of understanding the effects of a bottleneck in population size on the number of alleles at a locus such as those in the major histocompatibility complex (MHC) in humans and other mammals. Many loci in the MHC are highly polymorphic and the high level of polymorphism is thought to be maintained by overdominance in fitness (Hughes and Nei 1988). Reduced levels of variation in the MHC can be caused by a population bottleneck (Ayala 1995; Ellegrenet al. 1996; Milleret al. 1997, Weninket al. 1998). In general, more overdominant alleles will survive a bottleneck than would be expected on the basis of neutral theory, but the neutral approximation will accurately predict the initial rate of loss of overdominant alleles after a large reduction in population size (Takahata 1993).
Our results are also of relevance to discussions of the relative ages of overdominant alleles. Under Takahata’s (1990) theory, common alleles follow a coalescent model with a rescaled population size. According to that theory, the relative ages of alleles follow a geometric distribution. Our results with two selective classes of alleles show that even very slight heterogeneity in selective effects of different alleles can lead to large deviations from a geometric distribution. Ratios of ages of 5 or 10 are easily obtained when there is much less than a 1% difference in relative fitnesses. Slight differences in relative fitness are amplified by the long persistence time of both classes of alleles. Our model of two selective classes is not the only one possible; a similar approach could be used for three or more selective classes. The result for two selective classes tells us that, unfortunately, pleiotropic effects far too small to measure could cause substantial deviations from the coalescent approximation of Takahata (1990).
Acknowledgments
This research was supported in part by U.S. Public Health Service grant R01GM40282 to M.S. C.A.M. also received support from U.S. Public Health Service grant T32GM0712724. We thank J. Gillespie and M. Turelli for helpful discussions of this topic, M. Uyenoyama for telling us about Sasaki’s previous work on the SSWM approximation, and A. Sasaki for providing additional information about his Ph.D. dissertation.
Footnotes

Communicating editor: N. Takahata
 Received November 16, 1998.
 Accepted February 26, 1999.
 Copyright © 1999 by the Genetics Society of America