Abstract
We consider a diffusion model with neutral alleles whose population size is fluctuating randomly. For this model, the effects of fluctuation of population size on the effective size are investigated. The effective size defined by the equilibrium average heterozygosity is larger than the harmonic mean of population size but smaller than the arithmetic mean of population size. To see explicitly the effects of fluctuation of population size on the effective size, we investigate a special case where population size fluctuates between two distinct states. In some cases, the effective size is very different from the harmonic mean. For this concrete model, we also obtain the stationary distribution of the average heterozygosity. Asymptotic behavior of the effective size is obtained when the population size is large and/or autocorrelation of the fluctuation is weak or strong.
THE average heterozygosity has been one of the most frequently used measures of genetic diversity. A large amount of data have been accumulated to estimate the average heterozygosity of various species using protein electrophoresis (see, for example, Nevo 1978; Nei and Graur 1984; Hamrick and Godt 1990). Recent developments in molecular techniques such as randomly amplified polymorphic DNA (Williamset al. 1990) and amplified fragment length polymorphism (Voset al. 1995) enable us to carry out estimation of the average heterozygosity at the DNA level by randomly sampling many short sequences in the genome and examining their variation. Using the methods such as those developed by Clark and Lanigan (1993), Lynch and Milligan (1994), and Innan et al. (1999), it is now possible to estimate genomewide heterozygosity fairly easily using these techniques (e.g., Miyashitaet al. 1999).
One of the reasons that heterozygosity has been used for measuring genetic diversity is a simple relationship between its expectation and population genetic parameters under the neutrality assumption (Kimura 1968). For example, if we assume the infinite allele model in a haploid population of constant size N, the expected heterozygosity is expressed as H = 2Nu/(1 + 2Nu), where u is the mutation rate (Kimura and Crow 1964). Since H is a monotone increasing function of Nu, one can obtain information on Nu from estimates of H.
The size of the population is, however, hardly constant and it may fluctuate from generation to generation. In such cases, it is necessary to understand how fluctuation of population size would affect genetic diversity and summary statistics such as H. For this end, the effects of fluctuation of population size on the effective size of population N_{e} must be clarified and a representation for N_{e} must be obtained. By this representation, the expected heterozygosity may be expressed as H = 2N_{e}u/(1 + 2N_{e}u) and it shows how the fluctuation of population size affects H.
Fluctuation of population size is not independent from generation to generation in general as in the case of stochastic selection (Gillespie 1972; Takahataet al. 1975; Gillespie and Guess 1978; Iizuka and Matsuda 1982; Seno and Shiga 1984; Iizuka 1987). In other words, the fluctuation of population size is most likely autocorrelated. In the literature, the effective size is said to be equal to the harmonic mean of the population size when population sizes are not constant (Wright 1938; Crow 1954; Neiet al. 1975; Gillespie 1998). Iizuka (2001) showed, however, that the effective size is not equal to the harmonic mean unless the fluctuation of population size is uncorrelated for the WrightFisher model with fluctuating population size. To obtain this result however, no mutation is assumed and the population size is fluctuating between two distinct states. These assumptions are restrictive. It is important to include the effect of mutation to investigate the influence of fluctuation of population size on genetic diversity of population. Further, twostate models may be very special and some of their conclusions do not hold for the general case. It is necessary to see whether or not the effective size is different from the harmonic mean for the general pattern of the fluctuation of population size.
In this article, we consider the diffusion model with neutral K (2 ≤ K ≤ ∞) alleles whose population size is fluctuating randomly, incorporating the effect of mutation. First, we consider a general case with respect to the fluctuation of population size and we show that the effective size defined by the equilibrium average heterozygosity is larger than the harmonic mean of population size but smaller than the arithmetic mean of population size. Then we consider a special case of a twovalued Markov chain as a model of the fluctuation of population size. This simplification enables us to obtain explicit formulas for the stationary distribution of the average heterozygosity and the effective size. We can see quantitatively how the effective size is different from the harmonic mean using the latter formula.
NEUTRAL MODEL WITH FLUCTUATING POPULATION SIZE
Before we introduce the fluctuation of population size, we summarize some of the known results on the constant population model (Crow and Kimura 1970; Ewens 1979). We consider a neutral locus with K alleles A_{1}, A_{2},..., A_{K} in a randomly mating haploid population of constant size N. The mutation rate from A_{i} to all the other alleles is u per generation. Mutation occurs from A_{i} to A_{j} with the rate u/(K  1) per generation (i, j = 1, 2,..., K, j ≠ i). Under the diffusion approximation (diffusion model), let x_{i}(t) be the gene frequency of A_{i} at time t. We denote by E˜[·] the operation taking the expectation with respect to the random sampling drift. Then the average heterozygosity
A large amount of ecological data suggest that numbers of individuals in natural populations fluctuate considerably in each epoch and from generation to generation (Elton and Nicholson 1942; Andrewartha and Birch 1954; Odum 1959). The variations in population size are influenced by such factors as climate, the abundance of available resources, fluctuation in preypredator balance, and competition with other species using the same habitat (Nicholson 1957). In addition to those shortterm changes demonstrated by ecological data, longterm changes of population size have been inferred from past climate and fossil data. It is well known that there were at least seven glacial and interglacial cycles with a period of ∼100,000 years in the last 700,000 years. Organismal populations were thought to have responded to such climate shifts by changing their habitats (Webb and Bartlein 1992). For example, many plant and animal species retreated to a few refugia in the southern parts of Europe during the last glacial period (Bennett 1997; Hewitt 2000). Although these climate changes have strong cyclic components, biotic responses have many stochastic elements due to existence of physical barriers and species interaction. Thus, many species are thought to have experienced longterm stochastic changes of population size. Causes for longterm changes are not restricted to glacial cycles. Longerterm climate changes such as those in the last 3 million years (Webb and Bartlein 1992) and mountain building are among those. Thus, it is important to investigate fluctuation of population size in a general setting.
Now we consider the cases when population size fluctuates and let N(t) be the size of a haploid population at time t. In this article, we assume that {N(t)}_{∞<}_{t}_{<∞} is a stationary stochastic process that does not depend on gene frequencies {(x_{1}(t), x_{2}(t),..., x_{K}_{1}(t))}_{t}_{≥0}. In other words, the stochastic process that governs the change in population size is independent of the genetic structure of the population. We consider a diffusion model whose population size at time t is N(t) (for the precise meaning of this model, see appendix a). This model is referred to as the neutral diffusion model with fluctuating population size and the case of K = ∞ is referred to as the infinite allele model with fluctuating population size. For this model, the average heterozygosity H(t) satisfies
Let
TWOVALUED MARKOV CHAIN MODEL
To see how the effective size of population N_{e} depends on the probability law of {N(t)}_{∞<}_{t}_{<∞} and to what extent N_{e} is different from N_{h} and N_{a}, we consider a special case of a continuous time twovalued Markov chain for {N(t)}_{∞<}_{t}_{<∞}. Let {N(t)}_{∞<}_{t}_{<∞} be a Markov chain on {N_{1}, N_{2}} such that
Let p(h, N_{i}) and p(h) be the stationary probability density functions of (H(t), N(t)) and H(t), respectively (i = 1, 2). Applying the results of Matsuda and Ishii (1981), we have
Now we have an explicit expression for N_{e}. By (11) and (29), we have
Noting that
The size of population N(t) may be very large in natural populations. Further, the autocorrelation of {N(t)}_{∞<}_{t}_{<∞} may be very weak or strong. In such cases, we can consider the asymptotic behavior of N_{e}. For this end, we parameterize N_{1}, N_{2}, N_{e}, N_{h}, N_{a}, γ_{1}, γ_{2}, γ, V, u, and R_{K} by ε (ε → 0) such as
The WrightFisher model with fluctuating population size is investigated by Iizuka (2001). Let N^{(}^{k}^{)} be the size of haploid population in generation k, where {N^{(}^{k}^{)}}_{k}_{=0,±1,±2,...} is a twovalued Markov chain. This model is defined as the WrightFisher model with no mutation and no selection whose population size in generation k is N^{(}^{k}^{)}. For this model, the effective size
The WrightFisher model with fluctuating population size is more fundamental than the diffusion model with fluctuating population size since the stochastic effect of fluctuation of population size is introduced after the diffusion approximation for the latter model. It seems not to be easy, however, to incorporate the effect of mutation into the WrightFisher model with fluctuating population size. It is easy to incorporate the effect of mutation into the diffusion model with fluctuating population size since the differential equation for H(t) is linear. Furthermore, we can consider a very general pattern of fluctuation of population size for the diffusion model with fluctuating population size as we have shown in this article.
DISCUSSION
Here, we discuss some biological relevance of our results. Suppose that we are interested in effects of selection on genetic variation in a species. Effects of weak selection depend on population size (Ohta 1973, 1992). Thus, it is important to obtain information on population size although this is usually very difficult by nongenetic means (see Bassetet al. 2001 for the problems that have to be dealt with by researchers when trying to estimate effective size using demographic parameters). One of the easiest things we can do is to estimate the effective size defined here by measuring the average heterozygosity at neutral loci such as those of pseudogenes. Then, we can guess what variation pattern would be expected for alleles with a selection coefficient s. In fact, effects of selection depend not only on the effective size but also on the details of how population size changes (see Ohta 1997a, 1998). For example, the behavior of Tajima’s (1989) D as a function of the intensity of selection is very different if the change rate of population size differs with the effective size being kept constant (Tachida 2000). Nevertheless, we can use (39) to know what parameter combinations lead to the effective population size under the assumption of the twostate model and then examine effects of selection on the basis of this information. Although the twostate model is unrealistic and we need to extend theoretical studies for more general cases, at least we can obtain a rough idea as to how selection affects genetic variation in the species by measuring the effective population size. For the inference of the mechanism of molecular evolution under fluctuating population size, see also Araki and Tachida (1997) and Ohta (1997b).
Tachida (1985) developed a method to calculate the probabilities that two neutral genes taken at random from a population have certain allelic states, which is called the joint frequencies of alleles (see also Griffiths 1981). Using (29), we can extend this method to the case where population size is fluctuating. Let q(k) be the probability that two neutral genes taken at random from a population have k mutations since they diverged from their most recent common ancestor. Then the probability generating function of q(k)
APPENDIX A
Let f(x_{1}(t), x_{2}(t),..., x_{K}_{1}(t)) be an arbitrary function of gene frequencies in the neutral diffusion model. We denote by E˜[·] the operation taking the expectation with respect to the random sampling drift. By the general theory of diffusion processes (see Ewens 1979, pp. 136137, or Karlin and Taylor 1981, pp. 213216), the expectation of f(x_{1}(t), x_{2}(t),..., x_{K}_{1}(t)) satisfies
The neutral diffusion model with fluctuating population size is defined as a diffusion model with a random parameter by replacing N in (A2) with the stationary stochastic process N(t). In other words, this model is defined as a diffusion process in random environments. We have (7) in the same way as we have (2).
APPENDIX B
For a function f(x) with
APPENDIX C
For the twovalued Markov chain model, (7) can be expressed as
By (2.17) of Matsuda and Ishii (1981), the stationary probability density function p(h, N_{i}) of (H(t), N(t)) is given by
Acknowledgments
We thank A. Shimizu for noting the result of Dudley (1989) on a modification of Jensen’s inequality and two anonymous reviewers for valuable comments. M.I. was partially supported by a grantinaid (no. 12640139) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. H.T. was supported in part by a grant from Program for Promotion of Basic Research Activities for Innovative Biosciences (PROBRAIN) and a grant from Uehara Memorial Foundation.
Footnotes

Communicating editor: W. Stephan
 Received August 13, 2001.
 Accepted February 11, 2002.
 Copyright © 2002 by the Genetics Society of America