| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Corresponding author: Valérie Laporte, Animal and Population Biology, University of Edinburgh, W. Mains Rd., Edinburgh EH9 3JT, UK., valerie.laporte{at}ed.ac.uk (E-mail)
Communicating editor: G. B. GOLDING
| ABSTRACT |
|---|
A fast-timescale approximation is applied to the coalescent process in a single population, which is demographically structured by sex and/or age. This provides a general expression for the probability that a pair of alleles sampled from the population coalesce in the previous time interval. The effective population size is defined as the reciprocal of twice the product of generation time and the coalescence probability. Biologically explicit formulas for effective population size with discrete generations and separate sexes are derived for a variety of different modes of inheritance. The method is also applied to a nuclear gene in a population of partially self-fertilizing hermaphrodites. The effects of population subdivision on a demographically structured population are analyzed, using a matrix of net rates of movement of genes between different local populations. This involves weighting the migration probabilities of individuals of a given age/sex class by the contribution of this class to the leading left eigenvector of the matrix describing the movements of genes between age/sex classes. The effects of sex-specific migration and nonrandom distributions of offspring number on levels of genetic variability and among-population differentiation are described for different modes of inheritance in an island model. Data on DNA sequence variability in human and plant populations are discussed in the light of the results.
IN an ideal (Wright-Fisher) population of N breeding adult diploid individuals, the rate of genetic drift is equal to 1/(2N) (![]()
![]()
![]()
= 4 Neµ for the scaled mutation rate, where µ is the neutral mutation rate per nucleotide site (![]()
Population subdivision is a major factor that can influence Ne and has attracted much attention. ![]()
![]()
![]()
The total effective size of a subdivided population can be expressed as a function of the mean within-deme effective population size and of FST (![]()
![]()
![]()
![]()
In particular, the inbreeding effective size of a subdivided population determines the asymptotic rate of the increase of the probability of identity of a pair of alleles sampled randomly from the population at large. The variance effective size determines the variance of the change in allele frequency across a generation for genes drawn from the population as whole. With constant population sizes across generations, these two parameters converge to the same value (![]()
(![]()
![]()
These measures of effective population size were originally derived from recursion equations for identity probabilities of pairs of alleles or variances in allele frequencies (reviewed by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
There is, however, increasing interest in comparing the patterns of genetic variation and population differentiation for genes with different modes of inheritance (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Here we first present a general framework for studying the pairwise coalescent times and identity probabilities of genes with different modes of inheritance in a metapopulation consisting of demes connected by an arbitrary migration scheme. Individuals within each deme are divided into different age and sex classes, with different demographic properties and migration probabilities. By using the "fast-timescale" approximation of ![]()
| A GENERAL MODEL OF MIGRATION AND DRIFT IN STRUCTURED POPULATIONS |
|---|
We assume that individuals within a local population (deme) can be classified into different classes, e.g., with respect to age and/or sex. Let fijrs be the probability of identity of a pair of genes sampled from an individual of class r in deme i and an individual of class s in deme j. We can write mijru for the probability that a gene sampled from an individual of class r in deme i originated from an individual of class u in deme j.
When the probability per time interval of mutation of an allele to a new state, µ, is independent of allelic state, class, and deme, fijrs is equivalent to the moment-generating function for the time to coalescence of a pair of genes of the specified type, with parameter -2µ (![]()
![]()
Assuming that different individuals migrate independently of each other, and using the standard approach to modeling a geographically structured population (![]()
![]()
![]() |
(1) |
where Pijkrsuv is the probability of coalescence over one time interval of two genes sampled from individuals of class r in deme i and s in deme j, derived from individuals from deme k who belong to classes u and v, respectively (note that
unless u = v).
For i = j and r = s, and if migration involves individuals rather than gametes, we should strictly include a term on the right-hand side that covers the case when the two genes sampled come from the same migrant individual. The approximations leading to Equation 3 mean that this is unnecessary, since ignoring it introduces only a second-order term into the final result. (This is because the chance that a pair of genes from the same deme are sampled from the same individual is of the order of the reciprocal of the deme size, and we ignore products of this and the migration probabilities.)
Writing hijrs = 1 - fijrs for the probability of nonidentity of a pair of alleles, Equation 1 can conveniently be rewritten as
![]() |
(2) |
To make further progress, we assume that migration rates between demes are of the same order as the probabilities that two genes sampled from the same deme coalesce over one time unit and that both are sufficiently small that second-order terms in these, as well as mutation rates, can be neglected. This assumption is commonly made in applications of coalescent theory (![]()
![]()
To this order of approximation, the last term in braces on the right-hand side of (2) is nonzero only when i = j = k, corresponding to the case of two genes drawn from the same deme,
![]() |
(3) |
where
ij is the Kronecker delta (
ij = 1 when i = j and is otherwise 0); Pirsu is the probability that two genes sampled from deme i from individuals belonging to classes r and s coalesce in an individual of class u in the previous time interval, treating the deme as a closed population.
To obtain manageable results, we simplify further by removing the dependence of hijrs on r and s. We can average hijrs values over classes within demes by using a weight of
ir for the contribution from class r in the ith deme.
ir is chosen to be equal to the rth element of the left leading eigenvector of the matrix that describes the flow of genes among classes within deme i, scaled such that the elements of this eigenvector sum to one. This element represents the stationary-state probability that a randomly sampled gene from deme i originates from class r (![]()
ir gives an accurate approximation to the probability of origin of the sampled gene (![]()
![]()
We can then conveniently define the net probability of origin from deme k of a gene sampled from deme i, mik, as
![]() |
(4) |
Substituting into Equation 3 and summing
ir
jshijrs over r and s, we obtain
![]() |
(5) |
where Pi is the probability of coalescence of a pair of genes sampled randomly from deme i, such that
![]() |
(6) |
Using the argument of ![]()
![]() |
(7) |
where ti is the generation time of individuals from deme i and Nie is the deme's effective population size. In the case of discrete generations, ti = 1 for all demes.
By differentiating hij with respect to -2µ, we obtain the following expression for the mean coalescent time for a pair of alleles sampled from demes i and j (![]()
![]() |
(8) |
This provides a general set of linear relations that can be used to solve for the Tij, to the assumed order of approximation, for any specific model. Some examples are described below. Higher moments of the distribution of pairwise coalescence times can similarly be obtained by successive differentiation of Equation 5 (![]()
It is also useful to apply the method of ![]()
![]()
![]()
![]()
, whose elements can be assumed to sum to one without loss of generality. Using the argument of ![]()
![]() |
(9) |
where the coalescent time for each deme is weighted by the product of the reciprocal of the coalescence probability and the square of the corresponding component of
. In the discrete generation case, T0 corresponds to twice the "migration effective size" as defined by ![]()
![]()
![]()
| COALESCENT PROBABILITIES AND EFFECTIVE POPULATION SIZES |
|---|
Separate sexes with discrete generations:
The utility of the approach described above to determining the effective population size of a deme can be illustrated by its application to the discrete-generation case with separate sexes, previously treated by ![]()
![]()
From the assumption that the products of coalescence probabilities and migration probabilities are negligible, used to derive (3) and its successors, it is legitimate to treat each deme as a closed population. We have to consider the origins of genes sampled from individuals of all permissible combinations of sexes and parents (which combination is permissible depends on the mode of genetic transmission). For a pair of maternally derived genes sampled from two different females, the probability that they both come from the same mother is approximately
![]() |
(10) |
(![]()
i is the mean number of daughters per female in deme i.
This can be simplified by noting that the Poisson value for the variance in numbers of daughters per female, Viff, is
i, so that in general the deviation of this variance from the Poisson value can be written as
![]() |
(11) |
so that
![]() |
(12) |
Similar expressions can be obtained for other pairs of individuals and their parents (see Table 1).
|
Let ßirsu be the probability that a pair of genes sampled from individuals of sexes r and s in deme i both come from parents of sex u. Let
irsu be the probability that this pair of genes shared a common ancestral allele in the previous generation, given that they have a parent in common; i.e., they coalesce. From Equation 7, the net probability that a randomly chosen pair of genes coalesce in the previous generation is
![]() |
(13) |
where Qirsu is the probability that a pair of genes sampled from individuals of sexes r and s in deme i with a parent of sex u derives from the same parental individual.
Table 2 gives the values of these transmission probabilities for different modes of inheritance; these can be combined with the Q values from Table 1 and substituted into Equation 15, to obtain the final expressions for effective population sizes. Simplified expressions can be obtained when there is a fixed sex ratio among breeding individuals, so that there is a binomial distribution of the proportions of sons and daughters of a given individual, counted at maturity (see the Appendix).
|
For convenience, we henceforth drop the superscripts and subscripts indicating deme identities. Using Table 1 and Table 2, substituting into Equation 15, and noting that the Poisson expectation for variance in total offspring number is 1/(1 - c) for females and 1/c for males, where c is the proportion of males among breeding adults, we obtain the expression for effective population size with autosomal inheritance,
![]() |
(14) |
where
Vf and
Vm are the excesses over Poisson expectation of the total numbers of offspring per capita for female and male parents, respectively; F is the inbreeding coefficient associated with departure from random mating within demes.
For X-linked inheritance with male heterogamety, we have
![]() |
(15) |
For Y-linked inheritance
![]() |
(16) |
With female heterogamety, males and females are interchanged in these expressions.
For maternal inheritance
![]() |
(17) |
Partially self-fertilizing hermaphrodites:
It is also of interest to consider the case of a nuclear gene in a population of partially self-fertilizing hermaphrodites; this is particularly relevant for plants. (Maternally transmitted genes are clearly unaffected by the selfing rate, except as far as changes in the breeding system alter the distribution of numbers of successful offspring produced through seed.) With a Poisson distribution of offspring number for both seed and pollen, the effective size is reduced below that for random mating by a factor of 1/(1 + F), where F is the equilibrium inbreeding coefficient produced by the level of self-fertilization (![]()
![]()
Let there be N breeding individuals in the deme under consideration; the kth individual is assumed to produce dkS progeny through self-fertilized seed that survive to form part of the breeding population next generation, and dkO such progeny through outcrossed seed. In addition, it contributes pk progeny through pollen that fertilizes other individuals. The overall probability that a progeny individual is the product of self-fertilization is then
![]() |
(18) |
where
S and
O are the mean numbers of successful progeny per capita produced by selfed and outcrossed seed, respectively.
Details of the derivation of probabilities of gene origins are given in the Appendix In the case of constant population size, we have
, and
. The expression for the effective population size for a single deme, NeH, is
![]() |
(19a) |
(definitions of the terms in the numerator are given in the Appendix).
If the number of successful offspring through selfed seed, conditioned on the total number of successful offspring produced through seed, can be regarded as a binomial variate with parameter S (i.e., there is the same expected selfing rate for each individual), this simplifies further to
![]() |
(19b) |
In the limit of complete self-fertilization, the contributions from outcrossing pollen vanish, and F = 1, so that
![]() |
(20) |
A model is presented in the Appendix for analyzing the case of intermediate selfing rates. The overall conclusion is that non-Poisson variation in reproductive capacity may not greatly alter the standard result that Ne for a selfing population is reduced by a factor of 1/(1 + F), relative to the value for an outcrossing population (![]()
| THE ISLAND MODEL WITH SEX-SPECIFIC MIGRATION PARAMETERS |
|---|
In this section, we apply the above principles to the simple case of an island model (![]()
![]()
![]()
In plant species, male gametes can disperse through pollen but female gametes cannot disperse, so that only mm and mz can be nonzero. For animals, migration of male and female gametes occurs via dispersal of males and unmated females; dispersal of zygotes occurs when females migrate after mating but before laying eggs or giving birth. All three migration parameters can therefore be nonzero.
Using the argument leading to Equation 4, we can then define the net migration rates for different modes of inheritance (autosomal, X-linked, Y-linked, and cytoplasmic) as mA, mX, mY, and mC, respectively. We have
![]() |
(21a) |
![]() |
(21b) |
![]() |
(21c) |
![]() |
(21d) |
Coalescence times and expected nucleotide site diversities:
Results for mean pairwise coalescence times in the island model can be obtained by substituting these expressions into standard formulas (![]()
![]()
![]() |
(22a) |
where NeG is the appropriate effective population size for a single deme.
Using Equation 8, or the argument of ![]()
![]() |
(22b) |
where mG is the migration rate defined by the relevant choice of Equation 21aEquation 21bEquation 21cEquation 21d for mode of inheritance G.
The expected coalescence time for a pair of genes sampled randomly from the set of populations is
![]() |
(22c) |
The expected pairwise nucleotide site diversity under the infinite-sites model is given by the product of coalescence time and the mutation rate µG for the given mode of inheritance (![]()
SG, and from the population as a whole,
TG, are given, respectively, by
![]() |
(23a) |
![]() |
(23b) |
Mutation rates may differ among genetic systems because of differences in mutation rates between males and females (![]()
![]()
![]()
values (![]()
The absolute magnitude of between-population subdivision can be measured by the difference between the nucleotide diversity for the population as a whole and the mean for a pair of alleles sampled from the same deme (![]()
![]()
![]() |
(24a) |
With a large number of demes, this yields the familiar result of ![]()
![]() |
(24b) |
We now consider the application of these formulas to some specific examples. We focus initially on the effects of population subdivision, assuming a 1:1 sex ratio, Poisson variances in fertility for both sexes, and equal mutation rates for males, females, and different chromosomes. Discrete generations and a deme size of N breeding adults for all demes are also assumed. In this case, the ratios of effective population sizes and within-deme diversities for different modes of inheritance G and G' (rGG') are simply equal to the ratios of the respective numbers of gene copies in the population: rXA = 3/4, rXY = rXC = 3, rAY = rAC = 4, the same values as for panmixia, which are often quoted in the literature on molecular population genetics.
The effects of sex-specific migration and population subdivision in dioecious plants:
In Fig 1, we plot the values of FST for nuclear and cytoplasmic genes and the ratios of the total diversities for different modes of inheritance, as functions of the total number of migrants per generation (give by the sum of pollen and seed migration rates times the deme size). Male heterogamety is assumed.
|
As expected, reduced migration increases FST for all modes of inheritance. However, the rate of increase of FST as the amount of migration decreases is different for different modes of inheritance and also depends on the mode of migration. For instance, with equal pollen and seed migration rates, FST,Y and FST,C increase faster than FST, A and FST, X (Fig 1A). These differences result from differences in the numbers of effective migrants, due to two factors: (i) different effective population sizes and (ii) different gene migration rates. The effective number of X chromosomal migrants is lower than the effective number of autosomal migrants, due to a lower effective population size of the X chromosomes (3N/4) as well as its lower migration rate (mz + 1/3mm vs. mz + 1/2mm). The effective number of Y chromosomal migrants is lower than both the effective number of chromosomal X or autosomal migrants, due to the lower effective population size of the Y chromosome (N/4), which is not compensated for by its higher migration rate (mz + mm) relative to the X chromosome. A similar argument applies to cytoplasmic genes.
Population subdivision also modifies the ratios of the total diversities for different modes of inheritance: the ratios
TX/
TY(RXY) and
TA/
TY(RAY) decrease sharply with reduced gene flow and
TX/
TA(RXA) increases slightly (Fig 1D). From Equation 23aEquation 23b, it is easily seen that, in the limit when the migration rates tend toward zero, the R values are equal to the reciprocals of the ratios of the respective migration rates given by Equation 21a HREF="#FD21b">Equation 21bEquation 21cEquation 21d, since the right-hand side is dominated by the term involving the reciprocal of the product of effective size and migration rate. In this case, the limiting total diversities for autosomal and X-linked genes are only 1.33- and 1.5-fold higher than for Y-linked loci, respectively, with the diversity for X-linked loci being 1.12 that for autosomal loci. Similarly, the limiting ratio of autosomal to cytoplasmic total diversities becomes 0.67.
Differences between the pollen and seed migration rates also influence the rate of increase in FST. For instance, if migration occurs primarily through pollen (e.g., with mm = 100mz, Fig 1B), FST, A and FST, X increase faster with decreasing migration than when migration occurs equally through pollen and seeds (Fig 1A), or primarily through seeds (mm = 0.01mz, Fig 1C), whereas FST,C increases more slowly, but is higher throughout the range of Nm displayed. This is because X-linked and autosomal genes are haploid in pollen grains but diploid in seeds, and cytoplasmic genes move only through seed. In contrast, the Y chromosome, which is haploid in both dispersal units, is not affected by the relative pollen and seed migration rates. In consequence, the curves relating the diversity ratios to total migration rate are also modified. In particular, when migration occurs primarily through pollen, the increased migration rate of the Y chromosome compared to the X chromosome (mm vs. mz/3) compensates nearly totally for its reduced effective population size, so that RXY is independent of population subdivision (Fig 1E). In contrast, RXA is now slightly more affected by population subdivision, because the difference between the X and autosomal migration rates increases with the pollen migration rate. Hence, in the limit when the migration rates tend toward zero, RXY barely declines below 3, but RAY declines to 2, and RXA increases to 1.5.
The effects of sex-specific migration and population subdivision in animals:
We consider a model of animal migration with up to a sixfold difference between male and female migration rates and no zygotic migration (see above), assuming male heterogamety (for female heterogamety, male and female parameters are interchanged). As compared with the plant case, we can see that the case of predominantly male migration (Fig 2B and Fig E) is similar to the case of predominantly pollen migration in the plant model. The case of equal male and female migration rates is similar to the case of equal pollen and seed migration rates, except that limited migration has a much stronger effect on FSTY relative to FSTA and FSTX (Fig 2A). The effect is even stronger with predominantly female migration where FSTY increases much faster with reduced migration than FSTA or FSTX (Fig 2C). Consequently RAY and RXY show a greater reduction when migration is restricted (Fig 2D and Fig F) than in the plant model (Fig 1D and Fig F). These differences are easily understood since Y-linked genes now experience migration through only 1 unit (male individuals), while they had two in the plant model (pollen and seeds).
|
The effects of sex- or chromosome-specific mutation rates:
Under the infinite-sites model, the mutation rate is so low compared with any realistic migration rate that differences in mutation rates among different modes of inheritance will not affect the relative FST values. Absolute values of diversities will, however, be affected, since the ratios for different modes of inheritance will be multiplied by the ratios of the respective mutation rates. From Equation 23aEquation 23b, this does not affect the shapes of the R values as functions of migration rates, but does affect their heights. For example, if the male mutation rate is higher than the female mutation rate (![]()
![]()
The effects of nonrandom variation in fertility:
It is well known that an increase in the variance of fertility reduces Ne. Sex differences in the distributions of fertility modify the relative Ne values of genes with different modes of inheritance (![]()
![]()
As far as total diversity measures are concerned, population subdivision always counteracts the effect of an increased variance in male fertility by reducing the ratios RXY and RAY, as illustrated in the animal model with large
Vm (Fig 3). Since NeY is reduced by an increased male fertility variance, the effective number of Y migrants is much lower than the number of X or autosomal migrants. As the migration rate decreases, FST,Y now increases much faster than FST,X or FST,A, and the ratios RXY and RAY decrease toward one with equal male and female migration (Fig 3B and Fig E). With predominantly female migration, this effect is manifest even with relatively high migration (Fig 3F). For instance, RXY = 6.2 and RAY = 5.7 with very high migration, but are already halved with as many as 10 effective migrants. With an increased variance in female fertility, the effect of population subdivision depends on both the relative effective population sizes and the relative migration rates for different modes of inheritance. Population subdivision increases RXY (RAY) if NeX·mX < NeY·mY (NeA·mA < NeY·mY). However, this effect occurs only with very restricted migration, as illustrated in the plant model with
Vf = 5 (Fig 4). Moreover, the maximum value of RXY (RAY) is always lower than the expectation for a panmictic population with Poisson distributions of offspring numbers.
|
|
Selection on a nonrecombining genome, such as organelle genomes and the Y chromosome, can further reduce Ne (![]()
|
| COMPARISONS WITH DATA ON NATURAL POPULATIONS |
|---|
In this section, we compare some of the results derived above with data from surveys of DNA sequence variation in populations of humans and plants.
Human populations:
There is a large literature on genetic diversity in humans, and many different types of markers have been employed (including protein polymorphisms, restriction fragment length polymorphisms, microsatellites, Alu insertions, and single-nucleotide polymorphisms). These data have, however, several biases, which limit their utility for our purposes. First, worldwide population structure has rarely been investigated using markers with different modes of inheritance in the same samples (but see ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
|
In Table 3, we present a compilation of FST estimates from worldwide samples of different types of markers, with cytoplasmic, Y, X, and autosomal inheritance. These all provide evidence for population structure, with autosomal markers yielding the lowest mean FST estimates, as expected from the results shown in Fig 2. The estimates obtained from Y chromosome markers are, however, highly variable between studies (0.090.65). Microsatellites often show smaller FST values (0.090.23), as outlined in studies that have compared both kinds of markers (![]()
![]()
![]()
![]()
DNA sequence data have recently been obtained for 4 Y-linked and 15 autosomal loci, using similar worldwide samples (![]()
![]()
![]()
![]()
At present, it is hard to reach firm conclusions about the influence of sex-specific migration vs. differences in effective population sizes on the relative levels of diversity and divergence between human populations for different inheritance modes; there does not, however, seem to be strong evidence for a greatly reduced effective size of the human Y chromosome, in contrast to what is observed in Silene (see below) or in Drosophila (![]()
![]()
![]()
Plant populations:
There are relatively few species of plants with sex chromosomes, and diversity data on nuclear genes with different modes of inheritance are available only for the close relatives S. latifolia and S. dioica (Table 4). In S. dioica, there is only a single polymorphic site on the Y, which is too little to provide any useful information. In S. latifolia, nine polymorphic sites were found on the Y. All nuclear genes display quite strong population structure, with the lowest FST for the autosomal gene and the highest for a Y-linked gene, as expected from the above analyses. The ratio RXY over all demes (RXY = 23) is smaller than the ratio rXY for within-deme diversity (rXY = 29), as expected from the effects of subdivision (Fig 5).
|
It is, however, impossible to reconcile the estimates of FST for all three modes of heredity under a simple island model of population subdivision: given the estimates of FST,A and FST,X, FST,Y is expected to be much higher (and RXY to be much lower) than is observed (Fig 5). Deviations from the model assumed here (e.g., the occurrence of a selective sweep on the Y) might account for these discrepancies, if they are not simply due to sampling error due to the small number of informative sites on the Y.
| DISCUSSION |
|---|
In the first part of this article, we have shown that the use of the fast-timescale approximation (![]()
![]()
By use of a suitable definition of generation time, we can also define the effective size of a population as the reciprocal of twice the product of generation time and the probability of coalescence per time interval (Equation 7). This expression can be used to generate explicit formulas for effective population size with discrete generations and separate sexes, under a variety of different modes of inheritance, and with arbitrary distributions of offspring numbers (Equation 14Equation 15Equation 16Equation 17). The same approach can be used for the case of a nuclear gene in a population of partially self-fertilizing hermaphrodites (Equation 19aEquation 19b and Equation 20). An important conclusion in this case is that the standard formula for Ne for selfing populations (![]()
The approach can also be applied to the standard model of an age-structured population with discrete time intervals, recently revisited by ![]()
![]()
![]()
We also show how to simplify the analysis of the effects of population subdivision of a demographically structured population by defining a migration matrix that describes the net rates of movement of genes between different local populations. This involves weighting the migration probabilities of individuals of a given age-sex class by the contribution of this class to the leading left eigenvector of the matrix describing movements of genes between age and sex classes (Equation 4). This enables the determination of the moment-generating functions for the distributions of coalescent times for pairs of genes sampled from a given pair of populations, under any well-defined migration model (Equation 5), under the standard assumption that migration and drift are both weak evolutionary forces. From these, the expected coalescent times and higher moments can easily be found (![]()
![]()
Under the infinite-sites model, the expected number of nucleotide differences between a pair of alleles sampled from a prescribed pair of populations is equal to the product of the mutation rate and the corresponding expected coalescent time (![]()
![]()
An important general conclusion is that population subdvision makes it very hard to describe the expected level of genetic variability in a population by a single formula such as
= 4Neu. While it is possible to define a simple expression for a weighted mean coalescence time for a pair of alleles sampled from the same deme for a general migration model (Equation 9), this involves both the effective population sizes of all the individual demes and their contributions to the leading left eigenvector of the migration matrix defined by Equation 4. In general, these are unknowable quantities, making it very hard to equate any empirical estimate of the mean within-population nucleotide site diversity to a simple scaled mutation rate parameter. The details of the demography and migration parameters of a species may greatly influence the estimated scaled mutation rate based on unweighted mean within-population nucleotide site diversities, making comparisons between different species difficult to interpret.
The situation is even worse for the nucleotide diversity for a pair of alleles sampled from the population at large, since this is related to the within-deme value by 1/(1 - FST). In general, FST depends in a complex way on migration rates and deme sizes, and simple formulas are available for only a few limiting cases, such as the island model. Only when there is negligible genetic differentiation between local populations can one confidently relate mean nucleotide site diversity to the coalescent time for a randomly sampled pair of alleles and hence to a scaled mutation rate parameter. Many empirical investigations of DNA sequence variation in natural populations do not state explicitly what population parameters are of primary interest and often make little distinction between measures of variation based on whole-population and within-population estimates. More attention to these issues in presenting analyses of data on DNA sequence variability is desirable.
Our investigation of the island model also shows that strong population subdivision, coupled with sex-specific migration rates, can greatly affect the relative values of the expected total genetic diversities for different modes of inheritance and may even reverse some of the patterns expected under panmixia. For example, Fig 3 displays the expected patterns of genetic variability for animal populations with a high variance in male reproductive success. From Fig 3C, it may be seen that predominantly female migration can cause the ratios of autosomal or X-linked variability to Y-linked variability to decline below one with extreme subdivision, in contrast to a value of over four with panmixia. Encouragingly, however, the relative values of expected autosomal and X-linked total population diversities are insensitive to all but the most extreme population subdivision, for both plant and animal models (Fig 1 Fig 2 Fig 3 Fig 4). This suggests that the use of ratios of X-linked to autosomal diversities to make inferences about the strength of sexual selection (![]()
We have confined ourselves to deriving expressions for expected coalescence times and nucleotide site differences between alleles. However, expressions for the distribution of coalescent times for a set of n alleles sampled from a specified set of populations can easily be written down, using the migration probabilities defined by Equation 4 and coalescence probabilities defined by Equation 7 to generate the expectations of competing exponential distributions of waiting times to migration or coalescence events (![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank N. Barton, D. Charlesworth, F. Depaulis, and two anonymous reviewers for their comments on the manuscript. B.C. acknowledges support by the Royal Society and the Engineering and Physics Research Council, and V.L. acknowledges support by the Biotechnology and Biological Sciences Research Council.
Manuscript received April 2, 2002; Accepted for publication June 18, 2002.
| APPENDIX |
|---|