- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Laporte, V.
- Articles by Charlesworth, B.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Laporte, V.
- Articles by Charlesworth, B.
Effective Population Size and Population Subdivision in Demographically Structured Populations
Valérie Laportea and Brian Charlesworthaa Institute of Cell, Animal and Population Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
Corresponding author: Valérie Laporte, Animal and Population Biology, University of Edinburgh, W. Mains Rd., Edinburgh EH9 3JT, UK., valerie.laporte{at}ed.ac.uk (E-mail)
Communicating editor: G. B. GOLDING
| ABSTRACT |
|---|
A fast-timescale approximation is applied to the coalescent process in a single population, which is demographically structured by sex and/or age. This provides a general expression for the probability that a pair of alleles sampled from the population coalesce in the previous time interval. The effective population size is defined as the reciprocal of twice the product of generation time and the coalescence probability. Biologically explicit formulas for effective population size with discrete generations and separate sexes are derived for a variety of different modes of inheritance. The method is also applied to a nuclear gene in a population of partially self-fertilizing hermaphrodites. The effects of population subdivision on a demographically structured population are analyzed, using a matrix of net rates of movement of genes between different local populations. This involves weighting the migration probabilities of individuals of a given age/sex class by the contribution of this class to the leading left eigenvector of the matrix describing the movements of genes between age/sex classes. The effects of sex-specific migration and nonrandom distributions of offspring number on levels of genetic variability and among-population differentiation are described for different modes of inheritance in an island model. Data on DNA sequence variability in human and plant populations are discussed in the light of the results.
IN an ideal (Wright-Fisher) population of N breeding adult diploid individuals, the rate of genetic drift is equal to 1/(2N) (![]()
![]()
![]()
= 4 Neµ for the scaled mutation rate, where µ is the neutral mutation rate per nucleotide site (![]()
Population subdivision is a major factor that can influence Ne and has attracted much attention. ![]()
![]()
![]()
The total effective size of a subdivided population can be expressed as a function of the mean within-deme effective population size and of FST (![]()
![]()
![]()
![]()
In particular, the inbreeding effective size of a subdivided population determines the asymptotic rate of the increase of the probability of identity of a pair of alleles sampled randomly from the population at large. The variance effective size determines the variance of the change in allele frequency across a generation for genes drawn from the population as whole. With constant population sizes across generations, these two parameters converge to the same value (![]()
(![]()
![]()
These measures of effective population size were originally derived from recursion equations for identity probabilities of pairs of alleles or variances in allele frequencies (reviewed by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
There is, however, increasing interest in comparing the patterns of genetic variation and population differentiation for genes with different modes of inheritance (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Here we first present a general framework for studying the pairwise coalescent times and identity probabilities of genes with different modes of inheritance in a metapopulation consisting of demes connected by an arbitrary migration scheme. Individuals within each deme are divided into different age and sex classes, with different demographic properties and migration probabilities. By using the "fast-timescale" approximation of ![]()
| A GENERAL MODEL OF MIGRATION AND DRIFT IN STRUCTURED POPULATIONS |
|---|
We assume that individuals within a local population (deme) can be classified into different classes, e.g., with respect to age and/or sex. Let fijrs be the probability of identity of a pair of genes sampled from an individual of class r in deme i and an individual of class s in deme j. We can write mijru for the probability that a gene sampled from an individual of class r in deme i originated from an individual of class u in deme j.
When the probability per time interval of mutation of an allele to a new state, µ, is independent of allelic state, class, and deme, fijrs is equivalent to the moment-generating function for the time to coalescence of a pair of genes of the specified type, with parameter -2µ (![]()
![]()
Assuming that different individuals migrate independently of each other, and using the standard approach to modeling a geographically structured population (![]()
![]()
![]() |
(1) |
where Pijkrsuv is the probability of coalescence over one time interval of two genes sampled from individuals of class r in deme i and s in deme j, derived from individuals from deme k who belong to classes u and v, respectively (note that
unless u = v).
For i = j and r = s, and if migration involves individuals rather than gametes, we should strictly include a term on the right-hand side that covers the case when the two genes sampled come from the same migrant individual. The approximations leading to Equation 3 mean that this is unnecessary, since ignoring it introduces only a second-order term into the final result. (This is because the chance that a pair of genes from the same deme are sampled from the same individual is of the order of the reciprocal of the deme size, and we ignore products of this and the migration probabilities.)
Writing hijrs = 1 - fijrs for the probability of nonidentity of a pair of alleles, Equation 1 can conveniently be rewritten as
![]() |
(2) |
To make further progress, we assume that migration rates between demes are of the same order as the probabilities that two genes sampled from the same deme coalesce over one time unit and that both are sufficiently small that second-order terms in these, as well as mutation rates, can be neglected. This assumption is commonly made in applications of coalescent theory (![]()
![]()
To this order of approximation, the last term in braces on the right-hand side of (2) is nonzero only when i = j = k, corresponding to the case of two genes drawn from the same deme,
![]() |
(3) |
where
ij is the Kronecker delta (
ij = 1 when i = j and is otherwise 0); Pirsu is the probability that two genes sampled from deme i from individuals belonging to classes r and s coalesce in an individual of class u in the previous time interval, treating the deme as a closed population.
To obtain manageable results, we simplify further by removing the dependence of hijrs on r and s. We can average hijrs values over classes within demes by using a weight of
ir for the contribution from class r in the ith deme.
ir is chosen to be equal to the rth element of the left leading eigenvector of the matrix that describes the flow of genes among classes within deme i, scaled such that the elements of this eigenvector sum to one. This element represents the stationary-state probability that a randomly sampled gene from deme i originates from class r (![]()
ir gives an accurate approximation to the probability of origin of the sampled gene (![]()
![]()
We can then conveniently define the net probability of origin from deme k of a gene sampled from deme i, mik, as
![]() |
(4) |
Substituting into Equation 3 and summing
ir
jshijrs over r and s, we obtain
![]() |
(5) |
where Pi is the probability of coalescence of a pair of genes sampled randomly from deme i, such that
![]() |
(6) |
Using the argument of ![]()
![]() |
(7) |
where ti is the generation time of individuals from deme i and Nie is the deme's effective population size. In the case of discrete generations, ti = 1 for all demes.
By differentiating hij with respect to -2µ, we obtain the following expression for the mean coalescent time for a pair of alleles sampled from demes i and j (![]()
![]() |
(8) |
This provides a general set of linear relations that can be used to solve for the Tij, to the assumed order of approximation, for any specific model. Some examples are described below. Higher moments of the distribution of pairwise coalescence times can similarly be obtained by successive differentiation of Equation 5 (![]()
It is also useful to apply the method of ![]()
![]()
![]()
![]()
, whose elements can be assumed to sum to one without loss of generality. Using the argument of ![]()
![]() |
(9) |
where the coalescent time for each deme is weighted by the product of the reciprocal of the coalescence probability and the square of the corresponding component of
. In the discrete generation case, T0 corresponds to twice the "migration effective size" as defined by ![]()
![]()
![]()
| COALESCENT PROBABILITIES AND EFFECTIVE POPULATION SIZES |
|---|
Separate sexes with discrete generations:
The utility of the approach described above to determining the effective population size of a deme can be illustrated by its application to the discrete-generation case with separate sexes, previously treated by ![]()
![]()
From the assumption that the products of coalescence probabilities and migration probabilities are negligible, used to derive (3) and its successors, it is legitimate to treat each deme as a closed population. We have to consider the origins of genes sampled from individuals of all permissible combinations of sexes and parents (which combination is permissible depends on the mode of genetic transmission). For a pair of maternally derived genes sampled from two different females, the probability that they both come from the same mother is approximately
![]() |
(10) |
(![]()
i is the mean number of daughters per female in deme i.
This can be simplified by noting that the Poisson value for the variance in numbers of daughters per female, Viff, is
i, so that in general the deviation of this variance from the Poisson value can be written as
![]() |
(11) |
so that
![]() |
(12) |
Similar expressions can be obtained for other pairs of individuals and their parents (see Table 1).
|
Let ßirsu be the probability that a pair of genes sampled from individuals of sexes r and s in deme i both come from parents of sex u. Let
irsu be the probability that this pair of genes shared a common ancestral allele in the previous generation, given that they have a parent in common; i.e., they coalesce. From Equation 7, the net probability that a randomly chosen pair of genes coalesce in the previous generation is
![]() |
(13) |
where Qirsu is the probability that a pair of genes sampled from individuals of sexes r and s in deme i with a parent of sex u derives from the same parental individual.
Table 2 gives the values of these transmission probabilities for different modes of inheritance; these can be combined with the Q values from Table 1 and substituted into Equation 15, to obtain the final expressions for effective population sizes. Simplified expressions can be obtained when there is a fixed sex ratio among breeding individuals, so that there is a binomial distribution of the proportions of sons and daughters of a given individual, counted at maturity (see the Appendix).
|
For convenience, we henceforth drop the superscripts and subscripts indicating deme identities. Using Table 1 and Table 2, substituting into Equation 15, and noting that the Poisson expectation for variance in total offspring number is 1/(1 - c) for females and 1/c for males, where c is the proportion of males among breeding adults, we obtain the expression for effective population size with autosomal inheritance,
![]() |
(14) |
where
Vf and
Vm are the excesses over Poisson expectation of the total numbers of offspring per capita for female and male parents, respectively; F is the inbreeding coefficient associated with departure from random mating within demes.
For X-linked inheritance with male heterogamety, we have
![]() |
(15) |
For Y-linked inheritance
![]() |
(16) |
With female heterogamety, males and females are interchanged in these expressions.
For maternal inheritance
![]() |
(17) |
Partially self-fertilizing hermaphrodites:
It is also of interest to consider the case of a nuclear gene in a population of partially self-fertilizing hermaphrodites; this is particularly relevant for plants. (Maternally transmitted genes are clearly unaffected by the selfing rate, except as far as changes in the breeding system alter the distribution of numbers of successful offspring produced through seed.) With a Poisson distribution of offspring number for both seed and pollen, the effective size is reduced below that for random mating by a factor of 1/(1 + F), where F is the equilibrium inbreeding coefficient produced by the level of self-fertilization (![]()
![]()
Let there be N breeding individuals in the deme under consideration; the kth individual is assumed to produce dkS progeny through self-fertilized seed that survive to form part of the breeding population next generation, and dkO such progeny through outcrossed seed. In addition, it contributes pk progeny through pollen that fertilizes other individuals. The overall probability that a progeny individual is the product of self-fertilization is then
![]() |
(18) |
where
S and
O are the mean numbers of successful progeny per capita produced by selfed and outcrossed seed, respectively.
Details of the derivation of probabilities of gene origins are given in the Appendix In the case of constant population size, we have
, and
. The expression for the effective population size for a single deme, NeH, is
![]() |
(19a) |
(definitions of the terms in the numerator are given in the Appendix).
If the number of successful offspring through selfed seed, conditioned on the total number of successful offspring produced through seed, can be regarded as a binomial variate with parameter S (i.e., there is the same expected selfing rate for each individual), this simplifies further to
![]() |
(19b) |
In the limit of complete self-fertilization, the contributions from outcrossing pollen vanish, and F = 1, so that
![]() |
(20) |
A model is presented in the Appendix for analyzing the case of intermediate selfing rates. The overall conclusion is that non-Poisson variation in reproductive capacity may not greatly alter the standard result that Ne for a selfing population is reduced by a factor of 1/(1 + F), relative to the value for an outcrossing population (![]()
| THE ISLAND MODEL WITH SEX-SPECIFIC MIGRATION PARAMETERS |
|---|
In this section, we apply the above principles to the simple case of an island model (![]()
![]()
![]()
In plant species, male gametes can disperse through pollen but female gametes cannot disperse, so that only mm and mz can be nonzero. For animals, migration of male and female gametes occurs via dispersal of males and unmated females; dispersal of zygotes occurs when females migrate after mating but before laying eggs or giving birth. All three migration parameters can therefore be nonzero.
Using the argument leading to Equation 4, we can then define the net migration rates for different modes of inheritance (autosomal, X-linked, Y-linked, and cytoplasmic) as mA, mX, mY, and mC, respectively. We have
![]() |
(21a) |
![]() |
(21b) |
![]() |
(21c) |
![]() |
(21d) |
Coalescence times and expected nucleotide site diversities:
Results for mean pairwise coalescence times in the island model can be obtained by substituting these expressions into standard formulas (![]()
![]()
![]() |
(22a) |
where NeG is the appropriate effective population size for a single deme.
Using Equation 8, or the argument of ![]()
![]() |
(22b) |
where mG is the migration rate defined by the relevant choice of Equation 21aEquation 21bEquation 21cEquation 21d for mode of inheritance G.
The expected coalescence time for a pair of genes sampled randomly from the set of populations is
![]() |
(22c) |
The expected pairwise nucleotide site diversity under the infinite-sites model is given by the product of coalescence time and the mutation rate µG for the given mode of inheritance (![]()
SG, and from the population as a whole,
TG, are given, respectively, by
![]() |
(23a) |
![]() |
(23b) |
Mutation rates may differ among genetic systems because of differences in mutation rates between males and females (![]()
![]()
![]()
values (![]()
The absolute magnitude of between-population subdivision can be measured by the difference between the nucleotide diversity for the population as a whole and the mean for a pair of alleles sampled from the same deme (![]()
![]()
![]() |
(24a) |
With a large number of demes, this yields the familiar result of ![]()
![]() |
(24b) |
We now consider the application of these formulas to some specific examples. We focus initially on the effects of population subdivision, assuming a 1:1 sex ratio, Poisson variances in fertility for both sexes, and equal mutation rates for males, females, and different chromosomes. Discrete generations and a deme size of N breeding adults for all demes are also assumed. In this case, the ratios of effective population sizes and within-deme diversities for different modes of inheritance G and G' (rGG') are simply equal to the ratios of the respective numbers of gene copies in the population: rXA = 3/4, rXY = rXC = 3, rAY = rAC = 4, the same values as for panmixia, which are often quoted in the literature on molecular population genetics.
The effects of sex-specific migration and population subdivision in dioecious plants:
In Fig 1, we plot the values of FST for nuclear and cytoplasmic genes and the ratios of the total diversities for different modes of inheritance, as functions of the total number of migrants per generation (give by the sum of pollen and seed migration rates times the deme size). Male heterogamety is assumed.
|
As expected, reduced migration increases FST for all modes of inheritance. However, the rate of increase of FST as the amount of migration decreases is different for different modes of inheritance and also depends on the mode of migration. For instance, with equal pollen and seed migration rates, FST,Y and FST,C increase faster than FST, A and FST, X (Fig 1A). These differences result from differences in the numbers of effective migrants, due to two factors: (i) different effective population sizes and (ii) different gene migration rates. The effective number of X chromosomal migrants is lower than the effective number of autosomal migrants, due to a lower effective population size of the X chromosomes (3N/4) as well as its lower migration rate (mz + 1/3mm vs. mz + 1/2mm). The effective number of Y chromosomal migrants is lower than both the effective number of chromosomal X or autosomal migrants, due to the lower effective population size of the Y chromosome (N/4), which is not compensated for by its higher migration rate (mz + mm) relative to the X chromosome. A similar argument applies to cytoplasmic genes.
Population subdivision also modifies the ratios of the total diversities for different modes of inheritance: the ratios
TX/
TY(RXY) and
TA/
TY(RAY) decrease sharply with reduced gene flow and
TX/
TA(RXA) increases slightly (Fig 1D). From Equation 23aEquation 23b, it is easily seen that, in the limit when the migration rates tend toward zero, the R values are equal to the reciprocals of the ratios of the respective migration rates given by Equation 21a HREF="#FD21b">Equation 21bEquation 21cEquation 21d, since the right-hand side is dominated by the term involving the reciprocal of the product of effective size and migration rate. In this case, the limiting total diversities for autosomal and X-linked genes are only 1.33- and 1.5-fold higher than for Y-linked loci, respectively, with the diversity for X-linked loci being 1.12 that for autosomal loci. Similarly, the limiting ratio of autosomal to cytoplasmic total diversities becomes 0.67.
Differences between the pollen and seed migration rates also influence the rate of increase in FST. For instance, if migration occurs primarily through pollen (e.g., with mm = 100mz, Fig 1B), FST, A and FST, X increase faster with decreasing migration than when migration occurs equally through pollen and seeds (Fig 1A), or primarily through seeds (mm = 0.01mz, Fig 1C), whereas FST,C increases more slowly, but is higher throughout the range of Nm displayed. This is because X-linked and autosomal genes are haploid in pollen grains but diploid in seeds, and cytoplasmic genes move only through seed. In contrast, the Y chromosome, which is haploid in both dispersal units, is not affected by the relative pollen and seed migration rates. In consequence, the curves relating the diversity ratios to total migration rate are also modified. In particular, when migration occurs primarily through pollen, the increased migration rate of the Y chromosome compared to the X chromosome (mm vs. mz/3) compensates nearly totally for its reduced effective population size, so that RXY is independent of population subdivision (Fig 1E). In contrast, RXA is now slightly more affected by population subdivision, because the difference between the X and autosomal migration rates increases with the pollen migration rate. Hence, in the limit when the migration rates tend toward zero, RXY barely declines below 3, but RAY declines to 2, and RXA increases to 1.5.
The effects of sex-specific migration and population subdivision in animals:
We consider a model of animal migration with up to a sixfold difference between male and female migration rates and no zygotic migration (see above), assuming male heterogamety (for female heterogamety, male and female parameters are interchanged). As compared with the plant case, we can see that the case of predominantly male migration (Fig 2B and Fig E) is similar to the case of predominantly pollen migration in the plant model. The case of equal male and female migration rates is similar to the case of equal pollen and seed migration rates, except that limited migration has a much stronger effect on FSTY relative to FSTA and FSTX (Fig 2A). The effect is even stronger with predominantly female migration where FSTY increases much faster with reduced migration than FSTA or FSTX (Fig 2C). Consequently RAY and RXY show a greater reduction when migration is restricted (Fig 2D and Fig F) than in the plant model (Fig 1D and Fig F). These differences are easily understood since Y-linked genes now experience migration through only 1 unit (male individuals), while they had two in the plant model (pollen and seeds).
|
The effects of sex- or chromosome-specific mutation rates:
Under the infinite-sites model, the mutation rate is so low compared with any realistic migration rate that differences in mutation rates among different modes of inheritance will not affect the relative FST values. Absolute values of diversities will, however, be affected, since the ratios for different modes of inheritance will be multiplied by the ratios of the respective mutation rates. From Equation 23aEquation 23b, this does not affect the shapes of the R values as functions of migration rates, but does affect their heights. For example, if the male mutation rate is higher than the female mutation rate (![]()
![]()
The effects of nonrandom variation in fertility:
It is well known that an increase in the variance of fertility reduces Ne. Sex differences in the distributions of fertility modify the relative Ne values of genes with different modes of inheritance (![]()
![]()
As far as total diversity measures are concerned, population subdivision always counteracts the effect of an increased variance in male fertility by reducing the ratios RXY and RAY, as illustrated in the animal model with large
Vm (Fig 3). Since NeY is reduced by an increased male fertility variance, the effective number of Y migrants is much lower than the number of X or autosomal migrants. As the migration rate decreases, FST,Y now increases much faster than FST,X or FST,A, and the ratios RXY and RAY decrease toward one with equal male and female migration (Fig 3B and Fig E). With predominantly female migration, this effect is manifest even with relatively high migration (Fig 3F). For instance, RXY = 6.2 and RAY = 5.7 with very high migration, but are already halved with as many as 10 effective migrants. With an increased variance in female fertility, the effect of population subdivision depends on both the relative effective population sizes and the relative migration rates for different modes of inheritance. Population subdivision increases RXY (RAY) if NeX·mX < NeY·mY (NeA·mA < NeY·mY). However, this effect occurs only with very restricted migration, as illustrated in the plant model with
Vf = 5 (Fig 4). Moreover, the maximum value of RXY (RAY) is always lower than the expectation for a panmictic population with Poisson distributions of offspring numbers.
|
|
Selection on a nonrecombining genome, such as organelle genomes and the Y chromosome, can further reduce Ne (![]()
|
| COMPARISONS WITH DATA ON NATURAL POPULATIONS |
|---|
In this section, we compare some of the results derived above with data from surveys of DNA sequence variation in populations of humans and plants.
Human populations:
There is a large literature on genetic diversity in humans, and many different types of markers have been employed (including protein polymorphisms, restriction fragment length polymorphisms, microsatellites, Alu insertions, and single-nucleotide polymorphisms). These data have, however, several biases, which limit their utility for our purposes. First, worldwide population structure has rarely been investigated using markers with different modes of inheritance in the same samples (but see ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
|
In Table 3, we present a compilation of FST estimates from worldwide samples of different types of markers, with cytoplasmic, Y, X, and autosomal inheritance. These all provide evidence for population structure, with autosomal markers yielding the lowest mean FST estimates, as expected from the results shown in Fig 2. The estimates obtained from Y chromosome markers are, however, highly variable between studies (0.090.65). Microsatellites often show smaller FST values (0.090.23), as outlined in studies that have compared both kinds of markers (![]()
![]()
![]()
![]()
DNA sequence data have recently been obtained for 4 Y-linked and 15 autosomal loci, using similar worldwide samples (![]()
![]()
![]()
![]()
At present, it is hard to reach firm conclusions about the influence of sex-specific migration vs. differences in effective population sizes on the relative levels of diversity and divergence between human populations for different inheritance modes; there does not, however, seem to be strong evidence for a greatly reduced effective size of the human Y chromosome, in contrast to what is observed in Silene (see below) or in Drosophila (![]()
![]()
![]()
Plant populations:
There are relatively few species of plants with sex chromosomes, and diversity data on nuclear genes with different modes of inheritance are available only for the close relatives S. latifolia and S. dioica (Table 4). In S. dioica, there is only a single polymorphic site on the Y, which is too little to provide any useful information. In S. latifolia, nine polymorphic sites were found on the Y. All nuclear genes display quite strong population structure, with the lowest FST for the autosomal gene and the highest for a Y-linked gene, as expected from the above analyses. The ratio RXY over all demes (RXY = 23) is smaller than the ratio rXY for within-deme diversity (rXY = 29), as expected from the effects of subdivision (Fig 5).
|
It is, however, impossible to reconcile the estimates of FST for all three modes of heredity under a simple island model of population subdivision: given the estimates of FST,A and FST,X, FST,Y is expected to be much higher (and RXY to be much lower) than is observed (Fig 5). Deviations from the model assumed here (e.g., the occurrence of a selective sweep on the Y) might account for these discrepancies, if they are not simply due to sampling error due to the small number of informative sites on the Y.
| DISCUSSION |
|---|
In the first part of this article, we have shown that the use of the fast-timescale approximation (![]()
![]()
By use of a suitable definition of generation time, we can also define the effective size of a population as the reciprocal of twice the product of generation time and the probability of coalescence per time interval (Equation 7). This expression can be used to generate explicit formulas for effective population size with discrete generations and separate sexes, under a variety of different modes of inheritance, and with arbitrary distributions of offspring numbers (Equation 14Equation 15Equation 16Equation 17). The same approach can be used for the case of a nuclear gene in a population of partially self-fertilizing hermaphrodites (Equation 19aEquation 19b and Equation 20). An important conclusion in this case is that the standard formula for Ne for selfing populations (![]()
The approach can also be applied to the standard model of an age-structured population with discrete time intervals, recently revisited by ![]()
![]()
![]()
We also show how to simplify the analysis of the effects of population subdivision of a demographically structured population by defining a migration matrix that describes the net rates of movement of genes between different local populations. This involves weighting the migration probabilities of individuals of a given age-sex class by the contribution of this class to the leading left eigenvector of the matrix describing movements of genes between age and sex classes (Equation 4). This enables the determination of the moment-generating functions for the distributions of coalescent times for pairs of genes sampled from a given pair of populations, under any well-defined migration model (Equation 5), under the standard assumption that migration and drift are both weak evolutionary forces. From these, the expected coalescent times and higher moments can easily be found (![]()
![]()
Under the infinite-sites model, the expected number of nucleotide differences between a pair of alleles sampled from a prescribed pair of populations is equal to the product of the mutation rate and the corresponding expected coalescent time (![]()
![]()
An important general conclusion is that population subdvision makes it very hard to describe the expected level of genetic variability in a population by a single formula such as
= 4Neu. While it is possible to define a simple expression for a weighted mean coalescence time for a pair of alleles sampled from the same deme for a general migration model (Equation 9), this involves both the effective population sizes of all the individual demes and their contributions to the leading left eigenvector of the migration matrix defined by Equation 4. In general, these are unknowable quantities, making it very hard to equate any empirical estimate of the mean within-population nucleotide site diversity to a simple scaled mutation rate parameter. The details of the demography and migration parameters of a species may greatly influence the estimated scaled mutation rate based on unweighted mean within-population nucleotide site diversities, making comparisons between different species difficult to interpret.
The situation is even worse for the nucleotide diversity for a pair of alleles sampled from the population at large, since this is related to the within-deme value by 1/(1 - FST). In general, FST depends in a complex way on migration rates and deme sizes, and simple formulas are available for only a few limiting cases, such as the island model. Only when there is negligible genetic differentiation between local populations can one confidently relate mean nucleotide site diversity to the coalescent time for a randomly sampled pair of alleles and hence to a scaled mutation rate parameter. Many empirical investigations of DNA sequence variation in natural populations do not state explicitly what population parameters are of primary interest and often make little distinction between measures of variation based on whole-population and within-population estimates. More attention to these issues in presenting analyses of data on DNA sequence variability is desirable.
Our investigation of the island model also shows that strong population subdivision, coupled with sex-specific migration rates, can greatly affect the relative values of the expected total genetic diversities for different modes of inheritance and may even reverse some of the patterns expected under panmixia. For example, Fig 3 displays the expected patterns of genetic variability for animal populations with a high variance in male reproductive success. From Fig 3C, it may be seen that predominantly female migration can cause the ratios of autosomal or X-linked variability to Y-linked variability to decline below one with extreme subdivision, in contrast to a value of over four with panmixia. Encouragingly, however, the relative values of expected autosomal and X-linked total population diversities are insensitive to all but the most extreme population subdivision, for both plant and animal models (Fig 1 Fig 2 Fig 3 Fig 4). This suggests that the use of ratios of X-linked to autosomal diversities to make inferences about the strength of sexual selection (![]()
We have confined ourselves to deriving expressions for expected coalescence times and nucleotide site differences between alleles. However, expressions for the distribution of coalescent times for a set of n alleles sampled from a specified set of populations can easily be written down, using the migration probabilities defined by Equation 4 and coalescence probabilities defined by Equation 7 to generate the expectations of competing exponential distributions of waiting times to migration or coalescence events (![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank N. Barton, D. Charlesworth, F. Depaulis, and two anonymous reviewers for their comments on the manuscript. B.C. acknowledges support by the Royal Society and the Engineering and Physics Research Council, and V.L. acknowledges support by the Biotechnology and Biological Sciences Research Council.
Manuscript received April 2, 2002; Accepted for publication June 18, 2002.
| APPENDIX |
|---|
Variances in offspring numbers with a binomial distribution of the proportion of sons vs. daughters:
Let the proportion of males among breeding individuals be c. From the properties of the binomial distribution, it is easily seen that the variance of the number of sons per female is
![]() |
(A1a) |
where
f and Vf are the mean and variance of total offspring number per female. If the population size is stationary, we have
, and so
![]() |
(A1b) |
The Poisson expectation for Vfm is c/(1 - c), so this yields
![]() |
(A1c) |
Similarly, we have
![]() |
(A2) |
![]() |
(A3) |
Corresponding expressions can be derived for the progeny of male parents, interchanging c and (1 - c).
Effective population size with self-fertilization:
Consider first two maternally derived genes, sampled from different individuals (probability one-quarter). Their probability of origin from a common parent in the previous generation is independent of whether the individuals are the products of selfing or outcrossing and is simply
![]() |
(A4a) |
where
is the mean per capita number of successful offspring produced through seed, and
Vf is the excess over
of the variance in the number of successful offspring produced through seed.
Next, consider a maternally derived and a paternally derived gene from different individuals (probability one-half). The latter has a probability of S of being derived by selfing, in which case the probability that it came from the same parent as the maternally derived gene is
![]() |
(A4b) |
where CfOfS is the covariance between the number of offspring produced through outcrossed and selfed seeds, respectively, and
VfS is the excess over
S of the variance in number of offspring produced through selfing.
If the paternal allele was derived by outcrossing (probability 1 - S), the probability that the pair came from a common parent is
![]() |
(A4c) |
where Cfm is the covariance between the total number of offspring produced by seed and the number of offspring produced through outcrossed pollen, and
is the mean number of offspring produced through outcrossed pollen.
Finally, consider a pair of paternally derived genes (probability one-quarter). There is a probability of approximately S2 that they are both products of selfing, in which case the probability of sharing a common parent is
![]() |
(A4d) |
There is a probability 2S(1 - S) that one is derived from selfing and the other from outcrossing, in which case the probability of common parentage is
![]() |
(A4e) |
where CfSm is the covariance between the number of offspring produced through selfed seed and outcrossed pollen.
There is a probability (1 - S)2 that both are products of outcrossing, in which case their probability of common parentage is
![]() |
(A4f) |
where
Vm is the excess over
of the variance in number of offspring produced through outcrossed pollen.
Conditioned on common parentage, each of these possible origins of gene pairs has a probability of (1 + F)/2 of resulting in coalescence; the reciprocal of the effective population size, NeH, is thus given by multiplying (1 + F) by the sum of the products of the probabilities of origins and common parentage conditioned on origin, where at equilibrium under selfing and random mating F = S/(2 - S).
Assume that the kth individual has a total amount of resource Rk available for reproduction, of which a fraction ck is devoted to pollen production and 1 - ck to seed production. Let the net expected contribution of offspring through seed be a function f(Rk[1 - ck]) of the individual's allocation to seed production. If the population size is stationary, this has a mean of one.
If fitnesses are normalized so that mean fitnesses through male and female contributions are equal (![]()
![]() |
(A5) |
where the function g describes the dependence of relative male reproductive success on allocation to pollen (the expectation of g is 1).
The actual numbers of offspring produced are Poisson variates with
and
as parameters. The excess of the variance over Poisson expectation in the number of offspring produced through seed is
![]() |
(A6a) |
(assuming that Rk and ck are independent), where the derivatives are evaluated at the population means.
![]() |
(A6b) |
Similarly
![]() |
(A6c) |
and
![]() |
(A6d) |
Substituting these equations into Equation 19b, we obtain
![]() |
(A7) |
The second term in brackets is equal to the square of the derivative of total fitness with respect to sex allocation (![]()
More complex models, which allow for variation in the selfing rate and for covariances between R and c, can be written down, but the conclusions are not greatly changed. The covariance term can be seen to carry a weight that involves a factor equal to the derivative of total fitness with respect to sex allocation, which is expected to be close to zero (see above) from the above argument. The variance in selfing rate is necessarily close to zero for outcrossing and highly selfing populations and so will not influence their relative Ne values.
| LITERATURE CITED |
|---|
BACHTROG, D. and B. CHARLESWORTH, 2000 Reduced levels of microsatellite variability on the neo-Y chromosome of Drosophila miranda. Curr. Biol. 10:1025-1031.[Medline]
BACHTROG, D. and B. CHARLESWORTH, 2002 Reduced adaptation of an evolving neo-Y chromosome. Nature 416:323-326.[Medline]
BARBUJANI, G., A. MAGAGNI, E. MINCH, and L. L. CAVALLI-SFORZA, 1997 An apportionment of human DNA diversity. Proc. Natl. Acad. Sci. USA 94:4516-4519.
BERTRANPETIT, J., 2000 Genome, diversity, and origins: the Y chromosome as a storyteller. Proc. Natl. Acad. Sci. USA 97:6927-6929.
BOWCOCK, A. M., J. R. KIDD, J. L. MOUNTAIN, J. M. HEBERT, and L. CAROTENUTO et al., 1991 Drift, admixture, and selection in human evolutiona study with DNA polymorphisms. Proc. Natl. Acad. Sci. USA 88:839-843.
CABALLERO, A., 1995 On the effective size of populations with separate sexes, with particular reference to sex-linked genes. Genetics 139:1007-1011.[Abstract]
CHARLESWORTH, B., 1998 Measures of divergence between populations and the effect of forces that reduce variability. Mol. Biol. Evol. 15:538-543.[Abstract]
CHARLESWORTH, B., 2001 The effect of life-history and mode of inheritance on neutral genetic variability. Genet. Res. 77:153-166.[Medline]
CHARLESWORTH, B. and D. CHARLESWORTH, 2000 The degeneration of Y chromosomes. Philos. Trans. R. Soc. Lond. B 355:1563-1572.
CHARLESWORTH, D. and B. CHARLESWORTH, 1981 Allocation of resources to male and female functions in hermaphrodites. Biol. J. Linn. Soc. 15:57-74.
CHESSER, R. K. and R. J. BAKER, 1996 Effective sizes and dynamics of uniparentally and diparentally inherited genes. Genetics 144:1225-1235.[Abstract]
CHESSER, R. K., O. E. J. RHODES, D. W. SUGG, and A. SCHNABEL, 1993 Effective sizes for subdivided populations. Genetics 135:1221-1232.[Abstract]
COCKERHAM, C. C. and B. S. WEIR, 1993 Estimation of gene flow from F-statistics. Evolution 47:855-863.
ENNOS, R. A., 1994 Estimating the relative rates of pollen and seed migration among plant populations. Heredity 72:250-259.
EXCOFFIER, L., P. E. SMOUSE, and J. M. QUATTRO, 1992 Analysis of molecular variance inferred from metric distances among DNA haplotypesapplication to human mitochondrial-DNA restriction data. Genetics 131:479-491.[Abstract]
FILATOV, D. A., F. MONÉGER, I. NEGRUTIU, and D. CHARLESWORTH, 2000 Evolution of a plant Y-chromosome: variability in a Y-linked gene of Silene latifolia.. Nature 404:388-390.[Medline]
FILATOV, D. A., V. LAPORTE, C. VITTE, and D. CHARLESWORTH, 2001 DNA diversity in sex linked and autosomal genes of the plant species Silene latifolia and S. dioica.. Mol. Biol. Evol. 18:1442-1454.
HAMMER, M. F., A. B. SPURDLE, T. KARAFET, M. R. BONNER, and E. T. WOOD et al., 1997 The geographic distribution of human Y chromosome variation. Genetics 145:787-805.[Abstract]
HAMMER, M. F., T. M. KARAFET, A. J. REDD, H. JARJANAZI, and S. SANTACHIARA-BENERECETTI et al., 2001 Hierarchical patterns of global human Y-chromosome diversity. Mol. Biol. Evol. 18:1189-1203.
HUDSON, R. R., 1990 Gene genealogies and the coalescent process. Oxf. Surv. Evol. Biol. 7:1-45.
HUDSON, R. R., D. D. BOOS, and N. L. KAPLAN, 1992 A statistical test for detecting geographic subdivision. Mol. Biol. Evol. 9:138-151.[Abstract]
HURLES, M. E. and M. A. JOBLING, 2001 Haploid chromosomes in molecular ecology: lessons from the human Y.. Mol. Ecol. 10:1599-1613.[Medline]
HURST, L. D. and H. ELLEGREN, 1998 Sex biases in the mutation rate. Trends Genet. 14:446-452.[Medline]
JORDE, L. B., W. S. WATKINS, M. J. BAMSHAD, M. E. DIXON, and C. E. RICKER et al., 2000 The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y-chromosome data. Am. J. Hum. Genet. 66:979-988.[Medline]
KAYSER, M., M. KRAWCZAK, L. EXCOFFIER, P. DIELTJES, and D. CORACH et al., 2001 An extensive analysis of Y-chromosomal microsatellite haplotypes in globally dispersed human populations. Am. J. Hum. Genet. 68:990-1018.[Medline]
KIMURA, M., 1971 Theoretical foundations of population genetics at the molecular level. Theor. Popul. Biol. 2:174-208.[Medline]
LLOYD, D. G., 1977 Genetic and phenotypic models of natural selection. J. Theor. Biol. 69:543-560.[Medline]
MCVEAN, G., 2000 Evolutionary genetics: what is driving male mutation? Curr. Biol. 10:R834-R835.[Medline]
MCVEAN, G. A. T. and L. D. HURST, 1997 Evidence for a selectively favourable reduction in the mutation rate of the X chromosome. Nature 386:388-392.[Medline]
NAGYLAKI, T., 1980 The strong migration limit in geographically structured populations. J. Math. Biol. 9:101-114.[Medline]
NAGYLAKI, T., 1982 Geographical invariance in population genetics. J. Theor. Biol. 99:159-172.[Medline]
NAGYLAKI, T., 1998a The expected number of heterozygous sites in a subdivided population. Genetics 149:1599-1604.
NAGYLAKI, T., 1998b Fixation indices in subdivided populations. Genetics 148:1325-1332.
NORDBORG, M., 1997 Structured coalescent processes on different time scales. Genetics 146:1501-1514.[Abstract]
NORDBORG, M. and P. DONNELLY, 1997 The coalescent process with selfing. Genetics 146:1185-1195.[Abstract]
OOTA, H., W. SETTHEETHAM-ISHIDA, D. TIWAWECH, T. ISHIDA, and M. STONEKING, 2001 Human mtDNA and Y-chromosome variation is correlated with matrilocal versus patrilocal residence. Nat. Genet. 29:20-21.[Medline]
POLLAK, E., 1987 On the theory of partially inbreeding populations. I. Partial selfing. Genetics 117:353-360.
POLONI, E. S., O. SEMINO, G. PASSARINO, A. S. SANTACHIARA-BENERECETTI, and I. DUPANLOUP et al., 1997 Human genetic affinities for Y-chromosome P49a,f/TaqI haplotypes show strong correspondence with linguistics. Am. J. Hum. Genet. 61:1015-1035.[Medline]
QUINTANA-MURCI, L., O. SEMINO, E. S. POLONI, A. LIU, and M. VAN GIJN et al., 1999 Y-chromosome specific YCAII, DYS19 and YAP polymorphisms in human populations: a comparative study. Ann. Hum. Genet. 63:153-166.[Medline]
ROUSSET, F., 1999 Genetic differentiation in populations with different classes of individuals. Theor. Popul. Biol. 55:297-308.[Medline]
SEIELSTAD, M. T., E. MINCH, and L. L. CAVALLI-SFORZA, 1998 Genetic evidence for a higher female migration rate in humans. Nat. Genet. 20:278-280.[Medline]
SHEN, P. D., F. WANG, P. A. UNDERHILL, C. FRANCO, and W. H. YANG et al., 2000 Population genetic implications from sequence variation in four Y chromosome genes. Proc. Natl. Acad. Sci. USA 97:7354-7359.
SLATKIN, M., 1991 Inbreeding coefficients and coalescence times. Genet. Res. 58:167-175.[Medline]
STONEKING, M., 1998 Women on the move. Nat. Genet. 20:219-220.[Medline]
THOMSON, R., J. K. PRITCHARD, P. D. SHEN, P. J. OEFNER, and M. W. FELDMAN, 2000 Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc. Natl. Acad. Sci. USA 97:7360-7365.
THORSTENSON, Y. R., P. D. SHEN, V. G. TUSHER, T. L. WAYNE, and R. W. DAVIS et al., 2001 Global analysis of ATM polymorphism reveals significant functional constraint. Am. J. Hum. Genet. 69:396-412.[Medline]
UNDERHILL, P. A., L. JIN, A. A. LIN, S. Q. MEHDI, and T. JENKINS et al., 1997 Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography. Genome Res. 7:996-1005.
WAKELEY, J., 1999 Nonequilibrium migration in human history. Genetics 153:1863-1871.
WAKELEY, J., 2001 The coalescent in an island model of population subdivision with variation among demes. Theor. Popul. Biol. 59:133-144.[Medline]
WAKELEY, J. and N. ALIACAR, 2001 Gene genealogies in a metapopulation. Genetics 159:893-905.
WANG, J., 1997a Effective size and F-statistics of subdivided populations: I. Monoecious species with partial selfing. Genetics 146:1453-1463.[Abstract]
WANG, J., 1997b Effective size and F-statistics of subdivided populations: II. Dioecious species. Genetics 146:1465-1474.[Abstract]
WANG, J., 1999 Effective size and F-statistics of subdivided populations for sex-linked loci. Theor. Popul. Biol. 55:176-188.[Medline]
WANG, J. and A. CABALLERO, 1999 Developments in predicting the effective size of subdivided populations. Heredity 82:212-226.
WATKINS, W. S., C. E. RICKER, M. J. BAMSHAD, M. L. CARROLL, and S. V. NGUYEN et al., 2001 Patterns of ancestral human diversity: an analysis of Alu-insertion and restriction-site polymorphisms. Am. J. Hum. Genet. 68:738-752.[Medline]
WHITLOCK, M. C. and N. H. BARTON, 1997 The effective size of a subdivided population. Genetics 146:427-441.[Abstract]
WILKINSON-HERBOTS, H. M., 1998 Genealogy and subpopulation differentiation under various models of population structure. J. Math. Biol. 37:535-585.
WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16:97-169.
WRIGHT, S., 1943 Isolation by distance. Genetics 28:114-138.
WRIGHT, S., 1951 The genetical structure of populations. Ann. Eugen. 15:323-354.
YU, N., Y. X. FU, N. SAMBUUGHIN, M. RAMSAY, and T. JENKINS et al., 2001 Global patterns of human DNA sequence variation in a 10-kb region on chromosome 1. Mol. Biol. Evol. 18:214-222.
ZHAO, Z. M., L. JIN, Y. X. FU, M. RAMSAY, and T. JENKINS et al., 2000 Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22. Proc. Natl. Acad. Sci. USA 97:11354-11358.
ZIETKIEWICZ, E., V. YOTOVA, M. JARNIK, M. KORAH-LASKOWSKA, and K. K. KIDD et al., 1998 Genetic structure of the ancestral population of modern humans. J. Mol. Evol. 47:146-155.[Medline]
ZUROVCOVA, M. and W. F. EANES, 1999 Lack of nucleotide polymorphism in the Y-linked sperm flagellar dynein gene Dhc-Yh3 of Drosophila melanogaster and D. simulans. Genetics 153:1709-1715.
This article has been cited by other articles:
![]() |
A. D. Cutter, A. Dey, and R. L. Murray Evolution of the Caenorhabditis elegans Genome Mol. Biol. Evol., June 1, 2009; 26(6): 1199 - 1234. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Nakagome, J. Pecon-Slattery, and R. Masuda Unequal Rates of Y Chromosome Gene Divergence during Speciation of the Family Ursidae Mol. Biol. Evol., July 1, 2008; 25(7): 1344 - 1356. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Muir and D. Filatov A Selective Sweep in the Chloroplast DNA of Dioecious Silene (Section Elisanthe) Genetics, October 1, 2007; 177(2): 1239 - 1247. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Arunyawat, W. Stephan, and T. Stadler Using Multilocus Sequence Data to Assess Population Structure, Natural Selection, and Linkage Disequilibrium in Wild Tomatoes Mol. Biol. Evol., October 1, 2007; 24(10): 2310 - 2322. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. D Twiss, C. Thomas, V. Poland, J. A Graves, and P. Pomeroy The impact of climatic variation on the opportunity for sexual selection Biol Lett, February 22, 2007; 3(1): 12 - 15. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A. Dyer, B. Charlesworth, and J. Jaenike Chromosome-wide linkage disequilibrium as a consequence of meiotic drive PNAS, January 30, 2007; 104(5): 1587 - 1592. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. Lawson Handley, L. Berset-Brandli, and N. Perrin Disentangling Reasons for Low Y Chromosome Variation in the Greater White-Toothed Shrew (Crocidura russula) Genetics, June 1, 2006; 173(2): 935 - 942. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. L Hammond, L. J Lawson Handley, B. J Winney, M. W Bruford, and N. Perrin Genetic evidence for female-biased dispersal and gene flow in a polygynous primate Proc R Soc B, February 22, 2006; 273(1585): 479 - 484. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Lynch The Origins of Eukaryotic Gene Structure Mol. Biol. Evol., February 1, 2006; 23(2): 450 - 468. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Ironside and D. A. Filatov Extreme Population Structure and High Interspecific Divergence of the Silene Y Chromosome Genetics, October 1, 2005; 171(2): 705 - 713. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Singh, J. C. Davis, and D. A. Petrov X-Linked Genes Evolve Higher Codon Bias in Drosophila and Caenorhabditis Genetics, September 1, 2005; 171(1): 145 - 155. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lu and C.-I Wu Weak selection revealed by the whole-genome comparison of the X chromosome and autosomes of human and chimpanzee PNAS, March 15, 2005; 102(11): 4063 - 4067. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Ingvarsson Nucleotide Polymorphism and Linkage Disequilibrium Within and Among Natural Populations of European Aspen (Populus tremula L., Salicaceae) Genetics, February 1, 2005; 169(2): 945 - 953. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hey and R. Nielsen Multilocus Methods for Estimating Population Sizes, Migration Rates and Divergence Time, With Applications to the Divergence of Drosophila pseudoobscura and D. persimilis Genetics, June 1, 2004; 167(2): 747 - 760. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Yu, M. I. Jensen-Seaman, L. Chemnick, O. Ryder, and W.-H. Li Nucleotide Diversity in Gorillas Genetics, March 1, 2004; 166(3): 1375 - 1383. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. E. Ramos-Onsins, B. E. Stranger, T. Mitchell-Olds, and M. Aguade Multilocus Analysis of Variation and Speciation in the Closely Related Species Arabidopsis halleri and A. lyrata Genetics, January 1, 2004; 166(1): 373 - 388. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Laporte, V.
- Articles by Charlesworth, B.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Laporte, V.
- Articles by Charlesworth, B.





































ST,Y; square,
XY).




















