- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Beaumont, M. A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Beaumont, M. A.
Estimation of Population Growth or Decline in Genetically Monitored Populations
Mark A. Beaumontaa School of Animal and Microbial Sciences, Reading RG6 6AJ, United Kingdom
Corresponding author: Mark A. Beaumont, Whiteknights, PO Box 228, Reading RG6 6AJ, United Kingdom., m.a.beaumont{at}reading.ac.uk (E-mail)
Communicating editor: G. CHURCHILL
| ABSTRACT |
|---|
This article introduces a new general method for genealogical inference that samples independent genealogical histories using importance sampling (IS) and then samples other parameters with Markov chain Monte Carlo (MCMC). It is then possible to more easily utilize the advantages of importance sampling in a fully Bayesian framework. The method is applied to the problem of estimating recent changes in effective population size from temporally spaced gene frequency data. The method gives the posterior distribution of effective population size at the time of the oldest sample and at the time of the most recent sample, assuming a model of exponential growth or decline during the interval. The effect of changes in number of alleles, number of loci, and sample size on the accuracy of the method is described using test simulations, and it is concluded that these have an approximately equivalent effect. The method is used on three example data sets and problems in interpreting the posterior densities are highlighted and discussed.
THE effect of inbreeding on population fitness is currently the focus of many studies, both empirical (![]()
![]()
![]()
![]()
Estimation of Ne is problematic. There are three general approaches. One way is to estimate it nongenetically from the mating system (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The most widely used method to estimate Ne from genetic samples is from the difference in gene frequency between serial samples taken from the same population. This is the "temporal method," first introduced by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The above methods all assume that the sampling period is sufficiently short that the effects of mutations can be safely ignored. Recently a strand of research that is independent of that initiated by ![]()
![]()
![]()
![]()
![]()
This study assumes that the effects of mutations over the sampling period can be ignored and makes three contributions. First, it is shown how the Monte Carlo method of importance sampling can be used to update sets of genealogies in a Markov chain Monte Carlo simulation to estimate posterior distributions of parameters of interest. This method is very general and can be applied to all models of genealogical inference and may lead to increased efficiency in implementation and execution. This computational method is applied to the coalescent-based model of ![]()
![]()
![]()
| IMPLEMENTATION OF MARKOV CHAIN MONTE CARLO WITH INDEPENDENT SAMPLING OF GENEALOGICAL HISTORIES |
|---|
Background and motivations:
The potential for genetic data to shed light on the evolutionary history of populations has been well appreciated over the last decade, and in the development of the statistical methodology there has been a general interest in moving away from moment-based methods of estimation to the use of likelihood and Bayesian inference (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In the MCMC methods that give autocorrelated samples of parameter values the necessary integration for Bayesian inference is intrinsic, and the only additional computation, if required, is to estimate the posterior densities from the sampled points. By contrast, with importance sampling methods that give approximations of likelihood surfaces directly, further complex procedures are necessary for Bayesian inference. Given that it is generally easier to estimate densities (thereby enabling the choice of classical likelihood-based estimation, integrated likelihoods, or fully Bayesian inference) than to manipulate the approximated-likelihood surfaces, it would seem that methods that give autocorrelated samples of parameter values offer the greatest flexibility. However, these methods have currently two main drawbacks. They involve making small modifications to the genealogical history, and because of this (a) they are generally more difficult to program than pure IS methods, and (b) they can move quite slowly through the space of possible genealogical histories, making them potentially inefficient. This article introduces a method for overcoming these two disadvantages and applies it to a specific problem, the estimation of effective population size, Ne, from temporally spaced genetic samples.
The general motivation behind these computer-intensive methods is that from coalescent theory it is straightforward to calculate the probability p(D, G|
) = p(D|G)P(G|
) of any particular genealogical history, G, that gives rise to some data D, as a function of parameters specifying the demographic history and mutation model,
. There are a number of different representations of the genealogical history (see ![]()
![]() |
(1) |
where, following ![]()
) directly is most conveniently made using importance sampling (![]()
![]()

and this is estimated by sampling Gj from q(G|
) and evaluating
![]() |
(2) |
Generally the sampling distribution is chosen such that P(D|Gj) = 1 for all Gj. In the ideal case that q(G|
) = p(G|D,
), i.e., the posterior distribution of genealogical histories given the data and parameters, the variance in the estimate of p(D|
) is zero because each term in (2) evaluates to the likelihood axiomatically. The ratio p(G|
)/q(G|
) is called the importance ratio, or importance weight.
However, the evaluation or estimation of p(D|
) is not necessarily an ideal goal for population genetic inference. The problem is that
often has many components, and generally we wish to make inferences about one component (e.g., growth rate) independent of the others. Furthermore, for most population genetic problems, the likelihood surfaces do not approximate that of a multivariate normal distribution, and therefore asymptotic theory and methods often do not apply. These problems can be side-stepped by taking a Bayesian approach to inference, which also has the advantage that background information can be incorporated into the model (![]()

Inferences on particular parameters can be made from the marginal posterior distribution, where p(
|D) is integrated over all other parameters. If uniform improper priors are used p(
|D)
p(D|
) and the methods used to obtain marginal posterior distributions will also give the integrated (relative) likelihood surface. Considerations of how best to make inferences on single parameters in multiparameter models has led, for example, ![]()
The only method currently used to perform fully Bayesian analyses for population genetic inference has been Metropolis-Hastings sampling of parameter values (e.g., ![]()
![]()
![]()
![]()
To perform Metropolis-Hastings sampling it is not necessary to evaluate p(D|
), and we can work with p(D, G|
), which is easily calculated from coalescent theory. Starting with any Gi such that P(D|Gi) = 1, modify Gi
Gi+1 [where P(D|Gi+1) = 1] and
i
i+1 such that it is straightforward to calculate the probability, p(Gi+1,
i+1|Gi,
i), of obtaining Gi+1 and
i+1, conditional on being at Gi,
i, and the reverse. Then accept Gi+1 and
i+1, with probability
![]() |
(3) |
otherwise Gi+1 = Gi, and
i+1 =
i. The first term in the product is the likelihood ratio, the second is the Hastings term, and the third is the ratio of the priors. The Hastings term is the ratio of the probability of reaching the current state from the proposed state to that of the reverse and ensures a uniform coverage of the parameter space. This simulated Markov chain will then give a (serially autocorrelated) sample from p(
, G|D). Summaries of the marginal posterior density for a particular parameter or an estimate of the density itself can be obtained from the simulated sequence of values realized for that parameter, ignoring the others. The key point here is that it is possible to perform the simulation using p(D, G|
), which is easy to calculate, by updating the genealogical history G, and then the posterior distribution for the parameters of interest are obtained marginal to the genealogical histories. The price for this convenience is that the search space of the MCMC simulation is greatly increased. If it were possible to evaluateEquation 1, then p(
|D) marginal to G could have been obtained by running the simulation with p(D|
) and updating
i
i+1 alonei.e., accepting
i+1 with probability
![]() |
(4) |
and thus only
would have to be explored by the MCMC simulation.
Current methods that use independent sampling of genealogical histories within an MCMC framework:
Hitherto it has been easier to run the MCMC using p(G, D|
) and hence autocorrelated sampling of genealogical histories, but, as discussed above, there are programming problems and problems of efficiency with this approach. Therefore it is tempting to consider the use of importance sampling to obtain an approximation,
(D|
), which can then be implemented in an MCMC simulation to incorporate prior information, and obtain marginal posterior distributions or integrated likelihoods, as discussed above. One advantage of importance sampling is that it is often very straightforward to implement in a computer program. Also, because the importance-sampling function uses heuristics from coalescent theory to attempt to generate genealogies from their posterior distribution, given the data, it is a potentially more efficient method for sampling genealogical histories in comparison with MCMC.
This approach has been used in a series of articles (![]()
![]()
![]()
![]()
![]()
![]()

inEquation 4 is replaced by

estimated (in the genealogical analyses) usingEquation 2. Note that in normal MCMC, if the denominator p(D|
i) in the likelihood ratio were known without error there would be no need to reevaluate it each time that R was evaluated. By contrast, with
there is a choice whether to make independent estimates of
(D|
i) when it is evaluated at each update of the MCMC (evaluation ofEquation 4) or to reuse the earlier estimate. Intuitively it seems reasonable, though more time consuming, to reevaluate it each time so that the estimates of
are independent of each other and so that unusually large ratios arising by chance do not lead to sticking of the simulated Markov chain. Furthermore, the results in ![]()
![]()
and not for G as in the MCMC methods of ![]()
![]()
![]()
- Choose initial parameter values
i with i = 0. - Sample h independent genealogical histories using importance-sampling function and calculate
(D|
i) fromEquation 2. - Draw
i+1
p(
i+1|
i). - Sample h independent genealogical histories using importance-sampling function and calculate
(D|
i+1) fromEquation 2. - Accept
i+1 with probability 
Otherwise
i+1 =
i. - Set i = i + 1 and go to 2.
Bias correction:
Clearly, since
is based on an approximation of the likelihood ratio the posterior distribution will also be approximate. Simulation tests performed in ![]()
![]()

where R is the true likelihood ratio and
is its estimate. They suggest using the estimator R* =
2/
[
], where
[
] is an estimate of the expected value of the ratio, to correct for the bias in
. Details of how R* has been estimated for the genealogical model considered here are given in the Appendix. In the results below, simulations carried out with this bias correction are referred to here as MCWM with bias correction and the earlier method as MCWM without bias correction.
Independence Metropolis-Hastings simulation:
The methods described above all use importance sampling to approximate p(D|
) and, with or without bias correction, will lead the MCMC simulation to sample from an approximate posterior distribution. I now show how a small modification to the approach will guarantee that the MCMC will sample from the true posterior distribution.
Consider now an importance sample of size 1 (i.e., h = 1 inEquation 2 above). The importance weight,

is an (admittedly very poor) estimate of p(D|
) as described above, but is also the ratio of the probability of sampling the genealogy under the coalescent to the probability of sampling the genealogy under the importance-sampling function. Supposing this were used in the Metropolis-Hastings simulation described by (3), the ratio of importance weights for the ith and i + 1th MCMC update is

which, multiplied with the Hastings term for the parameter updates, p(
i|
i+1)/p(
i+1|
i), will give (3) above. Note that the Hastings term for updates to G is not conditional on any particular value of G. Thus, at least for G, this method is an example of the well-studied independence Metropolis-Hastings sampler (see, e.g., ![]()
![]()
![]()
![]()
, and although it would be possible to update the genealogical history independently of the demographic parameters, they are all updated together in the following simulations. To reiterate, the difference between the independence Metropolis-Hastings sampler and MCWM is that in MCWM the importance weight is viewed as an estimate of p(D|
i) in (4) and, although it would never be advisable to use it with a sample of size 1, is reevaluated with a new Gi at each evaluation of (4), whereas with the independence Metropolis-Hastings sampler the current Gi is retained at each evaluation of (3).
As is shown in the RESULTS, using a single genealogy in the independence sampler leads to very poor convergence of the MCMC, and, again this is a well-known property of the independence Metropolis-Hastings sampler when the sampling function is a poor approximation of the target density (![]()
and would generally be very good. Intuitively, therefore, we should get better convergence if we take larger sample sizes, but then the question arises whether the MCMC will converge to the required target density exactlyi.e., p(
, G|D).
As shown in the Appendix the target density for the MCMC becomes rather more complicated when we consider importance sample sizes greater than one. However, it can be proved (see the Appendix) that ifEquation 2 is used with values of h > 1 then the independence sampling procedure will always give the correct posterior densities for the demographic and genealogical parameters for any importance sample size. Although the sample of h genealogical histories observed at any point in the simulated chain is not drawn from the posterior distribution, if we consider the sample of genealogies simulated by the importance sampling procedure to be ordered and keep a track of, say, the genealogies occupying the jth position throughout the MCMC simulation, then these genealogies will (in the long run) be sampled from the correct posterior distribution and, jointly with the parameters, will be sampled from p(
, G|D). As described below, simulation tests suggest that acceptance rates increase rapidly with larger importance sample sizes, and for adequate importance sample sizes this procedure is, in general, more efficient than the other methods. This method differs from the normal independence sampler because we are using a group of sampled genealogies rather than one and is called grouped independence Metropolis-Hastings (GIMH) to distinguish it from the normal independence Metropolis-Hastings and from MCWM, which involves reevaluation of the likelihood. Since the grouped independence Metropolis-Hastings sampler can be shown to converge to the target densities exactly, whereas this is only approximate in the case of MCWM, with or without bias correction, the bulk of the analyses performed in this article are carried out using this approach.
A basic algorithm for GIMH (or the standard independence sampler when h = 1) is as follows:
- Choose initial parameter values
i with i = 0. - Sample h independent genealogical histories using importance-sampling function and calculate
i(D|
i) fromEquation 2. - Draw
i+1
p(
i+1|
i). - Sample h independent genealogical histories using importance-sampling function and calculate
i+1 (D|
i+1) fromEquation 2. - Accept
i+1 and
i+1(D|
i+1) with probability 
Otherwise
i+1 =
i and
i+1(D|
i+1) =
i(D|
i). - Set i = i + 1 and go to 3.
In this algorithm the importance sampling calculations are explicitly indexed for clarity. The essential difference from MCWM is that the iterations start at step 3 and genealogical histories are simulated only for the trial values.
| INFERENCE IN THE TEMPORAL METHOD BASED ON A COALESCENT MODEL WITH SAMPLES TAKEN AT MANY TIME POINTS |
|---|
The data are assumed to be sampled at different times, given by the sequence
= (x0, x1, ... , xd). Time is measured in units of generations, and the most recent sample is given subscript 0, and x0 = 0. The population is changing exponentially in size from a previously constant ancestral size NA at time X to size N0 at time 0. Time is taken to be increasing into the past, and terms such as "earlier" and "later" refer, respectively, to times nearer or farther from the most recent sample (see Fig 1). Corresponding to each time point is a sequence of sample sizes (number of chromosomes)
= (n0, n1, ... , nd), where lineages are added to the genealogy. The sequence of frequency counts of the different allelic types in each sample is given by
= (a0, a1, ... , ad), where the vectors are of length k, the total number of different allelic types observed in the data. For times >x0, at the time each set of lineages is added a number of lineages are present with descendants in earlier samples, lower down the genealogy (i.e., lower down in Fig 1). The allele frequency counts among these base lineages are denoted here as the random variable
= (f0, ... , fd), where, to ease the notation below, f0 is defined to be 0. The number of these lineages, also a random variable, H = (h1, ... , hd), depends on the number of coalescences that occur in the intervals between sampling points. These are given by the sequence
= (c1, ... , cd). Thus, at the ith sample point, the number of lineages deriving from earlier samples is given by

|
The notation used here is summarized in Fig 1.
The likelihood, assuming a model of drift without mutations, can be obtained as a straightforward extension of the two-sample case in ![]()
![]() |
(5) |
where
- p(ad + fd) is the probability of sampling the gene frequencies in the lineages extant at the earliest sampling time(at the top of Fig 1);
- p(fi + ai|fi+1, ci+1) is the probability of obtaining the gene frequencies among the lineages extant at sample i given the base lineages at i + 1 and the number of coalescences within the interval;
- p(ai+1, fi+1|ai+1 + fi+1) is the hypergeometric sampling probability of obtaining the frequencies in the base lineages and the frequencies in the sample lineages, given the frequencies of the combined lineages; and
- p(ci+1|(xi+1 - xi)/(2Ñi+1)) is the probability of obtaining c coalescences in the sampling interval, over which the harmonic mean effective size is Ñ
(see the Appendix for further details). The sum is over all possible numbers of coalescences between sampling intervals and all possible frequency counts among the base lineages at each interval. In the case of many unlinked loci, the likelihoods can be estimated for each locus separately and then multiplied together. Although in principle the possibilities can be straightforwardly enumerated, allowingEquation 5 to be solved, in practice there are far too many possibilities to make this useful. Instead, the importance sampling approach of ![]()
![]()
In this approach S independent sequences of coalescences of lineages are explicitly sampled by simulation (see the Appendix for details), and we obtain
![]() |
(6) |
Thus for the
th simulated sequence p(fi + ai|fi+1, ci+1) x p(ci+1|(xi+1 - xi)/(2Ñi+1)) in (5) is replaced by
c
ie=0 w
i(e+1), which is the ratio of the probability of obtaining the sampled sequence of lineages under the coalescent model, independent of the data, to the probability of obtaining it from the importance-sampling function.
The c
i coalescences are simulated using the coalescent model (see the Appendix for details of how this was done for a population of varying size). The distribution of the number of coalescences between data-sampling intervals is identical under the coalescent model and the importance-sampling function, and hence this term cancels out. This form of sampling is used for all the analyses described below. However, if importance weights are to be evaluated at parameter values other than those used to generate the samples, the terms in (6) need to be multiplied by a weight reflecting the different probability of obtaining the simulated number of coalescences under the coalescent compared to that under the importance-sampling function. This can be done in two ways. The weight p(ci,(xi+1 - xi)/2Ñi+1)/p(ci,(xi+1 - xi)/2Ñ*i+1) can be used from TAVARÉ's (1984) Equation 6.1 for each interval between samples, where Ñ and Ñ* are calculated for each interval from (A3), and Ñ* is used to generate the importance samples. Alternatively, the simulated coalescence times can be recorded, and an equivalent ratio can be calculated from their joint density under the coalescent compared to their joint density under the importance-sampling function. The advantage of the former is that it is marginal to the coalescence times and should therefore be more efficient; however, it is computationally time consuming to calculate and numerically unstable, and the latter is probably more practicable.
In the results described in the next sectionsEquation 6 has been used on its own to estimate likelihoods and also incorporated into the MCWM procedure with and without bias correction and into the GIMH) sampler. In general when MCWM and GIMH are used in the analyses rectangular priors are assumed for each parameter, as in ![]()
| SIMULATION TESTS |
|---|
Comparison of MCWM and GIMH with pure IS estimation:
To compare the accuracy of the three different MCMC approaches a data set was simulated from the model from a diploid population with effective size Ne = 51.2. The population did not change in size over the sampling period, and six samples each of size 20 chromosomes were taken at generations 0, 4, 8, 12, 16, and 20. The data set consisted of 10 loci each with five alleles in the population (although, due to sampling, some data sets had fewer than five alleles). The population frequencies were simulated from a uniform Dirichlet distribution, according to the assumption of the model.
The data set was then analyzed by four different approaches (in all models, Ne = N0 = NA):
- The likelihoods for a grid of 81 values of Ne from 20 to 80 were evaluated using (6). The likelihoods were evaluated at each point independently, using an IS size of 40,000. The standard errors were estimated using (A2). The approximate likelihood surface was normalized to have unit volume, and the standard errors were scaled accordingly. The standard deviation was estimated from this distribution.
- Nine different simulations using MCWM without bias correction were carried out in which the IS sizes used in the evaluation of (4) were 5000, 1000, 500, 100, 50, 10, 5, 2, and 1. The simulations were run for 20,000 updates, which appeared to give good convergence (as judged by eye from the output traces), and densities and standard deviations of the posterior distribution were estimated from the values of Ne generated by the simulation.
- Eight simulations using bias-corrected MCWM were carried out as for MCWM. Simulations using an IS size of one were not performed because SE[
(D/
)] cannot be estimated. - Nine simulations were carried out using GIMH as for MCWM. However, with GIMH the rate of convergence is heavily dependent on the IS size used. In particular, with an IS size of one the MCMC procedure tends to mix very poorly because the simulated chain will stick at chance high values of
(D/
). The length of simulation, the thinning interval (the number of MCMC iterations between successive recordings of parameter values), and the standard deviation of the trial parameter updates were varied between simulations to achieve satisfactory convergence, judged by eye from output traces.
Estimated densities using the four approaches are shown in Fig 2. It can be seen that the standard errors for the IS method are still large, even with 40,000 points. However, the posterior distributions estimated by MCWM with and without bias correction are very similar to each other and to the distribution estimated from the pure IS method, despite the variability in the estimates of the likelihood (for an IS size of 500 the standard errors are expected to be around nine times larger than those shown in Fig 2). The distribution for GIMH with an IS size of just 10 per MCMC update (evaluation ofEquation 4) is very close to that of the pure IS method.
|
Fig 3 shows how the width of the estimated posterior distribution varies with the IS size.
|
The standard deviation estimated from the pure IS method is 10.4. It can be seen that for MCWM with and without bias correction there is a strong relationship between the width of the distribution and the IS size. Bias correction does appear to ameliorate the problem to some extent, but still leads to inaccurate estimation of the posterior distribution when the IS size is small. For MCWM without bias correction, an IS size of
500 is the minimum required for accurate estimation, whereas
100 are needed with bias correction. The rate of convergence of the two MCWM methods appears to be independent of IS size.
GIMH is unaffected by the IS size, as expected from the result in the Appendix. This is not a free lunch, however. The tradeoff is that the amount of mixing is severely reduced when the IS size is low, and the length of time required to achieve convergence is correspondingly increased. For example, with an IS size of 1 the result shown in Fig 3 was obtained by pooling together results from seven independent simulations of 108 MCMC updates, thinned every 10,000 updates. Even in this case, there is still appreciable variability in the results between the independent simulations. The result for an IS size of 10 was obtained from a single simulation of 107 MCMC updates thinned every 200 updates. In this case convergence appears very good, with a very uniform trace for Ne, and the resulting density is plotted in Fig 2.
These points are further illustrated in Fig 4, where the proportion of trial updates that are accepted in the MCMC, divided by IS size, is plotted against IS size. The acceptance rate varied rapidly from 0.0011 with an IS size of 1 to 0.15 with an IS size of 10 and then more slowly to 0.85 with an IS size of 5000. It can be seen in Fig 4 that the scaled acceptance rate has an optimum at an IS size of
10.
|
In these simulations the standard deviation of the distribution of parameter updates was kept at 0.1 for all IS sizes, and therefore the scaled acceptance rate is a measure of efficiencyfor a given number of accepted trial updates, the total required number of IS evaluations is at a minimum if an IS size of 10 is used. This optimum will vary for different data sets and sizes of the trial parameter updates, and for the remaining analyses described in this article, which mostly used a standard deviation of 0.5 for the trial parameters, unless otherwise stated, GIMH was used with an IS size of 100 and 105 MCMC updates, thinned every 10 updates to give 10,000 points.
Effect of sample size and numbers of alleles and loci on the estimation of NA and N0:
To illustrate the effect of varying aspects of the sample, five independent simulations were performed for each combination of parameters in Table 1. As shown in the table the parameters were also summarized by a composite parameter SSAL = (sample size) x (k - 1) x (number of loci), where k is the number of alleles. Samples were simulated from populations that grew from NA = 20 to N0 = 200 and also populations that contracted from NA = 200 to N0 = 20. The samples were taken at six times,
= (0, 2, 4, 6, 8, 10). The population frequencies were simulated from a uniform Dirichlet distribution, as before. The aim of the analysis was to compare the effect of the parameters on the deviation of the joint posterior mode of NA and N0 from the value used in the simulations and also to illustrate typical posterior distributions obtained with different data sets. Ideally, of course, an analysis of the accuracy of estimators should use a larger number of replicates, but the time taken to run the MCMC precludes this. Five replicates are, however, sufficient to illustrate the general trend toward consistency in the estimator, as the amount of information in the data increases. This number was chosen because pairs of sets of five replicates could be run in parallel on a 10-node cluster of 700-Mhz Pentium 3 processors running under Linux. MCMC parameters are as described above for all simulations other than those with SSAL = 8000. In this case an IS size of 500 was used and the standard deviation (on a log scale) of the lognormal used for updating the demographic parameters was 0.1 rather than 0.5. The simulations took
4 hr for SSAL = 800 and
6 days for SSAL = 8000 (which has an IS size five times larger).
|
Examples of the joint posterior distributions for NA and N0 are shown in Fig 5. The posterior distributions are illustrated using highest posterior density (HPD) limits (as in ![]()
|
Using the mode from the joint posterior distribution as an estimator for NA and N0, the square root of the relative square error [defined as (
-
)2/N2A + (N0- N0)2/N20], referred to here as the "relative error," was calculated for each simulation and is shown plotted against SSAL in Fig 6, a and b. Each point in the figure is the relative error for a data set plotted against the SSAL value given in Table 1. Although there is substantial variability a general trend toward a reduction of the relative error with increasing SSAL can be seen. Given the variability of the results, the logarithm of the relative errors was analyzed using a linear model. In the model growth/decline was specified as a factor and the covariates (log transformed) were number of loci, number of independent alleles at each locus (k - 1), and sample size. The coefficients and standard errors were 4.96 (1.30), -0.629 (0.217), -0.517 (0.325), -0.794 (0.314), and -0.686 (0.164) for the intercept, effect of growth, number of loci, k - 1, and sample size, respectively. The effect of growth/decline was significant at P = 0.005, and the effect of the three covariates was significant at P = 0.0001. The residuals from this model were roughly normal with no obvious heteroscedasticity, although the limit on the relative errors arising from the rectangular priors has some effect on the residuals. A model with the coefficients for the three covariates forced to be -1 did not fit significantly less well than a model where the three covariates were free to vary (P = 0.21). This simple model gives the equations Relative Error = 2128/SSAL for a declining population and Relative Error = 1135/SSAL for a growing population. The fitted values from this model are shown in Fig 6. Thus the main conclusions of this analysis are: (a) there is, at least at the level of precision in this simulation study, an equivalence between the number of independent alleles, number of loci, and sample size; and (b) for the same value of SSAL the relative error is almost twice as large in a declining population in comparison with a growing population.
|
The Bayes factor favoring a model of population growth vs. decline was calculated for each MCMC simulation as the proportion of MCMC iterations where N0 > NA divided by the proportion of iterations where N0 < NA (each model has equal prior probability). The Bayes factor gives the relative likelihood of one model over the other (![]()
1600 have |log(Bayes factor)| < 2 (i.e., would be judged to be nonsignificant by conventional criteria), the great majority of results very strongly support the model under which they were generated. It should be noted that the Bayes factor is sensitive to the priors chosen, and this is discussed in more detail in the context of the example data sets analyzed below.
|
The bias in the joint estimation of NA and N0 apparent in Fig 5 does not appear to be caused by any systematic error in the estimation procedure, as judged by an examination of the coverage properties of the posterior distributions. The critical HPD P values corresponding to the true NA and N0 were estimated for each data set and plotted against the SSAL values given in Table 1 and Fig 8, a and b. If the posterior distribution was the same as the repeated sampling distribution the HPD P values should be uniformly distributed, irrespective of treatment. This will be true asymptotically when the posterior distribution approximates a multivariate normal. It can be seen that the estimated critical P values are broadly uniformly distributed, which is what would be expected under asymptotic theory. A Kolmogorov-Smirnoff one-sample test on the 60 P values shows no departure from a uniform (P = 0.93). There is no trend toward small P values with increasing SSAL, which would be expected if there was an error in the estimation procedure. The estimated critical P values for the examples in Fig 5, ad, are, respectively, 0.57, 0.29, 0.11, and 0.44. Thus, overall, the method appears to estimate changes in effective population size satisfactorily, but it is preferable to present results for the full posterior distribution rather than rely on the mode as a point estimate.
|
| ANALYSIS OF EXAMPLE DATA SETS |
|---|
To illustrate the behavior of the method on real data sets, three examples have been chosen: data from a population of Drosophila subobscura surveyed by ![]()
![]()
![]()
Drosophila:
The data were sampled from a population on Mount Parnes, 40 km north of Athens. The flies were genotyped for nine allozyme loci. The study site occupied
20,000 m2 of fir woodland at an elevation of 900 m. ![]()
![]()
150,000 individuals. These data have also been analyzed by ![]()
![]()
|
In addition, a separate analysis was carried out with Ne = NA = N0 to compare with the results obtained by ![]()
![]()
|
The mode of the posterior distribution is 449 with support limits of 253925 (in this case the support limit corresponds to the 0.922 HPD limit), which is very similar to the result of ![]()
![]()
![]()
![]()
The results for the varying population model are more in line with those of ![]()
![]()
for the second. In Fig 9 it can be seen that there is very little evidence of a change in population size. The joint mode is at NA = 337 and N0 = 890. The line of equal population sizes is well within the 90% HPD limits. The Bayes factor in favor of growth is 5.4. The marginal modes and HPD limits are 196 (57913) for NA and 726 (1124138) for N0. As noted above, the Bayes factor is sensitive to the priors chosen. Thus, for example, widening the rectangular bounds equally for both NA and N0 will tend to increase the Bayes factor, and narrowing them will decrease it. In general the tendency for the posterior distributions to reach an asymptote for large NA or N0 will cause sample size to affect the inferences. Considering three samples, as here, if, for example, the most recent sample is smaller than the oldest sample there will be greater uncertainty in N0 and the posterior distributions may be more likely to asymptote, and therefore there will be a tendency to suggest population growth, even if there is none. In fact, for the fly data, it is the oldest sample that is the smallest, and therefore this argument does not explain the broad posterior distribution for N0.
Northern pike:
Fish scale samples taken in 1961, 1977, and 1993 from Lake Escanaba, Wisconsin, were chosen from a collection of scales kept by the Wisconsin Department of Natural Resources and were genotyped at seven microsatellite loci. Five of these loci were biallelic and the remaining two were triallelic. The allele frequency counts used in the following analysis were obtained from the relat















