- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Vogl, C.
- Articles by Stephan, W.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Vogl, C.
- Articles by Stephan, W.
Population Subdivision and Molecular Sequence Variation: Theory and Analysis of Drosophila ananassae Data
Claus Vogla,c, Aparup Dasa, Mark Beaumontb, Sujata Mohantya, and Wolfgang Stephanaa Department Biologie II, Ludwig-Maximilians Universität, D-80333 München, Germany,
b School of Animal and Microbial Sciences, University of Reading, Whiteknights, Reading, RG6 6AJ, United Kingdom
c Veterinärmedizinische Universität Wien, A-1210 Vienna, Austria
Corresponding author: Claus Vogl, Veterinärplatz 1, A-1210 Vienna, Austria., vogl{at}i122server.vu-wien.ac.at (E-mail)
Communicating editor: M. VEUILLE
| ABSTRACT |
|---|
Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter
to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of
, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center.
POPULATION subdivision is centrally important for evolution and affects estimation of all evolutionary parameters from natural and domestic populations. Even if just a neutral model is considered (i.e., selection is ignored), analysis of molecular variation in subdivided populations requires modeling of at least three forces: mutation, migration, and drift. Straightforward incorporation of these three forces into a comprehensive model leads to formidable complexity and necessitates computer-intensive numerical integration schemes (![]()
![]()
![]()
Because of its importance, many approaches to measuring population subdivision have been put forward. Among them, the infinite island model and the complementary F-statistics approach (![]()
![]()
![]()
![]()
Since the seminal work of ![]()
![]()
(![]()
= 4Nem (or 3Nem for X-linked variation) via the equation
= 1/(1 +
) (see Table 1 for abbreviations of key mathematical symbols). In the original model, it is assumed that all populations share the same migration parameter
. But, obviously, populations differ generally in their sizes or migration rates or both. ![]()
|
In many species with subdivided populations, the central or original populations show higher variability than the peripheral populations. This is the case for the only slightly subdivided populations of humans (![]()
![]()
![]()
![]()
![]()
D. ananassae exhibits more population structure than both D. melanogaster and D. simulans. It exists in many semi-isolated populations around the equator, particularly in mainland Southeast Asia and on the islands of the Pacific Ocean (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Current methods for inferring population differentiation with only a single parameter cannot capture variability among populations. With coalescence-based approaches (![]()
![]()
![]()
![]()
![]()
![]()
Herein, we follow ![]()
![]()
![]()
[or conversely the differentiation parameter
= 1/(
+ 1)] for each population separately and to estimate contributions to the pool of migrants. In the second step, we analyze variation within the pool of migrants (collecting phase), using traditional statistics for molecular sequence variation (![]()
![]()
Theory for analysis of molecular sequence data (e.g., ![]()
![]()
![]()
![]()
| MATERIALS AND METHODS |
|---|
Data collection:
Population samples:
A total of 69 isofemale lines of D. ananassae were sampled from the following eight locations (sample size in parentheses): Kakadu (3) and Darwin (6), Australia; Bogor (10), Java, Indonesia; Mandalay (10), Burma; Kathmandu (10), Nepal; and Bhubaneswar (10), Puri (10), and Chennai (10), India (Fig 1). Details on the collections of these samples will be reported elsewhere.
|
Identification of DNA fragments:
The data set consists of DNA sequence data from nine loci. Since DNA sequence information is limited in D. ananassae, we used genomic information from D. melanogaster to identify marker loci on the X chromosome of D. ananassae. (D. melanogaster is the only closely related species with a completely available genome sequence.) The forward and reverse primers for each fragment were designed (and cordially provided by David de Lorenzo) in adjacent exons of a randomly chosen gene to amplify the intron flanked by these exons. More than 200 fragments of the above specifications (sizes varying from 400 to 1000 bp in the D. melanogaster database) were identified and tested for amplification by PCR with genomic DNA from D. ananassae; <10% of the primer pairs actually amplified DNA fragments. For each of the successfully amplified fragments, the DNA sequence was aligned with the corresponding fragment of D. melanogaster. A fragment was used for the population genetic analysis if sequence homology to the exons of D. melanogaster at both ends of the fragment was observed and if the fragment length was 300600 bp. Eight such fragments were identified and used in this study. An intron fragment of the Om(1D) (![]()
DNA sequencing: A single male from each isofemale line was used for genomic DNA preparation using the PUREGENE DNA isolation kit (Gentra Systems, Minneapolis) and then purified with EXOSAP-IT (USB Corporation, Cleveland). Both DNA strands were sequenced on an automated DNA sequencer (Megabace1000; Amersham Biosciences, Buckinghamshire, UK). Sequences were edited with SeqMan and aligned with MegAlign (DNAStar, Madison, WI). Manual alignment was used when required. Insertion-deletion polymorphisms were removed from sequences after alignment and thus not considered for the analyses.
Analysis of migration and drift among populations:
The main culprit for the slow computation speed of some coalescence-based approaches is the complex migration model used (![]()
![]()
![]()
![]()
i. For comparison, we also present simulation results using a splitting population model (Fig 2B). We define
i as the probability that, looking back in time, two random alleles coalesce within the ith population before any of them migrates into the allele pool (Fig 3A). If applied to subdivided populations, the last part of this definition needs to be replaced by "before the split from the ancestral population" (Fig 3B).
|
|
The ![]()
i are drawn from a distribution. The advantage is that even if each population is rather uninformative, quite accurate estimates of the means and variances of the
i over populations can be obtained. We chose an independent ß-distribution for each
i with parameters
and ß. The ß-distribution is flexible and can assume shapes from a wide "U" to a narrow bell, yet is controlled by only two parameters. We then have
![]() |
(1) |
We note that the standard Weir and Cockerham model follows in the limit of
and ß
with
/(
+ ß) = const. =
.
For further development of the model, we assume that loci are unlinked and therefore conditionally independent of each other, yet all have the same
i for each population. The latter assumption could be relaxed easily (![]()
![]()
![]()
![]()
![]()
The scattering phase is handled as follows. Conditional on the allelic proportions in the migrant pool and the hyperparameters
and ß defined above, populations are independent of each other. Conditional on
i, coalescence is independent among loci. Hence, consideration of a single locus in a single population suffices for illustration, as all other loci and populations are conditionally independent. It can be shown that the number of coalescences within allelic class k is independent of other allelic classes and depends only on the migration parameter
i, the allelic proportion
k, and the current frequency in the sample nk (see Appendix): within allelic class k the decision between migration and coalescences can be made recursively. Suppose that there are currently nk individuals in class k; then the probability of the next event being a migration event is
i
k/(
i
k + nk - 1) (Appendix, Equation A6) and the probability of it being a coalescence event is its complement. Then nk is reduced by one and the procedure repeated until only one is left that is a migrant by default.
Relevant parameters for calculation of allelic proportions in the migrant pool are the allele frequencies for each locus. These must be estimated from only those alleles within the I populations that originated directly from the migrant pool, whereas alleles that arose through sampling within populations (or, looking backward, through coalescence within a population) must be ignored. Otherwise, estimation of allelic proportions in the migrant pool is straightforward.
For integration, we employ a MCMC approach (![]()
i and the vector of inferred allelic proportions in the migrant pool {
1, ... ,
K}, we update the coalescence history for each population and locus; conditionally on the coalescence histories for each population, we update the allelic proportions in the migrant pool {
1, ... ,
K} for each locus; conditionally on the allelic proportions in the migrant pool {
1, ... ,
K}, the observed allele frequencies {n1, ... , nK}, and the two hyperparameters
and ß, we update
i for each population; and, finally, conditionally on the
i's we update the hyperparameters
and ß. This updating scheme converges relatively quickly such that, with reasonable starting conditions, good approximations to the posterior distribution may already be obtained after a "burn-in" period of
1000 iterations.
For validation of our method and the computer program, we simulated data that were sampled from five populations distributed with varying
i (Table 2). The model parameters (Fst or
i) can be tuned such that probabilities of coalescence within populations are identical to first order for the infinite island model and the splitting population model. Parameters were five populations with
i from 0.1 to 0.9 and 20 loci. With the infinite island model, we observed a bias toward lower values; with the splitting population model the bias reverses for higher
i. Despite the bias, our estimator definitely improved on the Weir and Cockerham estimator.
|
Analysis of molecular variation within the migrant pool:
The MCMC procedure described above provides, for each iteration, a sample of alleles that make it into the migrant pool. The collection of such samples over a run approximates the posterior frequencies of alleles in the migrant pool. Hence, we determine statistics describing molecular variation, the ![]()
w (not to be mistaken with the population differentiation parameter
above),
(![]()
![]()
The variance of the estimators in the posterior distribution is only part of the total variance. In addition, there is the familiar variation of the estimates due to coalescence, drift, and sampling within the entire population (![]()
![]()
Strictly, the Bayesian approach we adopt does not allow for calculation of confidence intervals or significance. We follow conventional statistics, however, and declare results significant if the 0.95 posterior intervals do not overlap the expected value or if two 0.95 posterior intervals do not overlap.
For validation of the method, we performed computer simulations assuming both the infinite island and the splitting population model. Parameters were 10 populations, 20 loci with a finite site mutation model with 500 sites each, a mutation rate of 7.5 x 10-6, and a population size of 104 in the migrant pool or the ancestral population (for the infinite island model and the splitting population model, respectively). Note that, because the model is a finite site model rather than the more usual infinite site model, the expectation of
is
0.0013 (and thus slightly <0.0015) and the expectation for TAJIMA's (1989) D is slightly negative (
-0.1). From this ancestral population/gene pool, 10 subpopulations were sampled with (A)
i = 0.1, (B)
i = 0.2, and (C)
i = 0.5, using both the infinite island model and the splitting population model.
In Table 3, we find that the estimated
i is biased to low values in the infinite island model and, generally, toward high values in the splitting population model, similar to the simulations with unequal population sizes above. As expected, averages across populations of the estimators of population variation
and
w are lower than the true values in the ancestral population/gene pool and D is positive. This is expected as drift reduces molecular variation, affecting
w more than
. Our new estimators of
,
w, and D in the migrant pool are very close to the true values. All these results hold true for both the infinite island and the splitting population model. In fact, we find few differences between these two models.
|
| DATA ANALYSIS |
|---|
We analyzed a sample of eight D. ananassae populations from Asia and Australia (Fig 1), where nine loci were sequenced in up to 10 isofemale lines per population. Pairwise Fst values for the eight populations (Table 4) show values between 0.04 and 0.19. The two Australian populations (Kakadu and Darwin) are most closely related; otherwise there is little evidence for isolation by distance. In particular, the two Indian populations of Bhubaneswar and Puri have an intermediate Fst of 0.09, although they are separated by only 50 km. The ![]()
|
Descriptive statistics of molecular variation (
,
w, and D) are given in Table 5. Populations with the highest molecular variation are Kakadu (Australia) and Bogor (Indonesia) with
w on average
0.013. The Kakadu population has the smallest sample size and, accordingly, the highest variation among loci. Hence, the very high values for loci 1 and 2 may be statistical flukes. The overall high values for Bogor, on the other hand, seem quite trustworthy. The other Australian population (Darwin) and the Indian populations have smaller levels of molecular variation of
w
0.005. The D-statistic is mostly neutral, but Kathmandu (Nepal), Mandalay (Burma), and Bogor (Indonesia) have increasingly negative D's.
|
With our two-step analysis, we performed a single MCMC run, where 105 iterations were sampled after a burn-in period of 104 iterations. For each population, the mean
i and quantiles were calculated from the approximate posterior distribution (A5). Estimates of
i vary a lot among populations (see Table 6). The Bogor population is the most variable and least differentiated from the migrant pool, with a
i of
0.02, whereas, at the opposite end, the population from Chennai in Southern India has a
i of
0.25. Thus, the Bogor population seems closest to the species center and the Chennai population most peripheral. The average
i is
0.11, slightly higher than the estimate according to ![]()
|
The molecular variation among migrants (Table 7) is higher than that in any of the populations. This is to be expected, as our method restores the variation that is reduced due to drift within populations. The large differences between
and
w are unexpected, though: on average, values for
w are much higher than those for
. Correspondingly, TAJIMA's (1989) D-statistic is very negative, on average -1.40. While only one individual D-value is statistically significantly negative, there is a clear trend toward negative values and the mean D-value is significantly negative.
|
| DISCUSSION |
|---|
Population subdivision is centrally important to evolution. Unfortunately, it complicates analysis of molecular variation. Herein, we follow ![]()
![]()
![]()
![]()
Our method of modeling the process of migration among and drift within subpopulations with the infinite island model is similar to those developed by ![]()
![]()
![]()
i and assume that the
i's are drawn from a common distribution. [This is similar to the random-effects model of WEIR and COCKERHAM (1984).] In contrast to the method developed by ![]()
We use a MCMC scheme, mainly Gibbs sampling, for integration. We validate the model with computer simulations. If the infinite island model is used for simulations, the statistical analysis performs very well as expected. But even if a different model with splitting populations instead of the infinite island model is used for simulations, estimates of
i are quite accurate. Our analysis therefore seems quite robust to violations of assumptions as long as the population differentiation parameters
i remain the same. In the second step of our analysis, we analyze the allele spectrum within the migrant population, using traditional estimates of population variation and deviation from mutation-drift equilibrium (![]()
![]()
The process of estimating statistics of molecular variation in the migrant pool/ancestral population described above can be thought of as a way of removing the effect of drift within subpopulations. The statistics therefore can be used exactly like those from a sample of an undivided population. A negative D, for example, may indicate population expansion.
With this method, we analyzed molecular variation from eight populations of D. ananassae from South Asia, Southeast Asia, and Australia. In tropical and subtropical regions of the world, D. ananassae is one of the most common Drosophila species, especially in and around human habitations. Although populations are separated by major geographical barriers such as mountains and oceans, recurrent transportation by human activity may lead to gene exchange. In spite of this, however, earlier studies with molecular genetic markers detected significant population subdivision with a different method (![]()
![]()
![]()
![]()
![]()
was
0.1. Migration parameters vary a lot among populations from a very low value of 0.02 in Bogor (Java, Indonesia) to values an order of magnitude higher in some Australian and Indian populations. These estimates suggest that the Bogor population is close to the species center, while the Australian and Indian populations are peripheral. Furthermore, we observe little isolation by distance in this data set: the Indian populations of Puri and Bhubaneswar are separated by only 50 km, yet share as much genetic variation as two random populations. At ranges different from those covered in this data set, isolation by distance may still be observed in D. ananassae, and appropriate adjustments need to be made in the method.
As with population differentiation, levels of molecular variation are quite variable among populations: the Bogor population is more than twice as variable as some Indian populations. In light of the previous analyses, the Indian populations seem to show reduction of variation due to drift. This is consistent with values of TAJIMA's (1989) D-statistics observed in these populations: the Indian populations show neutral or slightly positive D, while the Bogor population is negative. In a substructured population, drift leads to positive D values. For consistency with the negative D of the Bogor population, a migrant pool with negative D needs to be postulated. This may be caused by rapid population expansion. The process of drift in peripheral populations may then push these negative values back toward neutral or slightly positive values. Thus, in some Indian populations, two processes pushing in opposite directions might cancel out and lead to neutral or slightly positive D. In our two-step analysis, the allele spectrum in the migrant pool has an even more negative D than the sample from Bogor. Simulation results indicate, however, that this may be partly an artifact: in expanding populations, estimates of the D-statistic are increasingly negative with increasing sample size (data not shown).
From a biological point of view, our observation of negative values of the D-statistic for the Bogor population is particularly interesting, as our results also suggest that this population is close to the species center of D. ananassae. This observation contradicts the conventional notion (assumption) that ancestral populations are in an approximate equilibrium. A similar case was recently reported by ![]()
| ACKNOWLEDGMENTS |
|---|
We thank two reviewers and the editor for their comments on the manuscript, our colleagues at the evolutionary biology group at Ludwig-Maximilians Universität for discussion and critical reading of the manuscript, David de Lorenzo for designing the primers we used, and the Deutsche Froschungsgemeinschaft (grant no. STE 325/4-1) for financial support.
Manuscript received February 5, 2003; Accepted for publication August 6, 2003.
| APPENDIX |
|---|
Consider a haploid infinite island model and assume that the diffusion approximation or the Moran model holds. Since we assume that loci are unlinked and therefore conditionally independent, consideration of one locus suffices. Assume that the locus has K alleles in the pool of migrants; some of these may be missing in a sample from a particular population. The allelic proportions in the pool of migrants {
1, ... ,
K} are constant over time, but unknown and need to be estimated. The main parameter of interest is the migration parameter for each population,
i, or equivalently the probability of two random alleles to coalesce within the population before migration,
i = 1/(
i + 1). Given the allelic proportions in the pool of migrants, estimation of
i or, equivalently,
i is independent among populations. We thus leave out the indices i and l for population and locus, respectively. Data consist of allele frequencies in the sample, denoted by {n1, ... , nK}.
The likelihood:
The distribution of the data {n1, ... , nK} given the allelic proportions in the migrant pool {
1, ... ,
K} and the migration parameter
is a Dirichlet-multinomial distribution (see ![]()
![]()
![]() |
(A1) |
The coalescence history:
For simulating the coalescence history backward in time, we consider an infinitesimally short time interval,
. Two types of events may happen in this interval: a migration event and a coalescent event. With a coalescent event, two lineages with the same allele collapse into one. With a migration event, a lineage is substituted by another one by migration. Since migration is independent of the allelic type, no information on which allele was in the lineage before the migration event is available. Hence, the lineage is exchangeable with all other lineages in the population for which no data are available. The lineage is thus dropped from the analyses exactly as one is dropped through coalescence.
Subsequently we need the posterior distribution of the type of the next allele sampled given the previously sampled allelic types {n1, ... , nK} and the allelic proportions in the migrant pool {
1, ... ,
K}. This distribution has been derived previously (![]()
![]()
![]()
![]() |
(A2) |
which corresponds to formula (16) in ![]()
We index the sequence of coalescence or migration events with s. Obviously, the last event must be a migration event. Assume that the lineages remaining in the analysis after the sth coalescence or migration event are {n1(s), ... , nK(s)}. Introduce another characteristic vector {y1(s), ... , yK(s)} as {x1(s), ... , xK(s)}. The probability of a lineage being reduced by one by migration to {n1(s) - x1(s), ... , nk(s) - xk(s)} within the interval
given {n1(s), ... , nk(s)}, {
1, ... ,
K}, and
is
![]() |
(A3) |
where we took away the index s for notational convenience, and
y is the sum over all possible vectors {y1(s), ... , yk(s)}. Compare this equation to the first part of Equation 15 in ![]()
The probability of a lineage being reduced by one by coalescence to {n1(s) - x1(s), ... , nK(s) - xK(s)} within the interval
given {n1(s), ... , nK(s)} and
is
![]() |
(A4) |
where we again left out the index s for notational convenience. Compare this equation to the second part of Equation 15 in ![]()
Summing the relevant terms, one realizes that the posterior probability of class k to be chosen for either a migration or a coalescence event is nk/n. Thus, if we again use the characteristic vector {x1, ... , xk}, the probability of class k to be chosen is a Bernoulli distribution
![]() |
(A5) |
Given that class k is chosen, the conditional probability that the event is a coalescence or a migration is

and
![]() |
(A6) |
As the temporal sequence of coalescences is irrelevant, each allelic class may be treated independently. The number of coalescences relative to the number of migrations within allelic class k is thus independent of the other classes and depends only on the frequency nk, the allelic proportions in the migrant pool
k, and the migration parameter
.
The coalescence history can thus be inferred by simulating backward until no more lineages remain in the population. For each locus, this requires at most ni - 1 samples from a Bernoulli distribution from the ith population. Records of the allelic types of all migrants need to be kept for updating the estimates of allelic proportions within the migrant pool.
The migration parameter:
Updating
or
involves sampling from a nonstandard distribution. [Note that even though we currently do not index
, we refer to the population-specific migration parameter and not to the ![]()
given the hyperparameters
and ß in Equation 1 and the likelihood of Equation A1 multiplied over all loci. If a flat prior for
is used, the prior distribution is not proper; i.e., no constant can be found such that it integrates to one. Reparameterizing
= 1/(1 +
) assures a proper posterior distribution with flat priors. Introducing the index l for the loci, we then have
![]() |
(A7) |
With a Metropolis-Hastings step, it is possible to sample from this distribution by using a proposal or jumping distribution j(
|x) (![]()
old and the new value sampled from j(
|x) is
new, then
new is accepted in favor of retaining
old with probability one if the ratio
![]() |
(A8) |
is greater than one, and with probability a otherwise.
The hyperparameters:
Updating the hyperparameters
and ß involves sampling from a nonstandard distribution proportional to Equation 1. Again we employ a Metropolis-Hastings step.
Allelic proportions in the migrant pool:
To get a new estimate of the allele frequencies in the migrant pool {
1, ... ,
K}, the sum over all populations of allelic frequencies in each allelic class that enter the migrant pool needs to be determined. Given
, the probability of this sum of alleles is multinomial. The new allele proportions {
1, ... ,
K} can thus be sampled from the sum of allelic frequencies using a Dirichlet distribution, where a flat or other Dirichlet prior distribution may be used.
The sampling scheme:
We alternate between (i) sampling the coalescence history recursively using Equation A6 conditional on the migrant allelic proportions
and the differentiation parameter
i for each population and locus, (ii) sampling the allelic proportions {
1, ... ,
K} in the migrant pool conditional on the coalescence histories in all populations using a Dirichlet distribution for each locus, (iii) sampling the population differentiation parameters
i from the Dirichlet-multinomial distribution conditional on the observed allele frequencies and the hyperparameters using (A7), and (iv) sampling the hyperparameters
and ß conditional on the population differentiation parameters
i. This updating scheme converges to the joint posterior distribution of the parameters. From the joint posterior all marginal posterior distributions can be obtained easily. Furthermore, the sample of individuals that reach the migrant pool is used for calculating the sequence variation statistics once per 100 cycles.
| LITERATURE CITED |
|---|
BALDING, D. and R. NICHOLS, 1995 A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica 96:3-12.[Medline]
BALDING, D. and R. NICHOLS, 1997 Significant genetic correlations among caucasians at forensic DNA loci. Heredity 78:583-589.
BEERLI, P. and J. FELSENSTEIN, 1999 Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescence approach. Genetics 152:763-773.
BEERLI, P. and J. FELSENSTEIN, 2001 Maximum likelihood estimation of a migration matrix and effective population sizes in subpopulations by using a coalescent approach. Proc. Natl. Acad. Sci. USA 98:4563-4568.
CHEN, Y. B., J. MARSH, and W. STEPHAN, 2000 Joint effect of natural selection and recombination on gene flow between Drosophila ananassae populations. Genetics 155:1185-1194.
DAVID, J. and P. CAPY, 1988 Genetic variation of Drosophila melanogaster natural populations. Trends Genet. 4:106-111.[Medline]
DOBZHANSKY, T. and A. DREYFUS, 1943 Chromosomal aberrations in Brazilian Drosophila ananassae.. Proc. Natl. Acad. Sci. USA 29:368-375.
EXCOFFIER, L., 2001 Analysis of population subdivision, pp. 271307 in Handbook of Statistical Genetics, edited by D. BALDING, M. BISHOP and C. CANNINGS. Wiley, New York.
GELMAN, A., J. CARLIN, H. STERN and D. RUBIN, 1995 Bayesian Data Analysis. Chapman & Hall, London/New York.
HAMBLIN, M. and M. VEUILLE, 1999 Population structure among African and derived populations of Drosophila simulans: evidence for ancient subdivision and recent admixture. Genetics 153:305-317.
HAWKES, J., 1990 The Potato: Evolution, Biodiversity and Genetic Resources. Belhaven Press, London.
MCEVEY, S. F., J. R. DAVID, and L. TSACAS, 1987 The Drosophila ananassae complex with description of a new species from French Polynesia (Diptera: Drosophilidae). Ann. Soc. Entomol. 23:377-385.
NICHOLSON, G., A. SMITH, F. JONSSON, O. GUSTAFSSON, and K. STEFANSSON et al., 2002 Assessing population differentiation and isolation from single-nucleotide polymorphism data. J. R. Stat. Soc. B 64:1-21.
PIPERNO, D. and K. FLANNERY, 2001 The earliest archeological maize (Zea mays. L.) from highland Mexico: new accelerator mass spectrometetry dates and their implications. Proc. Natl. Acad. Sci. USA 98:2101-2103.
ROSENBERG, N. A., J. K. PRITCHARD, J. L. WEBER, H. M. CANN, and K. K. KIDD et al., 2002 Genetic structure of human populations. Science 298:2381-2385.
ROUSSET, F., 2001 Inference from spatial population genetics, pp. 239269 in Handbook of Statistical Genetics, edited by D. BALDING, M. BISHOP and C. CANNINGS. Wiley, New York.
STEPHAN, W., 1989 Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. II. The Om(1D) locus. Mol. Biol. Evol. 6:624-635.[Abstract]
STEPHAN, W. and C. H. LANGLEY, 1989 Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. I. Contrasts between the vermilion and forked loci. Genetics 121:89-99.
STEPHAN, W. and S. J. MITCHELL, 1992 Reduced levels of DNA polymorphism and fixed between-population differences in the centromeric region of Drosophila ananassae.. Genetics 132:1039-1045.[Abstract]
STEPHAN, W., L. XING, D. A. KIRBY, and J. M. BRAVERMAN, 1998 A test of the background selection hypothesis based on nucleotide data from Drosophila ananassae.. Proc. Natl. Acad. Sci. USA 95:5649-5654.
STEPHENS, M. and P. DONNELLY, 2000 Inference in molecular poulation genetics. J. R. Stat. Soc. B 62:605-655.
TAJIMA, F., 1983 Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437-460.
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.
TOBARI, Y. N., 1993 Drosophila ananassae: Genetical and Biological Aspects. Japan Scientific Societies Press, Tokyo.
WAKELEY, J., 1998 Segregating sites in Wright's island model. Theor. Popul. Biol. 53:166-174.[Medline]
WAKELEY, J., 1999 Nonequilibrium migration in human history. Genetics 153:1863-1871.
WAKELEY, J., 2001 The coalescent in an island model of population subdivision with variation among demes. Theor. Popul. Biol. 59:133-144.[Medline]
WATTERSON, G., 1975 On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7:256-276.[Medline]
WEIR, B. and C. COCKERHAM, 1984 Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.
WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16:97-159.
WRIGHT, S., 1969 Evolution and the Genetics of Populations. II. The Theory of Gene Frequencies. University of Chicago Press, Chicago.
This article has been cited by other articles:
![]() |
C. Vishalakshi and B. N. Singh Fluctuating Asymmetry in Hybrids of Sibling Species, Drosophila ananassae and Drosophila pallidosa, Is Trait and Sex Specific J. Hered., March 1, 2009; 100(2): 181 - 191. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Schug, S. G. Smith, A. Tozier-Pearce, and S. F. McEvey The Genetic Structure of Drosophila ananassae Populations From Asia, Australia and Samoa Genetics, March 1, 2007; 175(3): 1429 - 1440. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Das, S. Mohanty, and W. Stephan Inferring the Population Structure and Demography of Drosophila ananassae From Multilocus Data Genetics, December 1, 2004; 168(4): 1975 - 1985. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Baines, A. Das, S. Mousset, and W. Stephan The Role of Natural Selection in Genetic Differentiation of Worldwide Populations of Drosophila ananassae Genetics, December 1, 2004; 168(4): 1987 - 1998. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Vogl, C.
- Articles by Stephan, W.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Vogl, C.
- Articles by Stephan, W.


(B) The splitting population model, where an ancestral population subdivides into subpopulations of effective population size Nei that evolve in isolation for t generations, such that
.










