Genetics, Vol. 158, 1347-1362, July 2001, Copyright © 2001

Estimation of Admixture Proportions: A Likelihood-Based Approach Using Markov Chain Monte Carlo

Lounès Chikhia,b, Michael W. Bruforda,c, and Mark A. Beaumonta,d
a Institute of Zoology, Regent's Park, London NW1 4RY, United Kingdom,
b School of Biological Sciences, Queen Mary and Westfield College, University of London, London E1 4NS, United Kingdom,
c School of Biosciences, Cardiff University, Cardiff CF10 3TL, United Kingdom
d School of Animal and Microbial Sciences, University of Reading, Reading RG6 6AJ, United Kingdom

Corresponding author: Lounès Chikhi, Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, United Kingdom., l.chikhi{at}ucl.ac.uk (E-mail)

Communicating editor: D. CHARLESWORTH

When populations are separated for long periods and then brought into contact for a brief episode in part of their range, this can result in genetic admixture. To analyze this type of event we considered a simple model under which two parental populations (P1 and P2) mix and create a hybrid population (H). After that event, the three populations evolve under pure drift without exchange during T generations. We developed a new method, which allows the simultaneous estimation of the time since the admixture event (scaled by the population size ti = T/Ni, where Ni is the effective population size of population i) and the contribution of one of two parental populations (which we call p1). This method takes into account drift since the admixture event, variation caused by sampling, and uncertainty in the estimation of the ancestral allele frequencies. The method is tested on simulated data sets and then applied to a human data set. We find that (i) for single-locus data, point estimates are poor indicators of the real admixture proportions even when there are many alleles; (ii) biallelic loci provide little information about the admixture proportion and the time since admixture, even for very small amounts of drift, but can be powerful when many loci are used; (iii) the precision of the parameters' estimates increases with sample size (n = 50 vs. n = 200) but this effect is larger for the ti's than for p1; and (iv) the increase in precision provided by multiple loci is quite large, even when there is substantial drift (we found, for instance, that it is preferable to use five loci than one locus, even when drift is 100 times larger for the five loci). Our analysis of a previously studied human data set illustrates that the joint estimation of drift and p1 can provide additional insights into the data.





This article has been cited by other articles:


Home page
Mol Biol EvolHome page
E. Durand, F. Jay, O. E. Gaggiotti, and O. Francois
Spatial Inference of Admixture Proportions and Secondary Contact Zones
Mol. Biol. Evol., September 1, 2009; 26(9): 1963 - 1973.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
V. C. Sousa, M. Fritz, M. A. Beaumont, and L. Chikhi
Approximate Bayesian Computation Without Summary Statistics: The Case of Admixture
Genetics, April 1, 2009; 181(4): 1507 - 1519.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
M. G Thomas, M. P.H Stumpf, and H. Harke
Evidence for an apartheid-like social structure in early Anglo-Saxon England
Proc R Soc B, October 22, 2006; 273(1601): 2651 - 2657.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
G. Montana
Statistical methods in genetics.
Brief Bioinform, September 1, 2006; 7(3): 297 - 308.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
E. M.S Belle, P.-A. Landry, and G. Barbujani
Origins and evolution of the Europeans' genome: evidence from multiple microsatellite loci
Proc R Soc B, July 7, 2006; 273(1594): 1595 - 1602.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. Wang
A Coalescent-Based Estimator of Admixture From DNA Sequences
Genetics, July 1, 2006; 173(3): 1679 - 1692.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
E. C. Anderson
An Efficient Monte Carlo Method for Estimating Ne From Temporally Spaced Samples Using a Coalescent-Based Likelihood
Genetics, June 1, 2005; 170(2): 955 - 967.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. Excoffier, A. Estoup, and J.-M. Cornuet
Bayesian Analysis of an Admixture Model With Mutations and Arbitrarily Linked Markers
Genetics, March 1, 2005; 169(3): 1727 - 1738.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. Falush, M. Stephens, and J. K. Pritchard
Inference of Population Structure Using Multilocus Genotype Data: Linked Loci and Correlated Allele Frequencies
Genetics, August 1, 2003; 164(4): 1567 - 1587.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. A. Beaumont
Estimation of Population Growth or Decline in Genetically Monitored Populations
Genetics, July 1, 2003; 164(3): 1139 - 1160.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. Wang
Maximum-Likelihood Estimation of Admixture Proportions From Genetic Data
Genetics, June 1, 2003; 164(2): 747 - 765.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
L. Chikhi, R. A. Nichols, G. Barbujani, and M. A. Beaumont
Y genetic data support the Neolithic demic diffusion model
PNAS, August 20, 2002; 99(17): 11008 - 11013.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. Berthier, M. A. Beaumont, J.-M. Cornuet, and G. Luikart
Likelihood-Based Estimation of the Effective Population Size Using Temporal Changes in Allele Frequencies: A Genealogical Approach
Genetics, February 1, 2002; 160(2): 741 - 751.
[Abstract] [Full Text] [PDF]