- THIS ARTICLE
- Full Text
- Full Text (PDF)
-
All Versions of this Article:
genetics.104.031039v1
168/4/2373 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Nielsen, R.
- Articles by Clark, A. G.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Nielsen, R.
- Articles by Clark, A. G.
Originally published as Genetics Published Articles Ahead of Print on September 15, 2004.
Genetics, Vol. 168, 2373-2382, December 2004, Copyright © 2004
doi:10.1534/genetics.104.031039
Reconstituting the Frequency Spectrum of Ascertained Single-Nucleotide Polymorphism Data
Rasmus Nielsen*,
,1,
Melissa J. Hubisz* and
Andrew G. Clark
* Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York
Center for Bioinformatics, University of Copenhagen, 2100 Copenhagen, Denmark
1 Corresponding author: Center for Bioinformatics, Universitetsparken 15, 2100 Kbh Ø, Denmark.
E-mail: rasmus{at}binf.ku.dk
Most of the available SNP data have eluded valid population genetic analysis because most population genetical methods do not correctly accommodate the special discovery process used to identify SNPs. Most of the available SNP data have allele frequency distributions that are biased by the ascertainment protocol. We here show how this problem can be corrected by obtaining maximum-likelihood estimates of the true allele frequency distribution. In simple cases, the ML estimate of the true allele frequency distribution can be obtained analytically, but in other cases computational methods based on numerical optimization or the EM algorithm must be used. We illustrate the new correction method by analyzing some previously published SNP data from the SNP Consortium. Appropriate treatment of SNP ascertainment is vital to our ability to make correct inferences from the data of the International HapMap Project.
This article has been cited by other articles:
![]() |
R. Nielsen, M. J. Hubisz, I. Hellmann, D. Torgerson, A. M. Andres, A. Albrechtsen, R. Gutenkunst, M. D. Adams, M. Cargill, A. Boyko, et al. Darwinian and demographic forces affecting human protein coding genes Genome Res., May 1, 2009; 19(5): 838 - 849. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. E. Lohmueller, C. D. Bustamante, and A. G. Clark Methods for Human Demographic Inference Using Haplotype Patterns From Genomewide Single-Nucleotide Polymorphism Data Genetics, May 1, 2009; 182(1): 217 - 231. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Gray, J. M. Granka, C. D. Bustamante, N. B. Sutter, A. R. Boyko, L. Zhu, E. A. Ostrander, and R. K. Wayne Linkage Disequilibrium and Demographic History of Wild and Domestic Canids Genetics, April 1, 2009; 181(4): 1493 - 1505. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Moses and R. Durbin Inferring Selection on Amino Acid Preference in Protein Domains Mol. Biol. Evol., March 1, 2009; 26(3): 527 - 536. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Ramirez-Soriano and R. Nielsen Correcting Estimators of {theta} and Tajima's D for Ascertainment Biases Caused by the Single-Nucleotide Polymorphism Discovery Process Genetics, February 1, 2009; 181(2): 701 - 710. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Foll, M. A. Beaumont, and O. Gaggiotti An Approximate Bayesian Computation Approach to Overcome Biases That Arise When Using Amplified Fragment Length Polymorphism Markers to Study Population Structure Genetics, June 1, 2008; 179(2): 927 - 939. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Hernandez, S. H. Williamson, L. Zhu, and C. D. Bustamante Context-Dependent Mutation Rates May Cause Spurious Signatures of a Fixation Bias Favoring Higher GC-Content in Humans Mol. Biol. Evol., October 1, 2007; 24(10): 2196 - 2202. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Hernandez, S. H. Williamson, and C. D. Bustamante Context Dependence, Ancestral Misidentification, and Spurious Signatures of Natural Selection Mol. Biol. Evol., August 1, 2007; 24(8): 1792 - 1800. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. B. Rosenblum and J. Novembre Ascertainment Bias in Spatially Structured Populations: A Case Study in the Eastern Fence Lizard J. Hered., July 4, 2007; (2007) esm031v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tenesa, P. Navarro, B. J. Hayes, D. L. Duffy, G. M. Clarke, M. E. Goddard, and P. M. Visscher Recent human effective population size estimated from linkage disequilibrium Genome Res., April 1, 2007; 17(4): 520 - 526. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Thornton and J. D. Jensen Controlling the False-Positive Rate in Multilocus Genome Scans for Selection Genetics, February 1, 2007; 175(2): 737 - 750. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-K. Yoo, X. Ke, S. Hong, H.-Y. Jang, K. Park, S. Kim, T. Ahn, Y.-D. Lee, O. Song, N.-Y. Rho, et al. Fine-Scale Map of Encyclopedia of DNA Elements Regions in the Korean Population Genetics, September 1, 2006; 174(1): 491 - 497. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Kelley, J. Madeoy, J. C. Calhoun, W. Swanson, and J. M. Akey Genomic signatures of positive selection in humans and the limits of outlier approaches Genome Res., August 1, 2006; 16(8): 980 - 989. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. G. Clark, M. J. Hubisz, C. D. Bustamante, S. H. Williamson, and R. Nielsen Ascertainment bias in studies of human genome-wide polymorphism Genome Res., November 1, 2005; 15(11): 1496 - 1502. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. S. Carlson, D. J. Thomas, M. A. Eberle, J. E. Swanson, R. J. Livingston, M. J. Rieder, and D. A. Nickerson Genomic regions exhibiting positive selection identified from dense genotype data Genome Res., November 1, 2005; 15(11): 1553 - 1565. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Nielsen, S. Williamson, Y. Kim, M. J. Hubisz, A. G. Clark, and C. Bustamante Genomic scans for selective sweeps using SNP data Genome Res., November 1, 2005; 15(11): 1566 - 1575. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Williamson, R. Hernandez, A. Fledel-Alon, L. Zhu, R. Nielsen, and C. D. Bustamante Simultaneous inference of selection and population growth from patterns of variation in the human genome PNAS, May 31, 2005; 102(22): 7882 - 7887. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Unneberg, M. Stromberg, and F. Sterky SNP discovery using advanced algorithms and neural networks Bioinformatics, May 15, 2005; 21(10): 2528 - 2530. [Abstract] [Full Text] [PDF] |
||||





