TY - JOUR
T1 - Usefulness of Single Nucleotide Polymorphism Data for Estimating Population Parameters
JF - Genetics
JO - Genetics
SP - 439
LP - 447
VL - 156
IS - 1
AU - Kuhner, Mary K.
AU - Beerli, Peter
AU - Yamato, Jon
AU - Felsenstein, Joseph
Y1 - 2000/09/01
UR - http://www.genetics.org/content/156/1/439.abstract
N2 - Single nucleotide polymorphism (SNP) data can be used for parameter estimation via maximum likelihood methods as long as the way in which the SNPs were determined is known, so that an appropriate likelihood formula can be constructed. We present such likelihoods for several sampling methods. As a test of these approaches, we consider use of SNPs to estimate the parameter Θ = 4Neμ (the scaled product of effective population size and per-site mutation rate), which is related to the branch lengths of the reconstructed genealogy. With infinite amounts of data, ML models using SNP data are expected to produce consistent estimates of Θ. With finite amounts of data the estimates are accurate when Θ is high, but tend to be biased upward when Θ is low. If recombination is present and not allowed for in the analysis, the results are additionally biased upward, but this effect can be removed by incorporating recombination into the analysis. SNPs defined as sites that are polymorphic in the actual sample under consideration (sample SNPs) are somewhat more accurate for estimation of Θ than SNPs defined by their polymorphism in a panel chosen from the same population (panel SNPs). Misrepresenting panel SNPs as sample SNPs leads to large errors in the maximum likelihood estimate of Θ. Researchers collecting SNPs should collect and preserve information about the method of ascertainment so that the data can be accurately analyzed.
ER -