Originally published as Genetics Published Articles Ahead of Print on September 14, 2008.

Genetics, Vol. 180, 1511-1524, November 2008, Copyright © 2008
doi:10.1534/genetics.108.091116

Maximum-Likelihood Estimation of Site-Specific Mutation Rates in Human Mitochondrial DNA From Partial Phylogenetic Classification

* Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, 69978 Israel, {dagger} IBM T. J. Watson Research Center, Yorktown Heights, New York 10598, {ddagger} Missions Program, National Geographic Society, Washington, DC 20036, § The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom, ** Molecular Medicine Laboratory, Rambam Health Care Campus, Haifa 31096, Israel

1 Corresponding author: Department of Statistics, Tel Aviv University, Tel Aviv, 69978 Israel.
E-mail: saharon{at}post.tau.ac.il

The mitochondrial DNA hypervariable segment I (HVS-I) is widely used in studies of human evolutionary genetics, and therefore accurate estimates of mutation rates among nucleotide sites in this region are essential. We have developed a novel maximum-likelihood methodology for estimating site-specific mutation rates from partial phylogenetic information, such as haplogroup association. The resulting estimation problem is a generalized linear model, with a nonstandard link function. We develop inference and bias correction tools for our estimates and a hypothesis-testing approach for site independence. We demonstrate our methodology using 16,609 HVS-I samples from the Genographic Project. Our results suggest that mutation rates among nucleotide sites in HVS-I are highly variable. The 16,400–16,500 region exhibits significantly lower rates compared to other regions, suggesting potential functional constraints. Several loci identified in the literature as possible termination-associated sequences (TAS) do not yield statistically slower rates than the rest of HVS-I, casting doubt on their functional importance. Our tests do not reject the null hypothesis of independent mutation rates among nucleotide sites, supporting the use of site-independence assumption for analyzing HVS-I. Potential extensions of our methodology include its application to estimation of mutation rates in other genetic regions, like Y chromosome short tandem repeats.