- THIS ARTICLE
- Full Text
- Full Text (PDF)
-
All Versions of this Article:
genetics.107.085753v1
180/2/1095 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by RoyChoudhury, A.
- Articles by Thompson, E. A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by RoyChoudhury, A.
- Articles by Thompson, E. A.
Originally published as Genetics Published Articles Ahead of Print on September 9, 2008.
Genetics, Vol. 180, 1095-1105, October 2008, Copyright © 2008
doi:10.1534/genetics.107.085753
A Two-Stage Pruning Algorithm for Likelihood Computation for a Population Tree
Arindam RoyChoudhury*,1,
Joseph Felsenstein
and
Elizabeth A. Thompson
* Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138 and
Department of Genome Sciences and
Department of Statistics, University of Washington, Seattle, Washington 98195
1 Corresponding author: Wakeley Lab, 4092-4100 Biological Laboratories, 16 Divinity Ave., Harvard University, Cambridge, MA 02138.
E-mail: aroy{at}fas.harvard.edu
We have developed a pruning algorithm for likelihood estimation of a tree of populations. This algorithm enables us to compute the likelihood for large trees. Thus, it gives an efficient way of obtaining the maximum-likelihood estimate (MLE) for a given tree topology. Our method utilizes the differences accumulated by random genetic drift in allele count data from single-nucleotide polymorphisms (SNPs), ignoring the effect of mutation after divergence from the common ancestral population. The computation of the maximum-likelihood tree involves both maximizing likelihood over branch lengths of a given topology and comparing the maximum-likelihood across topologies. Here our focus is the maximization of likelihood over branch lengths of a given topology. The pruning algorithm computes arrays of probabilities at the root of the tree from the data at the tips of the tree; at the root, the arrays determine the likelihood. The arrays consist of probabilities related to the number of coalescences and allele counts for the partially coalesced lineages. Computing these probabilities requires an unusual two-stage algorithm. Our computation is exact and avoids time-consuming Monte Carlo methods. We can also correct for ascertainment bias.