- THIS ARTICLE
- Full Text
- Full Text (PDF)
-
All Versions of this Article:
genetics.108.087361v1
180/4/2175 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Desai, M. M.
- Articles by Plotkin, J. B.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Desai, M. M.
- Articles by Plotkin, J. B.
Originally published as Genetics Published Articles Ahead of Print on October 14, 2008.
Genetics, Vol. 180, 2175-2191, December 2008, Copyright © 2008
doi:10.1534/genetics.108.087361
The Polymorphism Frequency Spectrum of Finitely Many Sites Under Selection
Michael M. Desai* and
Joshua B. Plotkin
,1
* Lewis–Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544 and
Department of Biology and Program in Applied Mathematics and Computation Science, University of Pennsylvania, Philadelphia, Pennsylvania 19104
1 Corresponding author: Department of Biology, University of Pennsylvania, Philadelphia, PA 19104.
E-mail: jplotkin{at}sas.upenn.edu
The distribution of genetic polymorphisms in a population contains information about evolutionary processes. The Poisson random field (PRF) model uses the polymorphism frequency spectrum to infer the mutation rate and the strength of directional selection. The PRF model relies on an infinite-sites approximation that is reasonable for most eukaryotic populations, but that becomes problematic when
is large (
0.05). Here, we show that at large mutation rates characteristic of microbes and viruses the infinite-sites approximation of the PRF model induces systematic biases that lead it to underestimate negative selection pressures and mutation rates and erroneously infer positive selection. We introduce two new methods that extend our ability to infer selection pressures and mutation rates at large
: a finite-site modification of the PRF model and a new technique based on diffusion theory. Our methods can be used to infer not only a "weighted average" of selection pressures acting on a gene sequence, but also the distribution of selection pressures across sites. We evaluate the accuracy of our methods, as well that of the original PRF approach, by comparison with Wright–Fisher simulations.