- THIS ARTICLE
- Full Text
- Full Text (PDF)
-
All Versions of this Article:
genetics.107.081281v1
178/3/1817 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Riebler, A.
- Articles by Stephan, W.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Riebler, A.
- Articles by Stephan, W.
Originally published as Genetics Published Articles Ahead of Print on February 1, 2008.
Genetics, Vol. 178, 1817-1829, March 2008, Copyright © 2008
doi:10.1534/genetics.107.081281
Bayesian Variable Selection for Detecting Adaptive Genomic Differences Among Populations
Andrea Riebler*,1,
Leonhard Held* and
Wolfgang Stephan
* Biostatistics Unit, Institute of Social and Preventive Medicine, University of Zurich, CH-8001 Zurich, Switzerland and
Section of Evolutionary Biology, Department of Biology II, University of Munich, D-82152 Planegg-Martinsried, Germany
1 Corresponding author: Biostatistics Unit, Institute of Social and Preventive Medicine, University of Zurich, Hirschengraben 84, CH-8001 Zurich, Switzerland.
E-mail: andrea.riebler{at}ifspm.uzh.ch
We extend an Fst-based Bayesian hierarchical model, implemented via Markov chain Monte Carlo, for the detection of loci that might be subject to positive selection. This model divides the Fst-influencing factors into locus-specific effects, population-specific effects, and effects that are specific for the locus in combination with the population. We introduce a Bayesian auxiliary variable for each locus effect to automatically select nonneutral locus effects. As a by-product, the efficiency of the original approach is improved by using a reparameterization of the model. The statistical power of the extended algorithm is assessed with simulated data sets from a Wright–Fisher model with migration. We find that the inclusion of model selection suggests a clear improvement in discrimination as measured by the area under the receiver operating characteristic (ROC) curve. Additionally, we illustrate and discuss the quality of the newly developed method on the basis of an allozyme data set of the fruit fly Drosophila melanogaster and a sequence data set of the wild tomato Solanum chilense. For data sets with small sample sizes, high mutation rates, and/or long sequences, however, methods based on nucleotide statistics should be preferred.