Originally published as Genetics Published Articles Ahead of Print on June 4, 2006.

Genetics, Vol. 173, 2357-2370, August 2006, Copyright © 2006
doi:10.1534/genetics.105.053314

A Bayesian Heterogeneous Analysis of Variance Approach to Inferring Recent Selective Sweeps

* Department of Biomathematics, UCLA School of Medicine, Los Angeles, California 90095-1766 and {dagger} Department of Biostatistics, UCLA School of Public Health, Los Angeles, California 90095-1772

1 Corresponding author: Department of Biomathematics, UCLA School of Medicine, Box 951766, Los Angeles, CA 90095-1766.
E-mail: johnmm{at}ucla.edu

The distribution of microsatellite allele sizes in populations aids in understanding the genetic diversity of species and the evolutionary history of recent selective sweeps. We propose a heterogeneous Bayesian analysis of variance model for inferring loci involved in recent selective sweeps by analyzing the distribution of allele sizes at multiple loci in multiple populations. Our model is shown to be consistent with a multilocus test statistic, ln RV, proposed for identifying microsatellite loci involved in recent selective sweeps. Our methodology differs in that it accepts original allele size data rather than summary statistics and allows the incorporation of prior knowledge about allele frequencies using a hierarchical prior distribution consisting of log normal and gamma probability distributions. Interesting features of the model are its ability to simultaneously analyze allele size data for any number of populations and to cope with the presence of any number of selected loci. The utility of the method is illustrated by application to two sets of microsatellite allele size data for a group of West African Anopheles gambiae populations. The results are consistent with the suppressed-recombination model of speciation, and additional candidate loci on chromosomes 2 (079 and 175) and 3 (088) are discovered that escaped former analysis.