- THIS ARTICLE
- Full Text (Rapid PDF)
- Data Supplement
-
All Versions of this Article:
genetics.107.077263v1
177/2/861 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
-
Author home page(s):
Shuichi Kitada
Hirohisa Kishino
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Kitada, S.
- Articles by Kishino, H.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Kitada, S.
- Articles by Kishino, H.
doi:10.1534/genetics.107.077263
A more recent version of this article appeared on October 1, 2007.
REGULAR RESEARCH PAPERS |
Empirical Bayes Inference of pairwise FST and its distribution in the genome
Shuichi Kitada 1*, Toshihide Kitakado 1 and Hirohisa Kishino 2
1 Tokyo University of Marine Science and Technology
2 University of Tokyo
* To whom correspondence should be addressed. E-mail: kitada{at}kaiyodai.ac.jp.
Submitted on June 11, 2007
Revised on July 17, 2007
Accepted on 17 July 2007
Populations often have very complex hierarchical structure. Therefore, it is crucial in genetic monitoring and conservation biology to have a reliable estimate of the pattern of population subdivision. FSTs for pairs of sampled localities or subpopulations are crucial statistics for the exploratory analysis of population structures, such as cluster analysis and multidimensional scaling. However, the estimation of FST is not precise enough to reliably estimate the population structure and the extent of heterogeneity. This paper proposes an empirical Bayes procedure to estimate locus-specific pairwise FSTs. The posterior mean of the pairwise FST can be interpreted as a shrinkage estimator, which reduces the variance of conventional estimators largely at the expense of a small bias. The global FST of a population generally varies among loci in the genome. Our maximum likelihood estimates of global FSTs can be used as sufficient statistics to estimate the distribution of FST in the genome. We demonstrate the efficacy and robustness of our model by simulation and by an analysis of the microsatellite allele frequencies of the Pacific herring. The heterogeneity of the global FST in the genome is discussed based on the estimated distribution of the global FST for the herring and examples of human SNPs.
Key Words: distribution of FST, empirical Bayes, genome-wide estimation, pairwise FST, shrinkage estimator