Originally published as Genetics Published Articles Ahead of Print on July 14, 2005.

Genetics, Vol. 171, 813-823, October 2005, Copyright © 2005
doi:10.1534/genetics.105.044206

Ranks of Genuine Associations in Whole-Genome Scans

* National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709 and {dagger} N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 117809, Moscow, Russia

1 Corresponding author: National Institute of Environmental Health Sciences, MD A3-03, South Campus Bldg. (101), P.O. Box 12233, Research Triangle Park, NC 27709.
E-mail: zaykind{at}niehs.nih.gov

With the recent advances in high-throughput genotyping techniques, it is now possible to perform whole-genome association studies to fine map causal polymorphisms underlying important traits that influence susceptibility to human diseases and efficacy of drugs. Once a genome scan is completed the results can be sorted by the association statistic value. What is the probability that true positives will be encountered among the first most associated markers? When a particular polymorphism is found associated with the trait, there is a chance that it represents either a "true" or a "false" association (TA vs. FA). Setting appropriate significance thresholds has been considered to provide assurance of sufficient odds that the associations found to be significant are genuine. However, the problem with genome scans involving thousands of markers is that the statistic values of FAs can reach quite extreme magnitudes. In such situations, the distributions corresponding to TAs and the most extreme FAs become comparable and significance thresholds tend to penalize TAs and FAs in a similar fashion. When sorting between true and false associations, the "typical" place (i.e., rank) of TAs among the most significant outcomes becomes important, ordered by the association statistic value. The distribution of ranks that we study here allows calculation of several useful quantities. In particular, it gives the number of most significant markers needed for a follow-up study to guarantee that a true association is included with certain probability. This can be calculated conditionally on having applied a multiple-testing correction. Effects of multilocus (e.g., haplotype association) tests and impact of linkage disequilibrium on the distribution of ranks associated with TAs are evaluated and can be taken into account.




This article has been cited by other articles:


Home page
BioinformaticsHome page
J. Bukszar, J. L. McClay, and E. J. C. G. van den Oord
Estimating the posterior probability that genome-wide association findings are true or false
Bioinformatics, July 15, 2009; 25(14): 1807 - 1813.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
M. H. Gail, R. M. Pfeiffer, W. Wheeler, and D. Pee
Probability of detecting disease-associated single nucleotide polymorphisms in case-control genome-wide association studies
Biostat., April 1, 2008; 9(2): 201 - 215.
[Abstract] [Full Text] [PDF]


Home page
BrainHome page
D. Kasperaviciute, M. E. Weale, K. V. Shianna, G. T. Banks, C. L. Simpson, V. K. Hansen, M. R. Turner, C. E. Shaw, A. Al-Chalabi, H. S. Pall, et al.
Large-scale pathways-based association study in amyotrophic lateral sclerosis
Brain, September 1, 2007; 130(9): 2292 - 2301.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
D. C. Thomas
Are We Ready for Genome-wide Association Studies?
Cancer Epidemiol. Biomarkers Prev., April 1, 2006; 15(4): 595 - 598.
[Full Text] [PDF]