Originally published as Genetics Published Articles Ahead of Print on May 16, 2007.

Genetics, Vol. 176, 1823-1833, July 2007, Copyright © 2007
doi:10.1534/genetics.107.075408

Correcting for Measurement Error in Individual Ancestry Estimates in Structured Association Tests

* Section on Statistical Genetics and Bioinformatics, Center for Public Health Genomics, Department of Biostatistical Sciences, Division of Public Health Services, Wake Forest University Health Sciences, Winston-Salem, North Carolina 27101 and {dagger} Department of Biostatistics, Section on Statistical Genetics and {ddagger} Department of Nutrition Sciences, § Clinical Nutrition Research Center, University of Alabama, Birmingham, Alabama 35294

1 Corresponding author: Section on Statistical Genetics and Bioinformatics, Center for Public Health Genomics, Department of Biostatistical Sciences, Division of Public Health Services, Wake Forest University Health Sciences, WC-23, 100 N. Main St., Winston-Salem, NC 27101.
E-mail: jdivers{at}wfubmc.edu

We present theoretical explanations and show through simulation that the individual admixture proportion estimates obtained by using ancestry informative markers should be seen as an error-contaminated measurement of the underlying individual ancestry proportion. These estimates can be used in structured association tests as a control variable to limit type I error inflation or reduce loss of power due to population stratification observed in studies of admixed populations. However, the inclusion of such error-containing variables as covariates in regression models can bias parameter estimates and reduce ability to control for the confounding effect of admixture in genetic association tests. Measurement error correction methods offer a way to overcome this problem but require an a priori estimate of the measurement error variance. We show how an upper bound of this variance can be obtained, present four measurement error correction methods that are applicable to this problem, and conduct a simulation study to compare their utility in the case where the admixed population results from the intermating between two ancestral populations. Our results show that the quadratic measurement error correction (QMEC) method performs better than the other methods and maintains the type I error to its nominal level.