Originally published as Genetics Published Articles Ahead of Print on November 4, 2005.

Genetics, Vol. 172, 1349-1358, February 2006, Copyright © 2006
doi:10.1534/genetics.105.047241

A Logistic Regression Mixture Model for Interval Mapping of Genetic Trait Loci Affecting Binary Phenotypes

* Department of Statistics, George Washington University, Washington, District of Columbia 20052, {dagger} Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, Ohio 43403 and {ddagger} Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, DHHS, Bethesda, Maryland 20892

1 Corresponding author: Department of Statistics, George Washington University, 2140 Pennsylvania Ave., NW, Washington, DC 20052.
E-mail: zli{at}gwu.edu

Often in genetic research, presence or absence of a disease is affected by not only the trait locus genotypes but also some covariates. The finite logistic regression mixture models and the methods under the models are developed for detection of a binary trait locus (BTL) through an interval-mapping procedure. The maximum-likelihood estimates (MLEs) of the logistic regression parameters are asymptotically unbiased. The null asymptotic distributions of the likelihood-ratio test (LRT) statistics for detection of a BTL are found to be given by the supremum of a {chi}2-process. The limiting null distributions are free of the null model parameters and are determined explicitly through only four (backcross case) or nine (intercross case) independent standard normal random variables. Therefore a threshold for detecting a BTL in a flanking marker interval can be approximated easily by using a Monte Carlo method. It is pointed out that use of a threshold incorrectly determined by reading off a {chi}2-probability table can result in an excessive false BTL detection rate much more severely than many researchers might anticipate. Simulation results show that the BTL detection procedures based on the thresholds determined by the limiting distributions perform quite well when the sample sizes are moderately large.