Originally published as Genetics Published Articles Ahead of Print on December 15, 2005.

Genetics, Vol. 172, 2583-2599, April 2006, Copyright © 2006
doi:10.1534/genetics.105.042978

Modeling Haplotype Block Variation Using Markov Chains

Computer Science Department, Technion, Haifa 32000, Israel

1 Corresponding author: Computer Science Department, Technion, Technion City, Haifa 32000, Israel.
E-mail: gdg{at}cs.technion.ac.il

Models of background variation in genomic regions form the basis of linkage disequilibrium mapping methods. In this work we analyze a background model that groups SNPs into haplotype blocks and represents the dependencies between blocks by a Markov chain. We develop an error measure to compare the performance of this model against the common model that assumes that blocks are independent. By examining data from the International Haplotype Mapping project, we show how the Markov model over haplotype blocks is most accurate when representing blocks in strong linkage disequilibrium. This contrasts with the independent model, which is rendered less accurate by linkage disequilibrium. We provide a theoretical explanation for this surprising property of the Markov model and relate its behavior to allele diversity.