- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.105.049122v1
172/1/693 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zhang, H.
- Articles by Ye, Y.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zhang, H.
- Articles by Ye, Y.
Originally published as Genetics Published Articles Ahead of Print on October 11, 2005.
Genetics, Vol. 172, 693-699, January 2006, Copyright © 2006
doi:10.1534/genetics.105.049122
Detection of Genes for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies
Heping Zhang1, Xueqin Wang and Yuanqing Ye
Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut 06520-8034
1 Corresponding author: Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034.
E-mail: heping.zhang{at}yale.edu
>ABSTRACT
METHOD AND MODEL
SIMULATION
RESULTS
DISCUSSION
APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
LITERATURE CITED
There is growing interest in genomewide association analysis using single-nucleotide polymorphisms (SNPs), because traditional linkage studies are not as powerful in identifying genes for common, complex diseases. Tests for linkage disequilibrium have been developed for binary and quantitative traits. However, since many human conditions and diseases are measured in an ordinal scale, methods need to be developed to investigate the association of genes and ordinal traits. Thus, in the current report we propose and derive a score test statistic that identifies genes that are associated with ordinal traits when gametic disequilibrium between a marker and trait loci exists. Through simulation, the performance of this new test is examined for both ordinal traits and quantitative traits. The proposed statistic not only accommodates and is more powerful for ordinal traits, but also has similar power to that of existing tests when the trait is quantitative. Therefore, our proposed statistic has the potential to serve as a unified approach to identifying genes that are associated with any trait, regardless of how the trait is measured. We further demonstrated the advantage of our test by revealing a significant association (P = 0.00067) between alcohol dependence and a SNP in the growth-associated protein 43.
TO identify genes underlying an inheritable disease, it is critical to establish the linkage of the disease locus with a known gene or marker (usually a DNA polymorphism) (SPIELMAN et al. 1993). While classic linkage analysis has been applied successfully in mapping disease genes for many Mendelian diseases and for some complex diseases such as breast cancer (e.g., HALL et al. 1990), major challenges, limitations, and failures remain in using classic linkage analysis to map complex diseases. New techniques and methodologies must be developed to address these challenges, limitations, and failures of classic linkage analysis for more accurate identification of genedisease associations. Some challenges that have been studied thus far include population admixture (SPIELMAN et al. 1993) and limited and imprecise information on the density of combination (RABINOWITZ 1997).
Using insulin-dependent diabetes mellitus (IDDM) as the disease of interest, SPIELMAN et al. (1993) proposed a transmission/disequilibrium test (TDT) and demonstrated its power in establishing a strong linkage between 5'-flanking polymorphism on chromosome 11 and the susceptibility to IDDM. Specifically, the TDT compares the frequency of the marker allele of interest transmitted from heterozygous parents to their affected children with that of the nontransmitted marker allele. In contrast to the classic approaches, TDT has two important features. First, it uses an affected offspring in a parent-child trio to serve as his or her own case and control in an artificially created matched pair, thus eliminating the effect of population admixture. Second, it tests for linkage when a population association has been established between alleles at a marker and the trait status.
The properties and success of TDT have led to many useful extensions in two major directions. First, efforts have been made to consider data beyond the parent-child trio design. Examples include the use of sibships (SPIELMAN and EWENS 1998) and nuclear families (LUNETTA et al. 2000; RABINOWITZ and LAIRD 2000). The TDT has also been extended to deal with quantitative traits (ALLISON 1997; RABINOWITZ 1997). Furthermore, LIU et al. (2002) proposed a unified framework for TDT when the trait distribution belongs to an exponential family.
While methods for linkage and association analysis have been well established for dichotomous and quantitative traits, there is a lack of methodological development in analyzing ordinal traits. As illustrated by ZHANG et al. (2003) and FENG et al. (2004), many human conditions (e.g., cancer and most behavioral and psychiatric disorders) are measured on discrete, ordinal scales. An unnecessary collapse of trait levels could reduce the power in genetic analyses (ZHANG et al. 2003; FENG et al. 2004). Although ZHANG et al. (2003) and FENG et al. (2004) developed a basic framework to conduct segregation and linkage analyses for pedigree data, methods have not been developed for the association analysis of ordinal traits.
The purpose of this study is to develop a score test statistic to detect genes that are associated with an ordinal trait when gametic disequilibrium between marker and trait loci exists. To demonstrate the benefit of this test, we compare the new test statistic with existing test statistics in terms of the type I error approximation and power estimation for ordinal as well as quantitative and binary traits. While the primary motive of the new test is to deal with ordinal traits, the new test becomes the standard TDT when the trait is binary. Our simulations demonstrate that the new test has comparable power to other established tests when the trait is quantitative. Thus, the new test can serve as a unified test for any trait.
ABSTRACT
>METHOD AND MODEL
SIMULATION
RESULTS
DISCUSSION
APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
LITERATURE CITED
. Let
be the trait vector from the ith family, whose components take values in
, where the level is ordered, but is not necessarily on a linear scale. This trait value reflects the severity or stage of a certain condition such as cancer or diabetes.
At trait locus t, we assume that there is a trait increasing allele D, and we use d to denote the wild-type allele(s). Let
represent the genotype at trait locus t for the ith family. Let
be number of copies of allele D in genotype
.
We consider a diallelic marker with alleles A and a. Let
be the marker data. The likelihood contributed by the ith family at locus t is the probability
of the observed marker data, given the vector
of observed phenotypes, and given that t is the disease locus. As a standard assumption, we assume that this probability is independent among different families. In addition, as in WHITTEMORE and TU (2000), we assume (a) that the trait and marker loci are closely linked such that, given the family's genotypes at a trait locus t, the family's phenotypes and marker genotypes are independent and (b) that given
, the traits of the family members are conditionally independent. Thus, we have
![]() |
![]() |
are ascending level parameters,
, and ß is the genetic effect.
and ß are referred to as penetrance parameters. Here, we defined
to reflect an additive model, but it can be modified to reflect a dominant or recessive model.
Let
and
be the counts of children whose trait values are greater or less than
, respectively, and
be the numbers of copies of transmitted A alleles at the marker locus. We show in the APPENDIX that the score statistic to test the null hypothesis that
, namely, none of the genes that are linked and in gametic disequilibrium with the marker is associated with the trait, is
![]() |
that follows
asymptotically. We refer it to as the O-TDT.
It is noteworthy that T belongs to a general class of score statistics
, where the weight function
depends on the trait distribution. For example, it reduces to the original TDT when
and to RABINOWITZ's (1997) test when
for a normally distributed quantitative trait, where
is the sample average.
ABSTRACT
METHOD AND MODEL
>SIMULATION
RESULTS
DISCUSSION
APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
LITERATURE CITED
Simulation experiment design:
The data are generated as follows. First, the parents' genotypes were generated according to specified coefficients of linkage disequilibrium or haplotype frequencies as delineated in Table 1.
|
After the parental genotypes were generated, the offspring genotypes were generated depending on the purpose of the simulation. Under the null hypothesis, the trait is not associated with a locus linked to the marker. This is used to assess the type I error. To evaluate the power, the trait and marker loci are 1 cM apart. Finally, conditional on the trait genotype, the trait was generated by two models for different comparison purposes:
- A nonproportional odds model was also used to generate an ordinal trait. Because our score test was derived from a proportional odds model, we deliberately generated data from nonproportional odds to assess the robustness of our score test with respect to the proportionality assumption.
- A Gaussian model was used to generate a quantitative trait to evaluate the performance of the O-TDT for the quantitative trait. Again, the proportionality was not assumed.
Type I error comparison:
In Table 2, we compare the nominal levels of type I error with those estimated empirically by the simulation in 10,000 replications when ordinal traits were generated from nonproportional odds models. For the nonproportional odds model, Table 3 delineates the penetrance distribution, namely the distribution of the ordinal trait given the genotype at the trait locus.
|
|
In Table 4, we compare the nominal levels of type I error with those estimated empirically by the simulation in 10,000 replications when quantitative traits were generated from a Gaussian distribution. Once the quantitative traits are generated, the observed trait values are regarded as discrete and ordered quantities and hence can be treated as ordinal numbers, allowing the use of O-TDT.
|
It is clear from Tables 2 and 4 that the empirical type I errors estimated from the simulation replications are numerically close to the nominal levels. On a relative scale, however, some deviations between the empirical and nominal levels of type I errors can be observed for
= 0.0001, but given the very small size of the nominal level such deviations are not unexpected from our 10,000 replications.
Power comparison:
Table 5 compares power of TDT and O-TDT at three significance levels when ordinal traits were generated from nonproportional odds models as described in Simulation experimental design. We do not compare Q-TDT with O-TDT for ordinal traits, because the ordinal scale is not numerically meaningful, and the use of the Q-TDT is not appropriate. Table 6 compares power of Q-TDT and O-TDT at three significance levels when quantitative traits were generated from the model as described in Simulation experimental design.
|
|
Table 5 demonstrates that dichotomizing an ordinal trait can lead to a substantial loss of power. Figure 1 highlights the gain of power by O-TDT relative to the use of TDT when ordinal traits were generated from nonproportional odds models. Figure 1 includes two choices of K (4 and 5) and two choices for the number of families (200 and 400).
|
ABSTRACT
METHOD AND MODEL
SIMULATION
>RESULTS
DISCUSSION
APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
LITERATURE CITED
The alcohol dependence measure that we analyzed was based on several diagnostic systems, including the Diagnostic and Statistical Manual of Mental Disorders, Ed. 3, Revised (DSM-III-R). This measure was recorded on an ordinal scale with four levels (pure unaffected, never drank, unaffected with some symptoms, and affected).
We applied our test for the ordinal alcohol dependence measure and founded a highly significant association (P = 0.00067) between rs714697 and alcohol dependence. However, when we employed a standard TDT by dichotomizing the ordinal alcohol dependence into affected and unaffected, the P-value was 0.01. Thus, the use of the original ordinal scale reveals a much more significant association.
ABSTRACT
METHOD AND MODEL
SIMULATION
RESULTS
>DISCUSSION
APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
LITERATURE CITED
Although we presented the O-TDT test for a diallelic marker, particularly a SNP, the test can be extended for association studies to detect multiple SNPs (HOH et al. 2001) or haplotypes (ZHANG et al. 2004) that may affect the trait.
ABSTRACT
METHOD AND MODEL
SIMULATION
RESULTS
DISCUSSION
>APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
LITERATURE CITED
![]() |
,
for
, and
. Note that
![]() |
It is simple to see that
. Then,
![]() |
Under the null hypothesis that
, we have
![]() |
![]() |
For convenience, we drop the two irrelevant parameters in
from now on. Therefore,
![]() |
be the coefficient of linkage disequilibrium. We have
![]() |
![]() |
![]() |
Therefore,
![]() |
between the marker and trait loci, the score function equals zero under the null hypothesis. However, in the presence of association
, ignoring all constants, the score function becomes
![]() |
. We use the empirical distribution function of the trait values but do not estimate
's directly. Hence, we replace
with
, where
and
are the counts of children whose trait values are greater or less than
, respectively.
Under the null hypothesis, the conditional expectation values
and
are
![]() |
![]() |
The results of RABINOWITZ and LAIRD (2000) can be applied to estimate
,
, and
and then to obtain
and
.
ABSTRACT
METHOD AND MODEL
SIMULATION
RESULTS
DISCUSSION
APPENDIX: SCORE STATISTIC
>ACKNOWLEDGEMENTS
LITERATURE CITED
ABSTRACT
METHOD AND MODEL
SIMULATION
RESULTS
DISCUSSION
APPENDIX: SCORE STATISTIC
ACKNOWLEDGEMENTS
>LITERATURE CITED
ALLISON, D. B., 1997 Transmission-disequilibrium test for quantitative traits. Am. J. Hum. Genet. 60: 676690.[Medline]
BEGLEITER, H., T. REICH, V. HESSELBROCK, B. PORJESZ, T. K. LI et al., 1995 The collaborative study on the genetics of alcoholism. Alcohol Health Res. World 19: 228236.
BLENNOW, K., 2004 Cerebrospinal fluid protein biomarkers for Alzheimer's disease. J. Am. Soc. Exp. Neurother. 1: 213225.
FENG, R., J. LECKMAN and H. P. ZHANG, 2004 Linkage analysis of ordinal traits for pedigree data. Proc. Natl. Acad. Sci. USA 101: 1673916744.
HALL, J. M., M. K. LEE, B. NEWMAN, J. E. MORROW, L. A. ANDERSON et al., 1990 Linkage of early-onset familial breast cancer to chromosome 17q21. Science 250: 16841689.
HOH, J., A. J. WILLE and J. OTT, 2001 Trimming, weighting, and grouping SNPs in human case-control association studies. Genome Res. 11: 21152119.
KLEIN, R. J., C. ZEISS, E. Y. CHEW, J. Y. TSAI, R. S. SACKLER et al., 2005 Complement factor H polymorphism in age-related macular degeneration. Science 308: 385389.
LIU, Y., D. TRICHLER and S. B. BULL, 2002 A unified framework for transmission-disequilibrium test analysis of discrete and continuous traits. Genet. Epidemiol. 22: 2640.[CrossRef][Medline]
LUNETTA, K. L., S. V. FARONE, J. BIEDERMAN and N. M. LAIRD, 2000 Family based tests of association and linkage that used unaffected sibs, covariates, and interactions. Am. J. Hum. Genet. 66: 605614.[CrossRef][Medline]
RABINOWITZ, D, 1997 A transmission disequilibrium test for quantitative trait loci. Hum. Hered. 47: 342350.[CrossRef][Medline]
RABINOWITZ, D., and N. LAIRD, 2000 A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum. Hered. 50: 211223.[CrossRef][Medline]
RISCH, N., and K. MERIKANGAS, 1996 The future of genetic studies of complex human diseases. Science 273: 15161517.
SAUNDERS, D. E., J. H. HANNIGAN, C. S. ZAJAC and N. L. WAPPLER, 1995 Reversal of alcohol's effects on neurite extension and on neuronal GAP43/B50, N-myc, and c-myc protein levels by retinoic acid. Dev. Brain Res. 86: 1623.[CrossRef][Medline]
SPIELMAN, R. S., and W. J. EWENS, 1998 A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test. Am. J. Hum. Genet. 62: 450458.[CrossRef][Medline]
SPIELMAN, R. S., R. E. MCGINNIS and W. J. EWENS, 1993 Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52: 506516.[Medline]
WHITTEMORE, A. S., and I. P. TU, 2000 Detecting disease genes using family data. I. Likelihood-based theory. Am. J. Hum. Genet. 66: 13281340.[CrossRef][Medline]
ZHANG, H. P., R. FENG and H. T. ZHU, 2003 A latent variable model of segregation analysis for ordinal traits. J. Am. Stat. Assoc. 98: 10231034.[CrossRef]
ZHANG, K., Z. H. QIN, J. S. LIU, T. CHEN, M. S. WATERMAN et al., 2004 Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Res. 14: 908916.
Communicating editor: Y.-X. FU
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
-
All Versions of this Article:
genetics.105.049122v1
172/1/693 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zhang, H.
- Articles by Ye, Y.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zhang, H.
- Articles by Ye, Y.




= 0.11












