## Abstract

Treating mRNA transcript abundances as quantitative traits and examining their relationships with clinical traits have been pursued by using an analytical approach of quantitative genetics. Recently, Kraft *et al*. presented a family expression association test (FEXAT) for correlation between gene expressions and trait values with a family-based (sibships) design. This statistic did not account for biological relationships of related subjects, which may inflate type I error rate and/or decrease power of statistical tests. In this article, we propose two new test statistics based on a variance-components approach for analyses of microarray data obtained from general pedigrees. Our methods accommodate covariance between relatives for unmeasured genetic effects and directly model covariates of clinical importance. The efficacy and validity of our methods are investigated by using simulated data under different sample sizes, family sizes, and family structures. The proposed LR method has correct type I error rate with moderate to large sample sizes regardless of family structure and family sizes. It has higher power with complex pedigrees and similar power to the FEXAT with sibships. The other proposed FEXAT(R) method is favorable with large family sizes, regardless of sample sizes and family structure. Our methods, robust to population stratification, are complementary to the FEXAT in expression-trait association studies.

IN the past few years, there has been increasing interest in genetic studies of complex diseases by combining information on clinical traits, marker genotypes, and comprehensive gene expressions. It was proposed that standard methods of quantitative genetics can be applied to microarray data analyses (Wolfinger* et al.* 2001; Kraft* et al.* 2003). Treating mRNA transcript levels as quantitative traits, efforts have been pursued in examining their relationships with clinical traits and mapping gene expression quantitative trait loci (eQTL) for these traits by using an analytical approach of quantitative genetics (Kraft* et al.* 2003; Schadt* et al.* 2003).

Gene-expression levels are represented in populations by continuous variation. The studies in model organisms illustrated the genome-wide view of gene-expression levels as heritable phenotypes (Cheung and Spielman 2002). Li *et al.* compared transcript levels in lymphoblastoid cell lines from twins by cDNA microarray and RT-PCR to study the heritability of gene expression in humans. The distribution of the heritability among all genes showed a moderate and homogeneous positive shift that affects the majority of genes (Li* et al.* 2003). Brem *et al.* used a genome-wide genetic linkage approach to mapping the determinants of variation in gene expression. Their results suggested that the expression of most genes is affected by more than one locus (Brem* et al.* 2002). Most complex human phenotypes are also influenced by multiple genes (Chakravarti 1999). In developing analytical tools, it is important to consider genetic structure of/between gene expressions or of/between gene expression and a complex trait.

Replication is important in microarray experiments (Kerr and Churchill 2001; Nguyen* et al.* 2002; Yang and Speed 2002). Technical replication, such as spotting genes multiple times per array and hybridizing multiple arrays to the same samples, can address only the measurement error of an experiment. Bakay* et al.* (2002) found that major unwanted variability in expression profiling experiments was from a substantial interindividual variability. Such variability can be addressed by using multiple individuals randomly sampled from a population. However, as in genetic epidemiology studies, random sampling of individuals is liable to population stratification that may not only inflate type I error but also mask real genetic effects (Deng 2001).

Families generally contain more information about inherited traits than do random unrelated individuals, and family-based studies usually rely on variation and covariation among relatives. Recently Kraft* et al.* (2003) made pioneering efforts to adopt a family-based design in microarray studies to minimize systematic biases such as those from stratification. Stratification was recognized as a confounding factor contributing to spurious association between transcript abundance and disease status in the sample (Gibson 2003; Kraft* et al.* 2003). Kraft* et al.* (2003) presented a stratified family expression association test (FEXAT) to examine the correlation between gene expressions and traits, which was claimed to account for family structure. The FEXAT has a smaller false-discovery rate than the standard Pearson's correlation coefficient test when within-family correlation is of interest. However, it considers only sibship means and does not account for biological relationships between subjects within families. In addition, for large and complex pedigrees, the FEXAT uses only data from sibships extracted from large and complex pedigrees and does not utilize the information from all the pedigree members fully and simultaneously.

Factors such as age, genotype, sex, and habitual physical activity are important sources of variation in microarray experiments (Jin* et al.* 2001; Nogalska and Swierczynski 2001; Roth* et al.* 2002; Yang* et al.* 2003) and for complex traits (Deng* et al.* 2002). These factors may affect statistical analyses if they are not accounted for. Hence, covariate effects should be considered simultaneously and explicitly for gene expression and quantitative trait analyses, which is done in the FEXAT (Kraft* et al.* 2003) and new analyses that are being developed.

Stimulated largely by the pioneering work of Kraft* et al.* (2003), to account for family structure and potential covariate effects, we propose a modified FEXAT statistic and a bivariate analysis based on a variance-components approach to quantify the correlation between gene-expression levels and complex clinical traits. The properties of these two methods in terms of power and type I error rate are explored in a range of situations, in comparison with the FEXAT.

## MATERIALS AND METHODS

### FEXAT statistic:

As a starting point, we briefly introduce the statistic FEXAT proposed by Kraft* et al.* (2003), 1where *i* = 1, … , *n* index the sibships in the study (extended pedigrees may contribute multiple sibships) and *j* = 1, … , *n _{i}* index the subjects in a sibship.

*x*is the expression level for subject

_{ij}*j*in family

*i*, which is measured as the (log of the) fold change of the expression for the gene under study in the subject's RNA relative to a reference sample as in cDNA arrays or as match-mismatch score as in oligonucleotide arrays.

*y*is the trait value. This FEXAT statistic is compared with its asymptotic χ

_{ij}^{2}

_{1}distribution or an empirical permutation distribution to derive

*P*-values in practical data analyses (Kraft

*et al.*2003). It accounts only for sibship means for gene expression levels and trait values and does not consider family structure—the biological relationship between subjects within a family. If the family structures are pedigrees, some subjects will need to be excluded to calculate the statistic, since it accommodates only data for sibs.

### FEXAT(R) statistic:

To account for complete family structure—the relationships between all subjects within a pedigree—we propose here a statistic FEXAT(R), a revised version of the FEXAT. Denote **X*** _{i}* = (

*x*

_{i}_{1}, … ,

*x*

_{ini})

^{T}and

**Y**

*= (*

_{i}*y*

_{i}_{1}, … ,

*y*

_{ini})

^{T}as gene-expression levels and trait values within family

*i*(with

*n*measured subjects), respectively. They can be expressed, respectively, as 2and 3where

_{i}**β**and

_{X}**β**, respectively, are the vectors of fixed effects of gene-expression levels and trait values, which may incorporate effects of any observable covariates (

_{Y}*e.g.*, sex and age) as well as overall means of gene-expression levels and trait values.

**W**

_{Xi}and

**W**

_{Yi}are the incidence matrices of

**β**and

_{X}**β**, respectively.

_{Y}**F**

_{Xi}and

**F**

_{Yi}, respectively, are the family-mean differences of gene-expression levels and trait values, which may be due to confounding factors such as population stratification (Kraft

*et al.*2003);

**F**

_{Xi}∼

*N*and

**F**

_{Yi}∼

*N*, where

**J**

*is a matrix with all elements being 1.*

_{i}**U**

_{Xi}and

**U**

_{Yi}are the incidence vectors of

**F**

_{Xi}and

**F**

_{Yi}with elements of 0 or 1, respectively.

**z**

_{Xi}and

**z**

_{Yi}are the vectors of additive genetic effects of gene-expression levels and trait values, respectively;

**z**

_{Xi}∼

*N*and

**z**

_{Yi}∼

*N*, where

**G**

*is the relationship matrix for the*

_{i}*n*observed individuals within family

_{i}*i*. The elements of

**G**

*are twice the coefficients of kinship.*

_{i}**e**

_{Xi}and

**e**

_{Yi}are the vectors of residual effects of gene-expression levels and trait values within family

*i*, respectively;

**e**

_{Xi}∼

*N*and

**e**

_{Yi}∼

*N*, where

**I**is an

*n*×

_{i}*n*identity matrix. The variances of

_{i}*X*and

_{i}*Y*are thus and , respectively. The FEXAT(R) can be expressed as 4

_{i}Similar to the FEXAT, in practical data analyses, *P*-values for this statistic can be obtained by χ^{2}_{1} test or permutation test.

### Likelihood-ratio statistic:

We present a new statistic [likelihood ratio (LR)] by conducting analyses in a bivariate variance-components framework (Lange and Boehnke 1983; Lange 1997). Define **X** = (**X _{1}**, … ,

**X**, … ,

_{i}**X**)

_{n}^{T}and

**Y**= (

**Y**, … ,

_{1}**Y**, … ,

_{i}**Y**)

_{n}^{T}as gene-expression levels and trait values from the whole sample where

**X**and

_{i}**Y**were defined in Equations 2 and 3.

_{i}**X**and

**Y**can be written in a matrix form as 5where and .

**β**

_{X},

**β**

_{Y},

**W**

_{Xi},

**W**

_{Yi},

**F**

_{Xi}, and

**F**

_{Yi}were defined for Equations 2 and 3. Thus, the variance-covariance matrix is 6where .

**J**and

_{i}**G**were defined for Equations 2 and 3.

_{i}**I**is an identity matrix. σ

^{2}

_{fx}and σ

^{2}

_{fy}denote the family mean variances of the studied gene-expression levels and trait values in the study population, respectively. σ

*denotes the corresponding covariance between them. σ*

_{fxy}^{2}

_{ax}and σ

^{2}

_{ay}denote the variances of additive effects of the studied gene-expression levels and trait values in the population, respectively. σ

*denotes the corresponding covariance between them. σ*

_{axy}^{2}

_{ex}and σ

^{2}

_{ey}denote the variances of residual effects of the studied gene-expression levels and trait values in the population, respectively. σ

*denotes the corresponding covariance between them. The genetic correlation between gene-expression levels and trait values can be calculated as .*

_{exy}Under the assumption of the multivariate normality of gene-expression levels and trait values, the ln-likelihood of **V**, **β _{X}**, and

**β**given the observed data (

_{Y}**X**,

**Y**,

**W**,

_{X}**W**) is 7where

_{Y}Maximum-likelihood estimates can be obtained via a Fisher-scoring algorithm implemented in each sibship and/or pedigree (Lange and Boehnke 1983). Once maximum-likelihood estimates are available, one can test hypotheses of interest by the LR statistic, 8

This test statistic is approximately chi-square distributed with 1 d.f. We can test the null hypothesis σ* _{axy}* = 0. If σ

*= 0 is accepted, one can conclude that there is no significant relationship between the gene-expression level*

_{axy}**X**and trait value

**Y**.

### Simulation study:

A series of simulations were carried out to examine the effects of different factors on the performance of these two methods, in comparison with the FEXAT. To investigate the effects of sample sizes, we simulated three samples of 48, 96, and 192 subjects, respectively. The first sample consisted of 4 families of size 12 (4 × 12; throughout in this article, the first number is for families and the second number for subjects within each family); and the second sample consisted of 8 families of size 12 (8 × 12); and the third sample consisted of 16 families of size 12 (16 × 12). To study the effects of family structures on these three methods, we simulated three family structures: two of them were pedigrees and another was sibship. Sibship structure is a group of full-sibs without parents. The pedigree structures are illustrated in Figure 1. According to the definition of the FEXAT (Kraft* et al.* 2003), a large sibship of size 8 in each pedigree was used to calculate the statistic FEXAT in pedigree type A, whereas in pedigree type B, the complex pedigree should be broken into three sibships to calculate the statistic FEXAT. To study the effects of family sizes on these three statistical methods, we simulated another sample with 24 sibships of size 4 (24 × 4) to compare it with the situations of 8 sibships of size 12 (8 × 12) under the same sample size. For simplicity, we considered a case in the simulation studies of Kraft* et al.* (2003); *i.e.,* the ratio of variance in family means to variance in within-family differences is 1:1. Our simulation results (not shown) indicated that this ratio has little effect on the comparison of the three methods. Total variation of gene-expression levels and trait values were fixed at 1. We studied four levels of genetic correlation between gene-expression levels and trait values (0.0, 0.3, 0.5, and 0.7) and three heritability levels for both the gene-expression levels and trait values (0.2, 0.4, and 0.6), respectively. In addition, we considered two situations about covariates: (1) there were no covariate effects; *i.e.*, β* _{cX}* = β

*= 0, where β*

_{cY}*and β*

_{cX}*are the regression coefficients of the gene-expression levels and trait values on a specific covariate*

_{cY}*c*, respectively; and (2) there were covariate effects on the gene-expression levels and trait values. We assume that β

*= β*

_{cX}*= 0.2 and covariate effects were drawn from a standardized normal distribution. When there were covariate effects on the gene-expression levels and trait values, the computation of the FEXAT statistic used the adjusted (for the covariate) gene-expression levels and trait values that were implemented in the standard regression analyses. We also assumed that the family-mean correlation between gene-expression levels and trait values is 0.3, which may be due to confounding factors such as population stratification (Kraft*

_{cY}*et al.*2003). Although our simulation results are presented for the above assumed parameter values, our results for other parameter values (not shown here) indicated similar conclusions obtained with those presented here.

For each scenario, we generated 10,000 replicate studies and calculated the FEXAT, FEXAT(R), and LR test statistics for two nominal significance levels of α = 0.05 and α= 0.01, respectively. The statistical power and type I error rate are used as criteria to compare the relative performance of these three statistics in quantifying the correlation between gene-expression levels and trait values. Power refers to the probability of declaring a statistical significance when a true correlation exists. Type I error rate is the probability of declaring that gene-expression levels are correlated with trait values when there is no relationship between them, *i.e*., σ* _{axy}* = 0. More detailed descriptions of these test statistics examined are provided in materials and methods.

## RESULTS

The type I error rate and power estimates over 10,000 replicated simulations are summarized in Tables 1–3. All three methods are not sensitive to the correlation between the family-mean expression levels and trait values. That is, they are robust to population stratification (that can generate correlation between family-mean expression levels and trait values; Kraft* et al.* 2003) in association analyses between gene-expression levels and trait values. The estimated type I errors of the FEXAT(R) are slightly conservative and those of the LR are slightly inflated. The type I errors of the FEXAT vary depending upon family structures (see below).

The heritabilities of the gene-expression levels and trait values have no significant effect on type I errors of the three methods. As expected, the statistical powers increase with increasing heritabilities for all three methods. The LR statistic has higher power than the other two methods in most cases.

Comparisons of results in Tables 1–3 show that type I errors of the LR approach nominal levels with increasing sample sizes. For example, when the sample size of sibships increases from 4 × 12 (4 sibships each with 12 sibs) to 16 × 12 (16 sibships each with 12 sibs) and heritability *h*^{2} = 0.4, the estimated type I errors of the LR decrease from 12.7 to 5.3% and from 5.7 to 0.9% under the nominal significance levels of α = 0.05 and α = 0.01, respectively. In contrast, the reverse tendency is observed for the FEXAT. When the sample size increases, the estimated type I errors of the FEXAT have a slight inflation especially when α = 0.01. As expected, the statistical powers increase as the sample size increases for all of the three methods.

Pedigree structure affects type I error rates and powers (Tables 1–3). The FEXAT performs relatively unsatisfactorily in multigeneration complex pedigree data. Compared with the results from the same sample size and family size but with different family structures, the estimated powers of the FEXAT are much lower in complex pedigrees than in sibships, and the difference is more pronounced with large sample sizes. In addition, the type I errors of the FEXAT depart far above the nominal levels for those pedigrees with small sibships. This is largely because (1) the FEXAT uses only the data of sibships by breaking a large pedigree into multiple sibs and may lose information, while the FEXAT(R) and LR use full data in pedigrees; and (2) the FEXAT does not consider the biological relationship (covariance due to polygene effects), while the FEXAT(R) and LR methods accounted for the biological relationships of family members. For example, with a sample size of 16 × 12 (16 pedigrees each with 12 members), heritabilities of 0.4 (for both gene expression levels and trait values), and an expression-trait correlation of 0.7, the estimated powers of the FEXAT are 89.7, 47.6, and 29.7% for the family structures of sibship, pedigree A, and pedigree B, respectively. The type I error rate of the FEXAT is 10.4% with a sample size of 16 × 12 and heritabilities of 0.4 under the family structure of pedigree B. In contrast, the type I error rates of the LR and FEXAT(R) are reasonably robust to family structures and their statistical powers vary to a lesser extent than that of the FEXAT. Taking into account both type I error rate and statistical power, we present a summary of the relative performance of the three methods under different situations in terms of sample sizes and family structures in Table 4. Table 4 also provides a recommendation for choosing different statistics in practical data analysis. This generalization is made under a prerequisite of a large family size (*e.g.*, >12). If the family size is small, the FEXAT(R) is overconservative along with having lower powers (Table 5).

Table 6 illustrates the results when a covariate had effects on both the gene-expression levels and trait values under study. Because both the LR and FEXAT(R) accommodate covariate effects in the mean value structure of the models, they can correctly estimate the regression coefficient of the gene-expression levels and trait values on covariates. A comparison of Tables 2 and 6 shows that both the LR and FEXAT(R) yield nearly identical type I error rates and powers whether or not there are covariate effects. The statistic FEXAT *per se* cannot account for covariate effects. When adjusting gene-expression levels and trait values that can be implemented in the standard regression analyses, the FEXAT can also obtain similar powers to those when covariate effects are absent, however, with slightly inflated type I error rates (Tables 2 and 6). The above results are obtained from samples with large family sizes (Table 6). If the family size is small, with the same total sample size, the FEXAT has relatively larger inflated type I errors (Table 5). In comparison, the LR and FEXAT(R) are reasonably robust to family size regardless of the presence of covariate effects (Tables 2, 5, and 6).

## DISCUSSION

In microarray experiments, measures of gene-expression levels are available for members of a set of families (Shannon* et al.* 2002; Schadt* et al.* 2003). Thus, standard methods of quantitative genetics can be applied to the family-based microarray data analyses. This may facilitate exploration of the association of gene expressions with multifactorial phenotypes of interest. Family-based designs have the advantages of being robust to population stratification. Recently, Kraft* et al.* (2003) proposed a stratified FEXAT to quantify the relationship between gene-expression levels and clinical traits. Stimulated by their work, we presented a modified FEXAT statistic and a bivariate analysis based on a variance-components approach for such experimental data analyses.

Kraft* et al.* (2003) showed that the observed type I errors for the FEXAT were somewhat conservative especially at stringent nominal levels in their simulations. However, in our simulations, the FEXAT tends to have slightly inflated type I errors, especially under a large sample size. As indicated in Kraft* et al.* (2003), the FEXAT was more favorable for large family sizes than for small family sizes (see Kraft* et al.* 2003, Tables 1 and 2), but the reverse trend was observed in our study (see Tables 2 and 4). This is presumably due to different simulation designs between the two studies. In this study, we took into account the biological relationship (covariances due to polygenes or shared environmental covariates) between subjects within the same families when simulating phenotypic data. In the study of Kraft* et al.* (2003), data for family members (siblings) appeared to be drawn independently from a normal distribution. Apparently, the former design is closer to the nature of quantitative-trait inheritance (Lynch and Walsh 1998). It seems that the FEXAT is favored under small family size and/or subjects with distant relationships. Dividing extended pedigrees into multiple sibships, as suggested by Kraft* et al.* (2003), and ignoring covariances among family members greatly decrease the statistical powers and inflate type I errors of the FEXAT as shown in our simulations.

All three methods are robust to population stratification that may result in correlation in family means and yield spurious association results (Kraft* et al.* 2003). The FEXAT(R) seems to be slightly conservative at the cost of a decrease in the powers, while the type I error rates of the LR statistic are slightly higher than nominal levels especially with small sample size. The LR has higher powers than the FEXAT and FEXAT(R) in most cases, except for sibships and large samples where the FEXAT(R) has the highest power. Both the LR and FEXAT(R) are favored by increasing sample sizes regardless of the family structures. This is particularly important given that the decreasing costs of microarray experiments make it practical for gene expression study on a large scale in complex pedigrees. As shown in results, the FEXAT is preferable to the FEXAT(R) and LR statistics under small sample sizes and for sibships, while the performances and the preference of the three statistics are the reverse for moderate to large sample sizes and/or pedigrees.

It is known that maximum-likelihood estimates of the variance and covariance components may be biased when only a small number of observations are considered (Hopper and Mathews 1982; Amos* et al.* 1996). As the sample size increases, the biases in maximum-likelihood estimates of variance and covariance components should be reduced; thus, the type I error rates of the LR statistic should approach their nominal levels (Amos* et al.* 1996) as reflected in our results for the LR method. We examined the empirical distribution of the LR statistic under the null hypothesis with different sample sizes. We simulated 10,000 data sets using pedigree A as described in *Simulation study*, and in each data set the LR statistic was calculated and shown in histograms (Figure 2). The LR statistic follows the chi-square distribution with 1 d.f. asymptotically. However, this statistic has a slightly heavy tail under small sample sizes.

The methods proposed in this study can also be used to characterize the genetic correlation between multiple gene-expression levels. Pairwise analysis of multiple gene expression profiles may provide a prediction of joint expression and regulation of these genes. Due to the high-dimensional gene-expression data from microarrays, multiple testing problems may arise in the trait-expression and/or expression-expression association studies. An initial data reduction to focus on those genes whose expressions varied nontrivially across samples is necessary before the tests are performed. Additionally, systematic measurement errors may contribute to the covariation between different gene-expression levels, which may lead to spurious association results. In a flexible variance-components framework, modeling such covariation as a covariance component in our analyses is straightforward. This may minimize bias in analyses.

Variance-components methods make a critical assumption that the quantitative trait values within a family either follow or can be transformed to follow a multivariate normal distribution (Amos* et al.* 1996). Permutation tests can be performed in situations, particularly for small sample sizes, in which the violation of multivariate normality assumption is difficult to detect. Simulations showed that permutation tests warrant correct type I errors under violations of multivariate normality assumption (Abecasis* et al.* 2000). The procedure of reshuffling the original data is similar to that proposed by Kraft* et al.* (2003). Empirical *P*-values can be computed by comparing the observed statistic with the permuted statistics under the null hypothesis of no trait-expression association. We simulated 10,000 data sets using pedigree A and each data set was resampled 10,000 times to obtain the empirical *P*-value of the FEXAT(R). We found that the FEXAT(R) using a permutation test performs slightly better than that using a chi-square test (Table 7). However, the permutation test is computationally demanding especially for gene-expression data.

Typical microarray experiments aim at characterizing differential gene-expression patterns under distinct treatments. Quantifying trait-expression and/or expression-expression associations at the population level may represent another growing trend toward high-throughput microarray data analyses. In the latter analysis, a large number of candidate genes for complex diseases can be reduced by restricting attention to genes whose expression levels show associations with complex trait values. In the near future, remarkable advances in microarray technology may greatly decrease the experimental costs, making it practical to perform microarray experiments in large sample sizes with large pedigrees. This has been exemplified by the current burgeoning of large-scale whole-genome linkage scans for complex traits, which were hampered by the prohibitory costs just about a decade ago. It is anticipated that the methods based on the variance-components approach that is advantageous under large sample sizes will be a good way to measure quantitatively the relationship between gene expressions and clinical traits in general pedigrees. The program for implementing the methods investigated here is available, upon request, from Y.L. (yanlu{at}creighton.edu) or H.W.D.

## Acknowledgments

This study was partially supported by grants from the Health Future Foundation, the National Institutes of Health, and the State of Nebraska (LB595 and LB692). H.W.D. benefited by partial support from Hunan Province, Chinese National Science Foundation, Huo Ying Dong Education Foundation, the Cheng Kong scholar program, and the Ministry of Education of the People's Republic of China.

## Footnotes

Communicating editor: Y. X. Fu

- Received May 25, 2004.
- Accepted September 13, 2004.

- Genetics Society of America