## Abstract

Testing for genetic association with multiple traits has become increasingly important, not only because of its potential to boost statistical power, but also for its direct relevance to applications. For example, there is accumulating evidence showing that some complex neurodegenerative and psychiatric diseases like Alzheimer’s disease are due to disrupted brain networks, for which it would be natural to identify genetic variants associated with a disrupted brain network, represented as a set of multiple traits, one for each of multiple brain regions of interest. In spite of its promise, testing for multivariate trait associations is challenging: if not appropriately used, its power can be much lower than testing on each univariate trait separately (with a proper control for multiple testing). Furthermore, differing from most existing methods for single-SNP–multiple-trait associations, we consider SNP set-based association testing to decipher complicated joint effects of multiple SNPs on multiple traits. Because the power of a test critically depends on several unknown factors such as the proportions of associated SNPs and of traits, we propose a highly adaptive test at both the SNP and trait levels, giving higher weights to those likely associated SNPs and traits, to yield high power across a wide spectrum of situations. We illuminate relationships among the proposed and some existing tests, showing that the proposed test covers several existing tests as special cases. We compare the performance of the new test with that of several existing tests, using both simulated and real data. The methods were applied to structural magnetic resonance imaging data drawn from the Alzheimer’s Disease Neuroimaging Initiative to identify genes associated with gray matter atrophy in the human brain default mode network (DMN). For genome-wide association studies (GWAS), genes *AMOTL1* on chromosome 11 and *APOE* on chromosome 19 were discovered by the new test to be significantly associated with the DMN. Notably, gene *AMOTL1* was not detected by single SNP-based analyses. To our knowledge, *AMOTL1* has not been highlighted in other Alzheimer’s disease studies before, although it was indicated to be related to cognitive impairment. The proposed method is also applicable to rare variants in sequencing data and can be extended to pathway analysis.

ALZHEIMER’S disease (AD) (MIM 104300) is the most common neurodegenerative disease, and every 67 sec, someone in the United States develops AD (Alzheimer’s Association 2015a). Currently there is no cure for AD, and most cases are diagnosed in the late stage of the disease. It is projected that the number of Americans of age 65 years and older with AD will increase from 5.1 million in 2015 to 13.5 million in 2050, a growth from an estimated 11% of the U.S. senior population in 2015 to 16% in 2050, costing >$1.1 trillion in 2050 (Alzheimer’s Association 2015b). To advance our understanding of the initiation, progression, and etiology of AD, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) was started in 2004 and continues to the present, collecting extensive clinical, genomic, and multimodal imaging data (Shen *et al.* 2014). Many other genetic studies have been conducted, identifying multiple common and rare variants, shedding light on pathogenic mechanisms of AD (Marei *et al.* 2015; Saykin *et al.* 2015). In particular, the APOE allele has been consistently shown to be associated with AD. However, only 50% of AD patients carry an APOE allele, suggesting the existence of other genetic variants contributing to risk for the disease (Karch *et al.* 2014). A recent study indicates that 33% of total AD phenotypic variance is explained by common variants; APOE alone explains 6% and other known markers 2%, meaning >25% of phenotypic variance remains unexplained by known common variants (Ridge *et al.* 2013). Hence, as for other common and complex diseases and traits, many more genetic factors underlying late-onset AD are yet to be discovered. One obvious but costly approach is to have a larger sample size. Alternatively, more powerful analysis methods are urgently needed. For example, in contrast to the popular single single-nucleotide polymorphism (SNP)-based analysis, novel gene- and pathway-based analyses may be more powerful in discovering additional causal variants. As demonstrated by Jones *et al.* (2010), jointly analyzing functionally related SNPs sheds new light on the relatedness of immune regulation, energy metabolism, and protein degradation to the etiology of AD. The reason is due to the well-known genetic heterogeneity and small effect sizes of individual common variants, as observed from published genome-wide association study (GWAS) results (Manolio *et al.* 2009). To boost power in identifying aggregate effects of multiple SNPs, it may be promising to conduct association analysis at the SNP-set (or gene) level, rather than at the individual SNP level.

Another strategy is to use multiple endophenotypes, intermediate between genetics and the disease, for their potential to have stronger associations with genetic variants. In addition to boosting power, the use of intermediate phenotypes may provide important clues about causal pathways to the disease (Maity *et al.* 2012; Schifano *et al.* 2013). A recent GWAS demonstrated the effectiveness of the strategy: some risk genes such as *FRMD6* were first identified to be associated with some neuroimaging intermediate phenotypes (*e.g.*, hippocampal atrophy) (Shen *et al.* 2014) and then were later validated to be associated with AD (Hong *et al.* 2012; Sherva *et al.* 2014). A possibly useful but underutilized intermediate phenotype is the brain default mode network (DMN), consisting of several brain regions of interest (ROIs) remaining active in the resting state. Brain activity in the DMN may explain the etiology of AD (Metin *et al.* 2015) and is a plausible indicator for incipient AD (Damoiseaux *et al.* 2012; Greicius *et al.* 2004; He *et al.* 2009; Jones *et al.* 2011; Balthazar *et al.* 2014). Since there is growing evidence that genetic factors play a role in aberrant default mode connectivity (Glahn *et al.* 2010), it may be substantially more powerful to detect genetic variants associated with the DMN, a set of multiple intermediate phenotypes, than with AD.

Here we discuss gene-based multitrait analysis, aiming at discovering genes associated with multiple traits such as the DMN. To date, several but not many methods have been proposed for gene-based multitrait analysis (Maity *et al.* 2012; Guo *et al.* 2013; Van der Sluis *et al.* 2015; Wang *et al.* 2015). The simplest way is to use the minimum *P*-value (minP) test based on the most significant single-SNP–single-trait association, which, however, may lose power in the presence of multiple weak associations between multiple SNPs and multiple traits. Some methods, such as that in Van der Sluis *et al.* (2015) and M-TopQ25Stat (Guo *et al.* 2013), utilize only a few top association signals among the pairwise single-SNP–single-trait associations. Some methods based on principal components analysis (PCA) or principal components of heritability (PCH), originally proposed for multiple SNPs and a single trait (Wang and Abbott 2007; Klei *et al.* 2008), may be also applied. However, these methods and canonical correlation analysis (CCA) (Tang and Ferreira 2012) make use of only one or a few top components, and thus they share the same weakness of power loss in the presence of multiple associations; furthermore, the number of principal components (PCs) may be difficult to determine (Aschard *et al.* 2014). Another extreme is the burden test (Shen *et al.* 2010; Guo *et al.* 2013; Mukherjee *et al.* 2014), which is powerful in the presence of a dense association pattern, in which most SNP–trait pairs are associated with almost equal effect sizes and directions; otherwise, *e.g.*, when the association directions of some SNP–trait pairs are different, it does not perform well (as is well known for analysis of rare variants). A compromise between the above two extremes is a variance-component test (Maity *et al.* 2012; Wang *et al.* 2013), which is more robust to association density/sparsity and varying association directions. Nevertheless, as shown in the context of multiple rare variants and a single trait (Pan *et al.* 2014), it may still suffer from power loss in the presence of more sparse association patterns (*i.e.*, when there are fewer associated SNP–trait pairs). A fundamental challenge in multivariate analysis is the lack of a uniformly most powerful test: nonadaptive test may be powerful in some situations, but not in others. Nevertheless, we aim to construct an adaptive test such that it can maintain high power, not necessarily highest power, across a wide range of scenarios. In particular, the proposed test is adaptive at both the SNP and trait levels. Its key feature is the use of a weighting scheme to yield robust statistical power no matter whether the true and unknown association pattern is dense or sparse (or in whatever directions), and the weight is determined data adaptively. In addition, some chosen weights correspond to several existing tests, including a burden test and a variance-component test. Therefore, the high power range of the proposed test covers those of the burden test and the variance-component test. Moreover, the proposed test is based on the general framework of the generalized estimating equations (GEE), and hence it is flexible with the capability to incorporate covariates and various types of traits (Liang and Zeger 1986). It also avoids a difficulty in correctly specifying a joint multivariate distribution or likelihood for a set of multiple traits. Furthermore, we extend the proposed method to pathway analysis, in which it is adaptive to possibly varying gene-level associations.

We compare the performance of the new test with that of several existing tests, using both simulated and real data. The methods were applied to structural magnetic resonance imaging (MRI) data drawn from the ADNI to identify genes associated with the DMN. In the GWAS, 277,527 SNPs were mapped to 17,557 genes, among which genes *AMOTL1* on chromosome 11 and *APOE* on chromosome 19 were discovered by the new test to be significantly associated with the DMN. Notably, gene *AMOTL1* was not detected by single SNP-based analyses. We also illustrate the application of the methods to the ADNI whole-genome sequencing (WGS) data, although no significant genes were identified, presumably due to a relatively small sample size.

In the following, we briefly review GEE and an existing method before introducing the new test in *Materials and Methods*. In *Results*, the new and several existing methods are compared with applications to the ADNI data and simulated data mimicking the ADNI data. We end with a short summary of the conclusions.

## Materials and Methods

### Review

#### Generalized estimating equations:

Suppose for each individual we observe *k* traits *q* covariates and a set of SNPs with Denote and where *I* is a identity matrix, and ⊗ represents the Kronecker product. We model the mean of the phenotypes using a marginal generalized linear model(1)with parameters and a link function The regression coefficients are a vector, in which represents the effect of the *j*th SNP on the *t*th trait, while the element of is the effect size of the *s*th covariate on the *t*th trait. Liang and Zeger (1986) proposed estimating *φ* and *β* by solving the GEE(2)with and where *φ* is a dispersion parameter, models the variances with a variance function and is a working correlation matrix with possibly some unknown parameters *α*. Specifically, for quantitative traits () with the identity link function (or more generally, for any generalized linear model with a canonical link function), the score vector and its variance–covariance matrix areThe covariance matrix can be partitioned according to the score components for *ϕ* and *β*: For convenience, the working independence model is often used with as an identity matrix as done in this article unless specified otherwise.

Our primary concern is to test for overall genetic effects with while treating *ϕ* as nuisance parameters. To perform the score test, we evaluate Equation 1 under Under we have and the estimate of *ϕ*, denoted as is the solution to the generalized score equation The marginal mean is estimated by

For testing SNP-set effects, we consider the subcomponents of the score vector for *β*:(3) asymptotically follows a multivariate normal distribution under where can be written as Each element measures the association strength between SNP *j* and trait *k* for and and is asymptotically proportional to in Equation 1. implies there is no association between SNP *j* and trait *k*; similarly (or small) indicates no (or weak) association between SNP *j* and trait *k*.

For testing the GEE-Score test statistic is defined byUnder , the GEE-Score statistic asymptotically follows a central chi-square distribution with degrees of freedom. When is large, this standard score test loses power for large degrees of freedom. Another way to draw inference, especially convenient when combining the score test with other tests as discussed later, is to simulate for and obtain the null statistics The *P*-value can be calculated as where denotes the indicator function.

For ease of notation, we suppress *β* and take and hereafter.

#### An adaptive association test for a single SNP:

Zhang *et al.* (2014) proposed a class of sum of powered score (SPU) tests for testing association between an individual SNP and multiple traits, along with its data-adaptive version (aSPU). The SPU tests are a family of association tests based on the (generalized) score vector in the GEE framework, aiming for at least one of them to be powerful in any given situation. With only a single SNP *j*, then the score vector reduces to The association between the SNP and *k* traits can be quantified with a test statisticwhere a candidate integer is chosen from a preselected parameter set *e.g.*, The statistical power of an SPU(*γ*) test depends on the choice of When *γ* is an odd integer, the SPU(*γ*) test sums up the association signals across all the traits, retaining high power if all or most of the multiple traits have an almost equal effect size in the same association direction. A special case is giving a burden test commonly used for rare variants. With an even *γ*, the SPU(*γ*) test will be more powerful when some traits have different association directions. In particular, the SPU(2) test is the same as the sum of squared score (SSU) test (Pan 2011), closely related to multivariate distance matrix regression (MDMR) (McArdle and Anderson 2001), kernel machine regression (KMR) (Liu *et al.* 2007), and variance-component tests (Tzeng *et al.* 2011). Furthermore, as *γ* increases, the SPU test upweights the more strongly associated traits, while reducing the weights on other ones. In particular, when (as an even integer), only the maximum component of the score vector is used and the test statistic is defined as The SPU test is similar to the UminP test (when the variances of the score components are almost equal). To compute the significance of an SPU test, Monte Carlo (MC) simulations (or alternatively, permutations) are used; for the null score is generated from from which the null statistics SPU can be obtained for each *γ*. Then the *P*-value can be calculated as

However, it is not clear how to choose an optimal *γ* *a priori* for given data. Hence, Zhang *et al.* (2014) proposed an aSPU test to extract association evidence from multiple SPU(*γ*) tests. The statistic of the aSPU test is the minimum *P*-value of SPUs for some candidate values of *γ*,where is the *P*-value of SPU By MC simulations (or permutations), the *P*-value of aSPU, along with those of all SPU(*γ*) tests, can be efficiently calculated based on the same set of the null statistics in a single layer.

#### Existing gene-based tests:

We compare the proposed test with several existing gene-based tests for multiple traits, including multivariate analysis of variance (MANOVA), MDMR with the Euclidean distance (McArdle and Anderson 2001), multivariate KMR under linear kernel (Maity *et al.* 2012), and a multivariate functional linear model (MFLM) (Wang *et al.* 2015). We note that KMR can be derived based on a random-effects model while MFLM is built on a fixed-effect model. For implementation, the R package vegan was used for MDMR; R code for KMR and MFLM was downloaded from the authors’ websites, http://www4.stat.ncsu.edu/∼maity/software.html and https://www.nichd.nih.gov/about/org/diphr/bbb/software/fan/Pages/default.aspx, respectively. Since KMR (Maity *et al.* 2012) was computationally slow, it was excluded from the simulation studies.

### New methods

#### An adaptive test:

We introduce a novel gene-based adaptive sum of powered score test for a set of multiple traits, denoted as aSPUset, by extending the single SNP-based test of Zhang *et al.* (2014). Suppose that there are *p* SNPs in a gene and *k* traits of interests. Recall that is the generalized score vector of length in GEE, and *V* is the covariance matrix of the score vector; each element of the score, quantifies the association between SNP *j* and trait *t*. In practice, the true and unknown association patterns across multiple SNPs and multiple traits are complex: some SNPs may be associated with some traits, but not with other traits; different SNPs may be associated with different subsets of the traits with varying association strengths and directions. Since the use of nonassociated SNPs and traits in a test statistic could reduce the power of the test, we may want to give higher weights to more likely associated SNPs and traits. However, how much to optimally overweight these likely associated SNPs and traits depends on the true association pattern, which is unknown. The aSPUset test employs two positive integer parameters, and to control the degrees of weighting over the SNPs and over the traits, respectively, and the two parameters are chosen data adaptively. A larger puts more weights on the SNPs more likely to be associated with a given trait, while a larger upweights the traits more strongly associated with the SNPs.

We build the test statistic as follows. For each trait *t*, quantifies the association between the single trait and multiple SNPs, and then combines the single trait-based statistics:(4)Here candidate integers and are chosen from two preselected parameter sets and We used due to the good performance in our numerical studies.

In can be rewritten by an alternative form is a weight for each score element, which reflects the association strength (and direction) between SNP *j* and trait *t* of the given data. With the SPU test weights each SNP equally and yields the highest power if all the SNPs are associated with the trait *t* with similar effect sizes and association direction (*i.e.*, all positive or all negative). When the subsets of SNPs are associated with the trait *t*, or their association directions are different, SPU() is often more powerful. As increases, SPU() puts heavier weights on the SNPs that are more strongly associated with the trait *t*. At the end, as the parameter approaches ∞ (as an even integer), it considers the only most significant SNP; *i.e.*,

Similarly, controls how much to upweight the traits that are more likely to be associated with SNPs. SPU weights all traits equally and performs best when each trait is equally associated with the SNPs. Similarly, as increases, the SPU test overweights larger trait-based statistics in an extreme case, as we define If one is more interested in the most significantly associated single-SNP–single-trait pair, can be considered. Using various combinations of and one can target and fit different association patterns across multiple SNPs and multiple traits, including their varying sparsity levels. As a result, the SPU() tests cover several existing tests as special cases as will be shown.

The aSPUset test chooses () data adaptively by taking the minimum *P*-value of SPU()s as the test statistic for candidates and To assess the significance of all the and the aSPUset test, we use either permutations or MC simulations in a single layer to obtain their *P*-values. The permutation-based method is useful when the covariance matrix (*V*) is not easy to estimate (*e.g.*, in a high-dimensional setting) or when the usual Normal asymptotics may not hold (*e.g.*, *n* is not large compared to ); in contrast, the simulation-based method is more restrictive but computationally more efficient. For the permutation-based method, residual terms in Equation 3 are permuted to generate for from which the null score vector is computed as Alternatively, for the simulation method, we simulate the null score vectors independently from the null distribution: for

In either case, the null statistics can be calculated from the null score vectors for Because all SPU() tests are based on the same null score vectors we just need to simulate one set of null scores and efficiently compute the null statistics, tests simultaneously for candidate ’s. Then the *P*-value of isWe can also simultaneously and efficiently compute the *P*-value of the aSPUset test based on the same set of the null statistics being used for the SPU tests. Note that for each we can calculate its *P*-value as Denote its minimum as Then the significance of the aSPUset test is obtained as

#### Extensions:

As shown by Zhang *et al.* (2014), in some but not all situations, the GEE-Score test may perform better than the aSPU test for a single SNP and multiple traits; the opposite is true too. Hence, to take advantage of both tests, we combine them by taking their minimum *P*-value to form a new test statistic,(5)Its *P*-value can be calculated using simulations or permutations as for aSPUset. The null statistic GEE-Score^{(}^{b}^{)} is obtained from the same score that is used for Hence the null statistics for and GEE-Score^{(}^{b}^{)} can be computed simultaneously.

We can also consider a variance-weighted version of the SPU and aSPUset tests, called the SPUw and aSPUw-set, respectively. Each diagonal element of the covariance matrix (*V*) corresponds to the variance of the individual score element denote the variance of as The SPUw test is defined with the statisticThe aSPUw-set test statistic is defined as the one taking the minimum *P*-value of the multiple-SPUw tests in the same way as that for aSPUset and SPU The SPUw and aSPUw-set tests are invariant to the scale of each trait and hence may be useful when it is unclear how to standardize multiple traits that are in different scales. However, standardizing the traits (such that their sample variances are all equal to one) may or may not be beneficial; often, the power of the unweighted SPU tests and that of the weighted ones are similar as shown before in other contexts (Pan *et al.* 2014; Zhang *et al.* 2014).

#### Relationships with other methods:

The SPU tests are closely related to some existing tests, covering some as special cases. Guo *et al.* (2013) proposed a set of nonparametric methods for gene-based multiple-trait association analysis, called M-MeanStat, M-MaxStat, and M-TopQ25Stat. Each of the methods of Guo *et al.* (2013) is built on a generalized Kendall’s *τ*, which quantifies the pairwise association between a single SNP and a single trait. Comparing two sets of statistics, M-MeanStat *vs.* and M-Max *vs.* , we see their equivalence as described in *Appendix A*.

It is obvious that the SPU(1,1) test is a burden test, which is optimal if its implicit assumption that each SNP–trait pair is equally associated (with the same association direction) holds. The SPU(2,2) test has connections to several other tests. Zhang *et al.* (2014) showed that when testing on a single SNP, the SPU(2,2) test under the GEE working independence model is equivalent to MDMR with the Euclidean distance. However, for testing multiple SNPs, the equivalence does not hold (*Appendix B*). KMR with the linear kernel has the same test statistic as SPU(2,2) if the working correlation matrix of the latter in GEE is correctly specified as the true correlation matrix of [*i.e.*, ]; see *Appendix C* for derivation. This illustrates the flexibility of our proposed test under GEE, in contrast to the stronger modeling assumption in KMR. Since KMR can be derived based on a random-effects model while the burden test is formulated based on a fixed-effects model, our proposed method can be regarded as combining results from both fixed- and random-effects models.

As is shown in our numerical studies, the GEE-Score test and MANOVA performed similarly; we establish the equivalence between the GEE-Score test and MANOVA with the Pillai–Bartlett trace (*Appendix D*). Muller and Peterson (1984) discussed the close relationships among four versions of MANOVA (*i.e.*, with the Pillai–Bartlett trace, Hotelling–Lawley’s trace, Wilk’s *λ*, and Roy’s largest root), each of which can be written as a function of generalized canonical correlations (CCA). Hence the GEE-Score test is directly related to MANOVA and CCA.

#### Pathway analysis:

We extend the adaptive test for association analysis of a single trait and a pathway (*i.e.*, a set of genes) (Pan *et al.* 2015) to that of multiple traits and a pathway. The main idea is to allow adaptive weighting at the gene level, in addition to at the SNP and trait levels. Given a pathway *S* with genes and a single trait *t*, we partition the score vector according to the genes in *S* as with a subvector for gene *g* (with SNPs) as Denote and as the gene-specific SPU and the pathway-based SPU test statistics for a single trait *t*, respectively. Define a new test statistic as the pathway analysis for multiple traits,where the three scalars are specified to control the degrees of weighting the SNPs, genes, and traits, respectively; gives gene-specific weights for the SNPs in gene *g* as and gives gene-specific weights for each gene in the pathway *S*. These weights are specified based on some prior knowledge on the importance of the genes and SNPs; without prior knowledge, we can simply use an equal weight 1 on each gene and each SNP, as used in our later simulations. We employed and in later simulations.

Finally, a new adaptive test for pathway analysis, denoted the GEE-aSPUpath test, is defined aswhere is the *P*-value of the GEE-SPUpath test. The simulation or permutation procedure for generating the null statistics and calculating *P*-values for all the GEE-SPUpath and GEE-aSPUpath tests is similar to that for the GEE-aSPUset test.

Due to the limited space, we will not discuss the pathway-based tests in the sequel; some simulation results are presented in Supplemental Materials, File S4.

### Data availability

The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article. The R code for the proposed tests and simulations is available in Supplemental Materials, File S5. An R package GEEaSPU is to be uploaded to CRAN.

## Results

### Real data example

#### ADNI data:

Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and nonprofit organizations, as a $60 million, 5-year public–private partnership. The primary goal of ADNI has been to test whether serial MRI, positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early AD. Determination of sensitive and specific markers of very early AD progression is intended to aid researchers and clinicians to develop new treatments and monitor their effectiveness, as well as lessen the time and cost of clinical trials. The Principal Investigator of this initiative is Michael W. Weiner, VA Medical Center and University of California, San Francisco. ADNI is the result of efforts of many co-investigators from a broad range of academic institutions and private corporations, and subjects have been recruited from >50 sites across the United States and Canada. The initial goal of ADNI was to recruit 800 subjects but ADNI has been followed by ADNI-GO and ADNI-2. To date these three protocols have recruited >1500 adults, ages 55–90, to participate in the research, consisting of cognitively normal older individuals, people with early or late MCI, and people with early AD. The follow-up duration of each group is specified in the protocols for ADNI-1, ADNI-2, and ADNI-GO. Subjects originally recruited for ADNI-1 and ADNI-GO had the option to be followed in ADNI-2. For up-to-date information, see www.adni-info.org.

#### GWAS with ADNI-1 data:

One objective of ADNI is to elucidate genetic susceptibility to AD. We conducted a gene-based multitrait analysis for ADNI-1 data, by using gray matter volumes in the 12 ROIs corresponding to the DMN as intermediate phenotypes. The DMN is a network of brain regions that are active when an individual is at wakeful rest, which includes inferior temporal, medial orbitofrontal, parahippocampal, precuneus, and posterior cingulate ROIs (Greicius *et al.* 2004). Importantly, DMN activity distinguishes cognitively impaired patients such as with Alzheimer’s, attention deficit hyperactivity disorder (ADHD), or bipolar disorder from healthy controls (Greicius *et al.* 2004; Buckner *et al.* 2008; Meda *et al.* 2014; Metin *et al.* 2015). The gray matter volumetric measures related to the DMN were extracted from the ADNI-1 baseline data.

We included all SNPs with minor allele frequency (MAF) ≥ 0.05, with genotyping rate >90%, and surviving the Hardy–Weinberg equilibrium test at a significance threshold of 0.001. After all rounds of quality control, 519,286 SNPs remained, among which 277,527 SNPs were mapped to 17,557 genes. To consider SNPs in promoter or regulatory regions for each gene, we included SNPs upstream and downstream within 20 kb of each gene. Subjects with >10% missing genotypes were excluded, and only non-Hispanic Caucasians whose 12 gray matter volumes in the DMN were all measured at baseline were included, resulting in 144 patients with AD, 311 subjects with MCI, and 180 healthy elderly controls. For covariates, gender, years of education, handedness, age, and intracranial volume (ICV) measured at baseline were included.

To demonstrate the applicability and power of our approach, we applied MANOVA, MDMR (McArdle and Anderson 2001), KMR (Maity *et al.* 2012), MFLM (Wang *et al.* 2015) and GEE-based tests, GEE-Score, and aSPUset and aSPUset-Score tests. The number of MC simulations or permutations for each method was set at at the beginning, but was increased to if an obtained *P*-value was which ensured the identification of the genes at the genome-wide significance level (*P*-value with a Bonferroni adjustment). When any obtained *P*-value was <1.0*e*-8, we reported it as 1.0*e*-8. The *P*-values of permutation-based aSPUset and of simulation-based aSPUset agreed well (with a Pearson correlation of 0.98), and thus we reported only permutation-based results. For MFLM, we used *β*-smooth basis functions with the Pillai–Bartlett trace as a representative.

The aSPUset and MDMR tests uncovered two loci associated with the DMN. Table 1 lists the genes with the highest significance levels. Genes * (on chromosome 11) and **APOC1*, *APOE* (on chromosome 19) were identified by both aSPUset and MDMR, but not by other tests, while *TOMM40* (on chromosome 19) was detected only by aSPUset. *AMOTL1* is known to be involved in cell adhesion and cell signaling (Hamatani *et al.* 2004). A recent study using a pathway-enrichment strategy showed that the genes involved in neuronal cell adhesion and cell signaling are overrepresented in schizophrenia and bipolar disorder (Meda *et al.* 2014). Anney *et al.* (2008) identified *AMOTL1* as a gene associated with ADHD. The gene was also highly expressed in thalamus, a brain region implicated in the cognitive impairment of early stage Huntington’s disease (Schmouth *et al.* 2013). Three genes (*APOC1*, *APOE*, *TOMM40*) in chromosome 19 could not be readily discerned due to their physical closeness, although their gene sizes (*i.e.*, the numbers of SNPs) varied. The *P*-values of MDMR became less significant as the gene size increased, while the aSPUset was robust to the number of SNPs. This locus containing *APOE* is well known to be related to Alzheimer’s disease and cognitive impairment disorder (Seshadri *et al.* 2010; Kamboh *et al.* 2012; Liu *et al.* 2014).

Table 2 lists the SNPs included in the significant genes. We applied several single SNP-based tests for association with the default mode network. For each method, the permutation or simulation number was increased up to to satisfy the genome-wise significance level. As shown in Table 2, none of the SNPs in gene *AMOTL1* was significant, suggesting that a strong association signal was retained only in the gene level, rather than in the SNP level. On the other hand, SNP rs429358 contained in three genes (*APOC1*, *APOE*, *TOMM40*) was highly significant with a *P*-value of 1.0*e*-8. These results lend support to the proposed aSPUset test’s potential to be able to recover both multiple weak effects and single strong effects, due to its adaptiveness.

We explored each identified locus in detail in Figure 1 and Figure 2. In Figure 1, a LocusZoom plot (Pruim *et al.* 2010) illustrates local linkage disequilibrium (LD), recombination patterns, and *P*-values obtained from the single SNP-based aSPU test for the DMN. Figure 2 illustrates the association analyses for genes *AMOTL1* and *APOE*, respectively. First, we obtained *P*-values from the univariate test between each SNP and each individual trait composing the DMN, and then we applied the SNP-based test (aSPU) between each SNP and DMN (12 traits). Finally, we applied the aSPUset test at the gene level for the DMN. The SNPs contained in *AMOTL1* showed strong LD (Figure 1A), and their aggregate effects turned out to be significant at the gene level (Figure 2A). Among the SPU tests applied with SPU(3,2) showed the minimum *P*-value, implying that weak effects were aggregated for an overall association. In Figure 2B, only one variant (rs429358) in *APOE* was significant, but the significance level of aSPUset did not diminish in the gene-level analysis. In testing *APOE*, the *P*-values of SPU(2,1), SPU(4,1), SPU(6,1), SPU(8,1), and SPU(1) were tied and the most significant; this suggested that one SNP (rs429358) dominated in the gene level across all the traits.

Since the proposed test is based on combining all possible single-SNP–single-trait association pairs, if one wants to identify which pairs contribute most to an overall association, one can simply examine the significance levels of the univariate single-SNP–single-trait association tests. For example, Figure 2, A,a and B,a, illustrates the contribution of each SNP–trait pair for *AMOTL1* and *APOE*. In the gene *AMOTL1*, the SNP–trait pairs, (rs1367505, R-InferiorTemporal), (rs2033367, R-InferiorTemporal), and (rs333027, L-InferiorParietal), were ranked highest; for *APOE*, the top three significant pairs were (rs429358, R-Precuneus), (rs2075650, L-Precuneus), and (rs429358, L-InferiorParietal).

As shown in File S1, we conducted a single SNP-based GWAS scan for the ADNI-1 data. Interestingly, no SNP was significant from univariate single-SNP–single-trait analyses as shown in File S1, Figure A and Figure B. Furthermore, only one SNP, rs429358, was significant in single SNP-based multitrait analyses as shown in File S1, Figure C and Figure D. In contrast, two loci (*AMOTL1* and *APOE*) were uncovered by gene-based multitrait analyses by our proposed new test (File S1, Figure E and Figure F). In all analyses, covariates considered included gender, years of education, handedness, age, and ICV measured at baseline. Taken together, these results clearly demonstrated the advantage and power gain of our proposed gene-based multitrait analysis.

#### Validation with ADNI-GO/2 data:

Using the ADNI-1 data as the discovery sample, our GWAS identified two loci associated with the DMN. To validate the results, each method was applied to the two genes *AMOTL1* and *APOE*, using the ADNI-GO/2 data as the validation sample (with ). We applied the same SNP-filtering criteria as applied to ADNI-1. Table 3 presents the *P*-values obtained from each method; no significant association was identified. Due to different genotyping arrays, ADNI-GO/2 data contain different sets of SNPs from those of ADNI-1; we imputed missing SNPs that were originally included in the analysis of ADNI-1, based on the reference samples of HapMap 3 with MaCH (Liu *et al.* 2013), to apply each method to the identical SNP sets of ADNI-1. The aSPUset and aSPUset-Score tests identified gene *APOE* with *P*-values of 0.019 and 0.024, respectively, which passed the significance threshold of 0.05/2 as shown in Table 3, but gene *AMOTL1* was not significant by any test. File S2, Figure A illustrates *P*-values from single SNP-based testing after adjusting for covariates; SNP rs429358 was associated with the DMN (*P*-value 1.9*e*-3) by passing the Bonferroni-adjusted significance level of 0.05/12. File S2, Figure B presents *P*-values for the two candidate gene regions based on the ADNI-GO/2 data; the methods include the univariate single-SNP–single-trait test, the single SNP-based multitrait aSPU test, and the gene-based multitrait aSPUset test.

We should mention possible sample differences between ADNI-1 and ADNI-GO/2 cohorts. The ADNI-1 cohort includes three subject groups consisting of 25% patients with AD, 50% subjects with MCI, and 25% cognitively normal (CN) subjects; in contrast, the ADN-GO/2 study assigns 754 subjects into five groups: 20% CN, 12% significant memory concern (SMC), 35% early mild cognitive impairment (EMCI), 17% late mild cognitive impairment (LMCI), and 16% AD. At least the proportions of the CN subjects and patients with AD in the two cohorts are different, which might lead to different association results.

Finally, we combined the two cohorts to form ADNI-1/GO/2 with a larger sample size (∼1400 subjects) and obtained the *P*-values from the tests for the two candidate gene regions. The two genes were highly significantly associated with the default mode network as shown in Table 3.

#### Gene-based rare variant analysis of the ADNI sequencing data:

The proposed method was applied to analysis of rare variants with the ADNI WGS data, consisting of 254 and 500 subjects from ADNI-1 and ADNI-GO/2, respectively. In total, 26,142 genes were included for analyses; all variants inside a gene and those located 25 kb of upstream and downstream of the gene were mapped to the gene. Five covariates were adjusted: gender, years of education, handedness, age, and ICV. Due to the low frequency of rare variants, the asymptotic assumption for some tests may not hold; we modified each method to avoid using asymptotics. For MANOVA, rather than using the usual *F* distribution, we permuted residuals (under the null model) to estimate its null distribution; for aSPUset and MFLM, similarly the permutation-based method was applied. We included all rare variants within each gene region; the number of variants within each region ranged from 3 to 750. Sometimes permutation-based MANOVA suffered from rank deficiency when constructing the test statistic and could not be applied to ∼600 genes; MFLM also failed for some genes due to rank deficiency.

First, we included only rare variants (with MAF < 0.01) and then, both rare and low-frequency variants (with MAF < 0.05). No gene passed the genome-wide Bonferroni-adjusted significance threshold of The results for each set of rare variants are illustrated in File S3, Figure A and Figure B. MFLM was problematic with an inflation factor ∼1.5 in both analyses.

Given that two gene regions were significantly associated with DMN in the previous GWAS analysis, it would be of interest to see whether the rare variants in the two genes were associated. Table 4 reports the *P*-values for the two candidate genes. No significant associations were detected. File S3, Figure C depicts the *P*-values from single trait-based tests, including SKAT, SKAT-O, T1 (a burden test for rare variants with MAF < 0.01), T5 (a burden test for rare and low-frequency variants with MAF < 0.05), minP, and aSPU tests (Wu *et al.* 2011; Pan *et al.* 2014). T1 and T5 are equivalent to the SPU(1) test with MAF thresholds 0.01 and 0.05, respectively. The minP test is similar to the SPU(∞) test.

### Simulations

#### Simulation setups:

We evaluated the performance of our method along with several existing methods in simulation studies. The simulated data mimicked the association structures for the two genes (*AMOTL1* on chromosome 11 and *TOMM40* on chromosome 19) and the DMN in ADNI-1 data. Two factors were considered: association effect size (setup 1) and sparsity of association patterns (setup 2). For setup 1, various effect sizes were created by scaling the regression coefficient estimates obtained from a multivariate linear model (MLM) fitted to the original data. On each gene, an MLM was fitted to the ADNI-1 data, including the covariates (), SNPs (), and DMN (). For covariates, we included gender, education, handedness, age, and ICV as in the original data analysis. Denote the parameter estimates in an MLM as follows: is a vector for intercepts; is a matrix, in which represents the effect size of SNP *j* on trait *t*; the element in matrix stands for the *q*th covariate effect on the *t*th trait; and ∑ is the covariance estimate for the multivariate error term. To maintain the true correlation structures among genotype scores and five covariates we sampled pairs from the ADNI-1 data in each simulation. The multiple traits for subject *i* were generated from a multivariate normal distribution:(6)Here *φ* was a scaling parameter controlling the effect sizes of the SNPs (): with the null hypothesis held and type I error rates were evaluated; at the effect sizes were set to be equal to the estimated ones from the ADNI-1 data.

For setup 2, we varied the sparsity level of the association structure. At a fixed we increased the gene size by adding some null SNPs to gene *AMOTL1*. For the null SNPs, the genotype data adjacent to *AMOTL1* were used. As before, pairs were sampled from the ADNI-1 data. Throughout simulations, 10,000 replicates were used for each setup and the tests were conducted at the significance level

#### Type I error and power:

All the tests showed type I error rates controlled under the nominal level of 0.05 (Table 5). Of note, MDMR resulted in conservative type I error rates. In setup 1 (Table 5), as the association effect size (*φ*) decreased, the aSPUset and aSPUset-Score tests were more powerful than other tests, suggesting the potential usefulness of the proposed tests in identifying causal SNPs with weak effects. Since MFLM was proposed to reduce the dimensionality of the SNP data, it might not be desirable to use MFLM here; it might perform better with larger numbers of SNPs.

In setup 2 (Table 6), the aSPUset and aSPUset-Score yielded higher power than other tests as the proportion of the null SNPs in the SNP set increased. Throughout the simulations, the GEE-Score test performed similarly to MANOVA, confirming their equivalence.

#### Computational time:

We reported the computational requirement of each method in Table 7 by taking the average computation time for simulation setup 2. MANOVA was computationally most efficient, followed by MFLM. As the number of SNPs increased, the GEE-Score test and the aSPUset-Score test became computationally more demanding, but still feasible.

### Conclusions

We have presented a highly adaptive association test for multiple traits and multiple genetic variants. From the GWAS analyses of the ADNI-1 data (File S1), we observed its potential power gains in identifying cumulative weak effects of multiple associated SNPs in gene *AMOTL1* with multiple traits, which were undetectable by several other gene-based tests and single SNP-based tests. Given that most common variants have only weak effects for complex diseases and traits, developing testing strategies to improve power in identifying multiple SNPs with weak effects is very important. Our proposed method is developed along this direction. Furthermore, due to its adaptiveness, it also retains power in the presence of only one or a few associated SNPs (or traits), as shown for the *APOE* gene with the ADNI-1 data (while several existing gene-based tests failed to capture this). Our proposed adaptive test is in contrast to most of the existing tests, which may be powerful in one or more situations, but not across a wide range of situations. In practice, since the true association pattern for a given gene and traits is unknown, it is unclear which nonadaptive test should be used; it will be convenient and promising to apply an adaptive test such as our proposed one.

We emphasize the potential power gain with the use of multiple traits, especially of intermediate phenotypes for a complex disease such as AD (Mukherjee *et al.* 2014; Chen *et al.* 2015). However, since it is unknown how many of, and in what association patterns, the multiple traits are associated with a gene (or a set of SNPs), a straightforward use of any multivariate test may lose, not gain, power. Again, the availability of a powerful and adaptive test such as our proposed one will largely facilitate its easy and effective use in practice.

Finally, we summarize the use of our proposed tests and make some recommendations. To assess an overall association between a set of SNPs and a set of traits, we recommend the use of the *P*-value of the aSPUset test. If it is significant, one can check the individual *P*-values of the SPU tests to shed some light on the underlying association pattern. If a larger (or ) leads to a more significant *P*-value of the SPU test, it suggests a more sparse association pattern; that is, perhaps one or a fewer number of the SNPs (or traits) is or are associated. Furthermore, one can examine the *P*-value from the univariate test for each SNP–trait pair to identify which SNP–trait pairs contribute most to the overall association. For choosing candidate values of and based on our limited experience, we suggest using by default, although an optimal choice depends on the situation; using a too large or too small set or will lead to loss of power. A general guidance, taking as an example (and similarly for ), is to use such that the SPU test gives a *P*-value almost equal to that of SPU a larger number of SNPs may require a larger value of In addition, if some large univariate associations between various SNP–trait pairs are likely to be in opposite directions, only even integers are needed in and if it is known *a priori* that large univariate associations are mainly in one direction, then using only odd integers may be most powerful; otherwise, both even and odd integers should be used. Given the relationships among the tests, we recommend the use of our proposed aSPUset and aSPUset-Score tests, although MFLM may also perform well for large genes; further evaluations are needed.

## Acknowledgments

The authors are grateful to the reviewers for constructive comments. This research was supported by National Institutes of Health (NIH) grants R01GM113250, R01HL105397, and R01HL116720 and by the Minnesota Supercomputing Institute. J.K. was supported by a University of Minnesota Informatics Institute MnDRIVE fellowship. Data collection and sharing for this project were funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (NIH grant U01 AG024904) and Department of Defense (DOD) ADNI (DOD award W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, by the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Alzheimer’s Association; Alzheimers Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen Idec, Inc.; Bristol-Myers Squibb Company; Eisai, Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd. and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development, LLC.; Medpace, Inc.; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer, Inc.; Piramal Imaging; Servier; Synarc, Inc.; and Takeda Pharmaceutical Company. The Canadian Institutes of Rev December 5, 2013 Health Research provides funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

## Appendix

Without loss of generality we center both and to have their sample means and We consider the case without covariates, since several methods are applicable only to the case without covariates.

We rewrite the data format as a design matrix. Denote **Λ** as an matrix, each row of which contains subject *i*’s genotype and **Θ** as an matrix, each row of which consists of multiple traits Multivariate analysis can be derived from partitioning of the total sum of squares and cross-products (SSCP) matrix, the inner product According to the multivariate linear model, where **B** is the matrix of model parameters, **E** is the matrix of errors, the fitted value matrix is defined as and the matrix of residuals is **R** where **H** is a hat matrix.

We define each covariance estimate as follows. is a covariance estimate for genotype scores and is a covariance estimate among *k* multiple traits and are covariance estimates between two sets of variables and

tr(**A**) stands for sum of diagonal elements of a matrix **A**. vec(**A**) represents a linear transformation that converts the matrix (**A**) into a column vector.

### Appendix A: SPUw(2,2) and M-MeanStat; SPUw(∞,1) and M-Max

For each trait *t* and SNP *j*, their pairwise association is quantified by which follows a normal distribution asymptotically with mean zero and variance under the null hypothesis. Guo *et al.* (2013) defined the generalized Kendall’s *τ* statistic, Based on this, Guo *et al.* (2013) proposed M-MeanStat and M-MaxStat:

If a canonical link function and a working independence model are used in GEE, the test statistics of and are defined by(A2)Comparing the two sets of statistics in (A1) and (A2), we see that M-MeanStat and and M-Max and are approximately equivalent, respectively.

### Appendix B: SPU(2,2) and MDMR

Under the working independence model, the test statistic of SPU(2,2) is stated as(B1)MDMR is a nonparametric modification of traditional Fisher’s MANOVA (McArdle and Anderson 2001). Wessel and Schork (2006) and Zapala and Schork (2012) introduced the method to applications in genetics and genomics. For a single trait, it is closely related to kernel methods (Schaid *et al.* 2005; Pan 2011).

Suppose represents the distance between subjects *i* and *j*; let and *G* be its centered version. An *F* statistic can be constructed to test the hypothesis that the *p* regressor variables have no relationship to variation in the distance or dissimilarity of the *n* subjects reflected in the distance/dissimilarity matrix. The pseudo-*F* statistic of MDMR is defined byIf the Euclidean distance (*i.e.*, norm) is used to construct the distance matrix **G** the MDMR test statistic is defined asAs usual, permutations are used to calculate *P*-values. Then is invariant across all permutations and can be ignored (Pan 2011). The test statistic arrives at(B2)If we have a single SNP to be tested, *i.e.*, **Λ** is an matrix, the test statistic (B2) reduces to with Hence, SPU(2,2) and MDMR are equivalent for a single SNP and multiple traits, as established by Zhang *et al.* (2014). However, for multiple SNPs and multiple traits, by comparing (B1) and (B2), we see that in general they are not equivalent.

### Appendix C: SPU(2,2) and KMR

With a working correlation matrix in GEE, the SPU(2,2) test can be rewritten as(C1)Maity *et al.* (2012) introduced multivariate phenotype association analysis by SNP set- or gene-based KMR. The authors assumed that the phenotypes are correlated while the individuals are independent. Suppose is the true correlation matrix for *k* traits with and Define and a kernel matrix The score test under the null for KMR (Maity *et al.* 2012) is defined bywhere each is an kernel matrix for each trait. Applying a linear kernel yields(C2)KMR (Equation C2) has the same test statistic as the GEE-SPU(2) test (Equation C1) if the working correlation is the true correlation structure of [*i.e.*, ].

### Appendix D: GEE-Score test and MANOVA

The GEE-Score test statistic with a working independence model in GEE isIn MANOVA, a measure of the strength of association between **Θ** (multiple traits) and **Λ** (genotype scores) for the multivariate model depends on a partition of matrix of total SSCP; *i.e.*, (Haase 2011). Considering the Pillai–Bartlett (PB) trace, the MANOVA test statistic is stated as tr which can be written in an alternate form Hence, the GEE-Score test and MANOVA using the PB trace are equivalent.

## Footnotes

*Communicating editor: S. Sen*Supplemental material is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.115.186502/-/DC1.

↵2 Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

- Received December 22, 2015.
- Accepted April 2, 2016.

- Copyright © 2016 by the Genetics Society of America