## Abstract

A major goal in the study of complex traits is to decipher the causal interrelationships among correlated phenotypes. Current methods mostly yield undirected networks that connect phenotypes without causal orientation. Some of these connections may be spurious due to partial correlation that is not causal. We show how to build causal direction into an undirected network of phenotypes by including causal QTL for each phenotype. We evaluate causal direction for each edge connecting two phenotypes, using a LOD score. This new approach can be applied to many different population structures, including inbred and outbred crosses as well as natural populations, and can accommodate feedback loops. We assess its performance in simulation studies and show that our method recovers network edges and infers causal direction correctly at a high rate. Finally, we illustrate our method with an example involving gene expression and metabolite traits from experimental crosses.

NETWORK models derived from microarray experiments have shed light on the manner and extent of connectedness among expressed genes. However, these networks mostly summarize association, connecting phenotypes without causal direction. Genetical genomic studies, with microarray data in a segregating population, have shown evidence for *cis*-acting and *trans*-acting genomic regions (Brem *et al.* 2002; Schadt *et al.* 2003). These studies yield expression quantitative trait loci (eQTL) (Kendziorski and Wang 2006) that provide a causal experimental system where genotype drives phenotype. Directed graphs inferred from such a system (Jansen and Nap 2001; Zhu *et al.* 2004; Bing and Hoeschele 2005; Kulp and Jagalur 2006; Keurentjes *et al.* 2007) can yield causal gene networks that predict biochemical pathways, generating new hypotheses about functional relationships among expressed genes. These methods need a network node that consists of a gene physically located within the QTL support interval of one or more transcripts. This gene thus becomes a candidate regulator for the other transcripts. Unfortunately, phenotypes such as metabolites and physiological traits cannot serve as causal nodes in these networks. Schadt *et al.* (2005) allowed for causal clinical traits by using pairs of phenotypes that share multiple QTL. However, their method cannot infer causal direction between phenotypes without a common QTL.

In this article, we introduce a novel method for building causal networks among phenotypes that may not have common QTL, allowing the use of phenotypes other than gene expression. We first build an undirected graph that infers associations among phenotypes, using either an undirected dependency graph (UDG) (Shipley 2002; de la Fuente *et al.* 2004) or a skeleton derived from the PC (Peter-Clark) algorithm of Spirtes *et al*. (2000). We next use a LOD score to determine causal direction for every edge that connects a pair of phenotypes, conditional on connected QTL. Our method differs from the PC algorithm, which first infers a graph skeleton (with the “PC-skeleton algorithm”) and then uses partial correlations among phenotypes to infer the direction of some, but not all, edges of the inferred skeleton. Our QTL-directed dependency graph (QDG) can include feedback loops, which are common in biology. This integrated QDG approach overcomes serious limitations of both the genetical genomics network methods and the PC algorithm.

Our goal, in short, is to causally orient every edge connecting a pair of phenotypes in an undirected network. That is, does phenotype 1 drive phenotype 2, or vice versa?:Model selection procedures cannot distinguish between *M*_{1} and *M*_{2} because they are distribution or likelihood equivalent. That is, variation in the level of phenotype 1, *y*_{1}, causing a variation in the level of phenotype 2, *y*_{2}, yields the same joint density as the reverse situation,However, if in addition to the two phenotypes, we have measured genotypes, **q**_{1} = {*q*_{11}, …, *q*_{1k}} on *k* QTL affecting *y*_{1} and **q**_{2} = {*q*_{21}, …, *q*_{2l}} on *l* QTL affecting *y*_{2}, we can now resolve direction. The new modelsare *not* likelihood equivalent because the predictive densities disagree,(1)Therefore, we distinguish between models and by inferring direction of causation between phenotypes using a LOD score that conditions on genotypes at multiple QTL. We assume distinct QTL for different phenotypes and consider allegedly pleiotropic QTL in the discussion.

The methods of this article are widely applicable to human studies and outbred populations. Here, we confine attention to experimental crosses, such as a backcross or intercross derived from two disparate inbred lines. We focus on a mouse intercross with a set of already identified phenotypes of interest, ideally involved in or related to a common pathway. Further, we assume that multiple QTL associated with these traits had previously been determined.

We present a series of simulation studies where we evaluate the recovery rate of undirected edges and (1) compare the performance of the QDG and the PC algorithm for causal orientation in a network containing 100 phenotypes; (2) assess the accuracy of the QDG method for the inference of small cyclic graphs; and (3) study a causal model describing the relationship of 20 QTL, five gene expression traits, and one metabolite, constructed with the QDG method and presented in Ferrara *et al.* (2008).

## METHODS

#### Undirected graph:

We build an association network using UDGs (Shipley 2002) and PC skeletons (Spirtes *et al.* 2000). These methods are better suited to causal inference than other association networks such as Gaussian graphical models (GGMs) (Schafer and Strimmer 2005a,b), which include edges between any pair of nodes that has significant partial correlation. However, partial correlation can exist even when two nodes are uncorrelated, leading to spurious edges (Figure 1). In directed acyclic graphs, UDGs and PC skeletons avoid this problem by first removing edges where there is no significant correlation. In directed cyclic graphs, it may not be possible to remove all spurious edges. The PC-skeleton algorithm performs fewer computations than the UDG algorithm for sparse graphs, but it is less accurate in small sample sizes (Shipley 2002). An efficient implementation of the PC algorithm for sparse high-dimensional directed graphs (Kalisch and Buhlmann 2007) is available in the R package pcalg (R Development Core Team 2006).

#### Distinguishing QTL with direct and indirect effects:

The edge orientation procedure requires the inclusion of QTL that directly affect the phenotypes. Since it is possible that a strong QTL affecting an upstream trait may also be incorrectly detected as a QTL for a downstream phenotype, we first employ the following two-step preprocessing procedure to distinguish QTL with direct and indirect effects: (1) identify all pairs of connected phenotypes that share common QTL and (2) apply a generalization of the method proposed by Schadt *et al.* (2005) to each of these pairs to determine if the common QTL directly affects both traits or if it has an indirect effect in one of the phenotypes. If the pair of phenotypes is affected by a single common QTL (in addition to the QTL particular to each trait) we score the three graphical models presented in Figure 2, using the Bayesian information criterion (BIC) or Akaike's information criterion (AIC), and select the model with the smallest value. Figure 2a has both phenotypes directly affected by the common QTL, while in Figure 2b *q* has an indirect effect on phenotype *y*_{2} and no direct effect on *y*_{2}. Figure 2c represents the reverse situation. Supplemental Figure S1 illustrates the case where a pair of connected phenotypes map to two common QTL.

We point out that the original goal of Schadt *et al.* (2005) was to determine the causal direction between a pair of phenotypes. Here, we test the direct *vs.* the indirect effect of a common QTL on two phenotypes. We note in the discussion how causal direction inference for our method can be extended to orient edges that have only common QTL.

#### Preprocessing the undirected graph:

Whenever a pair of connected phenotypes have a common QTL, as in the situation described in Figure 2a, we further test the null hypothesis of no partial correlation between the phenotypes conditional on the common QTL. If we accept the null, we drop the edge between the phenotypes since their correlation is caused by the pleiotropic effect of the common QTL.

#### Edge orientation:

We orient causal edges between all pairs of connected phenotypes in an undirected network using associated multiple QTL to break likelihood equivalence. The QTL are assumed to come from earlier gene mapping of phenotypes. Edge orientation among pairs of phenotypes is based on a linear regression model with phenotypes regressed on QTL genotypes and on additive or interacting covariates such as sex, age, and other phenotypes. We orient each edge, conditioning on all other nodes (phenotypic, genotypic, or covariate) that are connected to that edge. For each edge, we evaluate a LOD score comparing the two possible orientations. We orient the edge in favor of the direction with the higher likelihood in the ratio. For the toy network presented in the Introducton this ratio iswhere *f*() represents the predictive density, that is, the sampling model with parameters replaced by the corresponding maximum-likelihood estimates, andrepresent standard LOD scores comparing the “full” model (a multiple-QTL model possibly containing covariates and interactions) with the “reduced” model (no QTL or covariates).

In more complex networks we orient edges in two steps: (1) build an initial directed network orienting each edge as above and (2) recompute the LOD score for each edge, connecting a pair of phenotype nodes by conditioning on all other phenotypes causative to either or both nodes. We repeat the second step for all edges until no edge switches direction and this involves some iteration to find the best orientation across the entire graph. With four phenotypes, an initial directed network might look like . The second step recomputes the LOD score for the edge between nodes 2 and 3, conditioning, respectively, on nodes 1 and 4. Similarly, we recompute the LOD score for the 3–4 edge, conditioning on node 2 for phenotype 3. If the direction of the 2–3 edge switches, the graph becomes . We then recompute the LOD scores for the 1–2 edge, using node 3 as a covariate for 2, and for the 2–3 edge conditioning, respectively, on 1 and 4.

In complex networks this algorithm may find more than one solution. That is, starting the algorithm from a different edge ordering may yield a different graph. To get the best solution we (1) rerun this algorithm using different initial edges to get all possible solutions, (2) score each solution using a likelihood-based overall measure of fit, and (3) select the graph with the highest score. This algorithm may oscillate between two graphs changing directions on some edges. In this case we choose the graph with the highest measure of fit.

#### QDG algorithm:

Construct an UDG (PC skeleton or an association graph).

Use QTL genotypes to direct each edge in the UDG. Call it DG.

Randomly choose an ordering of all edges in the DG. Call this list ODG.

For an edge in the ODG recompute the direction LOD score using all other causative phenotypes to either or both nodes, according to the DG, in addition to the QTL genotypes. If the direction changes, update the DG and move to the next edge in the ODG.

Repeat step 4 until no more edges change direction. The corresponding directed graph is one solution. If step 4 keeps oscillating between different graphs without converging to a solution in 30 steps, we include all distinct graphs as solutions.

Repeat steps 3, 4, and 5 1000 times and store all different solutions.

Score all solutions in step 6, using a likelihood-based measure of fit of the whole graph, and choose the graph with the highest score.

We point out that for each solution of this algorithm, the direction of each edge connecting two nodes is computed conditional on the parents of the nodes. While we start the QDG as a pairwise algorithm (step 2), the last steps actually represent a multivariate algorithm, where we perform local computations of direction on subsets of the graph. An R package (R Development Core Team 2006) implementing our procedure is under development and we intend to make it available at Bioconductor.

#### Simulations:

##### One hundred phenotypes network:

In each simulation experiment we generated synthetic data according to the network in Figure 3, using linear regression models. Each phenotype node was affected by two or three QTL (not shown in Figure 3), and we allowed only additive genetic effects. The QTL for each phenotype were randomly selected among 200 markers, with 10 markers unevenly distributed on each of 20 autosomes. We allowed different phenotypes to potentially share common QTL. For each phenotype, the regression coefficients with other phenotypes were chosen uniformly between 0.5 and 1; residual standard deviations were chosen between 0.1 and 0.5. The regression coefficient of the phenotype on the QTL ranged from 0.2 to 0.6. More specifically, we assumed intralocus additivity (no dominance) and an increment of 0.1 per allele. The coefficients for genotype *aa* were drawn uniformly between 0.2 and 0.4. The coefficients for genotypes *Aa* and *AA* were calculated by adding 0.1 and 0.2, respectively. For each simulation experiment, we generated 100 realizations and applied the QDG and PC algorithms to infer causal directions for the edges of the skeleton obtained by the PC-skeleton algorithm.

##### Cyclic networks:

For Figure 4, a and b, we generated data sets with 100 and 200 individuals, fixing regression coefficients at 0.5, error variances at 0.025, and the QTL effects at 0.2, 0.3, and 0.4 for the three F_{2} genotypes. We used a Gibbs sampling scheme to generate 100 data sets in each simulation study. For Figure 4c, we adopted a sample size of 100 and parameter values as above with the exception of the reciprocal parameters β_{25} and β_{52}. We used a burn-in of 2000 for the Gibbs sampler and inspection of Markov chains showed good mixing.

#### Permutation tests for directions:

*P*-values for the LOD scores are computed using 10,000 permutations. To get the null distribution for the direction we permute blocks of data in such a fashion that we break all the connections that affect the direction, keeping the others intact. As an example, consider Figure 5 and suppose we are interested in computing a permutation *P*-value for the arrow pointing from phenotype *y*_{1} to phenotype *y*_{2} conditional on covariates *x* and *z*. Here, *x* represents a covariate of both phenotypes (could be another phenotype, age, sex, etc.). *z* is a covariate only of *y*_{1}. **q**_{1} and **q**_{2} correspond to sets of QTL affecting *y*_{1} and *y*_{2}, respectively. To break the connections (brk) that affect the direction of an edge, we permute the corresponding pair of nodes (and their common covariates) as a block. Thus in Figure 5a we permute (*y*_{1}, *y*_{2}, *x*) as a block breaking the connections with *z*, **q**_{1}, and **q**_{2}. If instead we keep the connection between *z* and *y*_{1} by permuting only **q**_{1} and **q**_{2} (Figure 5b), we would not break all the connections affecting the directionality [resulting models *f*(*y*_{1} | *z*, *x*)*f*(*y*_{2} | *y*_{1}, *x*) and *f*(*y*_{1} | *y*_{2}, *z*, *x*)*f*(*y*_{2} | *x*) are not likelihood equivalent and can be used to infer the direction between *y*_{1} and *y*_{2}]. Finally, the common covariate *x* to *y*_{1} and *y*_{2} improves the fit of the linear model, even though it does not directly affect the direction between *y*_{1} and *y*_{2} since *f*(*y*_{1} | *x*)*f*(*y*_{2} | *y*_{1}, *x*) = *f*(*y*_{1}, *y*_{2} | *x*) = *f*(*y*_{1} | *y*_{2}, *x*)*f*(*y*_{2} | *x*).

#### Simulating data from cyclic networks:

Consider the set of structural equations(2)where *pa*(*y _{k}*) represents the set of parental nodes of

*y*, that is, the set of phenotypes that directly affect

_{k}*y*; ε

_{k}_{k}∼

*N*(0, ) are independent error terms; and

*q*= 1, 2, 3, where

_{u}*u*indexes all QTL associated with phenotype

*y*.

_{k}Let represent the vector of error terms. Using the Jacobian transformation from , we get the joint density of the phenotypes conditional on the QTL genotypes.

We simulate a data set of *n* individuals from a cyclic network in two steps. First, we simulate the QTL genotype data. Next, we generate phenotype data, conditional on the QTL genotype data, using a Gibbs sampling scheme. The distribution of each phenotype conditional on the remaining phenotypes, QTL genotypes, and parameters is univariate normal. Because each individual may have a unique multilocus genotype and the phenotype distributions are conditional on the QTL genotypes, we run separate Markov chains to generate each of the *n* data points.

#### Distinguishing graphs inside an equivalence class:

In the Introduction we demonstrated how the incorporation of driving QTL allows for causal direction inference in a network composed by two phenotypes linked by an edge. This result readily extends to more complex networks belonging to an equivalent class. Suppose, for example, that two *N*-node directed acyclic graphs belong to the same equivalence class. Thenwhere the indexes *k* and *v* represent different factorizations of the same joint predictive density *f*(*y*_{1}, …, *y _{N}*). By incorporating the driving QTL we are able to break the likelihood equivalence between these two phenotype networks since

This result is also true for directed cyclic graphs. Consider, for example, two phenotype networks composed by a cycle with three nodes, and . The predictive densities of the phenotypes agree by application of Bayes' formula:However, when we include the respective driving QTL,and the graphs are no longer likelihood equivalent.

Thus incorporating driving QTL in the analysis allows us to distinguish between phenotype networks that would otherwise be undistinguishable. Therefore, if the QDG algorithm converges to solutions belonging to the same equivalence class (considering the phenotypes only), it is still able to distinguish between then and reconstructs a single network.

#### Likelihood-based overall measure of fit:

For directed acyclic graphs the joint likelihood factors nicely and maximum-likelihood analysis is straightforward. Therefore, we adopted the maximized log likelihood of the whole graph as an overall measure of fit in this case.

For directed cyclic graphs, the likelihood cannot be completely factored (because of the normalization constant) and maximum-likelihood estimators do not have closed analytical forms, requiring numerical techniques for their evaluation. When comparing cyclic networks with reversed directions, we can avoid these computations because the normalization constants agree and we can simply compare the unnormalized likelihoods. To see why that is the case, recall that the QTL enter as a location parameter () in the structural Equations 2 defining the cyclic graphs, so that the hypervolume under the *N*-dimensional integral (the inverse of the normalization constant) is not affected by a shifting of the *N*-dimensional curve.

If we need to compare acyclic and cyclic graphs, the normalization constant of the cyclic graph cannot be ignored. But since an acyclic graph cannot be likelihood equivalent to a cyclic one, we can simply fit a path model to the respective phenotype networks (using any structural equation models package) and use the respective maximized log likelihood as an overall measure of fit.

## RESULTS

In simulations, we contrast our QDG method with the PC algorithm (Spirtes *et al.* 2000), which can orient some but not all edges in an undirected network using only partial correlations.

#### Network topology:

In all simulation studies, we assessed the performance of the PC skeleton or the UDG algorithm in terms of the average (across all edges in the respective graphs) of the percentage of times the algorithm recovered each of the true edges. Edge recovery rates were strong, increasing with sample size.

##### One hundred phenotypes network:

A complicated network with 100 phenotypes and 107 edges (Figure 3) illustrates how the percentage of edges recovered using the PC skeleton increases with sample size (Figure 6; supplemental Figures S2 and S3). Similar improvements were found for the true positive rate, while false positive rates were uniformly low and true discovery rates were uniformly high (Table 1).

##### Cyclic networks:

We investigated the properties of three cyclic graphs (Figure 4). As sample size increased from 100 to 200, the average percentage of true recovered edges using the PC skeleton improved from 83 to 99% for Figure 4a, and from 78 to 99% for Figure 4b (Tables 2 and 3). Figure 4c had two pairs of nodes, (1, 5) and (2, 4), that had no direct connection but were not d-separated (Pearl 1988). We expected the PC skeleton to detect spurious edges between them, but we found only 2 realizations in 100 with an edge between nodes 1 and 5 and none between 2 and 4 (Tables 4 and 5). Therefore, the PC skeleton successfully recovered important features of cyclic graph topology.

#### Causal direction inference:

##### One hundred phenotypes network:

Both QDG and PC algorithms improved in terms of correct causal direction as sample size increased. In all cases, the QDG method got more correct directions (Figure 7a) than the PC algorithm and, except for a sample size of 60, fewer incorrect directions (Figure 7b). Supplemental Figures S2 and S3 show the results per edge.

The network in Figure 3 demonstrates many features that can be inferred with varying degrees of difficulty by the PC algorithm. For instance, nodes organized in an unshielded collider pattern (Shipley 2002) such as are easier to direct than nodes organized in a bifurcation or line pattern such as or . In Figure 8, we present the average performances of the QDG and PC algorithms for nodes involved in unshielded collider structures separately from the performances of all other remaining patterns (that we pooled together). As expected, the PC algorithm performed much better in the unshielded collider pattern, relative to all other patterns, in terms of correct causal direction inference and of proportion of undirected edges. The QDG algorithm performed equally well for all patterns and was better than the PC. All nodes involved in unshielded collider patterns were highlighted in supplemental Figures S2 and S3.

The QDG method is robust with respect to the number of QTL per phenotype. We randomly selected only one QTL per phenotype (from the two or three QTL used to generate the data) for causal orientation. We found a relatively small decrease in performance when only one QTL was used (Figure 9). While the QDG algorithm required more iterations to converge on a solution when using only one driving QTL per phenotype, there was little loss in percentage of correct directions.

##### Cyclic networks:

For the three small cyclic graphs shown in Figure 4, the QDG inferred the correct direction most of the time. As sample size went from 100 to 200, the average percentage of correct direction relative to the proportion of recovered edges increased from 96 to 99% for Figure 4a (Table 2) and from 97 to 100% for Figure 4b (Table 3).

The simulations for Figure 4c illustrate that even though our method cannot detect reciprocal interactions (*e.g.*, between phenotypes 2 and 5), it can still infer the stronger direction, that is, the direction corresponding to the higher regression coefficient. Tables 4 and 5 present the results of three simulation studies where we adopted the following regression coefficients: (1) β_{25} = β_{52} = 0.5; (2) β_{25} = 0.5, β_{52} = 0.8; and (3) β_{25} = 0.8, β_{52} = 0.5.

## METABOLITE AND GENE EXPRESSION NETWORK

We inferred a causal network among expression and metabolite traits and validated the network with *in vitro* experiments (Ferrara *et al.* 2008). Here we simulate from a connected subset of this inferred network to study its statistical properties.

The causal network was constructed using *in vivo* gene expression and metabolite data from a sample of 58 mice from an F_{2} − *ob*/*ob* population generated from the C57BL/6J (B6) and BTBR founder strains (Lan *et al.* 2006). The B6 and BTBR strains, when made obese, differ in diabetes susceptibility (Stoehr *et al.* 2000); B6-*ob*/*ob* mice are diabetes resistant, whereas BTBR-*ob*/*ob* mice develop severe diabetes. The included transcripts had a high positive correlation with glutamine plus glutamate (Glx) and/or comapped eQTL and mQTL (Ferrara *et al.* 2008). Our QDG algorithm with random-edge orderings converged to two solutions. Figure 10 shows the major connected subset of this network.

We simulated 1000 realizations from the network depicted in Figure 10, inferring the correct edge directions for recovered edges on average 83% of the time. The average percentage of true recovered edges was 75%. All edges except were recovered at least 70% of the time, with correct direction at least 72% of the time (Figure 10). False edges were detected at rates <2% (Table 6). In the construction and simulations, age was used as an additive covariate and we allowed for sex-by-genotype interactions. Estimates from the real data were used as the parameter values in the simulations and we adopted α = 0.05 for the UDG algorithm.

Our network predicted six edge directions (Figure 10), which we tested by isolating hepatocytes from the B6 and BTBR parental strains. We used quantitative real-time PCR to measure gene expression changes in response to the 10-mm glutamine supplement *in vitro* (normalization to *Actb*). Our results indicated that glutamine does significantly change expression of *Agxt*, *Arg1*, and *Pck1* in both strains, as predicted by the network. Glutamine increased expression of *Ass1*, which is consistent with previous studies showing that glutamine increases urea cycle enzyme expression. Glutamine decreased expression of *Slc*1*a*2 in both strains and decreased *Ivd* in the B6 strain, contradicting our best network, but our confidence in these directions was marginal (Ferrara *et al.* 2008). Thus we used our inferred network for hypothesis generation, and we confirmed four of six predictions with subsequent laboratory work.

## DISCUSSION

In genetical genomics or eQTL studies, we are increasingly interested in inferring causal networks for sets of phenotypes that map to coincident genomic regions. We typically believe that there is a “master regulator” and that most comapping is due to indirect effects. Thus our objective is to untangle the direct and indirect effects of QTL and phenotypes. We provide a novel method that can first infer what phenotypes have proximal relationships (undirected graphs) and then orient those edges to infer causal relationships, leading to a unified causal network.

In recent years, there has been a large effort to model phenotypic networks. Several approaches have been employed to construct association networks (Schadt and Lum 2006). H. Li *et al*. (2006) based association networks on partial correlations among phenotypes with significant eQTL; Zhang and Horvath (2005) clustered phenotypes into modules on the basis of a power function of pairwise correlation; Lan *et al*. (2006) used phenotype correlation with strong eQTL to identify sets of functionally related expression traits.

Much work has been done to infer causal networks. R. Li *et al.* (2006) built causal networks among phenotypes and genotypes, using structural equation modeling involving a hierarchy of interconnected linear models. Their model selection procedure was computationally intensive, searching over a wide class of possible models. In the genetical genomics setting, Zhu *et al.* (2004) extended standard Bayesian network reconstruction methods to include detected major QTL driving gene transcript abundance. Due to computational restrictions, only a few QTL were included for each expression trait. Further, no feedback cycles were allowed in the model. Schadt *et al.* (2005) related expression to clinical traits, providing a conditional probability-based approach to distinguish whether expression changes associated with a QTL were causal, reactive, or independent from clinical changes. Bing and Hoeschele (2005) proposed a strategy to build a network extending from candidate genes in eQTL genomic intervals to *cis*- and *trans*-regulated mRNAs. However, their approach required identifying all candidate genes in genomic intervals and did not infer direction between phenotypes (expressed genes) except from a *cis*-acting gene affecting the expression of another gene in *trans*. Kulp and Jagalur (2006) built partially directed networks of expression phenotypes using a gene of interest as a seed and including *trans*-acting regulator–target pairs identified by gene mapping. Keurentjes *et al.* (2007) used the Bing and Hoeschele (2005) approach in Arabidopsis to rediscover a well-studied regulatory network associated with flowering time. However, none of these genetical genomics methods can infer the causal direction between two phenotypes that do not share a common QTL or when one expressed gene is not physically located within the QTL support interval of other transcripts.

We have proposed a simple method for causal network construction that extends existing causal network methods by accommodating phenotypes that have no common QTL. Our QDG method does not require that any phenotype correspond to a gene in a QTL support interval for another phenotype. We have shown how to infer causal direction in an association network of phenotypes by conditioning on causal QTL, even when graphs include cycles. This method is part of a larger strategy to reveal phenotypic networks. We begin with a feature selection to reduce from thousands of phenotypes to a manageable number (*cf*. Zhang and Horvath 2005; Lan *et al.* 2006) followed by a focus on a particular functional group. We then use QTL mapping methods to identify major QTL of phenotypes in this functional group (Yi *et al.* 2005, 2007). Finally, we use the approach described in this article to infer a phenotypic network for this functional group.

This proposed method can handle multiple QTL and covariates (additive or interactive) and has computing time proportional to the number of edges. It can be readily extended to include discrete driving factors other than genotypes, such as sex or treatment group. Furthermore, this method can be applied to a wide variety of phenotypes, and here we demonstrate its application to gene expression and metabolite data. The incorporation of genotypes (or discrete driving factors) allows the reconstruction of a single directed network of phenotypes (see methods). Previous work in gene expression network reconstruction using Bayesian networks (Friedman *et al.* 2000) was limited to the inference of equivalence classes, that is, a set of networks that have the same likelihood. Another noteworthy advantage is that the QDG does not require sequence information and can therefore be useful in the discovery process for a broader range of genetical genomics experiments. However, we can use sequence data to improve models in a similar fashion to Keurentjes *et al.* (2007).

Our QDG method shows superior performance over the PC algorithm (Spirtes *et al.* 2000) to infer correct direction of edges that bifurcate from a node. That is, the pleiotropic effect of one gene expression phenotype on two other phenotypes is readily detected by our QDG method, but missed by the PC algorithm. Both methods do well in detecting unshielded collider patterns, which correspond to the interactive effect of two gene expression phenotypes on a third phenotype. The enhanced ability of the QDG algorithm to infer both pleiotropy and epistasis will be valuable as eQTL and mQTL studies become more prevalent in experimental crosses.

In preliminary simulation studies (using data generated from Figure 3) we also investigated the performance of the method proposed by Opgen-Rhein and Strimmer (2007) to partially direct graph skeletons. Because their method was unable to recover any of the directions in our preliminary studies, we did not pursue further comparisons.

In this article we assume that a set of multiple QTL affecting each of the phenotypes has been previously determined. With sample sizes in the order of 300–500 we expect to have enough power to detect QTL for a set of functionally related phenotypes appropriate for our causal network methods. Furthermore, phenotypes with no detectable QTL can still be included in the analysis. Edge direction involving these nodes is largely determined by neighboring phenotypes and their QTL. For instance, if we did not detect any QTL for phenotype *y*_{1} we still can orient the edge since the causal models are not likelihood equivalent, *f*(*y*_{2} | *y*_{1}, *q*_{2})*f*(*y*_{1}) ≠ *f*(*y*_{1} | *y*_{2})*f*(*y*_{2} | *q*_{2}). Finally, we point out that previous studies have considered applying relaxed LOD score thresholds to detect minor QTL that would have otherwise been missed and have successfully uncovered interesting pathways (Chesler *et al.* 2005; Lan *et al.* 2006). We anticipate that such a strategy would work effectively with our proposed method.

Distinct QTL are necessary to infer causal direction in a phenotype network. We can orient an edge between a pair of phenotypes with common QTL if there is at least one additional QTL or causal phenotype that affects one of the pair but not the other. However, two phenotypes with only one common QTL have causal models that are likelihood equivalent, *f*(*q*)*f*(*y*_{1} | *q*)*f*(*y*_{2} | *y*_{1}, *q*) = *f*(**q**)*f*(*y*_{2} | *q*)*f*(*y*_{1} | *y*_{2}, *q*). In these cases, we can modify the network to determine directionality by breaking genotype-phenotype edges (Schadt *et al.* 2005), that is, in addition to the pleiotropic causal model, we also consider models and . Note, however, that if the data best support the pleiotropic model, we cannot infer a direction. Further, our QDG method determines a *P*-value to orient each edge. If that *P*-value is large, then we have little confidence in the direction. Such edges may be interpreted as undirected.

Our methods section describes an application of the Schadt *et al.* (2005) approach to test whether a QTL has a direct or an indirect effect on a phenotype. Furthermore, we are currently investigating a strategy to integrate QTL as nodes in the undirected dependency graph, allowing the algorithm to automatically test for direct and indirect QTL effects. We feel, however, that these further investigations are out of the scope of this article.

As is the case with any other network inference method, UDGs (and PC skeletons) can show false edges due to latent variables, including other phenotypes and genotypes. The absence of hidden variables is a common assumption in correlation-based network inference methods (*e.g*., de la Fuente *et al.* 2004; Schafer and Strimmer 2005a,b). Potential problems caused by hidden phenotypes include erroneous inference of direct causation between phenotypes where there is either (a) indirect causation through an intermediate missing phenotype or (b) independence of the phenotypes examined given a hidden variable (de la Fuente *et al.* 2004).

While our approach can detect circular paths involving three or more phenotypes, it merges reciprocal interactions between two phenotypes, . In these cases, a single edge is oriented in the direction of higher strength. Currently we are investigating a Bayesian approach to formally incorporate prior information about network topology and edge orientation.

In conclusion, we have presented a novel, efficient causal network construction method, QDG, with four main advantages: (1) we reduce computation time by avoiding a search over the entire graph space; (2) we screen out edges associated with partial correlation that are not causative in directed acyclic graphs; (3) we can infer feedback loops, which are common in genetic, metabolic, and biochemical networks; and (4) we can infer a single network, instead of an equivalence class of networks. The causal phenotype network method developed here may have broad applicability across a range of population structures. This can include, but is not limited to, inbred crosses, outbred crosses, and natural populations. The causal networks that emerge can form the basis for hypotheses about critical pathways that can be individually tested using knockouts or overexpression of genes.

## Acknowledgments

We acknowledge Angie Tebon Oler for her very generous technical help with hepatocyte experiments. We also thank Katherine Scheuler, Donald Stapleton, and the animal care facility at the University of Wisconsin for their services; Daniel Blasiole for his technical help with the hepatocyte protocol; and Christopher Newgard and the Sarah Stedman Nutrition and Metabolism Center for their generous help with processing of liver metabolites. Funding was supported by CNPq Brazil (E.C.); by National Institutes of Health grants DK66369, DK58037, and DK06639 (A.D.A., B.S.Y., E.C.); and by National Institute of Diabetes and Digestive and Kidney Diseases grant PO1 DK58398 (C.F.). Funding was also provided from the National Institute of General Medical Sciences (NIGMS) through the Duke University Medical Scientist Training Program grant 2T32GM007171 (C.F.) and NIGMS grants PA02110 and GM069430-01A2 (B.Y.).

## Footnotes

Communicating editor: R. W. Doerge

- Received November 28, 2007.
- Accepted April 6, 2008.

- Copyright © 2008 by the Genetics Society of America