Modeling Causality for Pairs of Phenotypes in System Genetics

Neto, Elias Chaibub; Broman, Aimee T; Keller, Mark P; Attie, Alan D; Zhang, Bin; Zhu, Jun; Yandell, Brian S

doi:10.1534/genetics.112.147124

Abstract

Current efforts in systems genetics have focused on the development of statistical approaches that aim to disentangle causal relationships among molecular phenotypes in segregating populations. Reverse engineering of transcriptional networks plays a key role in the understanding of gene regulation. However, transcriptional regulation is only one possible mechanism, as methylation, phosphorylation, direct protein–protein interaction, transcription factor binding, etc., can also contribute to gene regulation. These additional modes of regulation can be interpreted as unobserved variables in the transcriptional gene network and can potentially affect its reconstruction accuracy. We develop tests of causal direction for a pair of phenotypes that may be embedded in a more complicated but unobserved network by extending Vuong’s selection tests for misspecified models. Our tests provide a significance level, which is unavailable for the widely used AIC and BIC criteria. We evaluate the performance of our tests against the AIC, BIC, and a recently published causality inference test in simulation studies. We compare the precision of causal calls using biologically validated causal relationships extracted from a database of 247 knockout experiments in yeast. Our model selection tests are more precise, showing greatly reduced false-positive rates compared to the alternative approaches. In practice, this is a useful feature since follow-up studies tend to be time consuming and expensive and, hence, it is important for the experimentalist to have causal predictions with low false-positive rates.

causality, model selection, hypothesis tests, systems genetics, quantitative trait loci

A key objective of biomedical research is to unravel the biochemical mechanisms underlying complex disease traits. Integration of genetic information with genomic, proteomic, and metabolomic data has been used to infer causal relationships among phenotypes (Schadt et al. 2005; Li et al. 2006; Kulp and Jagalur 2006; Chen et al. 2007; Zhu et al. 2004, 2007, 2008; Aten et al. 2008; Liu et al. 2008; Chaibub Neto et al. 2008, 2009; Winrow et al. 2009; Millstein et al. 2009). Current approaches for causal inference in systems genetics can be classified into whole network scoring methods (Zhu et al. 2004, 2007, 2008; Li et al. 2006; Liu et al. 2008; Chaibub Neto et al. 2008, 2010; Winrow et al. 2009; Hageman et al. 2011) or pairwise methods, which focus on the inference of causal relationships among pairs of phenotypes (Schadt et al. 2005; Li et al. 2006; Kulp and Jagalur 2006; Chen et al. 2007; Aten et al. 2008; Millstein et al. 2009; Li et al. 2010; Duarte and Zeng 2011). In this article we develop a pairwise approach for causal inference among pairs of phenotypes.

Two key assumptions for causal inference in systems genetics are genetic variation preceding phenotypic variation and Mendelian randomization of alleles in unlinked loci. These conditions together, which provide temporal order and eliminate confounding of other factors, justify causal claims between QTL and phenotypes. Causal inference among phenotypes is justified by conditional independence relations under Markov properties (Li et al. 2006; Chaibub Neto et al. 2010).

Given a pair of phenotypes, Y₁ and Y₂, that co-map to the same quantitative trait locus, Q, our objective is to learn which of the four distinct models, M₁, M₂, M₃, and M₄, depicted in Figure 1, is the best representation for the true relation between Y₁ and Y₂. Models M₁, M₂, M₃, and M₄ represent, respectively, the causal, reactive, independence, and full models as collapsed versions of more complex regulatory networks. For instance, when the data are transcriptional and one gene is upstream of other genes, the regulation of the upstream gene may affect those downstream, even when the regulation takes place via post-transcriptional mechanisms and, hence, is mediated by unobserved variables. Transcriptional networks should be interpreted as collapsed versions of more complicated networks, where the presence of an arrow from a QTL to a phenotype or from one phenotype to another simply means that there is a directional influence of one node on another (that is, there is at least one path in the network where the node in the tail of the arrow is upstream of the node in the head). Supporting Information, Figure S1 shows a few examples of networks and their collapsed versions. Our goal in this article is to infer the causal direction between two nodes, and the term “causal” should be interpreted as causal direction, meaning either direct or indirect causal relations.

In this article, we propose novel causal model selection hypothesis tests and compare their performance to the AIC and BIC model selection criteria and to a causality inference test (CIT) proposed by Millstein et al. (2009). AIC (Akaike 1974) and BIC (Schwarz 1978) are widely used penalized likelihood criteria performing model selection among models of different sizes. Overparameterized models tend to overfit the data and, when comparing models with different dimension, it is necessary to counterbalance model fit and model parsimony by adding a penalty term that depends on the number of parameters. CIT is an intersection-union test, in which a number of equivalence and conditional F tests are conservatively combined in a single test. P-values are computed for models M₁ and M₂ in Figure 1, but not for the M₃ or M₄ models, and the decision rule for model calling goes as follows: (1) call M₁ if the M₁ P-value is less than a significance threshold α and the M₂ P-value is greater than α; (2) call M₂ if it is the other way around; (3) call M_i if both P-values are greater than α; and (4) make a “no call” if both P-values are less than α. The M_i call actually means that the model is not M₁ or M₂ and could correspond to an M₃ or M₄ model. Note that the CIT makes a no call when both M₁ and M₂ models are simultaneously significant.

Our causal model selection tests (CMSTs) adapt and extend Vuong’s (1989) and Clarke’s (2007) tests to the comparison of four models. Vuong’s model selection test is a formal parametric hypothesis test devised to quantify the uncertainty associated with a model selection criterion, comparing two models based on their (penalized) likelihood scores. It uses the (penalized) log-likelihood ratio scaled by its standard error as a test statistic and tests the null hypothesis that both models are equally close to the true data generating process. While the (penalized) log-likelihood scores can determine only whether, for example, model A fits the data better than model B, Vuong’s test goes one step further and attaches a P-value to the scaled contrast of (penalized) log-likelihood scores. In this way it can interrogate whether the better fit of model A compared to model B is statistically significant. Vuong’s test tends to be conservative and low powered. Clarke (2007) proposed a nonparametric version that achieves an increase in power at the expense of higher miss-calling error rates by using the median rather than the mean of (penalized) log-likelihood ratio.

We propose three distinct versions of causal model selection tests: (1) the parametric CMST test, which corresponds to an intersection-union test of six separate Vuong’s tests; (2) the nonparametric CMST test, constructed as an intersection-union test of six of Clarke’s tests; and (3) the joint-parametric CMST test, which mimics an intersection-union test and is derived from the joint distribution of Vuong’s test statistics. These CMST tests inherit from Vuong’s test the property that none of the models being compared need be correct. That is, the true model may describe a more complicated network, including unobserved factors. Our approach simply selects the wrong model that is closest to the (unknown) true model.

Methods

Vuong’s model selection test

The Kullback–Leibler Information Criterion (KLIC) (Kullback 1959) measures the closeness of a probability model to the true distribution of data. Sawa (1978) showed that the KLIC orders approximate models by comparing the expected value of the log likelihood under the true model. Vuong (1989) used this result to develop an empirical test of two models based on the sample mean of the log-likelihood ratio scaled by its sample standard error.

Formally, {f(y|x; θ) : θ ε Θ} represents a parametric family of conditional models and

\begin{matrix} K L I C (h^{0}; f) = E^{0} [log h^{0} (y | x)] - E^{0} [log f (y | x; θ_{*})] \\ = \int_{x} [\int_{y} h^{0} (y | x) log \frac{h^{0} (y | x)}{f (y | x; θ_{*})} d y] h^{0} (x) d x, \end{matrix}

(1)

where E⁰ represents the expectation with respect to the true joint distribution h⁰(y, x) = h⁰(y|x)h⁰(x), and θ_* is the parameter value that minimizes the KLIC distance from f to the true model (Sawa 1978). Note that f need not belong to the same parametric family as h⁰.

A model f₁(y|x; θ_1*), denoted f₁ for short, is regarded as a better approximation to the true model h⁰(y|x), than the alternative model f₂(y|x; θ_2*) if and only if KLIC(h⁰; f₁) < KLIC(h⁰; f₂), or alternatively, E⁰[log f₁] > E⁰[log f₂] (Sawa 1978). Vuong’s model selection test is based on the latter criterion and the null and alternative hypotheses are defined as

H_{0} : E^{0} [L R_{12}] = 0, H_{1} : E^{0} [L R_{12}] > 0, H_{2} : E^{0} [L R_{12}] < 0,

(2)

where LR₁₂ = log f₁ − log f₂. The null hypothesis is f₁ and f₂ are equally close to the true distribution. The alternative hypothesis H₁ means that f₁ is better than f₂ and conversely for the alternative H₂.

The quantity E⁰[LR₁₂] is unknown, but under fairly general conditions the sample mean and variance of

L {\hat{R}}_{12, i} = log {\hat{f}}_{1, i} - log {\hat{f}}_{2, i}

converge almost surely to E⁰[LR₁₂] and Var⁰[LR₁₂] = σ_12.12, where

{\hat{f}}_{1, i} = f_{1} (y_{i} | x_{i}; {\hat{θ}}_{1})

and

{\hat{θ}}_{1}

is the maximum-likelihood estimate of θ₁ (Vuong 1989). Let

L {\hat{R}}_{12} = \sum_{i = 1}^{n} L {\hat{R}}_{12, i}

⁠, then, under H₀,

n^{- 1 / 2} L {\hat{R}}_{12} / \sqrt{{\hat{σ}}_{12.12}} \to^{d} N (0, 1) .

(3)

Under H₁ this test statistic converges almost surely to ∞, whereas, under H₂, it converges to −∞ (Vuong 1989).

Vuong’s test is based on the unadjusted log-likelihood ratio statistic. However, competing models may have different dimensions, requiring a complexity penalty. The penalized log-likelihood ratio is given by

L {\hat{R}}_{12}^{*} = L {\hat{R}}_{12} - D_{12}

⁠, where the penalty D₁₂ is the difference of the number of parameters between models 1 and 2 (for AIC) or this value rescaled by (log n)/2 (for BIC). Because the penalty term is of smaller size than n^1/2, the adjusted log-likelihood ratio accounting for different model dimensions

Z_{12} = n^{- 1 / 2} L {\hat{R}}_{12}^{*} / \sqrt{{\hat{σ}}_{12.12}}

(4)

has the same asymptotic properties as

n^{- 1 / 2} L {\hat{R}}_{12} / \sqrt{{\hat{σ}}_{12.12}}

(Vuong 1989).

The P-value of Vuong’s test is given by p₁₂ = P(Z₁₂ ≥ z₁₂) = 1 − Φ(z₁₂), where Φ() represents the cumulative density function of a standard normal variable (Vuong 1989). Note that since Z₁₂ = −Z₁₂; p₂₁ = 1 − Φ(z₂₁) = Φ(z₁₂), so that p₁₂ + p₂₁ = 1. This property of the Vuong’s test ensures that the P-values of the intersection-union tests cannot be simultaneously significant.

Figure S2 illustrates how Vuong’s test trades a reduction in false positives against a reduction in statistical power. In our applications we need to account for both nested and nonnested models. While the presented test corresponds to Vuong’s test for strictly nonnested models, it is also valid for nested models when we adopt penalized likelihood scores (see File S1, for further details).

Clarke’s model selection paired sign test

The model selection paired sign test (Clarke 2007) is a nonparametric alternative to Vuong’s test, testing the null hypothesis that the median log-likelihood ratio is 0. Clarke’s test statistic, T₁₂, is a sign test on

L {\hat{R}}_{12, i}

⁠. Under the null hypothesis that the median log-likelihood ratio is zero, T₁₂ has a binomial distribution, and the P-value for comparing models 1 and 2 is

p_{12} = P (T_{12} \geq t_{12}) = \sum_{k = t_{12}}^{n} C_{k}^{n} 2^{- n},

(5)

with

C_{k}^{n} = n! / k! (n - k)!

⁠. The P-values for T₁₂ and T₂₁ do not add to 1 since the statistics are discrete,

p_{12} + p_{21} = 1 + C_{t_{12}}^{n} 2^{- n}

⁠. Nonetheless, the

C_{t_{12}}^{n} 2^{- n}

term decreases to 0 as n increases, and, in practice, p₁₂ + p₂₁ ≈ 1 even for moderate sample sizes. We adjust this test using the AIC or BIC penalty D₁₂,

T_{12} = \sum_{i = 1}^{n} 11 {L {\hat{R}}_{12, i} - n^{- 1} D_{12} > 0},

(6)

to account for the varying dimensionality of the models.

Causal model selection tests

The four models M₁, M₂, M₃, and M₄ (Figure 1) are used to derive intersection-union tests based on the application of six separate Vuong (or Clarke) tests comparing, namely, f₁ × f₂, f₁ × f₃, f₁ × f₄, f₂ × f₃, f₂ × f₄, and f₃ × f₄. Sun et al. (2007) implicitly used intersection unions of Vuong’s tests to select among three nonnested models. Here, we present three distinct versions of the CMST: (1) parametric, (2) nonparametric, and (3) joint-parametric CMST tests. We implement the tests with penalized log likelihoods, but state results for log likelihoods.

Figure 1

Pairwise causal models. Y1 and Y2 represent phenotypes that co-map to the same QTL, Q. Models M1, M2, M3, and M4 represent, respectively, the causal, reactive, independent, and full model. In model M1 the phenotype Y1 has a causal effect on Y2. In M2, the phenotype Y1 is actually reacting to a causal effect of Y2, hence the name reactive model. In the independence model, M3, there is no causal relationship between Y1 and Y2 and their correlation is solely due to Q. The full model, M4, corresponds to three distribution equivalent models M4a, M4b, and M4c which cannot be distinguished as their maximized-likelihood scores are identical. Model M4b represents a causal independence relationship where the correlation between Y1 and Y2 is a consequence of latent causal phenotypes, common causal QTL, or of common environmental effects. Models M4a and M4c correspond to causal-pleiotropic and reactive-pleiotropic relationships, respectively.

Open in new tab Download slide

Pairwise causal models. Y₁ and Y₂ represent phenotypes that co-map to the same QTL, Q. Models M₁, M₂, M₃, and M₄ represent, respectively, the causal, reactive, independent, and full model. In model M₁ the phenotype Y₁ has a causal effect on Y₂. In M₂, the phenotype Y₁ is actually reacting to a causal effect of Y₂, hence the name reactive model. In the independence model, M₃, there is no causal relationship between Y₁ and Y₂ and their correlation is solely due to Q. The full model, M₄, corresponds to three distribution equivalent models $M_{4}^{a}$ ⁠, $M_{4}^{b}$ ⁠, and $M_{4}^{c}$ which cannot be distinguished as their maximized-likelihood scores are identical. Model $M_{4}^{b}$ represents a causal independence relationship where the correlation between Y₁ and Y₂ is a consequence of latent causal phenotypes, common causal QTL, or of common environmental effects. Models $M_{4}^{a}$ and $M_{4}^{c}$ correspond to causal-pleiotropic and reactive-pleiotropic relationships, respectively.

Here we focus on model M₁ and P-value p₁, with analogous results and notation for the other three models. Starting with the parametric version, we test the null H₀: model M₁ is no closer to the true model than M₂, M₃ or M₄, against the alternative H₁: M₁ is closer to the true model than M₂, M₃, and M₄. More explicitly, we test,

H_{0} : {E^{0} [L R_{12}] = 0} \cup {E^{0} [L R_{13}] = 0} \cup {E^{0} [L R_{14}] = 0},

(7)

against

H_{1} : {E^{0} [L R_{12}] > 0} \cap {E^{0} [L R_{13}] > 0} \cap {E^{0} [L R_{14}] > 0} .

(8)

The rejection region for this test is min{z_12,z_13,z₁₄} > c_α, where c_α is the α-critical value of the standard normal. The intersection-union P-value is p₁ = max{p₁₂, p_13, p₁₄}. For any α, if p₁ ≤ α, then min{p₂,p₃,p₄} ≥ 1−α. Therefore, the proposed CMST test has at most one significant model P-value at a time, in contrast to the CIT approach.

The nonparametric CMST test corresponds to an intersection union of Clarke’s tests, exactly analogous to the parametric version. Because in practice p₁₂ + p₂₁ ≈ 1 for Clarke’s test, the nonparametric CMST test also does not allow the detection of more than one significant model P-value.

Simple application of separate Vuong tests overlooks the dependency among the test statistics. A multivariate extension, the joint parametric CMST test, can be developed to address this caveat. For model M₁, and under the same general regularity conditions of Vuong (1989), the sample covariance of

L {\hat{R}}_{12, i}

and

L {\hat{R}}_{13, i}

⁠,

{\hat{σ}}_{12.13}

⁠, converges almost surely to Cov⁰[LR₁₂, LR₁₃] = σ_12.13 (and similarly for all other covariance terms). Therefore, the sample covariance matrix,

{\hat{Σ}}_{1}

⁠, converges almost surely to

Σ_{1}

⁠. From the multivariate central limit and Slutsky’s theorems (Shao 2003), if

(\begin{array}{l} E^{0} [L R_{12}] \\ E^{0} [L R_{13}] \\ E^{0} [L R_{14}] \end{array}) = (\begin{array}{l} 0 \\ 0 \\ 0 \end{array})

(9)

then

Z_{1} = diag {({\hat{Σ}}_{1})}^{- \frac{1}{2}} L {\hat{R}}_{1} / \sqrt{n} \to^{d} N_{3} (0, ρ_{1}),

where

L {\hat{R}}_{1} = {(L {\hat{R}}_{12}, L {\hat{R}}_{13}, L {\hat{R}}_{14})}^{T}

and

ρ_{1} = diag {(Σ_{1})}^{- \frac{1}{2}} Σ_{1} diag {(Σ_{1})}^{- \frac{1}{2}}

is the correlation matrix

ρ_{1} = (\begin{matrix} 1 & ρ_{12.13} & ρ_{12.14} \\ ρ_{12.13} & 1 & ρ_{13.14} \\ ρ_{12.14} & ρ_{13.14} & 1 \end{matrix}) .

(10)

The condition in (9) is the worst case of the more general null hypothesis that M₁ is not better than at least one of M₂, M₃, M₄, or

H_{0} : min {E^{0} [L R_{12}], E^{0} [L R_{13}], E^{0} [L R_{14}]} \leq 0.

(11)

We test this against the alternative that M₁ is better than all three, or

H_{1} : min {E^{0} [L R_{12}], E^{0} [L R_{13}], E^{0} [L R_{14}]} > 0,

(12)

using the statistic W₁ = min{Z₁}, with P-value

\begin{matrix} P (W_{1} \geq w_{1}) = P (min {Z_{12}, Z_{13}, Z_{14}} \geq w_{1}) \\ = P (Z_{12} \geq w_{1}, Z_{13} \geq w_{1}, Z_{14} \geq w_{1}) . \end{matrix}

(13)

The joint-parametric CMST test with W₁ follows the spirit of an intersection union test while accounting for dependency among test statistics. Table 1 depicts the joint CMST tests for all models.

Model selection tests for models M₁, M₂, M₃, and M₄

Table 1

Model selection tests for models M₁, M₂, M₃, and M₄

H₀	Null distribution	P-value
H₀^M¹	$Z_{1} = {(Z_{12}, Z_{13}, Z_{14})}^{T} \sim N_{3} (0, {\hat{ρ}}_{1})$	p₁ = P(Z₁₂ ≥ w₁, Z₁₃ ≥ w_1, Z₁₄ ≥ w₁)
H₀^M²	$Z_{2} = {(Z_{21}, Z_{23}, Z_{24})}^{T} \sim N_{3} (0, {\hat{ρ}}_{2})$	p₂ = P(Z₂₁ ≥ w₂, Z₂₃ ≥ w_2, Z₂₄ ≥ w₂)
H₀^M³	$Z_{3} = {(Z_{31}, Z_{32}, Z_{34})}^{T} \sim N_{3} (0, {\hat{ρ}}_{3})$	p₃ = P(Z₃₁ ≥ w₃, Z₃₂ ≥ w_3, Z₃₄ ≥ w₃)
H₀^M⁴	$Z_{4} = {(Z_{41}, Z_{42}, Z_{43})}^{T} \sim N_{3} (0, {\hat{ρ}}_{4})$	p₄ = P(Z₄₁ ≥ w₄, Z₄₂ ≥ w_4, Z₄₃ ≥ w₄)

Here w_k = min{z_k} for k = 1,…,4, and ρ_k is defined in analogy with ρ₁ in Equation 10.

Open in new tab

Table 1

Model selection tests for models M₁, M₂, M₃, and M₄

H₀	Null distribution	P-value
H₀^M¹	$Z_{1} = {(Z_{12}, Z_{13}, Z_{14})}^{T} \sim N_{3} (0, {\hat{ρ}}_{1})$	p₁ = P(Z₁₂ ≥ w₁, Z₁₃ ≥ w_1, Z₁₄ ≥ w₁)
H₀^M²	$Z_{2} = {(Z_{21}, Z_{23}, Z_{24})}^{T} \sim N_{3} (0, {\hat{ρ}}_{2})$	p₂ = P(Z₂₁ ≥ w₂, Z₂₃ ≥ w_2, Z₂₄ ≥ w₂)
H₀^M³	$Z_{3} = {(Z_{31}, Z_{32}, Z_{34})}^{T} \sim N_{3} (0, {\hat{ρ}}_{3})$	p₃ = P(Z₃₁ ≥ w₃, Z₃₂ ≥ w_3, Z₃₄ ≥ w₃)
H₀^M⁴	$Z_{4} = {(Z_{41}, Z_{42}, Z_{43})}^{T} \sim N_{3} (0, {\hat{ρ}}_{4})$	p₄ = P(Z₄₁ ≥ w₄, Z₄₂ ≥ w_4, Z₄₃ ≥ w₄)

Here w_k = min{z_k} for k = 1,…,4, and ρ_k is defined in analogy with ρ₁ in Equation 10.

Open in new tab

The CMST tests are implemented in the R/qtlhot package available at CRAN. Although not explicitly stated in the notation, the pairwise models can easily account for additive and interactive covariates, and our code already implements this feature. When using this package please cite this article.

Simulation studies

We conducted two simulation studies. In the first “pilot study,” we focus on performance comparison of the AIC, BIC, CIT, and CMST methods with data generated from simple causal models. The goal is to understand the behavior of our methods in diverse settings. In the second “large-scale study,” we perform a simulation experiment, with data generated from causal models emulating QTL hotspot patterns. The goal is to understand the impact of multiple testing on the performance of our causality tests.

The pilot simulation study has data generated from models A to E in Figure 2. We conducted 10 simulation studies, generating data from the five models described above under sample sizes 112 (the size of our real data example) and 1000. For each model, we simulated 1000 backcrosses. We chose simulation parameters to ensure that 99% of the R² coefficients between phenotypes and QTL ranged between 0.08 and 0.32 for the simulations based on sample size of 112 subjects and between 0.01 to 0.20 for the simulations based on 1000 subjects (see File S2, File S3, and File S4 for details). We evaluated the method’s performance using statistical power, miss-calling error rate, and precision. These quantities were computed as,

Figure 2

Models used in the simulation study. Y1 and Y2 represent phenotypes that co-map to the same QTL, Q. Model A represents a causal effect of Y1 on Y2. Model B represents the same, with the additional complication that part of the correlation between Y1 and Y2 is due to a hidden-variable H. Model C represents a causal-pleiotropic model, where Q affects both Y1 and Y2 but Y1 also has a causal effect on Y2. Model D shows a purely pleiotropic model, where both Y1 and Y2 are under the control of the same QTL, but one does not causally affect the other. Model E represents the pleiotropic model, where the correlation between Y1 and Y2 is partially explained by a hidden-variable H.

Open in new tab Download slide

Models used in the simulation study. Y₁ and Y₂ represent phenotypes that co-map to the same QTL, Q. Model A represents a causal effect of Y₁ on Y₂. Model B represents the same, with the additional complication that part of the correlation between Y₁ and Y₂ is due to a hidden-variable H. Model C represents a causal-pleiotropic model, where Q affects both Y₁ and Y₂ but Y₁ also has a causal effect on Y₂. Model D shows a purely pleiotropic model, where both Y₁ and Y₂ are under the control of the same QTL, but one does not causally affect the other. Model E represents the pleiotropic model, where the correlation between Y₁ and Y₂ is partially explained by a hidden-variable H.

\begin{matrix} Power = \frac{TP}{N}, Miss-calling error = \frac{FP}{N}, \\ Precision = \frac{TP}{TP + FP}, \end{matrix}

where N is the total number of tests, and TP (true positives) and FP (false positives) are defined according to Table 2, which depicts possible calls against simulated models and tabulates whether a specific call correctly represents the causal relationship between the phenotypes in the model from which the data were generated.

True and false positives

Table 2

True and false positives

CMST	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M₃	FP	FP	FP	TP	FP
M₄	FP	FP	TP	FP	TP
CIT	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M_i	FP	FP	TP	TP	TP

CMST	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M₃	FP	FP	FP	TP	FP
M₄	FP	FP	TP	FP	TP
CIT	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M_i	FP	FP	TP	TP	TP

Each entry i, j represents whether the call on row i is a true positive (TP) or as false positive (FP), when the data are generated from the model on column j. For instance, when data are generated from models A or B, a M₁ call represents a true positive, whereas a M₂, M₃, or M₄ call represents a false positive for the AIC, BIC, and CMSTs approaches (for the CIT a M₂ or M_i call represents false positive). Note that a M₄ call is considered a true positive for model C (in addition to model E) because it corresponds to model $M_{4}^{a}$ on Figure 1 and, hence, is distribution equivalent to model M₄. Please note too that because the CIT provides P-values for only the M₁ and M₂ calls, but not for the M₃ and M₄ calls, and its output is M₁, M₂, or M_i, we classify a M_i call as a true positive for models C, D, and E. Observe that by doing so we are actually giving an unfair advantage for the CIT approach, since when the data are generated from, say, model E, the CIT needs only to discard models M₁ and M₂ as nonsignificant to detect a “true positive.” The AIC, BIC, and CMST approaches, on the other hand, need to discard models M₁, M₂, and M₃ as nonsignificant and accept model M₄ as significant.

Open in new tab

Table 2

True and false positives

CMST	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M₃	FP	FP	FP	TP	FP
M₄	FP	FP	TP	FP	TP
CIT	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M_i	FP	FP	TP	TP	TP

CMST	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M₃	FP	FP	FP	TP	FP
M₄	FP	FP	TP	FP	TP
CIT	Model A	Model B	Model C	Model D	Model E
M₁	TP	TP	FP	FP	FP
M₂	FP	FP	FP	FP	FP
M_i	FP	FP	TP	TP	TP

Each entry i, j represents whether the call on row i is a true positive (TP) or as false positive (FP), when the data are generated from the model on column j. For instance, when data are generated from models A or B, a M₁ call represents a true positive, whereas a M₂, M₃, or M₄ call represents a false positive for the AIC, BIC, and CMSTs approaches (for the CIT a M₂ or M_i call represents false positive). Note that a M₄ call is considered a true positive for model C (in addition to model E) because it corresponds to model $M_{4}^{a}$ on Figure 1 and, hence, is distribution equivalent to model M₄. Please note too that because the CIT provides P-values for only the M₁ and M₂ calls, but not for the M₃ and M₄ calls, and its output is M₁, M₂, or M_i, we classify a M_i call as a true positive for models C, D, and E. Observe that by doing so we are actually giving an unfair advantage for the CIT approach, since when the data are generated from, say, model E, the CIT needs only to discard models M₁ and M₂ as nonsignificant to detect a “true positive.” The AIC, BIC, and CMST approaches, on the other hand, need to discard models M₁, M₂, and M₃ as nonsignificant and accept model M₄ as significant.

Open in new tab

In the large-scale simulation study we investigate the empirical FDR (1 minus the precision) and power levels achieved by the CMST tests using the Benjamini and Hochberg (1995) and the Benjamini and Yekutieli (2001) FDR control procedures (denoted, respectively, by BH and BY), as well as no multiple testing correction. We simulate data from the models in Figure 3, which emulate eQTL hotspot patterns, i.e., genomic regions to which hundreds or thousands of traits co-map (West et al. 2007). In each simulation we generated 1000 distinct backcrosses with phenotypic data on 5001 traits on 112 individuals. We simulated unequally spaced markers for model F, but equally spaced markers for G, with Q₁ and Q set 1 cM apart. Because we fit almost three million hypothesis tests in this simulation study, we did not include the CIT tests in this investigation, restricting our attention to the computationally more efficient CMST tests. The details for our choice of simulation parameters and QTL mapping are presented in File S2, File S3, and File S4. A frequent goal in eQTL hotspots studies is to determine a master regulator, i.e., a transcript that regulates the transcription of the other traits mapping to the hotspot. A promising strategy toward this end is to test the cis traits (i.e., transcripts physically located close to the QTL hotspot) against all other co-mapping traits. Our simulations evaluate the performance of the CMST tests in this setting.

Figure 3

Models generating hotspot patterns. Y1 represents a cis-expression trait. Yk, k = 2, …, 5001 represent expression traits mapping in trans to the hotspot QTL Q. H represents an unobserved expression trait. Model F generates a hotspot pattern derived from the causal effect of the master regulator, Y1, on the transcription of the other traits. Model G gives rise to a hotspot pattern, due to the causal effect of H on Yk, but where the cis-trait Y1 maps to Q1, a QTL closely linked to the true QTL hotspot Q, and is actually causally independent of the traits mapping in trans to the Q.

Open in new tab Download slide

Models generating hotspot patterns. Y₁ represents a cis-expression trait. Y_k, k = 2, …, 5001 represent expression traits mapping in trans to the hotspot QTL Q. H represents an unobserved expression trait. Model F generates a hotspot pattern derived from the causal effect of the master regulator, Y₁, on the transcription of the other traits. Model G gives rise to a hotspot pattern, due to the causal effect of H on Y_k, but where the cis-trait Y₁ maps to Q₁, a QTL closely linked to the true QTL hotspot Q, and is actually causally independent of the traits mapping in trans to the Q.

Results

Pilot simulation study results

Figure 4 depicts the power, miss-calling error rate, and precision of each of the methods based on the simulation results of all five models in Figure 2. The results of the AIC and BIC approaches are constant across all significance levels since these approaches do not provide a measure of statistical significance. For those methods, we simply fit the models to the data and select the model with the best (smallest) score.

Figure 4

Open in new tab Download slide

Power (A and D), miss-calling error rate (B and E), and precision (C and F) across the simulated models depicted in Figure 2. The x-axis represents the significance levels used for computing the results. (A—C) The simulations based on sample size 112; (D—F) the results for sample size 1000. Dashed and solid curves represent, respectively, AIC- and BIC-based methods. Green: parametric CMST. Red: nonparametric CMST. Blue: joint-parametric CMST. Black: AIC and BIC. Orange: CIT. The shaded line on B and E corresponds to the α levels.

Overall, the AIC, BIC, and CIT showed high power, high miss-calling error rates, and low precision. The CMST methods, on the other hand, showed lower power, lower miss-calling error rates, and higher precision. The nonparametric CMST tended to be more powerful but less precise than the other CMST approaches. As expected, for sample size 1000, all methods showed an increase in power and precision and decrease in miss-calling error rate.

Figure S3, Figure S4, Figure S5, Figure S6, and Figure S7 show the simulation results data for each one of models A to E, when sample size is 112. Figure S8, Figure S9, Figure S10, Figure S11, and Figure S12 show the same results for sample size 1000. Some of the simulated models were intrinsically more challenging than others. For instance, in the absence of latent variables the causal and independence relations can often be correctly inferred by all methods (see the results for models A and D in Figure S3, Figure S6, Figure S8, and Figure S11). However, the presence of hidden variables in models B and E tend to complicate matters. Nonetheless, although the AIC, BIC, and CIT methods tend to detect false positives at high rates in these complicated situations, the CMST tests tend to forfeit making calls and tend to detect fewer false positives (see Figure S4, Figure S7, Figure S9, and Figure S12). Model C is particularly challenging (Figure S5 and Figure S10), showing the highest false-positive detection rates among all models.

In genetical genomics experiments we often restrict our attention to the analysis of cis-genes against trans-genes. In this special case it is reasonable to expect the pleiotropic causal relationship depicted in model C to be much less frequent than the relationships shown in models A, B, D, and E, so that the performance statistics shown in Figure 4 might be negatively affected to an unnecessary degree by the simulation results from model C.

To investigate the performance of methods in the cis- against trans-case, we present in Figure 5 the simulation results based on models A, B, D, and E only. Comparison of Figures 4 and 5 show an overall improvement in power, decrease in miss-calling rates and increase in precision.

Figure 5

Open in new tab Download slide

Power (A and D), miss-calling error rate (B and E), and precision (C and F) restricted to the cis- vs. trans-cases. The x-axis represents the significance levels used for computing the results. The results were computed using only the simulated models A, B, D, and E in Figure 2, since the pleiotropic causal relationship depicted in model C is expected to be much less frequent than the others when testing cis- vs. trans-case. (A–C) The simulations based on sample size 112; (D–F) the results for sample size 1000. Dashed and solid curves represent, respectively, AIC- and BIC-based methods. Green: parametric CMST. Red: nonparametric CMST. Blue: joint-parametric CMST. Black: AIC and BIC. Orange: CIT. The shaded line on B and E corresponds to the α levels.

In the analysis of trans- against trans-genes there is no a priori reason to discard the relationship depicted in model C, and more false positives should be expected. The CMST approaches, specially the joint parametric and parametric CMST methods, tend to detect a much smaller number of false positives than the AIC, BIC, and CIT approaches, as shown in Figure S5 and Figure S10.

Large-scale simulation study results

With the possible exception of the nonparametric version, the previous simulation study suggests that the CMST tests can be quite conservative. Therefore, it is reasonable to ask whether multiple testing correction is really necessary to achieve reasonable false discovery rates (FDR).

Figure 6 presents the observed FDR and power using uncorrected, BH corrected, and BY corrected P-values for the simulations based on model G. Figure 6, top, shows that, except for the AIC-based nonparametric CMST, the observed FDRs were considerably lower than the P-value cutoff, suggesting that multiple testing adjustment is not necessary for the CMST tests. Furthermore, comparison of the bottom panels shows that the BH and BY adjustments leads to a reduction in power (specially for the BY adjustment) for the joint and parametric tests at the expense of small drop in FDR levels (that were already low without any correction). For the nonparametric tests, on the other hand, BH corrections leads to bigger drops in FDR (specially for the AIC based test) and smaller drops in power. The BY correction appears too conservative even for the nonparametric tests. The results for model F are similar (Figure S13).

Figure 6

Open in new tab Download slide

Observed FDR and power for the simulations based on model G. The x-axis represents the P-value cutoffs used for computing the results. Dashed and solid curves represent, respectively, AIC- and BIC-based methods. Green: parametric CMST. Red: nonparametric CMST. Blue: joint-parametric CMST. Black: AIC and BIC. The shaded line in the top corresponds to the α levels.

Yeast data analysis and biologically validated predictions

We analyzed a budding yeast genetic genomics data set derived from a cross of a standard laboratory strain and a wild isolate from a California vineyard (Brem and Kruglyak 2005). The data consist of expression measurements on 5740 transcripts measured on 112 segregant strains with dense genotype data on 2956 markers. Processing of the expression measurements raw data was performed as described in Brem and Kruglyak (2005), with an additional step of converting the processed measurements to normal scores. We performed QTL analysis using Haley–Knott regression (Haley and Knott 1992) with the R/qtl software (Broman et al. 2003). We used Haldane’s map function, genotype error rate of 0.0001, and set the maximum distance between positions at which genotype probabilities were calculated to 2 cM. We adopted a permutation LOD threshold (Churchill and Doerge 1994) of 3.48, controlling the genome-wide error rate of falsely detecting a QTL at a significance level of 5%.

To evaluate the precision of the causal predictions made by the methods we used validated causal relationships extracted from a database of 247 knock-out experiments in yeast (Hughes et al. 2000; Zhu et al. 2008). In each of these experiments, one gene was knocked out, and the expression levels of the remainder genes in control and knocked-out strains were interrogated for differential expression. The set of differentially expressed genes form the knock-out signature (ko-signature) of the knocked-out gene (ko-gene) and show direct evidence of a causal effect of the ko-gene on the ko-signature genes. The yeast data cross and knocked-out data analyzed in this section are available in the R/qtlyeast package at GITHUB (https://github.com/byandell/qtlyeast).

To use this information, we: (i) determined which of the 247 ko-genes also showed a significant eQTL in our data set; (ii) for each one of the ko-genes showing significant linkages, we determined which other genes in our data set also co-mapped to the same QTL of the ko-gene, generating, in this way, a list of putative targets of the ko-gene; (iii) for each of the ko-gene/putative targets list, we applied all methods using the ko-gene as the Y₁ phenotype, the putative target genes as the Y₂ phenotypes, and the ko-gene QTL as the causal anchor; (iv) for the AIC- and BIC-based nonparametric CMST tests we adjusted the P-values according to the Benjamini and Hochberg FDR control procedure; and (v) for each method we determined the “validated precision,” computed as the ratio of true positives by the sum of true and false positives, where a true positive is defined as an inferred causal relationship where the target gene belongs to the ko-signature of the ko-gene, and a false positive is given by an inferred causal relation where the target gene does not belong to the ko-signature.

In total 135 of the ko-genes showed a significant QTL, generating 135 putative target lists. A gene belonged to the putative target list of a ko-gene when its 1.5 LOD support interval (Lander and Botstein 1989; Dupuis and Siegmund 1999; Manichaikul et al. 2006) contained the location of the ko-gene QTL. The number of genes in each of the putative target lists varied from list to list, but in total we tested 31,975 “ko-gene/putative target gene” relationships.

Figure 7 presents the number of inferred true positives, number of inferred false positives, and the prediction precision across varying target significance levels for each one of the methods. The CIT, BIC, and AIC had a higher number of true positives than the CMST approaches, with the AIC-based CMST methods having less power than the BIC-based CMST methods. However, the CIT, BIC, and AIC also inferred the highest numbers of false positives (Figure 7B) and showed low prediction precisions (Figure 7C). From Figure 7C we see that the CMST tests show substantially higher precision rates across all target significance levels compared to the AIC, BIC, and CIT methods. Among the CMST approaches, the joint-parametric CMST tended to show the highest precision, followed by the nonparametric and parametric CMST tests.

Figure 7

Open in new tab Download slide

Overall number of true positives (A), number of false positives (B), and precision (C) across all 135 ko-gene/putative target lists. The x-axis represents the significance levels used for computing the results. Dashed and solid curves represent, respectively, AIC- and BIC-based methods. Green: parametric CMST. Red: nonparametric CMST. Blue: joint-parametric CMST. Black: AIC and BIC. Orange: CIT.

The results presented in Figure 7 were computed using all 135 ko-genes. However, in light of our simulation results, which suggest that the analysis of cis- against trans-genes is usually easier than the analysis of trans- against trans-genes, we investigated the results restricting ourselves to ko-genes with significant cis-QTL. Only 28 of the 135 ko-genes were cis-traits, but, nonetheless, were responsible for 2947 of the total 31,975 “ko-gene/putative target gene” relationships. Figure 8 presents the results restricted to the cis-ko-genes. All methods show improvement in precision, corroborating our simulation results. Once again, the CMST tests showed higher precision than the CIT, AIC, and BIC.

Figure 8

Open in new tab Download slide

Overall number of true positives (A), number of false positives (B), and precision (C) restricted to 28 cis ko-gene/putative target lists. The x-axis represents the significance levels used for computing the results. Dashed and solid curves represent, respectively, AIC- and BIC-based methods. Green: parametric CMST. Red: nonparametric CMST. Blue: joint-parametric CMST. Black: AIC and BIC. Orange: CIT.

Discussion

In this article, we proposed three novel hypothesis tests that adapt and extend Vuong’s and Clarke’s model selection tests to the comparison of four models, spanning the full range of possible causal relationships among a pair of phenotypes. Our CMST tests scale well to large genome wide analyses because they are fully analytical and avoid computationally expensive permutation or resampling strategies.

Another useful property of the CMST tests, inherited from Vuong’s test, is their ability to perform model selection among misspecified models. That is, the correct model need not be one of the models under consideration. Accounting for the misspecification of the models is key. In general, any two phenotypes of interest are embedded in a complex network and are affected by many other phenotypes not considered in the grossly simplified (and thus misspecified) pairwise models.

Overall, our simulations and real data analysis show that the CMST tests are better at controlling miss-calling error rates and tend to outperform the AIC, BIC, and CIT methods in terms of statistical precision. However, they do so at the expense of a decrease in statistical power. While an ideal method would have high precision and power, in practice there is always a trade-off between these quantities. Whether a more powerful and less precise, or a less powerful and more precise, method is more adequate depends on the biologist’s research goals and resources. For instance, if the goal is to generate a rank-ordered list of promising candidates genes that might causally affect a phenotype of interest and the biologist can easily validate several genes, a larger list generated by more powered and less precise methods might be more appealing. However, in general, follow-up studies tend to be time consuming and expensive, and only a few candidates can be studied in detail. A long list of putative causal traits is not useful if most are false positives. High power to detect causal relations alone is not enough. A more precise method that conservatively identifies candidates with high confidence can be more appealing (see also Chen et al. 2007).

Further, the exploratory goal is often to identify causal agents without attempting to reconstruct entire pathways. Therefore, much information about the larger networks in which the tested pairs of traits reside is unknown and generally unknowable and contributes to the large unexplained variation that in turn results in low power. Our method accurately reflects this difficulty to detect causal relationships in the presence of noisy high-throughput data and poorly understood networks.

Interestingly, our data analysis and simulations also suggest that the analysis of cis-against trans-gene pairs is less prone to detect false positives than the analysis of trans- against trans-gene pairs. Our simulations suggest that model selection approaches have difficulty ordering the phenotypes when the QTL effect reaches the truly reactive gene by two or more distinct paths, only one of which is mediated by the truly causal gene (see Figure S1C, for an example).

When we test causal relationships among gene expression phenotypes, the true relationships might not be a direct result of transcriptional regulation. For instance, the true causal regulation might be due to methylation, phosphorylation, direct protein–protein interaction, transcription factor binding, etc. Margolin and Califano (2007) have pointed out the limitations of causal inference at the transcriptional level, where molecular phenotypes at other layers of regulation might represent latent variables. Model M₄ (see Figure 1) can account for these latent variables and can test this scenario explicitly.

Furthermore, as pointed out by Li et al. (2010), causal inference depends on the detection of subtle patterns in the correlation between traits. Hence, it can be challenging even when the true causal relations take place at the transcriptional level. The authors point out that reliable causal inference in genome-wide linkage and association studies require large sample sizes and would benefit from: (i) incorporating prior information via Bayesian reasoning; (ii) adjusting for experimental factors, such as sex and age, that might induce correlations not explained the the causal relations; and (iii) considering a richer set of models than the four models accounted in this article.

The CMST tests represent a step in the direction of reliable causal inference in three accounts. First, they tend to be precise, declining to make calls in situations where alternative approaches usually deliver a flood of false-positive calls. Second, the CMST tests can adjust for experimental factors by modeling them as additive and interactive covariates. Third, the CMST tests can be applied to nonnested models of different dimensions and can be readily extended to incorporate a larger number of models by implementing intersection-union tests on a larger number of Vuong’s tests. For the joint-parametric test a higher-dimensional null distribution is required.

FDR control for the CMST approaches is a challenging problem as our tests violate the key assumption, made by FDR control procedures, that the distribution of the P-values under the null hypothesis are uniformly distributed (Benjamini and Hochberg 1995; Storey and Tibshirani 2003). Recall that the CMST P-values are computed as the maximum across other P-values, and the maximum of multiple uniform random variables no longer follows a uniform distribution. Additionally, the CMST tests are usually not independent since we often test the same cis-trait against several trans-traits, so that the additional assumption of independent test statistics made by the original Benjamini–Hochberg procedure does not hold. The Benjamini–Yekutieli (BY) procedure relaxes the independent test statistics assumption, and we explore both these corrections in our simulations. Our results suggest that BH and BY multiple testing correction should not be performed for the joint and the parametric CMST tests, as the FDR levels are lower than the nominal level without any correction and are too conservative with severe reduction in statistical power with the application of BH and BY control. The nonparametric CMST tests, on the other hand, seemed to benefit from BH correction, showing slight decrease in power with concomitant decrease in FDR, in spite of the nonparametric CMST tests being based on discrete test statistics and the BH procedure being developed to handle P-values from continuous statistics. Inspection of the P-value distributions (see Figure S14, Figure S15, Figure S16, and Figure S17) suggests that the smaller P-values of the nonparametric tests, relative to the other approaches, are the reason for the higher power achieved by the BH corrected nonparametric tests. The BY procedure, on the other hand, tended to be too conservative even for the nonparametric CMST tests.

The CMST approach is currently implemented for inbred line crosses. Extension to outbred populations involving mixed effects models is yet to be done. Although in this article we focused on mRNA expression traits, the CMST tests can be applied to any sort of heritable phenotype, including clinical phenotypes and other “omic” molecular phenotypes.

The higher statistical precision and computational efficiency achieved by our fully analytical hypothesis tests will help biologists to perform large-scale screening of causal relations, providing a conservative rank-ordered list of promising candidate genes for further investigations.

Acknowledgments

We thank Adam Margolin for helpful discussions and comments and the editor and referees for comments and suggestions that considerably improved this work.This work was supported by CNPq Brazil (E.C.N.); National Cancer Institute (NCI) Integrative Cancer Biology Program grant U54-CA149237 and National Institutes of Health (NIH) grant R01MH090948 (E.C.N.); National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) grants DK66369, DK58037, and DK06639 (A.D.A., M.P.K., A.T.B., B.S.Y., E.C.N.); National Institute of General Medical Sciences (NIGMS) grants PA02110 and GM069430-01A2 (B.S.Y.).

Literature Cited

Akaike

H

,

1974

A new look at the statistical model identification.

IEEE Trans. Automat. Contr.

19

:

716

–

723

.

Google Scholar

Crossref

WorldCat

Aten

J E

,

Fuller

T F

,

Lusis

A J

,

Horvath

S

,

2008

Using genetic markers to orient the edges in quantitative trait networks: the NEO software.

BMC Syst. Biol.

2

:

34

.

Benjamini

Y

,

Hochberg

Y

,

1995

Controlling the false discovery rate: a practical and powerful approach to multiple testing.

J. R. Stat. Soc., B

57

:

289

–

300

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Benjamini

Y

,

Yekutieli

D

,

2001

The control of the False Discovery Rate in multiple testing under dependency.

Ann. Stat.

29

:

1165

–

1188

.

Google Scholar

Crossref

WorldCat

Brem

R

,

Kruglyak

L

,

2005

The landscape of genetic complexity across 5,700 gene expression trait in yeast.

Proc. Natl. Acad. Sci. USA

102

:

1572

–

1577

.

Google Scholar

Crossref

WorldCat

Broman

K

,

Wu

H

,

Sen

S

,

Churchill

G A

,

2003

R/qtl: QTL mapping in experimental crosses.

Bioinformatics

19

:

889

–

890

.

Chaibub Neto

E

,

Ferrara

C

,

Attie

A D

,

Yandell

B S

,

2008

Inferring causal phenotype networks from segregating populations.

Genetics

179

:

1089

–

1100

.

Chaibub Neto

E

,

Keller

M P

,

Attie

A D

,

Yandell

B S

,

2010

Causal graphical models in system genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes.

Ann. Appl. Stat.

4

:

320

–

339

.

Chen

L S

,

Emmert-Streib

F

,

Storey

J D

,

2007

Harnessing naturally randomized transcription to infer regulatory relationships among genes.

Genome Biol.

8

:

R219

.

Churchill

G A

,

Doerge

R W

,

1994

Empirical threshold values for quantitative trait mapping.

Genetics

138

:

963

–

971

.

Clarke

K A

,

2007

A simple distribution-free test for nonnested model selection.

Polit. Anal.

15

:

347

–

363

.

Google Scholar

Crossref

WorldCat

Duarte

C W

,

Zeng

Z B

,

2011

High-confidence discovery of genetic network regulators in expression quantitative trait loci data.

Genetics

187

:

955

–

964

.

Dupuis

J

,

Siegmund

D

,

1999

Statistical methods for mapping quantitative trait loci from a dense set of markers.

Genetics

151

:

373

–

386

.

Hageman

R S

,

Leduc

M S

,

Korstanje

R

,

Paigen

B

,

Churchill

G A

,

2011

A Bayesian framework for inference of the genotype-phenotype map for segregating populations.

Genetics

181

:

1163

–

1170

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Haley

C

,

Knott

S

,

1992

A simple regression method for mapping quantitative trait loci in line crosses using flanking markers.

Heredity

69

:

315

–

324

.

Hughes

T R

,

Marton

M J

,

Jones

A R

,

Roberts

C J

,

Stoughton

R

et al. ,

2000

Functional discovery via a compendium of expression profiles.

Cell

102

:

109

–

116

.

Kullback

S

,

1959

Information Theory and Statistics

.

Wiley

,

New York

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Kulp

D C

,

Jagalur

M

,

2006

Causal inference of regulator-target pairs by gene mapping of expression phenotypes.

BMC Genomics

7

:

125

.

Lander

E S

,

Botstein

D

,

1989

Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps.

Genetics

121

:

185

–

199

.

Li

R

,

Tsaih

S W

,

Shockley

K

,

Stylianou

I M

,

Wergedal

J

et al. ,

2006

Structural model analysis of multiple quantitative traits.

PLoS Genet.

2

:

e114

.

Li

Y

,

Tesson

B M

,

Churchill

G A

,

Jansen

R C

,

2010

Critical preconditions for causal inference in genome-wide association studies.

Trends Genet.

26

:

493

–

498

.

Liu

B

,

de la Fuente

A

,

Hoeschele

I

,

2008

Gene network inference via structural equation modeling in genetical genomics experiments.

Genetics

178

:

1763

–

1776

.

Manichaikul

A

,

Dupuis

J

,

Sen

S

,

Broman

K W

,

2006

Poor performance of bootstrap confidence intervals for the location of a quantitative trait locus.

Genetics

174

:

481

–

489

.

Margolin

A

,

Califano

A

,

2007

Theory and limitations of genetic network inference from microarray data.

Ann. N.Y. Acad. Sci.

1115

:

51

–

72

.

Millstein

J

,

Zhang

B

,

Zhu

J

,

Schadt

E E

,

2009

Disentangling molecular relationships with a causal inference test.

BMC Genet.

10

:

23

.10.1186/1471–2156–10–23.

Sawa

T

,

1978

Information criteria for discriminating among alternative regression models.

Econometrica

46

:

1273

–

1291

.

Google Scholar

Crossref

WorldCat

Schadt

E E

,

Lamb

J

,

Yang

X

,

Zhu

J

,

Edwards

S

et al. ,

2005

An integrative genomics approach to infer causal associations between gene expression and disease.

Nat. Genet.

37

:

710

–

717

.

Schwarz

G E

,

1978

Estimating the dimension of a model.

Ann. Stat.

6

:

461

–

464

.

Google Scholar

Crossref

WorldCat

Shao

J

,

2003

Mathematical Statistics, Springer Texts in Statistics

, Ed. 2.

Springer

,

New York

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Storey

J

,

Tibshirani

R

,

2003

Statistical significance for genomewide studies.

Proc. Natl. Acad. Sci. USA

100

:

9440

–

9445

.

Google Scholar

Crossref

WorldCat

Sun

W

,

Yu

T

,

Li

K C

,

2007

Detection of eQTL modules mediated by activity levels of transcription factors.

Bioinformatics

23

:

2290

–

2297

.

Vuong

Q H

,

1989

Likelihood ratio tests for model selection and non-nested hypothesis.

Econometrica

57

:

307

–

333

.

Google Scholar

Crossref

WorldCat

West

M A L

,

Kim

K

,

Kliebenstein

D J

,

van Leeuwen

H

,

Michelmore

R W

et al. ,

2007

Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis.

Genetics

175

:

1441

–

1450

.

Winrow

C J

,

Williams

D L

,

Kasarskis

A

,

Millstein

J

,

Laposky

A D

et al. ,

2009

Uncovering the genetic landscape for multiple sleep-wake traits.

PLoS ONE

4

:

e5161

.

Zhu

J

,

Lum

P Y

,

Lamb

J

,

GuhaThakurta

D

,

Edwards

S W

et al. ,

2004

An integrative genomics approach to the reconstruction of gene networks in segregating populations.

Cytogenet. Genome Res.

105

:

363

–

374

.

Zhu

J

,

Wiener

M C

,

Zhang

C

,

Fridman

A

,

Minch

E

et al. ,

2007

Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations.

PLOS Comput. Biol.

3

(

4

):

e69

.

10.1371/journal.pcbi.0030069

.

Zhu

J

,

Zhang

B

,

Smith

E N

,

Drees

B

,

Brem

R B

et al. ,

2008

Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks.

Nat. Genet.

40

:

854

–

861

.

Footnotes

Communicating editor: L. M. McIntyre

Author notes

Supporting information is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.112.147124/-/DC1.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Supplementary data

147124SI - pdf file

FigureS1 - pdf file

FigureS10 - pdf file

FigureS11 - pdf file

FigureS12 - pdf file

FigureS13 - pdf file

FigureS14 - pdf file

FigureS15 - pdf file

FigureS16 - pdf file

FigureS17 - pdf file

FigureS2 - pdf file

FigureS3 - pdf file

FigureS4 - pdf file

FigureS5 - pdf file

FigureS6 - pdf file

FigureS8 - pdf file

FigureS9 - pdf file

FileS1 - pdf file

FileS2 - pdf file

FileS3 - pdf file

FileS4 - pdf file

Month:	Total Views:
February 2021	2
March 2021	7
April 2021	4
May 2021	2
June 2021	2
July 2021	3
August 2021	16
September 2021	3
October 2021	2
November 2021	4
December 2021	4
January 2022	20
February 2022	5
March 2022	7
April 2022	15
May 2022	6
June 2022	17
July 2022	11
August 2022	19
September 2022	7
October 2022	9
November 2022	4
December 2022	4
January 2023	4
February 2023	4
March 2023	10
April 2023	12
May 2023	12
June 2023	7
July 2023	8
August 2023	3
September 2023	2
October 2023	3
November 2023	15
December 2023	8
January 2024	16
February 2024	19
March 2024	13
April 2024	8

Article Contents

Modeling Causality for Pairs of Phenotypes in System Genetics

Abstract

Methods

Vuong’s model selection test

Clarke’s model selection paired sign test

Causal model selection tests

Model selection tests for models M₁, M₂, M₃, and M₄

Simulation studies

True and false positives

Results

Pilot simulation study results

Large-scale simulation study results

Yeast data analysis and biologically validated predictions

Discussion

Acknowledgments

Literature Cited

Footnotes

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

See also

Companion Article

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Modeling Causality for Pairs of Phenotypes in System Genetics

Abstract

Methods

Vuong’s model selection test

Clarke’s model selection paired sign test

Causal model selection tests

Model selection tests for models M1, M2, M3, and M4

Simulation studies

True and false positives

Results

Pilot simulation study results

Large-scale simulation study results

Yeast data analysis and biologically validated predictions

Discussion

Acknowledgments

Literature Cited

Footnotes

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

See also

Companion Article

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Model selection tests for models M₁, M₂, M₃, and M₄