Abstract
Quantitative trait loci (QTL) hotspots (genomic locations affecting many traits) are a common feature in genetical genomics studies and are biologically interesting since they may harbor critical regulators. Therefore, statistical procedures to assess the significance of hotspots are of key importance. One approach, randomly allocating observed QTL across the genomic locations separately by trait, implicitly assumes all traits are uncorrelated. Recently, an empirical test for QTL hotspots was proposed on the basis of the number of traits that exceed a predetermined LOD value, such as the standard permutation LOD threshold. The permutation null distribution of the maximum number of traits across all genomic locations preserves the correlation structure among the phenotypes, avoiding the detection of spurious hotspots due to nongenetic correlation induced by uncontrolled environmental factors and unmeasured variables. However, by considering only the number of traits above a threshold, without accounting for the magnitude of the LOD scores, relevant information is lost. In particular, biologically interesting hotspots composed of a moderate to small number of traits with strong LOD scores may be neglected as nonsignificant. In this article we propose a quantile-based permutation approach that simultaneously accounts for the number and the LOD scores of traits within the hotspots. By considering a sliding scale of mapping thresholds, our method can assess the statistical significance of both small and large hotspots. Although the proposed approach can be applied to any type of heritable high-volume “omic” data set, we restrict our attention to expression (e)QTL analysis. We assess and compare the performances of these three methods in simulations and we illustrate how our approach can effectively assess the significance of moderate and small hotspots with strong LOD scores in a yeast expression data set.
QUANTITATIVE trait loci (QTL) hotspots, groups of traits comapping to the same genomic location, are a common feature of genetical genomics studies. Genomic locations associated with many traits are biologically interesting since they may harbor influential regulators. Therefore, statistical procedures aiming to assess the significance of such hotspots are of key importance.
Brem et al. (2002) and Schadt et al. (2003) detected hotspots by dividing the genome of an organism into equally spaced bins, counting the number of expression traits with a QTL in each bin. A hotspot was considered significant if it had more traits with QTL than expected if the expression QTL were randomly distributed across the genome. Darvasi (2003) and Perez-Enciso (2004) pointed out that these hotspots may arise as an artifact of high correlation among expression traits, rather than by the action of a common master regulator. Nongenetic mechanisms, uncontrolled environmental factors, and unmeasured variables are capable of inducing strong correlations among clusters of traits. Hence, whenever a trait shows a spurious linkage, many correlated traits will likely map to the same locus, creating a spurious expression (e)QTL hotspot. Furthermore, multiple-testing and relaxed-mapping thresholds may inflate the hotspots (Darvasi 2003).
West et al. (2007) and Wu et al. (2008) proposed a permutation test where the positions of the eQTL detected in the original data set are permuted across the genome separately by trait, using the distribution of the maximal number of expression traits across the genome to assess hotspot significance. This Q-method permutes QTL positions, not phenotype or genotype data, and improves upon the permutation approaches of Brem et al. (2002) and Schadt et al. (2003) since it accounts for multiple testing across the genome. However, the Q-method implicitly assumes traits are uncorrelated and hence underestimates the clustered pattern of spurious eQTL for correlated traits.
Breitling et al. (2008) proposed a permutation test that randomized rows in the marker data relative to rows in the trait data, preserving the correlation structure among phenotypes. The null distribution for this N-method of hotspot sizes depends on the number N of traits with LOD score exceeding a predetermined LOD threshold at each locus. The choice of LOD threshold is important: higher LOD thresholds yield smaller-sized spurious hotspots by chance under the null hypothesis of no hotspots. Two natural LOD threshold choices are the Churchill–Doerge (Churchill and Doerge 1994) single-trait LOD threshold, controlling genome-wide error rate (GWER) for one trait, and a conservative permutation threshold for the maximum LOD score across all traits and all genomic locations. The former allows large hotspots by chance under the null distribution (Breitling et al. 2008). The latter favors small hotspots composed of traits with high LOD scores under the null. Which threshold is more appropriate?
We propose a quantile-based permutation approach, the NL-method, with a sliding scale of thresholds ranging from the conservative to the single-trait threshold, jointly considering hotspot size and the distribution of LOD scores among correlated traits. Hence, even a small hotspot, with a modest number of correlated traits all having high LOD scores at a location, can be significant. The NL-method controls the genome-wide error rate across a range of possible hotspot sizes. Explicitly, we examine spurious hotspot size n, ranging from 1 to N, with N the hotspot size threshold delivered by the N-method using the single-trait LOD threshold. While the N-method yields the minimum significant hotspot size for a fixed LOD threshold, we turn the problem around and determine an empirical LOD threshold given a spurious hotspot size.
We assessed and compared the performances of the NL-, N-, and Q-methods using (i) simulated examples, where we generated hotspots with varying LOD-score distributions for data sets with correlated and uncorrelated traits, and (ii) simulation studies, where we generated null data sets, i.e., data sets where none of the phenotypes had any QTL, and assessed the error rates of the three procedures under different levels of correlation among the traits. Application of the NL-method to a yeast data set detected additional moderate and small hotspots considered nonsignificant by the N-method and avoided spurious hotspots detected by the Q-method. This ability to assess the statistical significance of hotspots with varying sizes and LOD-score distributions has the potential to yield important additional biological discoveries.
Methods
The Q- and N-methods
The now standard permutation threshold method for QTL mapping (Churchill and Doerge 1994) estimates the null distribution of the genome-wide maximum LOD score by shuffling the phenotypes relative to the genotype data, breaking the association between the phenotype and the genotypes. Our interest, though, is to assess the significance of QTL hotspots. This section presents two different permutation schemes that have been used in hotspot analysis. Supporting Information, Figure S1 shows a schematic of the genotype data, the phenotype data, and the output of hotspot analysis in genetical genomics experiments.
The first permutation scheme for the Q-method (Figure S2) derives the null distribution of hotspot sizes by permuting the cells of the observed QTL matrix along its rows, independently for each column; that is, the QTL are permuted across genomic locations separately by trait. The Q-method does not account for the correlation structure among the phenotypes and, contrary to the Churchill–Doerge and Breitling’s permutation tests, does not break the connection between phenotypes and genotypes. The Q-method permutation null distribution is generated under the assumption that phenotypes are uncorrelated. Violation of this assumption leads to a severe underestimation of the null distribution of hotspot sizes and to detection of many spurious hotspots, as shown by Breitling et al. (2008) in the reanalysis of the Wu et al. (2008) data and illustrated in the simulation study and examples below.
The second permutation scheme (Breitling et al. 2008) (Figure S3) used for the N- and NL-methods breaks the connection between genotypes and phenotypes while preserving the correlation structure separately within each of the types of data by permuting the rows of the phenotype data matrix relative to the rows of the genotype data. Mapping analysis is redone for all traits with the permuted data. This scheme, a direct extension of Churchill and Doerge (1994), preserves the correlation structure among the phenotypes, accounting for spurious hotspots due to nongenetic correlation.
The NL-method
In linkage analysis of a single phenotype we are usually interested in controlling the GWER of falsely detecting a QTL. For a given error rate α, we determine a single-trait mapping LOD threshold λ such that
Which threshold is more adequate depends on the underlying situation. We discard small hotspots with strong LOD scores when we adopt λ, but we miss hotspots composed of many traits with linkages barely reaching the single-trait mapping threshold using λc. Therefore, we propose a sliding scale of thresholds λn,α for n varying from 1 to N, where N represents the hotspot size (i.e., the number of traits with significant LOD at a given genomic location) expected by chance under the N-method’s permutation scheme, computed using the LOD-score threshold λ.
Given a fixed mapping threshold, the N-method determines the hotspot size expected by chance. We turn the problem around: given a fixed spurious hotspot size n, the NL-method determines the associated mapping threshold λn,α. We adopt maxk{qLODk(n)} as a test statistic where qLODk(n) corresponds to the nth LOD value of an ordered sample of T LOD scores, ordered from highest to lowest. Note that by taking the maximum of qLODk(n) across the genome we are able to control the genome-wide error rate associated with the qLODk(n) statistic. Explicitly, we control
Observe that when n = N, the LOD threshold λ (that controls the detection of a false QTL at a GWER α) matches λN,α (that controls the detection of a false hotspot of size N or higher), and qLODk(N) corresponds to LODk. Therefore, when n = N, the quantity in (3) reduces to (1). Similarly, when n = 1, we have that λc = λ1,α and qLODk(1) = maxt{LODt,k}, so that maxk{qLODk(1)} = maxt,k{LODt,k} and (3) reduces to (2).
Finally, the quantity in (3) is the probability of detecting at least one spurious hotspot of size exactly n somewhere in the genome given that none of the traits have a QTL anywhere in the genome. However, under the null hypothesis of no QTL, detecting a hotspot of size n* > n is less likely than detecting a hotspot of size n; therefore, if a threshold λn,α controls the full-null GWER for a hotspot of size n, it will also control the full-null GWER for a hotspot of size larger than n. Below, we detail the permutation algorithm for LOD quantiles.
NL-method algorithm:
For a fixed hotspot size, n = 1, … , N, we obtain the permutation LOD threshold that controls the genome-wide error rate of detecting at least one hotspot of size n or higher, at a fixed α level as follows:
Permute the data according to the N-method to break the associations among genotypes and phenotypes, while keeping the correlation structure among phenotypes intact. Compute the LOD scores for all phenotypes across all genomic locations.
Process the LOD profile of each trait as follows: (i) determine the LOD peak for each chromosome, (ii) compute the LOD support interval around the peak (Lander and Botstein 1989; Dupuis and Siegmund 1999; Manichaikul et al. 2006), and (iii) set to zero the LOD scores outside the LOD support interval (and below the single-trait mapping threshold).
For a fixed hotspot size n, compute qLODk(n) for genomic positions k = 1, … , K and store its maximum.
Repeat steps 1–3, B times. The histogram of the B-permutation samples of maxk{qLODk(n)} is an estimate of the null distribution of the test statistic for at least one spurious hotspot of size n or higher anywhere in the genome, given that none of the traits have a QTL anywhere in the genome.
The upper (1 − α)-quantile of the permutation sample generated in step 4 is our threshold (denoted by λn,α).
This algorithm is analogous to the traditional permutation test, replacing LOD scores by LOD quantiles. Chen and Storey (2006) perform permutation tests for distinct quantile-based statistics, in a different context, where they consider a set of relaxed significance thresholds to detect multiple QTL for a single trait.
The LOD-score processing step described in step 2 of the NL-method algorithm confines the hotspot location on the chromosome. LOD support intervals are the most commonly used interval estimates for the location of a QTL. Following Manichaikul et al. (2006) we adopt 1.5-LOD support intervals for a backcross and 1.8-LOD support intervals for an intercross, decreasing the spread of the hotspot as illustrated in Figure S4.
Finally, note that instead of running a genuine, but infeasible, multiple-trait joint analysis to account for the correlation structure among the traits, our strategy is to perform multiple single-trait mapping analyses with an appropriate multiple-trait permutation threshold. Justification for permutation tests in the context of QTL mapping of a single phenotype is given by Churchill and Doerge (1994). A sufficient condition for a permutation test to have type I error rate held at the nominal level is that the observations are exchangeable (Good 1994, p. 203; Lehmann 1986, p. 231). Violation of the exchangeability assumption can lead to an inflation of type I error rates (Churchill and Doerge 2008). Observations are exchangeable if the joint probability of any outcome is the same irrespective of the order in which the observations are considered (Lehmann 1986, p. 231). Permutation tests remain valid for a multivariate response that can be reduced to a single-valued test statistic (Good 1994, Chap. 5). Exchangeability of subjects under the null distribution follows by the construction of an experimental cross. At a fixed genomic location our test statistic corresponds to qLODk(n). Across the whole genome, we adopt maxk{qLODk(n)} as our genome-wide and single-valued test statistic.
Results
Simulated examples
In this section we illustrate the application of the Q-, N-, and NL-methods to two simulated data sets: one with highly correlated traits and the other with uncorrelated traits. We generated data from backcrosses composed of 112 individuals with 16 chromosomes of length 400 cM containing 185 equally spaced markers each, and phenotype data on 6000 traits. The phenotype data were generated according to the models
In both examples we simulated three hotspots: (i) a small hotspot located at 200 cM on chromosome 5 showing high LOD scores (see Figure S5, A and D), (ii) a big hotspot located at 200 cM on chromosome 7 showing LOD scores ranging from small to high (see Figure S5, B and E), and (iii) a big hotspot located at 200 cM on chromosome 15 showing LOD scores ranging from small to moderate (see Figure S5, C and F).
In both simulations we set σ2 = 1 and γ = 2. QTL analysis was performed using Haley–Knott regression (Haley and Knott 1992) with the R/qtl software (Broman et al. 2003). We adopted Haldane’s map function and genotype error rate of 0.0001. Because we adopted a dense genetic map (our markers are ∼2.16 cM apart), we did not consider putative QTL positions between markers.
In the first example, denoted simulated example 1, we adopted a latent effect of 1.5. In the second example, denoted simulated example 2, we adopted a latent effect of 0 and simulated uncorrelated traits. Figure S6, A and B, shows the distribution of all pairwise correlations among the 6000 traits for both simulated examples. These extreme examples illustrate the effect of phenotype correlation on QTL hotspot sizes. The correlations of the real data are actually intermediate (see Figure S6C).
Figure 1 shows the results for the Q- and N-methods for simulated example 1, using α = 0.05. Figure 1A shows the hotspot architecture computed using a single-trait LOD threshold of 3.65; i.e., at each genomic location the plot shows the number of traits with LOD score >3.65. In addition to the simulated hotspots on chromosomes 5, 7, and 15, Figure 1A shows a few spurious hotspots, including a big hotspot on chromosome 8. The blue and red lines show the N- and Q-methods’ thresholds, 560 and 7, respectively. In this example the N-method was unable to detect any hotspots, whereas the Q-method detected false hotspots on chromosomes 3, 6, 8, 9, 12, and 16. Figure 1, B and C, shows the hotspot size null distributions and the 5% significance thresholds for the N- and Q-methods, respectively.
N- and Q-method analyses for simulated example 1. (A) Inferred hotspot architecture using a single-trait permutation threshold of 3.65 corresponding to a GWER of 5% of falsely detecting at least one QTL somewhere in the genome. The blue line at count 560 corresponds to the hotspot size expected by chance at a GWER of 5% according to the N-method permutation test. The red line at count 7 corresponds to the Q-method’s 5% significance threshold. The hotspots on chromosomes 5, 7, 8, and 15 have sizes 50, 500, 125, and 280, respectively. (B) N-methods permutation null distribution of the maximum genome-wide hotspot size. The blue line corresponds to the hotspot size 560 expected by chance at a GWER of 5%. (C) Q-methods permutation null distribution of the maximum genome-wide hotspot size. The red line at 7 shows the 5% threshold. Results are based on 1000 permutations.
Figures 2 and 3 show the NL-method analysis results for simulated example 1, using α = 0.05. Figure 2, A–D, presents the hotspot architecture inferred using four different quantile-based permutation thresholds. Figure 2A presents the hotspot architecture inferred using a LOD threshold of 7.07. Only the true hotspots (on chromosomes 5, 7, and 15) were significant by this conservative threshold. Figure 2B presents the hotspot architecture computed using a LOD threshold of 4.93 that aims to control GWER ≤ 0.05 for spurious hotspots of size 50. The hotspots on chromosomes 5, 7, and 15 were detected by this threshold. Figure 2, C and D, shows the hotspot architectures using LOD thresholds of 4.21 and 3.72, respectively. Only the hotspot on chromosome 7 was detected as significant for these thresholds. Note that neither the big spurious hotspot on chromosome 8 nor any of the other spurious hotspots shown in Figure 1A were picked up by the quantile-based thresholds.
NL-method analysis for simulated example 1. (A–D) Hotspot architecture inferred using different quantile-based permutation thresholds; i.e., for each genomic location it shows the number of traits that mapped there with a LOD threshold higher than the quantile-based permutation threshold. (A) Hotspot architecture inferred using a permutation LOD threshold of 7.07 corresponding to the LOD threshold that controls the probability of falsely detecting at least a single linkage for any of the traits somewhere in the genome under the null hypothesis that none of the traits have a QTL anywhere in the genome, at an error rate of 5%. (B, C, and D) Hotspot architectures computed using QTL mapping LOD thresholds of 4.93, 4.21, and 3.72 that aim to control GWER at a 5% error rate for spurious eQTL hotspots of sizes 50, 200, and 500, respectively.
Hotspot size significance profile derived with the NL-method for simulated example 1. For each genomic location (i.e., x-axis position) the hotspot sizes at which the hotspot was significant are shown, that is, at which the hotspot locus had more traits mapping to it with a LOD score higher than the threshold on the right, than expected by chance. The scale on the left shows the range of spurious hotspot sizes investigated by our approach. The scale on the right shows the respective LOD thresholds associated with the spurious hotspot sizes on the left. The range is from 7.07, the conservative empirical LOD threshold associated with a spurious “hotspot of size 1,” to 3.65, the single-trait empirical threshold, associated with a spurious hotspot of size 560. All permutation thresholds were computed targeting GWER ≤ 0.05, for n = 1, … , 560.
Figure 3 connects hotspot size to quantile-based threshold. This hotspot size significance profile depicts a sliding window of hotspot size thresholds ranging from n = 1, … , N, where N = 560 corresponds to the hotspot size threshold derived from the N method. For each genomic location, the hotspot size (left axis) is significant for the LOD threshold (right axis). For example, the chromosome 5 hotspot was significant up to size 49, meaning that >1 trait mapped to the hotspot locus with LOD > 7.07, >2 traits mapped to the hotspot locus with LOD > 6.46, and so on up to hotspot size 49 where >49 traits mapped to the hotspot locus with LOD > 4.93. The hotspot on chromosome 7 was significant up to size 499, and the hotspot on chromosome 15 (higher peak) was significant for hotspot sizes 2–129 and 132–143.
The NL-method detected only the real hotspots on chromosomes 5, 7, and 15, whereas the N-method did not detect any hotspots and the Q-method detected 6 spurious hotspots, in addition to the real hotspots. The sliding window of quantile-based thresholds detected the small hotspot composed of traits with high LOD scores on chromosome 5 as well the big hotspots on chromosomes 7 and 15. Equally important, the NL-method dismissed spurious hotspots, such as chromosome 8, composed of numerous traits with LOD scores <5.57.
Figure 4 shows the results for the Q- and N-methods for simulated example 2, using α = 0.05. Figure 4A shows the hotspot architecture. The blue and red lines show the N- and Q-method’s thresholds, 19 and 8, respectively. In this example, both the N- and Q-methods were able to correctly pick up the hotspots on chromosomes 5, 7, and 15.
N- and Q-method analyses for simulated example 2. (A) Inferred hotspot architecture using a single-trait permutation threshold of 3.65 corresponding to a GWER of 5% of falsely detecting at least one QTL somewhere in the genome. The blue line at count 19 corresponds to the hotspot size expected by chance at a GWER of 5% according to the N-method permutation test. The red line at count 8 corresponds to the Q-method’s 5% significance threshold. The hotspots on chromosomes 5, 7, and 15 have sizes 50, 464, and 220, respectively. (B) The N-method’s permutation null distribution of the maximum genome-wide hotspot size. The blue line at 19 corresponds to the hotspot size expected by chance at a GWER of 5%. (C) The Q-method’s permutation null distribution of the maximum genome-wide hotspot size. The red line at 8 shows the 5% threshold. Results are based on 1000 permutations.
Comparison of Figures 1A and 4A shows that the spurious hotspots tend to be much smaller when the traits are uncorrelated (compare chromosome 8 on both plots), leading to much smaller N-method thresholds (compare the blue lines). The Q-method thresholds, on the other hand, are quite close. This is expected since the Q-method threshold depends on the number of significant QTL (we observed 3162 significant linkages in simulated example 1, against 3586 significant linkages in example 2) and not on the correlation among the traits.
Figure S7 displays the hotspot size significance profile for simulated example 2. The NL-method also detected the hotspots on chromosomes 5, 7, and 15.
Simulation study
In this simulation study we assess and compare the error rates of the Q-, N-, and NL-methods under three different levels of correlation among the traits. To determine whether the methods are capable of controlling the GWER at the target levels, we conduct separate simulation experiments as follows:
We generate a “null genetical genomics data set” from a backcross composed of (i) 6000 traits, none of which is affected by a QTL, but that are nevertheless affected by a common latent variable to generate a correlation structure among the traits, and (ii) genotype data on 2960 equally spaced markers across 16 chromosomes of length 400 cM (185 markers per chromosome). Any detected QTL hotspot is spurious, arising from correlation among the traits.
We perform QTL mapping analysis, and 1.5-LOD support interval processing, of the 6000 traits. For each one of the following single-trait QTL mapping permutation thresholds (that control GWER at the α = 0.01, 0.02, … , 0.10 levels, respectively), we do the following:
-
a. We compute the observed QTL matrix and generate the Q-method hotspot size threshold on the basis of 1000 permutations of the observed QTL matrix. We record whether or not we see at least one spurious hotspot of size greater than the Q-method threshold anywhere in the genome.
-
b. For each genomic location we count the number of traits above the single-trait LOD threshold. We compute the N-method hotspot size threshold on the basis of 1000 permutations of the null data set. We record whether at least one spurious hotspot of size greater than the N-method threshold is anywhere in the genome.
-
c. We compute the NL-method LOD thresholds for spurious hotspot size thresholds ranging from 1 to the N-method threshold. For each NL-method LOD threshold, λn,α, where n = 1, … , N, we count, at each genomic location, how many traits mapped to that genomic location with a LOD > λn,α and record whether there is at least one spurious hotspot of size greater than n anywhere in the genome.
-
We repeat the first two steps 1000 times. For each one of the three methods, the proportion of times we recorded spurious hotspots, out of the 1000 simulations, gives us an estimate of the empirical GWER associated with the method.
QTL analysis was performed as described above. Figure 5 shows the simulation results for null data sets generated using latent variable effects of 0.0, 0.25, and 1.0. The Q- and N-methods, with observed GWER (red), and target error rate (black), have two α-levels, α1 for QTL mapping and α2 for the tail area of the hotspot size permutation null distribution. Figure 5 displays the results when α1 = α2 = 0.01, 0.02, … , 0.10. The NL-method has a single α-level; the red curves are the observed GWERs for spurious hotspot sizes n = 1, … , N, where N represents the N method’s permutation threshold.
Observed GWER for the Q-, N-, and NL-methods under varying strengths of phenotype correlation. Black lines show the targeted error rates. Red curves show the observed GWER. (A–C) Results for uncorrelated phenotypes. (D–F) Results for weakly correlated phenotypes generated using a latent variable effect of 0.25. (G–I) Simulation results for highly correlated phenotypes generated using latent effect set to 1. The left, middle, and right columns show the results for the Q-, N-, and NL-methods, respectively. Note the different y-axis scales for the Q-method panels. The red curves on the NL-method panels show the observed GWER for hotspot sizes ranging from 1 to N, where N is the median N-method threshold for α = 0.10.
Figure 5, A–C, shows that for uncorrelated traits the Q- and N-methods were conservative, below target levels, whereas the NL-method shows error rates about the right target levels for most hotspot sizes. Figure 5, D and G, shows that error rates for the Q-method are higher than target levels when the traits are correlated and increase with correlation strength among the phenotypes. These results are expected since the Q-method’s thresholds depend on the number of QTL detected in the unpermuted data and tend to increase with the number of phenotypes. Because we generated the same number of phenotypes in the three simulation studies, the Q-method’s thresholds were similar. Therefore, the number and the size of the spurious QTL tend to be proportional to the correlation strength of the phenotypes. The N- and NL-methods, on the other hand, are designed to cope with the correlation structure among the phenotypes and show error rates close to the target levels as shown in Figure 5, E, F, H, and I.
Yeast data set example
In this section we illustrate and compare the Q-, N-, and NL-methods using data generated from a cross between two parent strains of yeast: a laboratory strain and a wild isolate from a California vineyard (Brem and Kruglyak 2005). The data consist of expression measurements on 5740 transcripts measured on 112 segregant strains, with dense genotype data on 2956 markers. Processing of the expression measurements of raw data was done as described in Brem and Kruglyak (2005), with an additional step of converting the processed measurements to normal quantiles by the transformation
Hotspot analysis of the yeast data, based on the N-method (Figure 6A), detected significant eQTL hotpots on chromosomes 2 (second peak), 3, 12 (first peak), 14, and 15 (first peak), at a GWER of 5% according to null distribution of hotspot sizes shown in Figure 6B. The blue line represents the N method’s significance threshold of N = 96. The maximum hotspot size on chromosome 8 was 95 and almost reached significance. Nonetheless, Figure 6A also shows suggestive (although substantially smaller) peaks on chromosomes 1, 4, 5, 7, 9, 12 (second peak), 13, 15 (second peak), and 16 that did not reach significance according to the N-method’s significance threshold.
N- and Q-method analyses for the yeast data. (A) Inferred hotspot architecture using a single-trait permutation threshold of 3.44 corresponding to a GWER of 5% of falsely detecting at least one QTL somewhere in the genome. The blue and red lines at counts 96 and 28 correspond to the hotspot size expected by chance at a GWER of 5% according to the N- and the Q-method permutation tests, respectively. (B and C) The permutation null distributions of the maximum genome-wide hotspot size based on 1000 permutations. The blue and red lines at 96 and 28 correspond, respectively, to the hotspot size expected by chance at a GWER of 5% for the N- and Q-methods.
The red line in Figure 6A represents the Q-method’s significance threshold of 28, derived from the null distribution of hotspot sizes shown in Figure 6C. The Q-method detected significant hotspots on chromosomes 2 (both peaks), 3, 4, 5 (both peaks), 7, 8, 12 (both peaks), 13, 14, and 15 (both peaks).
Figure 7 shows the hotspot significance profile for the NL-method. The major hotspots on chromosomes 2, 3, 12 (first peak), 14, and 15 (first peak) were significant across all thresholds tested up, and the hotspot on chromosome 8 was significant up to size 93. Furthermore, the NL-method showed that the small hotspots detected by the Q-method on chromosomes 5, 12 (second peak), 13, and 15 (second peak) might indeed be real. Nonetheless, the small hotspots on chromosomes 4 and 7, detected by the Q-method, are less interesting than the small hotspot on chromosome 1 that was actually missed by the Q-method.
Hotspot size significance profile derived with the NL-method. The range is from 7.40, the conservative empirical LOD threshold associated with a spurious “hotspot of size 1,” to 3.45, the single-trait empirical threshold, associated with a spurious hotspot of size 96. All permutation thresholds were computed targeting GWER ≤ 0.05, for n = 1, … , 96.
Discussion
A common feature in genetical genomics studies of expression traits is the presence of eQTL hotspots where a single polymorphism leads to widespread downstream changes in the expression of distant genes. These genomic loci associated with many distant genes are biologically interesting since they may harbor important regulators. Statistical procedures aiming to assess the significance of such hotspots are of key importance.
Breitling et al. (2008) were the first to propose a permutation test (the N-method) for eQTL hotspots that accounts for the correlation structure among phenotypes due to the effect of confounders. However, the authors restricted their attention to the single-trait empirical threshold only and may have overlooked interesting hotspots composed of moderate to small numbers of traits with strong LOD scores.
In this article, we adopt the Breitling et al. permutation scheme and propose a method to determine a range of quantile-based permutation thresholds (the NL-method) that allows us to assess the significance of hotspots on the basis of the number and the linkage strength of the traits composing those hotspots. For a fixed error rate α, our approach investigates the significance of a hotspot, using a range of N distinct mapping thresholds, where N is the smallest hotspot size that is significant by the N-method. For each n = 1, … , N we determine the LOD threshold that controls the genome-wide error rate of detecting at least one spurious hotspot of size n or higher somewhere in the genome, at an error rate ≤α.
Our simulated examples and simulation studies show that the Q-method performs well when the traits are uncorrelated, but detects spurious hotspots at high rates when the traits are correlated. This result is not surprising since the Q-method implicitly assumes that the traits are uncorrelated. Molecular traits such as mRNA expression levels, metabolite concentrations, and protein levels are often highly correlated and the Q-method is not adequate in these situations. On the other hand, our simulations suggest that the N and N- and NL-methods perform adequately for correlated or uncorrelated traits, showing genome-wide error rates close the target levels.
The advantage of the NL-method over the N-method is that it can assess the significance of hotspots with any type of LOD-score distribution. For instance, (i) a hotspot composed of many traits with moderate LOD scores will be found with thresholds close to the single-trait threshold, (ii) a hotspot consisting of a few traits with strong LOD scores will be detected with thresholds close to the conservative threshold, and (iii) a large hotspot with a range of moderate to large LOD scores will be significant at all thresholds in our sliding scale. The ability to assess the significance of these different types of hotspots can lead to important additional biological findings that might be overlooked by previous approaches, while still avoiding the detection of spurious hotspots. In the analysis of the yeast data, the hotspots on chromosomes 5, 8, 12 (second peak), 13, and 15 (second peak) have a LOD distribution of type ii. The hotspots on chromosomes 2, 3, 12 (first peak), 14, and 15 (first peak) have LOD distributions of type iii. No hotspot with LOD distribution of type i is present in the yeast data set. Note that hotspots composed of moderate to small numbers of traits with moderate LOD scores will be missed by all thresholds in our sliding scale and will be discarded as nonsignificant by our analysis. Application of the N-method detected only the five big hotspots on chromosomes 2, 3, 12, 14, and 15. Additionally, the simulated example 1 shows an example were the NL-method was able to pick up the three simulated hotspots missed by the N-method.
The NL-method is in a certain sense analogous to the approach proposed by Chen and Storey (2006). In the same way that Chen and Storey relax the single-trait mapping threshold by controlling the probability that a trait falsely maps to k or more genomic locations, we relax the conservative threshold by controlling the probability that ≥n traits falsely map to a common genomic location.
Even though the sliding window of thresholds delivered by the NL-method method is more informative than the single-hotspot size threshold of the N method, these approaches have the same computational complexity. They use exactly the same permutations but summarize the results differently. Both methods are computationally intensive: reliable results require 1000 or more permutations and for each permuted data set we perform mapping analysis of several thousand traits. Thus, in general, parallel computations on a cluster are required. To reduce the computational burden, we adopted Haley–Knott regression and mapped traits with common missing phenotype data patterns as blocks. An R package called qtlhot is being submitted to Comprehensive R Archive Network (CRAN).
The approach in this article relied on single-QTL mapping methods. To examine whether an apparent hotspot could be an artifact, such as a ghost QTL (Haley and Knott 1992), we used multiple-QTL methods (Manichaikul et al. 2009) for some smaller hotspots (data not shown). Most traits from these hotspots continued to map to the same location detected by single-trait analysis when we allowed for other possible QTL on the same or other chromosomes. It would be possible to extend our quantile-based permutation approach to multiple-QTL mapping (Jansen 1993; Jansen and Stam 1994; Manichaikul et al. 2009) by considering the LOD profile for each QTL adjusted for all other QTL [e.g., using the addqtl or the multiple-QTL mapping functions in R/qtl (Broman and Sen 2009; Arends et al. 2010)]. However, this would require considerably more computation and is left for future research.
The analysis of data sets containing groups of repetitive traits (that is, distinct traits representing slightly different measurements of a same “baseline” phenotypic trait) must be conducted with care. Repetitive traits are artifacts of the experimental design rather than indications of underlying biological processes. For instance, traits derived from oligos of same gene are often highly correlated simply because they arise from the same gene and might be picked up as a hotspot. Thus, repetitive traits can introduce artifactual hotspots that are indistinguishable statistically from biologically driven hotspots, unless this is addressed by attention to the design. Other examples of repetitive traits include (i) protein traits where one protein can exist in many variants due to post-translational modifications and the abundance of each variant is measured and used as a separate trait and (ii) classical phenotypic traits such as flowering in Arabidopsis, where a major QTL has been investigated in a number of independent studies, under different environmental conditions, leading to a group of repetitive traits strongly mapping to the same QTL (see supplement in Fu et al. 2009). If repetitive traits are known ahead of time, they should be removed or otherwise accounted for in the analysis. For example, Fu et al. (2009) proposed organizing repetitive classical traits into disjunct phenotypic groups on the basis of trait annotations and performed hotspot analysis on the average trait per category.
Fu et al. (2009) point out that large eQTL hotspots may or may not persist when examining proteomic QTL, metabolic QTL, and phenotypic QTL gene mapping. Now that we can infer smaller hotspots composed of any of these QTL types, it may be possible to find more connections. A small hotspot could in fact be quite important to reveal genetic effects on whole-body phenotypes.
Acknowledgments
We thank Rachel Brem for sharing the yeast data set and Bill Taylor from the Center for High Throughput Computing of University of Wisconsin (Madison) for his assistance with cluster computation. This work was supported by National Council for Scientific and Technological Development (CNPq) Brazil (E.C.N.); National Cancer Institute Integrative Cancer Biology Program grant U54-CA149237 and National Institutes of Health (NIH) grant R01MH090948 (to E.C.N.); National Institute of Diabetes and Digestive and Kidney Diseases grants DK66369, DK58037, and DK06639 (to A.D.A., M.P.K., A.F.B., B.S.Y., and E.C.N.); National Institute of General Medical Sciences grants PA02110 and GM069430-01A2 (to B.S.Y.); The Netherlands Bioinformatics Centre, Distinguished Scientist Traveling Stipend (to R.C.J.); and by NIH grant R01GM074244 (to K.W.B.).
Footnotes
Communicating editor: J. Borevitz
- Received February 8, 2012.
- Accepted May 18, 2012.
- Copyright © 2012 by the Genetics Society of America
Available freely online through the author-supported open access option.


















