This article proposes a quick method for computing approximate threshold levels that control the genome-wise type I error rate of tests for quantitative trait locus (QTL) detection in interval mapping (IM) and composite interval mapping (CIM). The procedure is completely general, allowing any population structure to be handled, e.g., BC1, advanced backcross, F2, and advanced intercross lines. Its main advantage is applicability in complex situations where no closed form approximate thresholds are available. Extensive simulations demonstrate that the method works well over a range of situations. Moreover, the method is computationally inexpensive and may thus be used as an alternative to permutation procedures. For given values of the likelihood-ratio (LR)-profile, computations involve just a few seconds on a Pentium PC. Computations are simple to perform, requiring only the values of the LR statistics (or LOD scores) of a QTL scan across the genome as input. For CIM, the window size and the position of cofactors are also needed. For the approximation to work well, it is suggested that scans be performed with a relatively small step size between 1 and 2 cM.
MAPPING of quantitative trait loci (QTL) is of growing interest to both breeders and geneticists (Liu 1998; Lynch and Walsh 1998; Kaoet al. 1999). QTL mapping procedures such as interval mapping (IM; Lander and Botstein 1989) and composite interval mapping (CIM; Jansen 1993; Zeng 1993, 1994) involve tests of the null hypothesis that a QTL is absent. For a putative QTL position, the likelihood-ratio (LR) test statistic asymptotically follows a χ2-distribution with degrees of freedom equal to the number of associated QTL effects. Since the QTL position is not known, multiple tests are performed in small steps across the whole genome. To control the genome-wise type I error rate, some form of adjustment of the critical threshold value of the test statistic is necessary. Lander and Botstein (1989, 1994) have provided formulas for approximate critical thresholds in IM for a backcross (BC1) population, which are appropriate under the assumption of an infinitely dense map. Feingold et al. (1993) suggested an approximation, which works well for rather dense maps (Rebaïet al. 1994). Further approximations for dense (<1 cM) and sparse maps are given by Dupuis and Siegmund (1999). Lander and Botstein (1989), van Ooijen (1992), and Darvasi et al. (1993) provided simulation-based critical values for a number of settings in the intermediate map density case. Using results of Davies (1977, 1987), Rebaï et al. (1994) provided formulas for approximate critical thresholds in BC1 and F2 populations, which are applicable in the intermediate map density case. They demonstrated good performance of these thresholds using simulations (see also Doerge and Rebaï 1996). The authors also indicated that the results of Davies (1977, 1987) are potentially useful for deriving critical thresholds in other populations and exemplified this for the case of a diallel cross. Dupuis (1994) gave another approximation, which in simulations by Doerge and Rebaï (1996) has been shown to be somewhat less accurate than that of Rebaï et al. (1994), leading to slightly inflated type I errors. A different approach to deriving critical threshold is the permutation test procedure advocated by Churchill and Doerge (1994) and Doerge and Churchill (1996). The great advantage of this approach is its conceptual simplicity, its distribution-free nature, and the general applicability in different population structures. A serious drawback is the computational workload. For example, to compute a critical threshold for a genomewise type I error rate of 0.01, at least 10,000 permutations are required to obtain a reasonably accurate estimate of the threshold (Doerge and Rebaï 1996). For a type I error of 0.05, a sample of 1000 permutations is usually regarded as sufficient (Churchill and Doerge 1994). In routine applications, where many traits need to be analyzed, permutations can pose a formidable workload.
The purpose of this article is to suggest a quick method to compute approximate threshold values for both IM and CIM using the results of Davies (1977, 1987). An important aspect of the method is its generality, which allows any population structure to be handled easily. A further advantage is computational speed, which is only of the order of a few seconds, given the LR or LOD profile. The thresholds obtained for real data are compared to those derived by permutation. A simulation is conducted to check the appropriateness of the approximation.
The most common approach to the multiple testing problem is a Bonferroni adjustment of the significance level (Hochberg and Tamhane 1987). When performing M tests, each individual test is assigned a comparison-wise significance level of α/M, which guarantees the overall type I error rate to be below α. This method works fine so long as the number of tests is small. In QTL mapping the number of tests is bounded only by the step size chosen as we scan across the genome. When the step size tends to zero, the number of tests (M) approaches infinity, and the Bonferroni approach breaks down. Essentially this is because correlations among tests at adjacent points on the chromosome are not exploited.
A key to finding a useful procedure is to realize that in significance testing for QTL, we are in a situation where a nuisance parameter, i.e., the position of the QTL, is present only under the alternative hypothesis (Rebaï et al. 1994, 1995). Since the nuisance parameter is not known even under the alternative, a profile of the test statistic across the permissible interval for the nuisance parameter is constructed, and we choose the maximum of that profile to perform just one test. Under the null hypothesis, the nuisance parameter (QTL position) is undefined, so that standard techniques are not applicable for deriving the null distribution of the test statistic unconditionally on the nuisance parameter. For the situation of testing a hypothesis when the nuisance parameter is present only under the alternative, Davies (1977, 1987) provided an upper bound to the type I error rate, which is based on the assumption that the series of tests for different values of the nuisance parameter form a chi-square process. An evaluation of the upper bound provides an approximate critical threshold. Rebaï et al. (1994, 1995) found an explicit formula for the upper bound in case of QTL mapping for a BC1 and an F2 population. The derivations are algebraically quite involved, however, even for the simplest case of a BC1 population, and extension to other, more complex, designs seems rather complicated.
Davies (1987) had also suggested a “quick calculation of significance” (Equations 2.4 and 3.4) on the basis of an approximation of his upper bound, which is considered by Rebaï et al. (1994) as a useful method for simulation-based calculation of the significance value for any experimental design, particularly when an explicit expression for the upper bound is difficult to obtain. A need for simulation to compute the approximate threshold is not mentioned by Davies (1987), however. In fact, his quick procedure requires only very elementary calculations using the values of the test statistic for a fine grid of positions across the chromosome. Davies did use simulations to verify that his procedure does, in fact, control the type I error rate and has good power. Because of its simplicity and generality, Davies’ quick procedure promises to be very useful for computing approximate thresholds for a wide variety of experimental populations such as advanced backcrosses, for which the more refined approximations are difficult to obtain. In what follows, application of Davies’ quick method to IM and CIM is outlined.
Controlling the chromosome-wise error rate: In this article, it is assumed that QTL are mapped by either IM or CIM using the ML method (Lander and Botstein 1989; Zeng 1994). Thus, LR tests are performed across a grid of locations on the chromosome, so the problem is to find a critical threshold for the LR test statistic that will control the chromosome-wise type I error rate. The proposed method is completely general and not restricted to a particular type of population. The underlying model assumes a mixture of normal distributions with constant variance and location parameters depending on the QTL effects. The mixing proportions are given by the genotype frequencies and will be specific to the particular population structure under study:
. T(θ) is the LR test statistic at the putative QTL position θ in centimorgans; LOD(θ) = T(θ)/[2 log(10)].
. α is the chromosome-wise type I error rate.
. C is the critical threshold value for LR test statistic.
. k is the number of genetic effects for the putative QTL (k depends on the population being studied. Examples: For a backcross population, k = 1, i.e., one allelic substitution effect. For an F2 population, k = 2, i.e., one additive and one dominance effect).
It is assumed that T(θ) is a continuous function of θ, except for a finite number of jumps in the first derivative with respect to θ. A further assumption is that conditional on the QTL position T(θ) follows a χ2-distribution with k d.f. To detect QTL, the chromosome is scanned and the maximum of T(θ) is determined over a grid for θ [max T(θ), say]. The null hypothesis of no QTL on a given chromosome is rejected when max T(θ) > C. For a given critical value C, the chromosome-wise type I error rate is bounded above by (1) where is the cumulative distribution function of χ2 with k d.f. and Γ(·) is the Gamma function. The upper bound in (1) is derived, taking into account the fact that test statistics T(θ) computed at adjacent positions θ are stochastically dependent and in fact form a stochastic process. For details of derivation the reader is referred to Davies (1977, 1987). Note that if we dropped the second term on the right-hand side of (1), C would just be the critical value for a conditional test at a given QTL position θ. Thus, it can be said that the second term takes care of the fact that multiple tests are performed and we need to consider the unconditional distribution of max T(θ) across the chromosome, rather than the conditional distribution of T(θ) for a specific θ. Thus, the resulting threshold for a prespecified α will be higher than that for the conditional test. Davies (1987) suggested estimating in (1) by (2) where θ1,..., θr are the successive turning points (points of inflection) of , i.e., the values of θ, where the first derivative changes sign. This change of sign occurs at the local minima and maxima of . A sign change can (but need not) occur at the markers. The advantage of (2) is that it can be computed from the LR (LOD) profile alone, i.e., from the T(θ) values computed from the data for a grid of values for θ, and does not require further theoretical calculations. Using (2) in place of the integral in (1), the upper bound of the chromosome-wise type I error rate is estimated by (3) For given α, C may be found from (3) by numerical methods. The problem in practice is to find the turning points θ1,..., θr. In most cases, this will have to be done numerically. Usually a grid search is done over all θ, so the turning points can only be determined to the accuracy given by the step size of the grid. We therefore suggest using a relatively fine grid, e.g., between 1 cM and 2 cM. The analysis is simplified by pretending that every point on the grid is a turning point. To see this, assume that θ1, θ2, and θ3 are three successive positions on the chromosome and that T(θ) is monotonically increasing in the interval (θ1, θ3) so that θ2 is not a turning point. Due to the monotonic increase of T(θ) we have It is clear from this result that by pretending that every point on the grid is a turning point, application of (2) will yield the correct result (to the accuracy of the grid), even though only a fraction of points will correspond to real turning points. Thus, to compute V, we simply compute the absolute differences between successive square roots of T(θ) on the grid and sum these across the chromosome.
Davies (1987) notes regarding the one-parameter case (k = 1) that the second term in (3) will be important when T(θ) scans across a range of widely differing hypotheses and then values of T(θ) might tend to be independent for separated values of θ. This matches the situation of a scan across a whole chromosome, where tests >50 cM apart are virtually independent. He further conjectured that in this case the law of large numbers applies so that V gives a good estimate of . His simulations confirmed this conjecture. From this we expect the approximation to work well for QTL mapping. A small-scale simulation is performed to check the appropriateness of the approximation.
Controlling the genome-wise error rate in IM: To guarantee a genome-wise type I error rate, Rebaï et al. (1994) proposed choosing the same α for each chromosome, using the formula αi = 1 - (1 - γ)1/n, where αi is the chromosome-wise error rate for the ith chromosome. This allocation assumes that test statistics for different chromosomes are stochastically independent, which they are not, since the same phenotypic data are used for all chromosomes. The effect of dependence will usually be small, but, nevertheless, we prefer to use the Bonferroni inequality (see below), which is guaranteed to control γ, even when the test statistics are dependent.
A problem for the practitioner is that a separate threshold needs to be used for each chromosome when the same αi is chosen for each chromosome. Rebaï et al. (1994) suggested that αi “could be chosen by a manner which takes into account the relative lengths of all chromosomes.” This suggestion is taken up here. To compute an overall type I error γ, we use the Bonferroni inequality (Hochberg and Tamhane 1987), from which it follows that choosing αi so that (4) where n is the number of chromosomes, will ensure that the overall type I error rate is at most γ. Inserting the approximation (3) into (4), we find (5) where Vi is the value of V for the ith chromosome. Instead of choosing the same αi for each chromosome, we suggest that a common critical value C be used for all chromosomes, while αi may be different on each chromosome. Using a numerical search procedure such as bisection, C is chosen to satisfy (5) for the desired γ. The resulting chromosome-wise αi can be inferred from (3), though this is not necessary in practice. Assuming uniform coverage of the genome by markers, αi will be relatively large for longer chromosomes. This seems a perfectly natural allocation of error rates.
Extension to CIM: In CIM, cofactors are used to reduce residual variation by controlling for the genetic background (Jansen 1993; Zeng 1993, 1994). Cofactors are determined by model selection procedures such as forward selection and backward elimination. When scanning the chromosome, a certain window size is imposed around a putative QTL. Cofactors within the window are ignored when computing T(θ). Thus, as we scan the chromosome, the set of cofactors changes at points bordering the window around the markers used as cofactors. These points correspond to jumps in T(θ) (and its first derivative). The method of Davies (1987) is not directly applicable to CIM because of the discontinuities in T(θ).
A simple solution to this problem is to consider in turn coherent intervals on a chromosome, for which the same set of cofactors is used in the analysis, so that T(θ) is continuous within the interval, and to control the interval-wise type I error rate. The genome-wise type I error rate may then be controlled using a Bonferroni adjustment. Thus, the upper bound for the genomewise type I error rate is estimated as (6) where p is the number of coherent intervals having a constant set of cofactors. Note that the summation in the second term on the right-hand side of (6) is over the p coherent intervals. Also, the integration limits in Vi according to (2) are now the borders of the coherent intervals. Thus, for the border of two adjacent intervals, the LR statistic T(θ) has to be computed twice, once for each of the two intervals, i.e., once with the set of cofactors for the one interval and once with the set of cofactors for the other interval. IM can be regarded as a limiting case of CIM, in which each chromosome forms a coherent interval with no cofactors, so that p = n, where n is the number of chromosomes. For CIM, we will have p > n. Except when a cofactor is less than half the window size away from one end of the chromosome, each cofactor will increase the number of coherent intervals by two. Otherwise the increase is by one interval. Application of the Bonferroni procedure to combine intervals from the same chromosome will result in a conservative threshold, because the correlation structure among tests in different intervals, but on the same chromosome, is not exploited.
We used data from an Oryza sativa × O. rufipogon BC2F2 population evaluated in an upland environment to exemplify the method and compare it to the permutation method. The data were obtained in two experiments, one with rice grown in monoculture and one with rice intercropped with Brachiaria. Details of these experiments are described in Moncada et al. (2000). The traits used for QTL mapping are described in Table 1. IM and CIM results obtained using QTL Cartographer (Bastenet al. 1997) were kindly provided by the authors. The step size of the QTL scans was 2 cM. A stepwise selection procedure with a P value of 0.01 for F-toenter and F-to-drop was used to select cofactors for CIM. The maximum number of cofactors was fixed at five. The window size was 10 cM. In permutations for CIM, the same cofactors were used as those identified in the original analysis. Permutation thresholds at the 0.05 significance level were estimated on the basis of 1000 permutations using the procedure of Churchill and Doerge (1994).
From the output of QTL Cartographer, we did not have available the value of T(θ) at the borders of coherent intervals. Thus, in computing the sum in (6), we ignored the discontinuity in T(θ) at the borders. The effect of this is a slightly too liberal critical threshold. We set p = n + 2nc in the term p on the right-hand side of (6), where n is the number of chromosomes and nc is the number of cofactors, which is a conservative choice. The critical LR thresholds for IM and CIM at γ = 0.05 as computed by permutation and the quick method are shown in Tables 2 and 3. The medians across traits are similar for both methods. With IM, the median threshold by the quick method is slightly larger than by permutation, while for CIM the situation is reversed. By both methods the median threshold is larger for CIM than for IM. A 95% confidence limit around the permutation threshold was computed using standard procedures (Moodet al. 1974, Chap. XI). There is a relatively large variation among permutation thresholds across traits, which cannot be explained by sampling variation alone (see confidence limits), while thresholds computed by the quick method are relatively stable. The more pronounced variability of permutation thresholds is partly due to the fact that the permutation distribution is unique for each trait, since it is conditional on the observed trait values. Thus, some variation in thresholds is expected, even if all traits are sampled from a normal distribution. It should be stressed that despite the larger variation among traits in thresholds obtained by permutation, it is guaranteed by the theory of permutation tests (Lehmann 1986) that the procedure will control the genome-wise type I error. Both the permutation procedure and the quick method are adequate in the case of approximate normality. Note that differences of thresholds for a particular realized sample do not imply that over many experiments the two methods give widely different controls of type I error. Clearly, any two test procedures may yield different results in a specific sample and yet give exactly the same type I error control over repeated samples, provided the distributional assumptions are met. Traits 31 and 40 have comparatively large permutation thresholds. Inspection of the data revealed that for these traits, marked nonnormality and/or presence of outliers were a problem. In these cases, the permutation thresholds seem preferable, since the quick method is based on the normality assumption.
We simulated the chromosome-wise type I error α for IM in a BC1 population of 200 individuals under the global H0 of no QTL. Equal spacing of markers along a 100-cM chromosome and absence of interference was assumed. The number of crossovers per chromosome was simulated according to a Poisson distribution with parameter equal to the length of the chromosome in morgans, which is in accordance with Haldane’s mapping function. The LR statistic for the null hypothesis of no QTL was computed using the expectation-maximization (EM) algorithm (Lynch and Walsh 1998). The step size was 1 cM in most simulations, but was also varied for some simulations to study the effect of step size. On each of 10,000 simulation runs, we determined the threshold by (5) and checked if H0 was rejected at levels α = 0.01 and α = 0.05 anywhere in the genome. For comparison, we also assessed the number of rejections based on thresholds by Rebaï et al. (1994). The percentage of rejections was recorded to give an estimate of the actual genome-wise type I error rates (Table 4). The results indicate that the quick method controls the type I error, tending to yield slightly more conservative thresholds than those of Rebaï et al. (1994). Also, the quick method was rather insensitive to the choice of step size between 1 and 5 cM. The approximation performs better for wider marker spacing. For two nonnormal distributions (uniform and ), the empirical type I error was on the conservative side.
To broaden the scope of the study, simulations were also performed for an F2 population, an advanced backcross population (Tanksley and Nelson 1996), and a population of advanced intercross lines (AIL; Darvasi and Soller 1995). Advanced backcrossing involves repeated backcrossing to one of the parental lines. This strategy is especially useful for the discovery and transfer of valuable QTL alleles from unadapted donor lines into established elite breeding lines (Tanksley and Nelson 1996). We studied a BC2 population. AIL are obtained by crossing two inbred lines and then proceeding by random mating for several generations. Thus, instead of stopping at F2, we continue up to an Ft population. This provides increasing probability of recombination and hence increased mapping resolution. We studied an F3 population. The settings (map length, marker spacing, error distribution, step size, etc.) used for these three types of population (F2, BC2, F3) are the same as for BC1. Mapping was done by the EM algorithm assuming absence of interference and Haldane’s mapping function. The results reported in Table 5 show that the approximation works well across different population types.
This work was motivated by the high computational workload of permutations encountered in practical applications. The method advocated here for computing approximate thresholds is fast and easy to use. Implementation into existing packages for QTL mapping should be possible with minimal effort. Simulations have shown that the approximate thresholds provide reasonable, though somewhat conservative, control of the genome-wise type I error rate in a wide variety of population structures. Due to the generality of the method, any population structure can be accommodated. The method is especially useful in more complex designs, where closed form thresholds are not available (advanced backcross, advanced intercross lines, etc.). The rice example has demonstrated that the quick method yields thresholds similar to those obtained by permutation. The method is reasonably robust to nonnormality, but should probably be used with caution if departure from normality is marked. Approximate thresholds can replace thresholds obtained by permutation if computing time is a limiting factor, e.g., when many traits need to be analyzed. Note that with permutation thresholds, the workload increases drastically as the type I error is reduced, since permutation sample size needs to be increased to attain reasonable accuracy. By contrast, the quick method has the same small workload, regardless of the targeted type I error. In summary I recommend using the quick method preferably in the following situation: (i) a permutation test is computationally too expensive; (ii) approximate normality can be assumed; and (iii) closed form critical thresholds are not readily available.
In this article, we have used the maximum-likelihood method for computing T(θ). The method should work equally well with IM and CIM methods on the basis of multiple regression (Haley and Knott 1992; Martinez and Curnow 1992). These methods also yield test statistics, which asymptotically follow a χ2-distribution conditional on the putative QTL, so that the theory of Davies (1977, 1987) applies. The method may also be used with IM/CIM on the basis of models that are extended to account for several random sources of variation, e.g., random effects due to genetic correlation and genotype-by-environment interaction (Piepho 2000a). Moreover, Davies’ method has been used successfully for marker difference regression (Lynch and Walsh 1998; Piepho 2000b), a viable alternative to IM and CIM.
When errors follow a normal distribution, the permutation procedure and the quick method are expected to yield similar thresholds, provided the null hypothesis is true. An advantage of permutation tests relative to the quick method is that normality of the errors under the null hypothesis need not be assumed. Note, however, that in our small simulation study, Davies’ (1987) quick method was robust to nonnormality of errors. Also, in simulations by Doerge and Rebaï (1996), the approximate thresholds of Rebaï et al. (1994), which are based on the same upper bound (1) as Davies’ quick method, proved to be relatively insensitive to nonnormality.
Rebaï et al. (1994) have used the quick method of Davies (1987) to derive simulation-based thresholds for the case of F3 progenies derived from a diallel cross between four inbred lines. However, they erroneously assumed that turning points of occur only at the marker positions, which is implicit from their Equation 10. First, turning points need not occur at the markers. Second, further turning points will occur at local minima and maxima of between markers. For example, in a BC1 population, a turning point of occurs whenever the ML estimate of the QTL effect changes sign, corresponding to a local minimum. The omission of turning points may explain the liberal thresholds obtained by Rebaï et al. (1994) for their diallel example. Furthermore, it does not seem necessary to use simulations in deriving approximate thresholds. Our simulations indicate that computation of V from the data set at hand, i.e., not using simulations, gives reasonable, though somewhat conservative, results. Some improvement may be possible by using simulations, as suggested by Rebaï et al. (1994), though this can no longer be called a quick method. Also, if one is prepared to use simulation in routine applications, it seems better to simulate exact thresholds rather than approximate thresholds. The associated computation costs are the same and the results are more accurate.
I thank Pilar Moncada for providing the output from QTL Cartographer for the rice data. Thanks are also due to Hugh Gauch Jr. for very helpful discussions on the article. This article was written while the author was visiting at the Department of Biometrics, College of Agriculture and Life Sciences, Cornell University, Ithaca, New York. Support of the Heisenberg programme of the Deutsche Forschungsgemeinschaft is thankfully acknowledged.
Communicating editor: Z-B. Zeng
- Received August 26, 1999.
- Accepted October 6, 2000.
- Copyright © 2001 by the Genetics Society of America