Genetics, Vol. 164, 1683-1687, August 2003, Copyright © 2003

Adjusted P Values for Genome-Wide Scans

Theodore C. Lystiga
a Department of Mathematical Statistics, Chalmers University of Technology, 412 96 Göteborg, Sweden

Corresponding author: Theodore C. Lystig, Chalmers University of Technology, Eklandagatan 86, 412 96 Göteborg, Sweden., lystig{at}math.chalmers.se (E-mail)

Communicating editor: G. CHURCHILL


*  ABSTRACT
*TOP
*ABSTRACT
*METHODS
*SIMULATED EXAMPLE
*DISCUSSION
*LITERATURE CITED

Genome-wide scans for quantitative trait loci (QTL) have traditionally been summarized with plots of logarithm of odds (LOD) scores. A valuable modification is to supplement such plots with an additional vertical axis displaying quantiles of adjusted P values and labeling local maxima of the LOD scores with location-specific adjusted P values. This provides a visible gradation of genome-wide significance for the LOD score curve, instead of the stark dichotomy that a single threshold yields. Adjusted P values give genome-wide significance of individual LOD scores and are obtained through a straightforward modification of the familiar algorithm for generating permutation-based thresholds.


TWO of the most popular methods for performing genome scans to detect quantitative trait loci (QTL) are interval mapping (LANDER and BOTSTEIN 1989 Down) and an approximation to interval mapping using least squares (HALEY and KNOTT 1992 Down). Numerous extensions have been developed for both the interval mapping (ZENG 1994 Down; JIANG and ZENG 1995 Down; KAO et al. 1999 Down) and the least-squares (HALEY et al. 1994 Down; KNOTT and HALEY 2000 Down) approaches. While debate may be present concerning the choice of analysis method for a genome scan, the method of summarizing the analysis is rarely questioned. Invariably, a scalar multiple of a logarithm of odds (LOD) score curve is used.

A major attraction of the LOD score is that it acts as a type of profile likelihood, evincing at each analysis point the relative support for the presence of a QTL at that location. The profile nature of the LOD score can make the interpretation of significance problematic. The issue is one of false positives and has been discussed by LANDER and BOTSTEIN 1989 Down and LANDER and KRUGLYAK 1995 Down. Essentially, the LOD score threshold that must be exceeded to declare significance should be higher when examining a collection of LOD scores (as in a genome scan) than when examining a single analysis point. Thresholds may be computed in a variety of ways. Two of the most common approaches are an analytical method (LANDER and BOTSTEIN 1989 Down; LANDER and KRUGLYAK 1995 Down) and an empirical method based on permutation tests (CHURCHILL and DOERGE 1994 Down; DOERGE and CHURCHILL 1996 Down; NETTLETON and DOERGE 2000 Down). The thresholds are calculated in such a manner as to control the false-positive rate at level {alpha} for the entire genome scan. More specifically, a threshold for a given study is the value of the LOD score such that the maximum of the observed genome scan LOD score will exceed the threshold with probability {alpha} only when the null hypothesis is true for that particular study. The threshold thus provides a sharp demarcation line between significance and nonsignificance for the genome scan. However, apart from the single case of a LOD score being exactly equal to the calculated value, a threshold does not provide a precise level of genome-wide significance for the individual LOD score values.

It is desirable to know the genome-wide significance or adjusted P values (WESTFALL and YOUNG 1993 Down) of each of the analysis points represented in the LOD score curve. The interpretation of an adjusted P value for a given LOD score within a genome scan is as the probability of observing the maximum LOD score of a genome scan that was at least as large as the LOD score in question, given that the null hypothesis (such as no QTL present anywhere in the genome) is true. As such, it is a statement about the significance of a particular test statistic within the context of a genome scan. This differs from a standard unadjusted P value, which considers only the marginal significance of the LOD score for a given analysis point, without accounting for the multitude of other LOD scores obtained for the entire genome scan. As originally discussed by NETTLETON and DOERGE 2000 Down and detailed later in this article, it is a relatively straightforward process to modify the algorithm used to generate permutation-based thresholds (CHURCHILL and DOERGE 1994 Down) to obtain the adjusted P values. Furthermore, such a modified algorithm also provides a Monte Carlo approximation to the null distribution of the maximum LOD score from the genome scan.

The immediate use of the adjusted P values is as a means of stating the precise significance level for the test of the presence of a candidate QTL found in a genome scan. These values may appear in a textual description of the genome scan or may be used graphically to label local peaks of interest in a LOD score curve. The quantiles from the approximate null distribution of the maximum LOD score can be used to ascertain multiple threshold values, which then correspond to adjusted P values. The LOD score values of these quantiles can be used to place multiple threshold values for adjusted P values onto a traditional LOD score plot; the adjusted P values would appear as an additional labeled vertical axis. Of course, it is even possible to directly display the adjusted P values by plotting the entire genome scan with an adjusted P-value curve, similar to a LOD score curve.


*  METHODS
*TOP
*ABSTRACT
*METHODS
*SIMULATED EXAMPLE
*DISCUSSION
*LITERATURE CITED

As noted by NETTLETON and DOERGE 2000 Down, adjusted P values are readily obtained through a modification of the algorithm used to generate a permutation-based threshold (CHURCHILL and DOERGE 1994 Down). The algorithm for generating a permutation-based threshold is presented below. This algorithm is appropriate when the null hypothesis states that no QTL are anywhere in the genome, so that the phenotypic and genotypic information are mutually independent.

  1. Individuals in the experiment are labeled with unique numbers one through n.

  2. The phenotypic data are shuffled by taking a random permutation of the indices 1, ... , n and matching the ith phenotypic trait value to the individual with index given by the ith element of permuted indices. This permuted vector of traits is matched with the original (unpermuted) genotype information for all individuals.

  3. A genome scan for QTL effects is performed on the resulting permuted data set, and the largest test statistic (LOD score) is recorded.

  4. Steps 2 and 3 are repeated a total of N times (N is often 1000), yielding N maximal test statistics, one from each genome scan.

  5. The N maximal test statistics are ordered from smallest to largest.

  6. The 100(1 - {alpha}) percentile of the N ordered values is the estimated experiment-wise threshold value for controlling the type 1 error at level {alpha}.

With {alpha} = 0.05 and N = 1000, the 950th value of the ordered maximal test statistics would be the estimated threshold value. Note that while N = 1000 is usually sufficient to obtain a threshold for {alpha} = 0.05, on the order of N = 10,000 shuffled data sets are recommended for {alpha} = 0.01 to obtain stable estimates (CHURCHILL and DOERGE 1994 Down).

To calculate adjusted P values, the only portion of the above algorithm that must be modified is step 6. Instead of extracting a single value from the N ordered values to form a threshold, one simply calculates (for each observed LOD score statistic x from the original genome scan) the proportion of the N maximal test statistics that are greater than or equal to x. Note that determining the number of maximal statistics greater than a given x is facilitated by the fact that the N maximal test statistics were sequentially ordered in step 5. The modification necessary to obtain adjusted P values can be incorporated into the above algorithm as follows:

  1. 6*. For each LOD score x from the genome scan of the original unpermuted data, calculate the number of maximal test statistics from step 5 that are greater than or equal to x and divide by N.

The adjusted P values calculated in step 6* effectively make use of the entire distribution of maximal test statistics from a set of N genome scans, while a permutation-based threshold uses only the 100(1 - {alpha}) percentile of those statistics.

It is important to note that in addition to an {alpha} = 0.05 threshold, any other quantile of interest may be calculated from the set of N maximal LOD scores. In practice, this means that it is a simple task to add various threshold levels of adjusted P values onto standard LOD score plots. An example is provided with the simulated data below.


*  SIMULATED EXAMPLE
*TOP
*ABSTRACT
*METHODS
*SIMULATED EXAMPLE
*DISCUSSION
*LITERATURE CITED

The simulated data set is for 200 animals from an F2 cross in the rat. The genetic length of the chromosomes is taken from DRACHEVA et al. 2000 Down, but the content of the map was simulated and the true contents were used in the subsequent QTL analysis. The entire genome is 1872 cM in length. Markers were generated at regular 10-cM intervals (188 markers total). A single QTL with additive effect 0.5 ({sigma}2 = 1.0) and no dominance effect was simulated at 85 cM from the left end of chromosome 2. The data were evaluated with interval mapping to obtain LOD scores at 1-cM intervals, yielding a total of 2001 analysis points. A LOD score plot for a genome scan of these data appears in Fig 1.



View larger version (50K):
In this window
In a new window
Download PPT slide
 
Figure 1. LOD score plot of a genome scan for simulated data from an F2 cross in the rat. There is a single QTL near the middle of chromosome 2. The permutation threshold at LOD = 3.55 (based on 1000 resampled data sets) effectively splits the genome-wide significance of elements of the LOD score curve into two regions: those with genome-wide significance >0.05 and those with genome-wide significance <0.05. The scale for the unadjusted P values is shown on the right-hand vertical axis.

In addition to providing the scale for the LOD curve on the left-hand vertical axis, Fig 1 also provides the scale for the unadjusted P values on the right-hand vertical axis. Similar dual scale plots appeared in LANDER and KRUGLYAK 1995 Down. One noteworthy feature of Fig 1 is how it demonstrates that the naïve unadjusted threshold of P = 0.05 corresponding to LOD = 1.30 is clearly too low. Seven of the 20 chromosomes without a QTL present exceeded the threshold by chance variation alone, since a true QTL is present only on chromosome 2. A value of the LOD score that maintains the correct size across the entire genome is the permutation-based threshold of LOD = 3.55 from 1000 resampled data sets. This LOD score corresponds to an unadjusted P value of P = 0.000283. However, as indicated in Fig 1, the permutation-based threshold merely serves to dichotomize the genome scan analysis points into two distinct groups: those with genome-wide significance >0.05, and those with genome-wide significance <0.05. The relative significance of portions of the LOD score curve within each of the two regions is not apparent from this LOD plot. The gradation of genome-wide significance is not discernible.

The plot in Fig 2 depicts the genome scan results from chromosome 2, where the single QTL was generated. The genome scan is portrayed with a LOD score curve and supplemented with quantiles of adjusted P values (shown on the right-hand vertical axis). Furthermore, two of the peaks are labeled with position and adjusted P value. The lower peak is 110.28 cM from the left end of the chromosome and has a LOD score of 2.85, which corresponds to an adjusted P value of 0.240. This value was calculated by determining that 2.85 was less than 240 of the 1000 maximal genome scan LOD scores from the resampled data sets. The higher peak is at 82.84 cM and has a LOD score of 8.00, which corresponds to an adjusted P value <0.001. The observed LOD score of 8.00 was greater than the maximal genome scan LOD scores from all 1000 resampled data sets. Unlike the stark dichotomy of genome-wide significance seen in Fig 1, the adjusted P-value scale present on the right-hand vertical axis in Fig 2 provides a meaningful gradient of genome-wide significance for the LOD score curve.



View larger version (13K):
In this window
In a new window
Download PPT slide
 
Figure 2. LOD score plot of chromosome 2 from a genome scan for simulated data from an F2 cross in the rat. A single QTL was generated 85 cM from the left end of this chromosome. Note the scale for the adjusted P values (based on 1000 resampled data sets) shown on the right-hand vertical axis. The LOD score of 8.00 observed at 82.84 cM was greater than the maximum genome scan LOD scores of all 1000 resampled data sets. In contrast, the LOD score of 2.85 from the local peak at 110.28 cM was less than 240 of the 1000 maximal genome scan LOD scores.


*  DISCUSSION
*TOP
*ABSTRACT
*METHODS
*SIMULATED EXAMPLE
*DISCUSSION
*LITERATURE CITED

Adjusted P values are a valuable tool to assist in the summary of genome-wide scans. Depicted graphically in a supplemented LOD score plot, they enable the LOD score evidence at a single analysis point for the presence of a QTL to be interpreted both marginally and within the context of a whole genome scan. Furthermore, labeling individual LOD score peaks with adjusted P values permits precise statements of genome-wide significance for particular locations of interest. Such supplemented LOD score plots retain all of the information of standard LOD score plots, enabling inferences to be made on values of the LOD score, the adjusted P values, or both.

The emphasis in this article has been on how adjusted P values may be combined with traditional LOD score plots to improve the summary of genome-wide scans. One major improvement is the use of LOD score values corresponding to quantiles of the adjusted P values to provide a meaningful gradient of genome-wide significance for the LOD score curve. Such emphasis differs markedly from that of NETTLETON and DOERGE 2000 Down, who investigated the variability of permutation-based genome-wide thresholds. These authors presented adjusted P values (referred to in their article as permutation P values) as analogues of permutation thresholds. Their prime interest was in assessing when a LOD score exceeded a genome-wide threshold for a given fixed value of {alpha}, while accounting for the Monte Carlo error of the permutation procedure. As such, the emphasis was on a single individual value of {alpha} for genome-wide significance, instead of on presenting gradations of significance. Moreover, neither adjusted nor unadjusted P values were displayed on any of their LOD score plots.

The Monte Carlo error of the permutation procedure is important to consider, especially when one is interested in precise values of very small adjusted P values. CHURCHILL and DOERGE 1994 Down originally suggested using at least 1000 permutations for estimating thresholds corresponding to an adjusted P value of 0.05 and suggested that as many as 10,000 permutations might be required for stable estimates of a threshold corresponding to an adjusted P value of 0.01. Depending on the stability required for the adjusted P values, these numbers could be refined further using techniques described by NETTLETON and DOERGE 2000 Down. Note however that the variability of the adjusted P values resides in their calculated values and not in their relative ordering. This may be seen in part by observing that both adjusted and unadjusted P values are monotonic transformations of the values from the LOD score curve.

An intriguing property of adjusted P values is that they provide a common scale on which to present the results from different models, phenotypes, or crosses. This might be accomplished by plotting the adjusted P values directly, instead of as a supplemental vertical axis. One example of multiple models would be when comparing models of varying complexity in the search for epistasis (genetic interaction). Another version of multiple models that could be compared would be both logarithmic and untransformed versions of a particular phenotype. For an initial application comparing multiple competing models in a one-dimensional genome scan, see BROMAN 2003 Down. The comparison of multiple phenotypes would be particularly interesting with related measurements, such as alternative forms of assessing cognitive ability in Alzheimer's disease or different measures of severity of rheumatoid arthritis. With the case of multiple crosses, researchers have a meaningful way of comparing the relative evidence for a QTL from a backcross strain and an intercross strain, for example.

The P-value adjustments employed in this article were based on a single-step resampling method. It is also possible to perform a multiple-step resampling method adjustment, as described in WESTFALL and YOUNG 1993 Down(Chap. 2). In the multiple-step method, the order (in terms of LOD score) of a given statistic within the observed genome scan is used to determine which simulated order statistic the observed statistic should be ranked against. That is, the observed maximum is ranked against the simulated maxima, the second-highest observed statistic is ranked against the distribution of the second-highest statistics from the simulations, and so on until the smallest observed statistic is ranked against the distribution of the minima from the simulated data sets. The greatest disparity between the single- and multiple-step methods occurs for the smallest observed statistics, which is where there is the least interest. There were negligible discernible differences between the adjusted P values from the single- and multiple-step methods for the simulated data used in this article. As calculation of the multiple-step adjustments takes substantially longer than that of the single-step adjustments, researchers are advised to use the single-step adjustments.

In summary, the calculation of single-step adjusted P values is straightforward and has been detailed above. The material necessary to perform the adjustments is generated naturally in the course of determining a permutation-based threshold. Quantiles of adjusted P values provide a meaningful gradation of genome-wide significance for elements of a LOD score curve. Furthermore, the adjusted P values are available to provide precise levels of significance for all analysis points in genome scans, in particular local and global maxima of LOD score curves. They provide a natural mechanism for the simultaneous presentation of genome scans involving multiple models, multiple phenotypes, or multiple crosses. They even provide an appropriate framework in which to investigate epistatic gene action. Adjusted P values are a valuable supplement to LOD score curves and are an asset for both exploring and summarizing genetic models of quantitative traits.


*  ACKNOWLEDGMENTS

I thank K. Broman, G. Churchill, O. Nerman, S. Schreyer, and two anonymous referees for providing critical feedback on earlier versions of this manuscript. Financial support for this work was provided by Arexis, AstraZeneca, and Chalmers University.

Manuscript received March 17, 2003; Accepted for publication April 10, 2003.


*  LITERATURE CITED
*TOP
*ABSTRACT
*METHODS
*SIMULATED EXAMPLE
*DISCUSSION
*LITERATURE CITED

BROMAN, K. W., 2003  Mapping quantitative trait loci in the case of a spike in the phenotype distribution. Genetics 163:1169-1175.[Abstract/Free Full Text]

CHURCHILL, G. A. and R. W. DOERGE, 1994  Empirical threshold values for quantitative trait mapping. Genetics 138:963-971.[Abstract]

DOERGE, R. W. and G. A. CHURCHILL, 1996  Permutation tests for multiple loci affecting a quantitative character. Genetics 142:285-294.[Abstract]

DRACHEVA, S. V., E. F. REMMERS, S. CHEN, L. CHANG, and P. S. GULKO et al., 2000  An integrated genetic linkage map with 1137 markers constructed from five F2 crosses of autoimmune disease-prone and -resistant inbred rat strains. Genomics 63:202-226.[Medline]

HALEY, C. S. and S. A. KNOTT, 1992  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315-324.[Medline]

HALEY, C. S., S. A. KNOTT, and J.-M. ELSEN, 1994  Mapping quantitative trait loci in crosses between outbred lines using least squares. Genetics 136:1195-1207.[Abstract]

JIANG, C. and Z-B. ZENG, 1995  Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140:1111-1127.[Abstract]

KAO, C.-H., Z-B. ZENG, and R. D. TEASDALE, 1999  Multiple interval mapping for quantitative trait loci. Genetics 152:1203-1216.[Abstract/Free Full Text]

KNOTT, S. A. and C. S. HALEY, 2000  Multitrait least squares for quantitative trait loci detection. Genetics 156:899-911.[Abstract/Free Full Text]

LANDER, E. and L. KRUGLYAK, 1995  Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat. Genet. 11:241-247.[Medline]

LANDER, E. S. and D. BOTSTEIN, 1989  Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199. [corrigendum: Genetics 136: 705 (1994)].[Abstract/Free Full Text]

NETTLETON, D. and R. W. DOERGE, 2000  Accounting for variability in the use of permutation testing to detect quantitative trait loci. Biometrics 56:52-58.[Medline]

WESTFALL, P. H., and S. S. YOUNG, 1993 Resampling-Based Multiple Testing: Examples and Methods for P-Value Adjustment. John Wiley & Sons, New York.

ZENG, Z-B., 1994  Precision mapping of quantitative trait loci. Genetics 136:1457-1468.[Abstract]




This article has been cited by other articles:


Home page
GeneticsHome page
S. Wang, S. Huang, L. Zheng, and H. Zhao
Mapping Quantitative Trait Loci in Noninbred Mosquito Crosses
Genetics, April 1, 2006; 172(4): 2293 - 2308.
[Abstract] [Full Text] [PDF]


Home page
Mol. Pharmacol.Home page
T. L. Thomae, E. A. Stevens, A. L. Liss, N. R. Drinkwater, and C. A. Bradfield
The Teratogenic Sensitivity to 2,3,7,8-Tetrachlorodibenzo-p-dioxin Is Modified by a Locus on Mouse Chromosome 3
Mol. Pharmacol., March 1, 2006; 69(3): 770 - 775.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Bilger, L. M. Bennett, R. A. Carabeo, T. A. Chiaverotti, C. Dvorak, K. M. Liss, S. A. Schadewald, H. C. Pitot, and N. R. Drinkwater
A Potent Modifier of Liver Cancer Risk on Distal Mouse Chromosome 1: Linkage Analysis and Characterization of Congenic Lines
Genetics, June 1, 2004; 167(2): 859 - 866.
[Abstract] [Full Text] [PDF]