- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Allison, D. B.
- Articles by Heo, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Allison, D. B.
- Articles by Heo, M.
Meta-Analysis of Linkage Data under Worst-Case Conditions: A Demonstration Using the Human OB Region
David B. Allisona and Moonseong Heoaa Obesity Research Center, St. Luke's/Roosevelt Hospital Center, Columbia University College of Physicians & Surgeons, New York, New York 10025
Corresponding author: David B. Allison, Obesity Research Center, St. Lukes/Roosevelt Hospital, 1090 Amsterdam Ave., 14th floor, New York, NY 10025, dba8{at}columbia.edu (E-mail).
Communicating editor: R. R. HUDSON
| ABSTRACT |
|---|
To date, few methods have been developed explicitly for meta-analysis of linkage analyses. Moreover, the methods that have been developed or suggested generally depend on certain ideal situations and have not been widely applied. In this article, we apply standard statistical theory and meta-analytic techniques in novel ways to five published papers discussing the evidence of linkage of body mass index (BMI) to the region of the human genome containing the OB gene. These methods are "inference based," meaning that they allow one to make statements about the statistical significance of the entire body of evidence. As currently developed, they do not allow specific statements to be made about the amount of variance explained by any putative locus or allow precise confidence intervals to be placed around the putative location of a linked locus. By applying these techniques to the literature on linkage in the human OB gene region, we are able to show that the evidence for linkage somewhere in the region is extremely strong (P = 1.5 x 10-5).
POOLING data to increase the precision of one's estimates and conclusions dates back at least to the early 1900s (![]()
![]()
![]()
![]()
![]()
![]()
Recently, several authors have suggested the possibility of developing methods for meta-analysis of linkage studies. ![]()
![]()
![]()
![]()
![]()
![]()
![]()
We agree with ![]()
![]()
![]()
![]()
![]()
![]()
- Different studies use different genetic markers.
- Different studies use different statistical techniques to test for linkage.
- Some studies include multiple hypothesis tests by using multiple markers, multiple statistical techniques, or multiple phenotypic cutoff points. This creates issues of nonindependent multiple testing that must be managed.
- Not all studies report all of the information required for easy extraction of the data. The techniques we will present are applicable even under these difficult circumstances.
| GENERAL PRINCIPLES |
|---|
Assume that there are m independent studies assessing linkage of a disease or trait to markers within a region of the genome. Suppose that a P value can be obtained on each of the m data sets where the P value indicates the probability of obtaining data as extreme or more extreme than the data observed under the null hypothesis of no linkage in the region. ![]()
![]() |
(1) |
This quantity is distributed as
2 with 2m degrees of freedom. One then can test the significance of the entire body of data by evaluating the probability of obtaining a
2 greater than or equal to the observed
2 under the null hypothesis and by using the significance level of their choice. (One could imagine situations in which the null hypothesis is rather different. However, in this article we confine our attention to the simple null hypothesis of no linkage in the region.)
In this article, we illustrate the application of this technique to published evidence for linkage in the human OB gene region to body mass index (BMI; kg/m2). Most of the exposition involves the application of standard statistical methods and meta-analytic "tricks of the trade" to derive one P value from each study. Following the EXAMPLE, a reiteration and discussion of the general approach are included.
| EXAMPLE |
|---|
To our knowledge, there are five published studies concerning linkage of BMI with markers in the human OB gene region. A sixth study (![]()
![]()
![]()
|
However, the first problem of incomplete data can be solved rather easily in this case. Because the exact t values are provided along with n (the sample size) it is easy to obtain the exact P values by integrating the t distribution with n - 1 degrees of freedom (d.f.). In this case, the second problem of having two separate studies can also be solved. Although the two samples involved contain overlapping individuals, it has been shown that the IBD status of different sibling pairs from within the same sibship are pairwise independent (e.g., ![]()
![]()
![]()
To accomplish this, we begin by converting each P value to a corresponding (standard normal) Z-score by means of the inverse standard normal distribution function
21, that is, Z =
21 (1 - P). Under the null hypothesis of no linkage, the eight Z-scores have a multivariate normal distribution with zero means, unit variances, and a correlation matrix R. According to ![]()
)2, where
denotes the recombination fraction between the markers. Note that one centimorgan (cM) is equal to
of 0.01, equivalent, on average, to 1 million base pairs (bp) (DEPARTMENT OF ENERGY 1992).
In this case, since the tests on individual markers are one-sided, we defined more extreme data in terms of the sum of the Z-scores
![]() |
(2) |
![]()
Although a multiple-marker linkage analysis was conducted by ![]()
![]()
![]()
![]() |
(3) |
is the crossing-over rate between the genotypes being compared, and T is the threshold level that yields the significance level
(T). In
= 2 (for sib-pair tests;
(T ) = 0.003 (observed P value), and T =
21 (1 - 0.003), which is the threshold level of the observed P value. This yields µ(T ) = 0.073, and hence P * = 0.070. Therefore, after correcting the P value of 0.003 for the fact that it was obtained by a multipoint procedure with a 211-cM interval, the corrected single P value is 0.070.
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The data for ![]()
![]()
![]()
![]()
![]()
![]()
|
|
We begin by considering the data for the sibling-pair sets in Table 2, in which for each BMI cutoff, the corresponding nominal P value was provided. Hence, by applying
21 to those P values, the corresponding Z-scores are obtained. In this situation, the 3 x 3 correlation matrix R for the Z-scores can be calculated by assuming that the IBD status for the sibling pairs is independently and identically distributed (i.i.d.). Under this assumption, the correlation between the Z-score for a sample and a subset within the sample is the square root of the proportion of subjects in the subsample from the larger sample. For example, the correlation between the Z-scores for sibling pairs above 35 and sibling pairs above 30 is
As above, once the three Z's and their correlation matrix are obtained, a single P value, in terms of the sum, can be obtained. In this case, it was observed to be 0.159 for the sib-pair test. (Note that the sib-pair tests in Table 2 are one-sided.)
Turning to the TDT results in Table 3, a single P value can be obtained in a similar fashion. The chi-squares in Table 3 are converted to Z-scores by taking their square root. Assuming that the data are i.i.d., the correlation among the Z's can again be estimated as the square root of the proportion of subjects in a subset divided by the number of subjects in the larger set. For example, the estimated correlation of the Z-score in subjects with a BMI
40 and the Z-score for subjects with a BMI >30 is
With the Z-scores and the 3 x 3 correlation matrix among them derived, a single P value for the TDT in ![]()
![]() |
(4) |
The final challenge with the data from ![]()
![]()
Overall meta-analysis:
The (single) P values for the five papers calculated as described above are presented in Table 4. In the primary overall meta-analysis the P values were pooled using Fisher's method described in the Introduction. For ![]()
2 for the overall analysis was 44.10 with 12 d.f. (P = 1.5 x 10-5). These results apparently provide strong evidence of linkage somewhere in the OB region. When the P value for ![]()
2 being 47.73 (P = 3.5 x 10-6). Thus, these results suggest that there is clear evidence for linkage of BMI to something in the OB region regardless of whether one uses the TDT or sib-pair analyses from ![]()
|
As a sensitivity analysis, each study result was removed from the analysis, and the chi-square statistic with 10 d.f. (from the remaining study results) was computed. The corresponding P values are given in Table 4. This table shows that while the ![]()
![]()
Apart from the overall significance of these results, one may question whether the variation of the strength of the evidence (in terms of the P value) seen from study to study is simply random variation or represents some statistically significant heterogeneity in the linkage. Although this is clearly a legitimate question, it is not possible to conduct such a heterogeneity test in the current situation when only the inferential information available from P values is used, since the heterogeneity of the strength of statistical evidence across the studies is not identifiable from the heterogeneity of the linkage per se.
Summary of meta-analytic approach used:
In this section we reiterate in general terms the approach taken in the meta-analysis. The first step is to extract a single P value from each study. The studies contained herein provide an illustration of several different techniques for extracting the single P value from each study. The techniques used can briefly be summarized as follows:
- If a single P value was provided from a multipoint procedure, then the LanderKruglyak correction in Equation 3 was applied;
DUGGIRALA et al. 1996 case.
- If a separate P value for each of several markers is used, then the Z-scores and the correlations among them are obtained by using the inverse standard normal distribution function
21 and the formula in CAREY and WILLIAMSON 1993 , respectively. Since the P values are one-sided for linkage studies, a single P value was determined in terms of the sum of the Z-scores in Equation 2, which is normally distributed; cases of
CLEMENT et al. 1996 and
NORMAN et al. 1996 .
- For the multiple cutoff study defining affected or unaffected sib pairs (e.g.,
REED et al. 1996 ), the correlations among Z-scores were estimated by means of the proportion of sample sizes. For the one-sided sib-pair tests, the sum in (2) was used for extracting the single P value. In contrast, for the two-sided TDT, the single P value was extracted by means of the quadratic form Q in (3), which is chi-square distributed.
- For a single marker study, no correction needs to be applied;
BORECKI et al. 1994 case.
Once a single P value is derived for each study, the P values are pooled using Fisher's method in Equation 1.
| DISCUSSION |
|---|
The above example indicates that by judicious use of standard statistical theory and meta-analytic techniques, one can meta-analyze data from multiple linkage studies of the same phenotype even under a number of worst-case conditions. We believe such methods will become essential to understanding the overall significance of the research literature when raw data are not available. This is all the more important in the field of gene mapping for complex traits because the power to detect significance in any one study is often quite below what one would desire. This power can be enhanced by pooling the data in multiple studies.
The efficacy of these procedures is demonstrated by the example involving the assessment of linkage of BMI to the genome region containing the human OB gene. Taken individually, no one study is terribly convincing, except ![]()
We wish to stress again that we do not consider our approach outlined in the current paper to be the optimal approach. Clearly, a better approach is to obtain the raw data of all of the investigators, whether the data were published or not, and conduct a pooled analysis. However, for the near future, we suspect that there will be many situations in which meta-analysts have only one alternative available to them: the statistical integration of data from multiple published studies that have used different statistical methods and different markers and, in some cases, have presented incomplete information. Moreover, although we hope that our paper acts as a call for the presentation of more complete information, such calls have been issued by meta-analysts for over two decades (![]()
![]()
A limitation of the meta-analysis conducted herein concerns the potential for publication bias. Publication bias occurs when the probability of a study being published is dependent upon the results of the study. Publication bias has been shown to exist in other fields (e.g., ![]()
![]()
![]()
![]()
Finally, the methods presented herein only allow one to conduct inferential tests of whether or not there is significant linkage in a particular region. The appropriate conclusion to such an analysis, assuming a statistically significant result is obtained, is that there is statistically significant evidence for linkage somewhere in the region examined, in this case the region from D7S531 to D7S483, which is the same as ![]()
In conclusion, as more studies assessing the linkage between genetic markers and complex diseases or traits are possible, we suspect that there will be increasing need for meta-analysis to objectively and quantitatively pool the resulting information. We hope that the procedures outlined in this paper are a useful step in that direction.
| FOOTNOTES |
|---|
1 Since preparation of this article, we have become aware that there are at least three additional articles on this topic (![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
This research was supported by National Institutes of Health grants R29DK47256, R01DK51716, and P30DK26687.
Manuscript received February 28, 1997; Accepted for publication October 23, 1997.
| LITERATURE CITED |
|---|
ALLISON, D. B., M. S. FAITH, and B. S. GORMAN, 1996 Publication bias in obesity treatment trials? Int. J. Obes. 20:931-937.
AMOS, C. I., R. C. ELSTON, A. F. WILSON, and J. E. BAILEYWILSON, 1989 A more powerful robust sib-pair test of linkage for quantitative traits. Genet. Epidemiol. 6:435-449[Medline].
BORECKI, I. B., T. RICE, L. PÉRUSSE, C. BOUCHARD, and D. C. RAO, 1994 An exploratory investigation of genetic linkage with body composition and fatness phenotypes: the Québec family study. Obes. Res. 2:213-219[Medline].
BRAY, M. S. E., C. L. BOERWINKLE AND, and C. L. BOERWINKLE ANDHANIS, 1996 OB gene not linked to human obesity in Mexican American affected sib pairs from Starr County, Texas. Hum. Genet. 98:590-595[Medline].
CAREY, G. and J. WILLIAMSON, 1993 Linkage analysis of quantitative traits: increased power by using selected samples. Am. J. Hum. Genet. 49:786-796.
CHALMERS, T. C., C. S. FRANK, and D. REITMAN, 1990 Minimizing the three stages of publication bias. J Am. Med. Assoc. 263:1392-1395
CLEMENT, K., C. GARNER, J. HAGER, A. PHILIPPI C, and A. PHILIPPI CLEDUC ET AL., 1996 Indication for linkage of the human OB gene region with extreme obesity. Diabetes 45:687-690[Abstract].
COX D. R., and D. B. HINKLEY, 1974 Theoretical Statistics. Chapman & Hall, London.
DEPARTMENT OF ENERGY HUMAN GENOME PROGRAM, 1992 Primer on Molecular Genetics. Oak Ridge National Laboratory, Oak Ridge.
DICKERSIN, K. and Y. MIN, 1993 Publication bias: the problem that won't go away. Ann. NY Acad. Sci. 703:135-148[Medline].
DUGGIRALA, R., M. P. STERN, B. D. MITCHELL, L. J. REINHART, and P. A. SHIPMAN et al., 1996 Quantitative variation in obese-related traits and insulin precursors linked to the OB gene region on human chromosome 7. Am. J. Hum. Genet. 59:694-703[Medline].
FEINSTEIN, A. R., 1995 Meta-analysis: statistical alchemy for the 21st century. J. Clin. Epidemiol. 48:71-79[Medline].
FISHER, R. A., 1954 Statistical Methods for Research Workers, 12th ed. Hafner Publishing Company Inc., New York.
FULKER, D. W., S. S. CHERNEY, and L. R. CARDON, 1995 Multipoint interval mapping of quantitative trait loci, using sib pairs. Am. J. Hum. Genet. 56:1224-1233[Medline].
GLASS, G. V., 1976 Primary, secondary, and meta-analysis of research. Educ. Res. 5:3-8.
GREENHOUSE, J. B., and S. IYENGAR, 1994 Sensitivity analysis and diagnostics, pp. 383409 in The Handbook of Research Synthesis, edited by H. COOPER and L. V. HEDGES. Russell Sage Foundation, New York.
HASEMAN, J. K. and R. C. ELSTON, 1972 The investigation of linkage between a quantitative trait and a marker locus. Behav. Genet. 2:3-19[Medline].
HASSTEDT, S. J., M. HOFFMAN, M. F. LEPPERT, and S. C. ELBEIN, 1997 Recessive inheritance of obesity in familial non-insulin-dependent diabetes mellitus, and lack of linkage to nine candidate genes. Am. J. Hum. Genet. 61:668-677[Medline].
HODGE, S. E., 1984 The information contained in multiple sibling pairs. Genet. Epidemiol. 1:109-122[Medline].
HYNE, V. and M. J. KEARSEY, 1995 QTL analysis: further uses of 'marker regression.' Theor. Appl. Genet. 91:471-476.
IYENGAR, S. and J. B. GREENHOUSE, 1988 Selection models and the file drawer problem. Stat. Science 3:109-135.
JACKSON, G. B., 1980 Methods of integrative reviews. Rev. Educ. Res. 50:438-460.
JENG, G. T., J. R. SCOTT, and L. F. BURMEISTER, 1995 A comparison of meta-analytic results using literature vs individual patient data. J. Am. Med. Assoc. 274:830-845
KEARSEY, M. J. and V. HYNE, 1994 QTL analysis: a simple 'marker-regression' approach. Theor. Appl. Genet. 89:698-702.
LANDER, E. S. and L. KRUGLYAK, 1995 Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat. Genet. 11:241-247[Medline].
LI, Z. and D. C. RAO, 1996 Random effects model for meta-analysis of multiple quantitative sibpair linkage studies. Genet. Epidemiol. 13:377-384[Medline].
NORMAN, R. A., R. L. LEIBEL, W. K. CHUNG, L. POWERKEHOE, and S. C. CHUA et al., 1996 Absence of linkage of obesity and energy metabolism to markers flanking homologues of rodent obesity genes in Pima Indians. Diabetes 45:1229-1232[Abstract].
OKSANEN, L., M. ÖHMAN, M. HEIMAN, K. KAINULAINEN, and J. KAPIRO et al., 1997 Markers for the gene ob and serum leptin levels in human morbid obesity. Hum. Genet. 99:559-564[Medline].
PEARSON, K., 1904 Report on certain enteric fever inoculation statistics. Brit. Med. J. 3:1243-1246.
PEARSON, K., 1933a Appendix to Dr. Eldenton's paper on "The Lanarkshire milk experiment. rdquo; Ann. Eugen. 5:337-338.
PEARSON, K., 1933b On a method of determining whether a sample size n supposed to have been drawn from a parent population having a known probability integral has probably been drawn at random. Biometrika 25:379-410
REED, D. R., Y. DING, W. XU, C. CATHER, and E. D. GREEN et al., 1996 Extreme obesity may be linked to markers flanking the human OB gene. Diabetes 45:691-694[Abstract].
SPIELMAN, R. S., R. E. MCGINNIS, and W. J. EWENS, 1993 Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus. Am. J. Hum. Genet. 52:506-516[Medline].
STIRLING, B., N. J. COX, G. I. BELL, C. L. HANIS, and R. S. SPIELMAN et al., 1995 Identification of microsatellite markers near the human OB gene and linkage studies in NIDDM-affected sib pairs. Diabetes 44:999-1001[Abstract].
TIPPETT, L. H. C., 1931 The Methods of Statistics. Williams & Norgate, London.
XU, S. and W. R. ATCHLEY, 1995 A random model approach to interval mapping of quantitative trait loci. Genetics 141:1189-1197[Abstract].
This article has been cited by other articles:
![]() |
K. Zhang, H. Wiener, M. Beasley, V. George, C. I. Amos, and D. B. Allison An Empirical Bayes Method for Updating Inferences in Analysis of Quantitative Trait Loci Using Information From Related Genome Scans Genetics, August 1, 2006; 173(4): 2283 - 2296. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Guo, D. A. Sleper, P. Lu, J. G. Shannon, H. T. Nguyen, and P. R. Arelli QTLs Associated with Resistance to Soybean Cyst Nematode in Soybean: Meta-Analysis of QTL Locations Crop Sci., February 1, 2006; 46(2): 595 - 602. [Abstract] [Full Text] [PDF] |
||||
![]() |
S Sammalisto, T Hiekkalinna, E Suviolahti, K Sood, A Metzidis, P Pajukanta, H E Lilja, A Soro-Paavonen, M-R Taskinen, T Tuomi, et al. A male-specific quantitative trait locus on 1p21 controlling human stature J. Med. Genet., December 1, 2005; 42(12): 932 - 939. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-J. Kim, M. F. Rothschild, J. Beever, S. Rodriguez-Zas, and J. C. M. Dekkers Joint analysis of two breed cross populations in pigs to improve detection and characterization of quantitative trait loci J Anim Sci, June 1, 2005; 83(6): 1229 - 1240. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. Palmer, K. C. Barnes, P. R. Burton, H. Chen, W. O.C.M. Cookson, Collaborative Study on the Genetics of Asthma, K. A. Deichmann, R. C. Elston, J. W. Holloway, K. B. Jacobs, et al. Meta-analysis for linkage to asthma and atopy in the chromosome 5q31-33 candidate region Hum. Mol. Genet., April 1, 2001; 10(8): 891 - 899. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. Walling, P. M. Visscher, L. Andersson, M. F. Rothschild, L. Wang, G. Moser, M. A. M. Groenen, J.-P. Bidanel, S. Cepica, A. L. Archibald, et al. Combined Analyses of Data From Quantitative Trait Loci Mapping Studies: Chromosome 4 Effects on Porcine Growth and Fatness Genetics, July 1, 2000; 155(3): 1369 - 1378. [Abstract] [Full Text] |
||||
![]() |
B. Goffinet and S. Gerber Quantitative Trait Loci: A Meta-analysis Genetics, May 1, 2000; 155(1): 463 - 473. [Abstract] [Full Text] |
||||
![]() |
A. G. Comuzzie and D. B. Allison The Search for Human Obesity Genes Science, May 29, 1998; 280(5368): 1374 - 1377. [Abstract] [Full Text] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Allison, D. B.
- Articles by Heo, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Allison, D. B.
- Articles by Heo, M.









