- THIS ARTICLE
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Cannings, C.
- Articles by Sheehan, N. A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Cannings, C.
- Articles by Sheehan, N. A.
Letter to the Editor |
On a Misconception About Irreducibility of the Single-Site Gibbs Sampler in a Pedigree Application
C. Canningsa and N. A. Sheehanaa Division of Genomic Medicine, Royal Hallamshire Hospital, University of Sheffield, Sheffield S10 2JF, United Kingdom and Department of Epidemiology and Public Health, University of Leicester, Leicester LE1 6TP, United Kingdom
Corresponding author: N. A. Sheehan, University of Leicester, 2228 Princess Rd. W., Leicester LE1 6TP, UK., nas11{at}le.ac.uk (E-mail)
THE analysis of genetic data on groups of related individuals, or pedigrees, frequently necessitates the calculation of probabilities and likelihoods. There are well-known algorithms such as the peeling algorithm (![]()
![]()
![]()
![]()
![]()
![]()
![]()
The first application of MCMC methods in pedigree analysis (![]()
![]()
![]()
![]()
![]()
The pedigree Gibbs sampler, as defined above, will converge to the true posterior distribution of genotypes on the pedigree given available phenotypes if it defines an irreducible chain whereby all states communicate and if each individual genotype is updated infinitely often. This follows from the ergodic theorem for finite, aperiodic, irreducible, Markov chains (![]()
![]()
The issue, therefore, is whether the legal configurations of genotypes on a pedigree (i.e., those states that agree both with the data and the genetic model) communicate if genotypes are updated one at a time. It can be shown that the single-site Gibbs sampler is generally irreducible over the legal states of a diallelic genetic system. The only exception (![]()
|
Once the genetic system has three or more alleles, it is easy to see how the Gibbs sampler can define a reducible Markov chain. The classic example for a three-allele system, with alleles A, B, and C, say (see ![]()
|
Reducibility depends on the data. If it were possible to enumerate the noncommunicating classes of the Markov chain described by the Gibbs sampler, a scheme that would move between the different classes using the reversible-jump method of ![]()
![]()
![]()
![]()
The practical implication of this is that it may not be possible to sample correctly from the true posterior distribution of genotype, given phenotype, using a single-site Gibbs sampler once there are three or more alleles in the genetic system, and any estimates of probabilities and likelihoods obtained in this way are hence unreliable.
In animal breeding circles, especially those concerned with animals where artificial insemination is practiced, it is not uncommon to have a small number of males with large numbers of mates and offspring. In particular, full DNA information would usually be available on these males. The pedigrees resulting from such a breeding program are extremely complex and defy exact calculation of probabilities and likelihoods for any genetic system, however simple. Often, the pedigree structure is sacrificed and estimates are obtained on a simple subpedigree such as a half-sib design, using methods like least squares (![]()
![]()
It has been claimed in the animal genetics literature (see ![]()
![]()
|
For a half-sib design that includes only individuals over two generations and assumes that the dams are unrelated and distinct, the single-site Gibbs sampler is irreducible when the sires are fully typed, regardless of any observations on mates or offspring. However, it is not the case in general and misleading results can be obtained.
The belief that the single-site Gibbs sampler defines an irreducible Markov chain on the space of genotypic configurations over a general pedigree provided one member of each spouse pairing is observed would seem to derive from the erroneous statement in ![]()
![]()
![]()
However, while it is true that any analysis based on a reducible sampling scheme is potentially dubious, reducibility itself is not really a problem. There are several methods for getting around reducibility (see ![]()
![]()
![]()
![]()
![]()
![]()
Using meiosis indicators or descent graphs (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
ACKNOWLEDGMENTS
We acknowledge Rohan Fernando for bringing this problem to our attention. In addition, Nuala Sheehan acknowledges support from the Wellcome Trust Biomedical Research Collaboration Grant 056266/Z/98/Z and the TVW Telethon Institute for Child Health Research, Perth, Western Australia.
Manuscript received May 3, 2002; Accepted for publication August 13, 2002.
LITERATURE CITED
BESAG, J., 1986 On the statistical analysis of dirty pictures. J. R. Stat. Soc. Ser. B 48:259-302.
BINK, M. C. A. M. and J. A. M. V. ARENDONK, 1999 Detection of quantitative trait loci in outbred populations with incomplete marker data. Genetics 151:409-420.
CANNINGS, C., E. A. THOMPSON, and M. H. SKOLNICK, 1978 Probability functions on complex pedigrees. Adv. Appl. Probab. 10:26-61.
CHURCHILL, G. A. and R. W. DOERGE, 1994 Empirical threshold values for quantitative trait mapping. Genetics 138:963-971.[Abstract]
COX, D. R., and H. D. MILLER, 1965 Stochastic Processes. Methuen & Co., London.
ELSTON, R. C. and J. STEWART, 1971 A general model for the genetic analysis of pedigree data. Hum. Hered. 21:523-542.[Medline]
FERNÁNDEZ, S. A., R. L. FERNANDO, B. GULDBRANDTSEN, L. R. TOTIR, and A. L. CARRIQUIRY, 2001 Sampling genotypes in large pedigrees with loops. Genet. Sel. Evol. 33:337-367.[Medline]
GEMAN, S. and D. GEMAN, 1984 Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Patt. Anal. Mach. Intell. 45:721-741.
GEYER, C. J., 1991 Markov chain Monte Carlo maximum likelihood, pp. 156163 in Computer Science and Statistics: Proceedings of the 23rd Symposium on the Interface, edited by E. M. KERAMIDAS and S. M. KAUFMAN. Interface Foundation of North America, Fairfax Station, VA.
GEYER, C. J. and E. A. THOMPSON, 1995 Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Am. Stat. Assoc. 90(431):909-920.
GILKS, W. R., D. CLAYTON, D. J. SPIEGELHALTER, N. G. BEST, and A. J. MCNEIL et al., 1993 Modelling complexity: applications of gibbs sampling to medicine. J. R. Stat. Soc. Ser. B 55:39-52.
GREEN, P. J., 1995 Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711-732.
HALEY, C. S., S. A. KNOTT, and J. M. ELSEN, 1994 Mapping quantitative trait loci in crosses between outbred lines using least squares. Genetics 136:1195-1207.[Abstract]
HEATH, S. C., 1997 Markov chain Monte Carlo segregation and linkage analysis for oliogenic models. Am. J. Hum. Genet. 61:748-760.[Medline]
HOESCHELE, I., P. UIMARI, F. GRIGNOLA, Q. ZHANG, and K. GAGE, 1997 Advances in statistical methods to map quantitative trait loci in outbred populations. Genetics 147:1445-1457.[Abstract]
HURME, P., M. J. SILLANPÄÄ, E. ARJAS, T. REPO, and O. SAVOLAINEN, 2000 Genetic basis of climatic adaptation in Scots Pine by Bayesian quantitative trait locus analysis. Genetics 156:1309-1322.
JENSEN, C. S. and N. SHEEHAN, 1998 Problems with determination of noncommunicating classes for Monte Carlo Markov chain applications in pedigree analysis. Biometrics 54:416-425.[Medline]
JENSEN, C. S., U. KJAERULFF, and A. KONG, 1995 Blocking Gibbs sampling in very large probabilistic expert systems. Int. J. Hum. Comput. Stud. 42:647-666.
JENSEN, F. V., 1996 An Introduction to Bayesian Networks. University College Press, London.
LAURITZEN, S. L. and D. J. SPIEGELHALTER, 1988 Local computations with probabilities on graphical structures and their applications to expert systems. J. R. Stat. Soc. Ser. B 50:157-224.
LIN, S., 1996 Multipoint linkage analysis via Metropolis jumping kernels. Biometrics 52:1417-1427.[Medline]
LIN, S., E. THOMPSON, and E. WIJSMAN, 1993 Achieving irreducibility of the Markov chain Monte Carlo method applied to pedigree data. IMA J. Math. Appl. Med. Biol. 10:1-17.
LIN, S., E. THOMPSON, and E. WIJSMAN, 1994 Finding noncommunicating sets for Markov chain Monte Carlo estimations on pedigrees. Am. J. Hum. Genet. 54:695-704.[Medline]
SHEEHAN, N., 1992 Sampling genotypes on complex pedigrees with phenotypic constraints: the origin of the B allele among the Polar Eskimos. IMA J. Math. Appl. Med. Biol. 9:1-18.
SHEEHAN, N. and A. THOMAS, 1993 On the irreducibility of a Markov chain defined on a space of genotype configurations by a sampling scheme. Biometrics 49:163-175.[Medline]
SHEEHAN, N. A., 1990 Genetic restoration on complex pedigrees. Ph.D. Thesis, University of Washington, Seattle.
SHEEHAN, N. A., 2000 On the application of Markov chain Monte Carlo methods to genetic analyses on complex pedigrees. Int. Stat. Rev. 68:83-110.
SHEEHAN, N. A., B. GULDBRANDTSEN, M. S. LUND, and D. A. SORENSEN, 2002 Bayesian MCMC mapping of quantitative trait loci in a half-sib design: a graphical model perspective. Int. Stat. Rev. 70:241-267.
SILLANPÄÄ, M. J. and E. ARJAS, 1999 Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data. Genetics 151:1605-1619.
SOBEL, E. and K. LANGE, 1996 Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. Am. J. Hum. Genet. 58:1323-1337.[Medline]
THOMAS, A., A. GUTIN, V. ABKEVICH, and A. BANSAL, 2000 Multilocus linkage analysis by blocked Gibbs sampling. Stat. Comput. 10:259-269.
THOMPSON, E. A., 1986 Pedigree Analysis in Human Genetics. Johns Hopkins University Press, Baltimore.
THOMPSON, E. A., 1994 Monte Carlo likelihood in genetic mapping. Stat. Sci. 9(3):355-366.
THOMPSON, E. A., 2000a MCMC estimation of multi-locus genome sharing and multipoint gene location scores. Int. Stat. Rev. 68(1):53-73.
THOMPSON, E. A., 2000b Statistical Inference from Genetic Data on Pedigrees, NSF-CBMS Regional Conference Series in Probability and Statistics, Vol. 6. Institute of Mathematical Statistics and the American Statistical Association, Beachwood, OH.
THOMPSON, E. A., 2001 Monte Carlo methods on genetic structures, pp. 176218 in Complex Stochastic Systems, edited by O. E. BARNDORFF-NIELSEN, D. R. COX and C. KLUPPELBERG. Chapman & Hall, London.
THOMPSON, E. A., and S. C. HEATH, 2000 Estimation of conditional multilocus gene identity among relatives, pp. 95113 in Statistics in Molecular Biology and Genetics, IMS Lecture Notes, edited by F. SEILLER-MOISEIWITSCH. Institute of Mathematical Statistics, American Mathematical Society, Providence, RI.
WANG, T., R. L. FERNANDO, C. STRICKER, and R. C. ELSTON, 1996 An approximation to the likelihood for a pedigree with loops. Theor. Appl. Genet. 93:1299-1309.
This article has been cited by other articles:
![]() |
S. H. Lee, J. H. J. Van der Werf, and B. Tier Combining the Meiosis Gibbs Sampler With the Random Walk Approach for Linkage and Association Studies With a General Complex Pedigree and Multimarker Loci Genetics, December 1, 2005; 171(4): 2063 - 2072. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Lee and J. H. J. Van der Werf The Role of Pedigree Information in Combined Linkage Disequilibrium and Linkage Mapping of Quantitative Trait Loci in a General Complex Pedigree Genetics, January 1, 2005; 169(1): 455 - 466. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Cannings, C.
- Articles by Sheehan, N. A.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Cannings, C.
- Articles by Sheehan, N. A.



