- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by George, A. W.
- Articles by Haley, C. S.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by George, A. W.
- Articles by Haley, C. S.
Mapping Quantitative Trait Loci in Complex Pedigrees: A Two-Step Variance Component Approach
Andrew W. Georgea, Peter M. Visscherb, and Chris S. Haleyaa Roslin Institute, Midlothian EH25 9PS, United Kingdom
b Institute of Cell, Animal and Population Biology, Edinburgh EH9 3JG, Scotland
Corresponding author: Andrew W. George, Division of Genetics and Biometry, Roslin Institute (Edinburgh), Roslin, Midlothian EH25 9PS, United Kingdom., andrew.george{at}bbsrc.ac.uk (E-mail)
Communicating editor: T. F. C. MACKAY
| ABSTRACT |
|---|
There is a growing need for the development of statistical techniques capable of mapping quantitative trait loci (QTL) in general outbred animal populations. Presently used variance component methods, which correctly account for the complex relationships that may exist between individuals, are challenged by the difficulties incurred through unknown marker genotypes, inbred individuals, partially or unknown marker phases, and multigenerational data. In this article, a two-step variance component approach that enables practitioners to routinely map QTL in populations with the aforementioned difficulties is explored. The performance of the QTL mapping methodology is assessed via its application to simulated data. The capacity of the technique to accurately estimate parameters is examined for a range of scenarios.
WITH the widespread usage of genetic markers in helping to detect and localize quantitative trait loci (QTL), marker data are becoming available on human and livestock populations with increasingly complex pedigree structures. QTL analysis in such populations is challenging because the number of alleles segregating at the QTL is unknown, the marker phases may be unknown or only partially known, the marker and QTL allele frequencies must be estimated from the data, inbreeding loops may exist in the pedigree, and markers may be noninformative or ungenotyped. Although it is possible to simplify the analysis of complex pedigree data by fragmenting the pedigree into smaller component families, methods that fully account for complex relationships between individuals are expected to provide greater power to detect QTL (![]()
Literature surrounding the mapping of QTL in populations with complex pedigrees can be classified according to the allelic assumptions associated with the QTL. Mapping methods either assume the QTL is a fixed effect with a finite number of alleles or a random effect with an infinite number of alleles.
Analysis of statistical models that treat the QTL as a fixed effect range from simple regression-based methodologies (![]()
![]()
![]()
![]()
![]()
Random effects models, based upon the simple premise that individuals of like phenotype are more likely to share genes identical-by-descent (IBD), offer a less parameterized statistical environment in which to map QTL. This environment is obtained by assuming the QTL effects are normally distributedan assumption that circumvents the estimation of QTL allele frequencies and is robust to violation (![]()
Random effects models have long been utilized by human geneticists interested in partitioning the genetic variance of quantitative traits into effects due to specific chromosomal regions. As early as the 1970s, variance component approaches (i.e., analytical methods that estimate the parameters of random effects models) were being used to detect QTL in phase-known pedigrees (![]()
![]()
![]()
In contrast to the long association human geneticists have had with random effects models, animal geneticists' acceptance of QTL as random effects is relatively recent. ![]()
![]()
![]()
![]()
![]()
![]()
The aim of this article is to present to the animal genetics community a new two-step variance component method that is capable of routinely mapping QTL in populations with considerable missing marker information and complex pedigree structures. The methodology is based upon an interval mapping procedure and begins by utilizing available marker data and pedigree information to calculate the (co)variance matrices associated with a QTL at a particular position on the genome. Once the (co)variance matrix is calculated, the mixed linear model is constructed and parameter estimates are obtained. This two-step process of calculating the (co)variance matrix and estimating the parameters of the mixed linear model is repeated for each position on the genome. A test statistic measuring QTL presence is then obtained from which position and size can also be determined. The ability of this method to analyze complex pedigree data is owed to the recently upgraded and freely available software package Loki (![]()
| MATERIALS AND METHODS |
|---|
Mixed linear models:
When constructing a mixed linear model that accounts for a QTL, the quantitative trait is commonly assumed to be controlled by a linear combination of fixed effects, putative QTL, and additive residual (polygenic) effects. The polygenic effects account for the cumulative result of all loci affecting the quantitative trait that are unlinked to the QTL. Mixed linear models can be constructed at the animal or gametic level. In this article, an animal model is presented, which, in matrix notation, is defined as
![]() |
(1) |
where y is an (m x 1) vector of phenotypes, X is an (m x s) design matrix, ß is a (s x 1) vector of fixed effects, Z is an (m x q) incidence matrix relating animals to phenotypes, u is a (q x 1) vector of additive polygenic effects, v is a (q x 1) vector of additive QTL effects, and e is a residual vector.
The random effects u, v, and e are assumed to be uncorrelated and distributed as multivariate normal densities as follows: u
Nq(0, A
2u), v
Nq(0, G
2v), and e
Nm(0, R
2e), where the scalar variances
2u,
2v, and
2e are the polygenic variance, the additive variance of the QTL, and the residual variance, respectively; A is the standard additive genetic relationship matrix; G is the (q x q) (co)variance matrix for the additive effects of the QTL conditional on marker information; and R is a known (m x m) diagonal matrix. If individuals i and j are noninbred, then gij
G represents the proportion of alleles IBD at the QTL. However, if i and j are inbred, gij is interpreted as twice the coefficient of coancestry for the QTL. The coefficient of coancestry is defined as the probability that two randomly drawn alleles from individuals i and j are IBD (![]()
![]()
When no QTL is assumed to be segregating in the population, the mixed linear model in matrix notation becomes
![]() |
(2) |
with u
Nq(0, A
2u) and e
Nm(0, R
2e).
Calculating the IBD probabilities for the G matrix:
In practice, QTL genotypes are unobservable. Instead, linked markers are genotyped and used to infer QTL IBD status. The marker information in complex pedigrees is often incomplete. Unknown linkage phases, noninformative markers, and/or missing marker genotypes complicate the calculation of G. Several methods for calculating IBD probabilities in complex pedigrees have been developed. These methods fall into one of three classesrecursive algorithms, correlation-based algorithms, or simulation-based algorithms.
Recursive algorithms:
Recursive algorithms to calculate IBD probabilities for a QTL's gametic relationship matrix were developed by ![]()
![]()
s=m,p
t=m,pgisjt, where s, t
{maternal (m), paternal (p)} and gisjt represents the probability of the sth parental gamete inherited from individual i being IBD to the tth parental gamete inherited from individual j. The calculation of the gametic IBD probabilities is based upon information from a single fully genotyped marker linked to a QTL. Extensions to linked phase-known marker data were made by ![]()
Recursive algorithms are an effective and economical way of calculating IBD probabilities given the availability of full marker information; however, this requirement is difficult to guarantee for complex pedigrees. ![]()
Correlation-based algorithms:
![]()
![]()
![]() |
(3) |
where B(r,
) is the correlation matrix between the proportion of alleles shared IBD at the fully genotyped marker and a putative QTL, r denotes the rth kinship relationship,
represents the Hadamard product, and GM is the (q x q) (co)variance matrix conditioned on and calculated at the marker M. ![]()
![]()
![]()
Simulation-based algorithms:
For pedigrees with incomplete marker information, direct application of recursive or correlation-based IBD algorithms is impossible. In this situation, G is often replaced by its expectation conditioned on the observed marker data (Mobs) such that
![]() |
(4) |
where
is a single phase-known marker configuration for the pedigree from the set of all possible marker configurations (
), G
is the (co)variance matrix for the QTL conditional on
, and Pr(
|Mobs) is the conditional probability of the complete marker configuration
given the observed data Mobs. The (co)variance matrix G
can now be estimated via one of the above IBD algorithms as if full marker data are available.
Calculating the expectation of G for pedigrees containing substantial missing data presents two computational challenges. First, the number of configurations in
is potentially large, thus the order of the summation in (4) makes the calculation infeasible. In practice, a Monte Carlo approximation is used (see ![]()
|Mobs) is intractable. Exact methods such as the ![]()
![]()
A plethora of MCMC algorithms have been developed for the exploration of
and thus approximation of Pr(
|Mobs). Among the simplest are the "single-site" approaches (![]()
![]()
stem from the observed marker data constraining the set of missing marker configurations. Not all marker configurations are consistent with Mendelian inheritance rules. Several more complex samplers (![]()
![]()
![]()
![]()
These difficulties prompted ![]()
![]()
). This then allowed Gs, the (co)variance matrix for a QTL conditioned on the segregation indicators, and Pr(s|Mobs) to be estimated where s
. Also the expectation of G can be easily calculated as E(G|Mobs) =
s
GsPr(s|Mobs). The space of segregation indicators is far less constrained than the space of missing genotypes, culminating in Monte Carlo chains with improved convergence and irreducibility properties. A single-site Metropolis-Hastings sampler was developed by ![]()
![]()
Segregation indicators and their use in estimating G:
Using notation consistent with ![]()
Consider the pedigree depicted in Fig 1, where ordered marker information is recorded on a single locus (i.e., x|y implies x is the allele inherited from the maternal parent and y is the allele inherited from the paternal parent). Shown are three different sets of segregation indicators consistent with the observed marker data and pedigree structure. These segregation indicators give possible allelic pathways through the pedigree. Since segregation events are not directly observable, several segregation patterns may be consistent for the same set of marker data.
|
By obtaining a large number of s with probability Pr(s|Mobs), these segregation indicators can be used to estimate IBD probabilities between any pair of individuals in the pedigree. For example, in the first and third segregation patterns in Fig 1, the maternal allele of individual 6 originates from (or is IBD to) the maternal allele of individual 1, while in the second segregation pattern, the maternal allele of individual 6 originates from (or is IBD to) the maternal allele of individual 2. Therefore, based upon these realizations, Pr(6m
2m|Mobs) =
and Pr(6m
1m|Mobs) =
, where
represents IBD.
Multiple-site segregation sampler:
A brief introduction to the multiple-site segregation sampler, as developed by ![]()
![]()
Very simply, the multiple-site segregation sampler is a cleverly designed Gibbs sampler (![]()
![]() |
(5) |
where si· = {Si
; l = 1, · · · , n} and Pr(si·|{sj·; j
i}, Mobs) is the probability of the segregation indicators across n loci at the ith segregation conditional on all other segregation indicators and observed marker data. Since the number of loci may be large, drawing realizations directly from the conditional distributions in (5) remains challenging. Therefore, ![]()
The first step involves moving through the genome, calculating locus by locus, cumulative probabilities for Sij. These probabilities are relatively easy to calculate recursively. Once all n cumulative probabilities have been obtained, the second step involves moving back down the genome, sampling Sij from a univariate density that is a function of the associated cumulative probability, the previous sampled segregation indicator (Si j+1), and the recombination rate between loci j and j + 1. In this way, si· can be sampled from its conditional distribution. By repeating these two steps for i = 1, ... , m, a realization from (5) is obtained.
Implementation of the multiple-site segregation sampler:
Implementation of the multiple-site segregation sampler is via an adapted version of the QTL mapping software Loki. Loki was originally designed for multipoint linkage analysis in general pedigrees using MCMC methods; however, it has since been modified for IBD probability calculation. The user supplies Loki with the pedigree structure, marker genotypes, marker positions, and QTL positions for which the IBD matrices are to be calculated. Dependent chains of IBD probabilities are then obtained for each QTL position. Convergence is determined by monitoring the IBD probabilities over the iteration number. Once the probabilities stabilize, the sampler is deemed to have reached convergence.
Two-step variance component approach:
The variance component approach to map QTL in complex pedigrees is composed of two distinct steps:
- Step 1. For each QTL position on the chromosomal segment, the (co)variance matrix for the QTL (i.e., G) is calculated.
- Step 2. For each position considered in step 1, construct the mixed linear models (1) and (2), obtain estimates of the parameters, and test for the presence of a QTL.
These steps are common to all interval mapping-based variance component methods; however, their implementation differs greatly among practitioners. For example, there are various approaches to calculating the G matrix that have already been discussed and there are numerous analytical and simulation techniques for estimating the parameters of a mixed linear model.
With regard to the implementation strategy adopted in this article, in step 1 the IBD probabilities for the G matrix are obtained via the multiple-site segregation sampler. In step 2 ASREML (![]()
Distribution of the test statistic:
Statistical theory states that log LR follows a
2 distribution with the degrees of freedom equal to the number of parameters being tested (![]()
![]()
When a single chromosomal position is being tested, log LR follows a 50:50 mixture distribution, where one mixture component is a peak at 0 and the other component is a
21 distribution (![]()
![]()
2 distribution with between 1 and 2 d.f. QTL detection, however, is often carried out over large chromosomal regions and even the entire genome. For these situations, the distribution of log LR under H0 is unclear.
Since this article deals with simulated data, it is possible to replicate data under the null hypothesis, construct the empirical distribution of log LR, and derive empirical threshold values in which to determine QTL presence. For real data, permutation methods (![]()
| SIMULATION STUDY |
|---|
The simulation study begins with the analysis of data generated under H0. By constructing a histogram of the test statistic over replicates, an empirical distribution of log LR is obtained. QTL presence at a chromosomal position is then determined in subsequent analyses by comparing the respective test statistic to the empirical threshold.
To investigate the performance of the two-step variance component method for mapping QTL in complex pedigrees, data are generated under four simulation setups. The first setup, referred to as the "benchmark" setup, involves the generation of fully genotyped, highly polymorphic marker data. A biallelic QTL that explains 10% of the total variation (i.e., h2v = 0.1) is segregating in the population. Setups A, B, and C then change a single feature of the benchmark setup, enabling the effect on the variance component method's performance to be assessed. The four setups considered in this study are summarized in Table 1 and are discussed in detail below together with the generation of data under H0.
|
Generation of data under H0:
Replicates are generated according to the benchmark setup and setup A (which are described below) but without a segregating QTL in the population. These two setups are equivalent except the benchmark setup is based upon a sheep pedigree where setup A is based upon a pig pedigree. This allows the sensitivity of the test statistic's distribution to changes in pedigree structure to be assessed.
Benchmark setup:
The pedigree structure is based upon a real pedigree created to explore copper deficiency in a selected sheep population. The original experiment contained over 2000 individuals; however, for the purposes of demonstrating the methodology, a subset of 500 individuals is selected. In reducing the pedigree's size, careful attention is given to maintaining the structure's original complex nature. The reduced pedigree consists of 269 related families spanning four generations with 1.8 offspring on average per mating. The pedigree structure contains no inbreeding.
The marker information consists of four polymorphic markers segregating with eight equally frequent alleles and placed on a chromosomal segment of length 60 cM at positions 0, 20, 40, and 60 cM. A biallelic QTL with alleles Q and q segregating at equal frequencies is then placed between the second and third markers at position 35 cM. If an individual inherits QQ from its parents, its phenotypic contribution due to the QTL is vi = m + a, where m = 0 and a = 13.5. If the individual's QTL genotype is heterozygous or qq, the individual's phenotypic contribution is vi = m + d or vi = m - a, respectively, where d = 0. ![]()
The polygenic contribution made by an individual is dependent upon the polygenic contributions of its parents and Mendelian sampling. Since the complete parentage of every individual in the pedigree is not available, ui is generated according to the number of known parents. If both parents are unknown (i.e., the individual is a founder), then ui
N(0,
2u). Suppose, however, the sire of the ith individual (si) is in the pedigree. Then ui
N(0.5usi, (0.75 - 0.25 fsi)
2u), where fsi is the inbreeding coefficient of si. A similar distribution is used to generate ui when only the individual's dam is in the pedigree. If both parents are present, ui
N(0.5(usi + uDi), 0.5(1 - 0.5(fsi + fDi)
2u)), where fDi is the inbreeding coefficient for the dam of individual i. The environmental error term ei is generated from N(0,
2e). Fixed effects are not generated; thus Xß in (1) and (2) equals µ, the overall mean.
The value of
2v is dependent upon m, a, d, and pQ (the Q allele frequency), where
2v = 2pQ(1 - pQ)a2 when d = 0 (![]()
2v, the size of the QTL is characterized by the amount of total variation explained by the QTL. To obtain a QTL explaining 10% of the total variation,
2u and
2e are set to 300.0 and 500.0, respectively. For these values, the heritability of the trait is 44%.
Setup A:
The ability of the variance component method to map QTL in a pedigree with large numbers of offspring per mating and inbreeding is investigated. The pedigree used in this study is again based upon a real structure originating from a Meishan pig experiment. The initial experiment contained
2500 related individuals, but for meaningful comparisons to be made with the benchmark analysis, 500 related individuals are selected. The average number of offspring per family is 14.3 across five generations of matings, consisting of 35 related families. The average inbreeding coefficient is 4.5% with a maximum inbreeding coefficient of 17%.
Setup B:
Complex pedigrees often contain individuals with missing marker genotypes. This missing information introduces uncertainty into the analysis. To better understand the ability of the methodology to cope with this uncertainty, two approaches to removing the marker information generated according to the benchmark setup are explored. The first approach is where 50% of the marker genotypes are randomly removed. The second approach removes only the marker information of offspring that are not themselves parents. This results in a 53% loss in marker information.
Setup C:
The final setup investigates the effect a reduction in marker informativeness has upon the analysis. Data are generated according to the benchmark setup except three alleles as opposed to eight alleles are segregating at the markers.
| RESULTS |
|---|
Results from the application of the two-step variance component method to replicated data generated under the above-described simulation study are now reported. Due to the analyses being computationally demanding, only every third centimorgan is tested for the presence of a QTL. A single analysis across the chromosome can take up to 56 min on a Compaq Professional Workstation XP1000 utilizing a single Alpha 21264 processor running at 500 MHz. Four parallel analyses per replicate are performed, where Loki and ASREML runs begin from different well-dispersed starting values. The empirical distributions of the test statistic, however, are created from the analysis of 500 replicated data sets; therefore only a single run is performed per replicate due to obvious computational constraints.
Construction of the empirical distribution of log LR under H0:
Fig 2A reveals close agreement between the empirical and theoretical (i.e., 50:50 mixture where one component mixture is a peak at 0 and the other is a
21) distributions of log LR when a single position on the chromosome is tested for QTL presence (i.e., the 35-cM position). This was found to hold regardless of the position being tested.
|
In Fig 2B, when a chromosome-wide QTL search is performed, the empirical distributions appear to follow a
21. However, the 5% threshold value obtained from
21 is less conservative than the empirical thresholds. The 5% empirical threshold values are 4.74 and 4.32 for the benchmark setup and setup A, respectively, where
21,0.05 = 3.84. Empirical thresholds are used throughout this article. It is interesting to note that the empirical distributions are relatively unaffected by changes in pedigree structure.
Benchmark setup:
The mean log LR profile over 50 replications of data generated under the benchmark setup is shown in Fig 3. The profile peaks between markers 2 and 3 at the position of the simulated QTL (i.e., 35 cM). The mean peak is well below the 5% empirical threshold; however, this result is slightly misleading. The peak of the mean profile is biased downward because the estimated position, and thus corresponding peak of the profile, varies across replicates. In fact, 48% of the analyses yield a test statistic along the chromosomal segment in excess of the 5% threshold.
|
The ability of the methodology to accurately estimate the parameters of interest can be gauged from the results presented in Table 2. The mean parameter estimates of h2v, h2u,
2e, and dQ, where dQ represents the location of the QTL in centimorgans, are close to the simulated values; the parameter estimates' standard deviations (SD) are reasonable; and the mean between-run variance is small. However, a more appropriate measure of accuracy is the mean bias. The mean bias, E(
-
), is defined as the expected value of the difference between the estimator (
) and the parameter's true value (
). For example, the mean bias of the estimate of h2v is E(
2v - h2v) =
, where (
2v)ij represents the REML estimate of h2v from the analysis of the ith replicate and the jth parallel run, and (h2v)i represents the true parameter value of h2v.
|
For the parameters h2u,
2e, and dQ in Table 2 the mean bias is small and negative, implying a slight underestimating of the parameters. For h2v, the mean bias of
0.04 suggests that the method tends to overestimate the amount of variation explained by the QTL. A similar result is evident in ![]()
Setup A:
Increasing the average family size has an obvious effect on the performance of the methodology as evidenced in Fig 3. With larger families, the peak of the log LR profile (based upon 50 replicates) increases from 3.9 to 11.0, where 82% of the analyses yield a log LR in excess of the 5% empirical threshold. Once again, the parameters are well estimated (see Table 3), with mean biases slightly smaller than the biases obtained under the benchmark setup.
|
Setup B:
In setup B, five patterns of missing marker data are analyzed. Patterns 14 correspond to the random removal of 50% of the marker genotypes while pattern 5 is obtained by only genotyping the parents, which constitutes a 53% loss in marker genotypes. The proportions of individuals in the pedigree with 0, 1, 2, 3, and 4 missing marker genotypes, for each pattern, are given in Table 4. Each pattern consists of 50 replicates. These replicates, before the marker data are removed, are the same as those replicates generated under the benchmark setup. Thus, differences between the results obtained under setup B and the benchmark setup can be directly attributed to the effect of missing marker information.
|
The mean log LR profiles for patterns 15 together with the mean log LR profile for the benchmark setup are shown in Fig 4. There are two points of interest to note with respect to this figure. First, the mean log LR profile for data with partially genotyped markers lies below the profile obtained with complete marker information. Less marker information introduces extra uncertainty into the analysis and the method's ability to detect QTL decreases. In fact, the percentages of analyses yielding a log LR in excess of the 5% empirical threshold are only 24, 36, 24, 28, and 20% for patterns 15, respectively, well below the 48% achieved when the same data contain completely genotyped individuals.
|
Second, no real difference exists between the performance of the method across patterns 14. However, the mean profile for pattern 5, where only the parents are genotyped, does appear to differ from the other log LR profiles.
The difference in the method's performance across the five patterns of missing marker data is further evidenced in Table 5. The SD, average between-run variance, and mean bias are marginally higher for patterns 14 than they are under the benchmark stepup (given in Table 2). For pattern 5, though, the method struggles to obtain reasonable parameter estimates, with the mean bias for h2v being more than double the mean bias obtained under the other patterns.
|
Setup C:
The impact of less informative markers on the ability of the variance component method to detect QTL is evident from Fig 5. The mean log LR profile is well below the 5% threshold with only 26% of the analyses yielding a significant peak log LR. The parameter estimates (shown in Table 6), however, are similar to the estimates obtained under the benchmark setup with highly informative markers. Thus, the use of less polymorphic markers imparts greater uncertainty into the detection of QTL as opposed to the estimation of QTL.
|
|
| DISCUSSION |
|---|
To date, several statistical approaches have been developed to map QTL in outbred livestock populations; however, these methods focus on granddaughter or half-sib designs and are not easily extended to more challenging pedigree structures. In this article, a two-step variance component method is presented that is capable of detecting and estimating QTL in populations with complex pedigrees and considerable missing marker information. The methodology is illustrated through its application to simulated sheep and pig populations.
By formulating the QTL mapping problem within a mixed linear model framework, a less parameterized statistical environment is obtained, reducing the computational burden of the analysis. The complex relationships that may exist between individuals are included within the model, leading to more accurate parameter inferences, and additional fixed and random effects can be easily incorporated into the analysis with minimal adjustment to the methodology.
For example, to simultaneously map two linked QTL, the mixed linear model becomes y = Xß + Zu + Zv1 + Zv2 + e, where v1 and v2 are the additive effects of the linked QTL. Analogous to the two-step process of mapping a single QTL, Loki is used to calculate the (co)variance matrices of v1 and v2 at two separate test positions along the chromosome. Estimates of the parameters are then obtained via ASREML and the test statistic for the presence of two linked QTL is constructed. This process is repeated for each pair of test positions on the chromosome, enabling multiple QTL to be detected and localized. Note, when two QTL are being mapped, the QTL profile is a two-dimensional surface.
A two-step process to estimating the variance components of a mixed linear model is not new per se. ![]()
![]()
![]()
A variety of algorithms are available for the calculation of REML estimates. Standard algorithms such as ASREML require the inverse of the QTL's (co)variance matrix, which is singular at marker loci. In this article, G and thus the test statistic are calculated at a position slightly to the right or left of the marker, an approach also adopted by I. HOESCHELE (personal communication). ![]()
![]()
|
Pivotal to the success of any interval mapping procedure is the calculation of an appropriate threshold value in which QTL are declared significant. The threshold value is dependent upon the distribution of the test statistic, which is known for a single point [i.e., 50:50 mixture where one component mixture is a peak at 0 and the other is a
21 (![]()
)% threshold value using the distribution of the test statistic at a single point but with the level of confidence adjusted to
/N% (i.e., the Bonferroni correction for multiple testing). See ![]()
![]()
Problems also surround the construction of confidence intervals for QTL position estimates. These problems are not unexpected given that the construction of such intervals is challenging in even simple pedigrees. For an approximate confidence interval, the LOD drop-off method could be employed and more accurate confidence intervals obtained under parametric and/or nonparametric bootstrapping methods. However, as with permutation testing, resampling for nonparametric bootstrapping methods may be difficult. Clearly, further research is needed to resolve these issues.
This article has been catalytic to initiating work in three further areas of research. First, the simulation study suggests partial marker information on most individuals is to be desired over having a mixture of fully genotyped and completely ungenotyped individuals. This is currently being explored in greater detail for a range of missing marker scenarios. Second, a new recursive algorithm to calculate IBD probability in complex pedigrees has been developed and is currently being tested. Third, the methodology is to be applied to the analysis of real sheep and beef cattle data.
| ACKNOWLEDGMENTS |
|---|
The authors thank Simon Heath for his many useful comments and fine-tuning of Loki. This work was partly supported by a Biotechnology and Biological Sciences Research Council award.
Manuscript received May 12, 2000; Accepted for publication July 27, 2000.
| LITERATURE CITED |
|---|
ALMASY, L. and J. BLANGERO, 1998 Multipoint quantitative-trait linkage analysis in general pedigrees. Am. J. Hum. Genet. 62:1198-1211[Medline].
AMOS, C. I., 1994 Robust variance-components approach for assessing genetic linkage in pedigrees. Am. J. Hum. Genet. 54:535-543[Medline].
CANNINGS, C., E. A. THOMPSON, and M. H. SKOLNICK, 1978 Probability functions on complex pedigrees. Adv. Appl. Prob. 10:26-61.
CHURCHILL, G. and R. DOERGE, 1994 Empirical threshold values for quantitative trait mapping. Genetics 138:963-971[Abstract].
ELSTON, R. C. and J. STEWART, 1971 A general model for the genetic analysis of pedigree data. Hum. Hered. 21:523-542[Medline].
FALCONER, D. S., 1989 Introduction to Quantitative Genetics, Ed. 3. Longman Scientific and Technical, Essex, UK.
FERNANDO, R. L. and M. GROSSMAN, 1989 Marker assisted selection using best linear unbiased prediction. Genet. Sel. Evol. 21:467-477.
FULKER, D. W., S. S. CHERNY, and L. R. CARDON, 1995 Multipoint interval mapping of quantitative trait loci, using sib pairs. Am. J. Hum. Genet. 56:1224-1233[Medline].
GEMAN, S. and D. GEMAN, 1984 Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6:721-741.
GEORGE, A. W., K. L. MENGERSEN, and G. P. DAVIS, 2000 Localisation of a quantitative trait locus via a Bayesian approach. Biometrics 56:40-51[Medline].
GEYER, C. J. and E. A. THOMPSON, 1995 Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Am. Stat. Assoc. 90:909-920.
GILMOUR, A. R., B. R. CULLIS, S. J. WELHAM and R. THOMPSON, 1998 ASREML. Program User Manual. Orange Agricultural Institute, New South Wales, Australia.
GRIGNOLA, F. E., I. HOESCHELE, and B. TIER, 1996a Mapping quantitative trait loci in outcross populations via residual maximum likelihood. I. Methodology. Genet. Sel. Evol. 28:479-490.
GRIGNOLA, F. E., I. HOESCHELE, Q. ZHANG, and G. THALLER, 1996b Mapping quantitative trait loci in outcross populations via residual maximum likelihood. I. A simulation study. Genet. Sel. Evol. 28:491-504.
GRIGNOLA, F. E., Q. ZHANG, and I. HOESCHELE, 1997 Mapping linked quantitative trait loci via residual maximum likelihood. Genet. Sel. Evol. 29:529-544.
HEATH, S. C., 1997 Markov chain Monte Carlo segregation and linkage analysis for oligogenic models. Am. J. Hum. Genet. 61:748-760[Medline].
HOESCHELE, I., 1993 Elimination of quantitative trait loci equations in an animal model incorporating genetic marker data. J. Dairy Sci. 76:1693-1713[Abstract].
HOESCHELE, I., P. UIMARI, F. E. GRIGNOLA, Q. ZHANG, and K. M. GAGE, 1997 Advances in statistical methods to map quantitative trait loci in outbred populations. Genetics 147:1445-1457[Abstract].
JANSEN, R. C., D. L. JOHNSON, and J. A. M. VAN ARENDONK, 1998 A mixture model approach to the mapping of quantitative trait loci in complex populations with an application to multiple cattle families. Genetics 148:391-399
JAYAKAR, S. D., 1970 On the detection and estimation of linkage between a locus influencing a character and a marker locus. Biometrics 26:451-464[Medline].
JENSEN, C. S. and H. SHEEHAN, 1998 Problems with determination of noncommunicating classes for Monte Carlo Markov chain applications in pedigree analysis. Biometrics 54:416-425[Medline].
KNOTT, S. A., J. M. ELSEN, and C. S. HALEY, 1996 Methods for multiple-marker mapping of quantitative trait loci in half-sib populations. Theor. Appl. Genet. 93:71-80.
LANDER, E. and L. KRUGLYAK, 1995 Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat. Genet. 11:241-247[Medline].
LIN, S., E. THOMPSON, and E. WIJSMAN, 1993 Achieving irreducibility of the Markov chain Monte Carlo method applied to pedigree data. IMA J. Math. Appl. Med. Biol. 10:1-17
LIN, S., E. THOMPSON, and E. WIJSMAN, 1994 Finding noncommunicating sets for Markov chain Monte Carlo estimations on pedigrees. Am. J. Hum. Genet. 54:695-704[Medline].
LUND, M., and C. S. JENSEN, 1998 Multivariate updating of genotypes in a Gibbs sampling algorithm in the mixed inheritance model. Proceedings of the 6th World Congress on Genetics Applied to Livestock Production Science, Vol. 25. Armidale, Australia, pp. 521524.
MALÉCOT, G., 1948 Les Mathématiques de l'Hérédité. Masson et Cie, Paris.
SELF, S. G. and K. Y. LIANG, 1987 Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Stat. Assoc. 82:605-610.
SHEEHAN, N., 1990 Genetic restoration on complex pedigrees. Ph.D. Thesis, University of Washington, Seattle.
STRAM, D. O. and J. W. LEE, 1994 Variance component testing in the longitudinal mixed effects model. Biometrics 50:1171-1177[Medline].
THOMPSON, E. A., 1994 Monte Carlo estimation of multilocus autozygosity probabilities. Proceedings of the 1994 Interface Conference, Fairfax Station, VA, pp. 498506.
THOMPSON, E. A., and S. C. HEATH, 1999 Estimation of conditional multilocus gene identity among relatives, pp. 95113 in Statistics in Molecular Biology, edited by F. SEILLIER-MOSEIWITCH, P. DONNELLY and M. WATERMAN. Springer-Verlag IMS lecture note series, New York.
UIMARI, P. and I. HOESCHELE, 1997 Mapping two linked quantitative trait loci using Bayesian analysis and Markov chain Monte Carlo algorithms. Genetics 146:735-743[Abstract].
VAN ARENDONK, J. A. M., B. TIER, and B. P. KINGHORN, 1994 Use of multiple genetic markers in prediction of breeding values. Genetics 137:319-329[Abstract].
VISSCHER, P. M., C. S. HALEY, S. C. HEATH, W. J. MUIR, and D. H. R. BLACKWOOD, 1999 Detecting QTLs for uni and bipolar disorder using a variance component method. Psych. Genet. 9:75-84.
WANG, T., R. L. FERNANDO, S. VAN DER BEEK, M. GROSSMAN, and J. A. M. VAN ARENDONK, 1995 Covariance between relatives for a marked quantitative trait locus. Genet. Sel. Evol. 27:251-274.
WILKS, S. S., 1938 The large sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Stat. 9:60-62.
XIE, C., D. G. GESSLER, and S. XU, 1998 Combining different line crosses for mapping quantitative trait loci using the identical by descent-based variance component method. Genetics 149:1139-1146
XU, S. and W. R. ATCHLEY, 1995 A random model approach to interval mapping of quantitative trait loci. Genetics 141:1189-1197[Abstract].
This article has been cited by other articles:
![]() |
J. Hernandez-Sanchez, J.-A. Grunchec, and S. Knott A web application to perform linkage disequilibrium and linkage analyses on a computational grid Bioinformatics, June 1, 2009; 25(11): 1377 - 1383. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. H. Sneller, D. E. Mather, and S. Crepieux Analytical Approaches and Population Types for Finding and Utilizing QTL in Complex Plant Populations Crop Sci., March 17, 2009; 49(2): 363 - 380. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Uemoto, Y. Nagamine, E. Kobayashi, S. Sato, T. Tayama, Y. Suda, T. Shibata, and K. Suzuki Quantitative trait loci analysis on Sus scrofa chromosome 7 for meat production, meat quality, and carcass traits within a Duroc purebred population J Anim Sci, November 1, 2008; 86(11): 2833 - 2839. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Sandor and M. Georges On the Detection of Imprinted Quantitative Trait Loci in Line Crosses: Effect of Linkage Disequilibrium Genetics, October 1, 2008; 180(2): 1167 - 1175. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. D. Daetwyler, F. S. Schenkel, M. Sargolzaei, and J. A. B. Robinson A Genome Scan to Detect Quantitative Trait Loci for Economically Important Traits in Holstein Cattle Using Two Methods and a Dense Single Nucleotide Polymorphism Map J Dairy Sci, August 1, 2008; 91(8): 3225 - 3236. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Kolbehdari, Z. Wang, J. R. Grant, B. Murdoch, A. Prasad, Z. Xiu, E. Marques, P. Stothard, and S. S. Moore A Whole-Genome Scan to Map Quantitative Trait Loci for Conformation and Functional Traits in Canadian Holstein Bulls J Dairy Sci, July 1, 2008; 91(7): 2844 - 2856. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sargolzaei, F. S. Schenkel, G. B. Jansen, and L. R. Schaeffer Extent of Linkage Disequilibrium in Holstein Cattle in North America J Dairy Sci, May 1, 2008; 91(5): 2106 - 2117. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Druet, S. Fritz, M. Boussaha, S. Ben-Jemaa, F. Guillaume, D. Derbala, D. Zelenika, D. Lechner, C. Charon, D. Boichard, et al. Fine Mapping of Quantitative Trait Loci Affecting Female Fertility in Dairy Cattle on BTA03 Using a Dense Single-Nucleotide Polymorphism Map Genetics, April 1, 2008; 178(4): 2227 - 2235. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. E. Meuwissen and M. E. Goddard Multipoint Identity-by-Descent Prediction Using Dense Markers to Map Quantitative Trait Loci and Estimate Effective Population Size Genetics, August 1, 2007; 176(4): 2551 - 2560. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. F. McRae, N. A. Matigian, L. Vadlamudi, J. C. Mulley, B. Mowry, N. G. Martin, S. F. Berkovic, N. K. Hayward, and P. M. Visscher Replicated effects of sex and genotype on gene expression in human lymphoblastoid cell lines Hum. Mol. Genet., February 15, 2007; 16(4): 364 - 373. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Buitenhuis, M. S. Lund, J. R. Thomasen, B. Thomsen, V. H. Nielsen, C. Bendixen, and B. Guldbrandtsen Detection of Quantitative Trait Loci Affecting Lameness and Leg Conformation Traits in Danish Holstein Cattle J Dairy Sci, January 1, 2007; 90(1): 472 - 481. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Druet, S. Fritz, D. Boichard, and J. J. Colleau Estimation of genetic parameters for quantitative trait Loci for dairy traits in the French holstein population. J Dairy Sci, October 1, 2006; 89(10): 4070 - 4076. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Lee and J. H. J. Van der Werf Using Dominance Relationship Coefficients Based on Linkage Disequilibrium and Linkage With a General Complex Pedigree to Increase Mapping Resolution Genetics, October 1, 2006; 174(2): 1009 - 1016. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Lee and J. H. J. Van der Werf Simultaneous Fine Mapping of Multiple Closely Linked Quantitative Trait Loci Using Combined Linkage Disequilibrium and Linkage With a General Pedigree Genetics, August 1, 2006; 173(4): 2329 - 2337. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Macgregor, S. A. Knott, I. White, and P. M. Visscher Quantitative Trait Locus Analysis of Longitudinal Quantitative Trait Data in Complex Pedigrees Genetics, November 1, 2005; 171(3): 1365 - 1376. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. F. McRae, J. M. Pemberton, and P. M. Visscher Modeling Linkage Disequilibrium in Natural Populations: The Example of the Soay Sheep Population of St. Kilda, Scotland Genetics, September 1, 2005; 171(1): 251 - 258. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Gao and I. Hoeschele Approximating Identity-by-Descent Matrices Using Multiple Haplotype Configurations on Pedigrees Genetics, September 1, 2005; 171(1): 365 - 376. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. C Thomas The estimation of genetic relationships using molecular markers and their efficiency in estimating heritability in natural populations Phil Trans R Soc B, July 29, 2005; 360(1459): 1457 - 1467. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-M. Zhang, Y. Mao, C. Xie, H. Smith, L. Luo, and S. Xu Mapping Quantitative Trait Loci Using Naturally Occurring Genetic Variance Among Commercial Inbred Lines of Maize (Zea mays L.) Genetics, April 1, 2005; 169(4): 2267 - 2275. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Varona, O. Vidal, R. Quintanilla, M. Gil, A. Sanchez, J. M. Folch, M. Hortos, M. A. Rius, M. Amills, and J. L. Noguera Bayesian analysis of quantitative trait loci for boar taint in a Landrace outbred population J Anim Sci, February 1, 2005; 83(2): 301 - 307. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Szyda, Z. Liu, F. Reinhardt, and R. Reents Estimation of Quantitative Trait Loci Parameters for Milk Production Traits in German Holstein Dairy Cattle Population J Dairy Sci, January 1, 2005; 88(1): 356 - 367. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Crepieux, C. Lebreton, B. Servin, and G. Charmet Quantitative Trait Loci (QTL) Detection in Multicross Inbred Designs: Recovering QTL Identical-by-Descent Status Information From Marker Data Genetics, November 1, 2004; 168(3): 1737 - 1749. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. de Koning, R. Pong-Wong, L. Varona, G. J. Evans, E. Giuffra, A. Sanchez, G. Plastow, J. L. Noguera, L. Andersson, and C. S. Haley Full pedigree quantitative trait locus analysis in commercial pigs using variance components J Anim Sci, September 1, 2003; 81(9): 2155 - 2163. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Freyer, P. Sorensen, C. Kuhn, R. Weikard, and I. Hoeschele Search for Pleiotropic QTL on Chromosome BTA6 Affecting Yield Traits of Milk Production J Dairy Sci, March 1, 2003; 86(3): 999 - 1008. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Slate, P. M. Visscher, S. MacGregor, D. Stevens, M. L. Tate, and J. M. Pemberton A Genome Scan for Quantitative Trait Loci in a Wild Population of Red Deer (Cervus elaphus) Genetics, December 1, 2002; 162(4): 1863 - 1873. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Perez-Enciso, A. Clop, J. M. Folch, A. Sanchez, M. A. Oliver, C. Ovilo, C. Barragan, L. Varona, and J. L. Noguera Exploring Alternative Models for Sex-Linked Quantitative Trait Loci in Outbred Populations: Application to an Iberian x Landrace Pig Intercross Genetics, August 1, 2002; 161(4): 1625 - 1632. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by George, A. W.
- Articles by Haley, C. S.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by George, A. W.
- Articles by Haley, C. S.





, a female;
, a male; and , the mating of two individuals. Individuals 4 and 7 have missing marker genotypes. The marker genotypes are ordered such that x|y signifies that allele x has been inherited from the maternal parent and allele y has been inherited from the paternal parent. Three segregation patterns (i.e., s1, s2, s3) consistent with the marker data are shown. These segregation patterns are vectors of segregation indicators and they give the possible allelic pathways through the pedigree. For example, the first segregation pattern for individual 3 infers that this individual inherited its mother's paternal allele and its father's maternal allele. Note that 1|1 is not a valid set of segregation indicators for individual 3 since the paternal marker allele of 2 is not passed on to individual 3.

,
, 
and
, marker location;
, the simulated position of the QTL.








