Abstract
Augmentation of marker genotypes for ungenotyped individuals is implemented in a Bayesian approach via the use of Markov chain Monte Carlo techniques. Marker data on relatives and phenotypes are combined to compute conditional posterior probabilities for marker genotypes of ungenotyped individuals. The presented procedure allows the analysis of complex pedigrees with ungenotyped individuals to detect segregating quantitative trait loci (QTL). Allelic effects at the QTL were assumed to follow a normal distribution with a covariance matrix based on known QTL position and identity by descent probabilities derived from flanking markers. The Bayesian approach estimates variance due to the single QTL, together with polygenic and residual variance. The method was empirically tested through analyzing simulated data from a complex granddaughter design. Ungenotyped dams were related to one or more sons or grandsires in the design. Heterozygosity of the marker loci and size of QTL were varied. Simulation results indicated a significant increase in power when ungenotyped dams were included in the analysis.
RECENT advances in molecular genetics technology have led to the availability of moderate resolution genetic marker maps for plant and livestock species (e.g., Barendseet al. 1994). Animal and plant breeders are currently using these genetic markers to identify chromosomal regions containing quantitative trait loci (QTL; e.g., Patersonet al. 1988; Stuberet al. 1992; Anderssonet al. 1994; Georgeset al. 1995). The power of QTL detection is an important factor in the analysis of experiments, that is, to maximize the chance of detecting QTL and minimize the risk of falsepositives.
Weller et al. (1990) outlined the granddaughter design to map QTL in dairy cattle. In this design, marker genotypes are determined for grandsires and their sons (paternal half sibs), and quantitative trait phenotypes are measured on daughters of sons. This scheme capitalizes on the existing structure in dairy cattle populations and minimizes the number of marker genotypes needed for a given power of detection (Welleret al. 1990). Traditional methods such as (multiple) linear regression and maximum likelihood interval mapping assume unrelated grandsire families and only two generations of genotyped individuals. However, relationships between families, such as related grandsires and maternal grandsons frequently occur in outbred populations. Furthermore, available data may involve multiple generations of genotyped or phenotyped individuals. Exploiting all relationships between individuals and all information collected over generations seems a very appropriate approach to increase power of QTL detection.
Parameter estimation in complex animal (and plant) breeding pedigrees may be tackled by Bayesian analysis, and a comprehensive overview is given by Wang (1998). In Bayesian analysis, prior assumptions and the likelihood of the data at hand form the joint posterior density of all unknown variables in a model underlying the observed phenotypes. Markov chain Monte Carlo (MCMC) methods provide means for exploration of complex nonstandard joint densities, and marginal posterior densities of parameters of interest can be approximated. There are a variety of techniques for their implementation (Gelfand 1994) of which Gibbs sampling (Geman and Geman 1984) is the most commonly used. Bayesian linkage analysis in combination with MCMC methods have been applied in human genetics (e.g., Thomas and Cortessis 1992; Heath 1997a), in plant genetics (e.g., Satagopanet al. 1996; Sillanpää and Arjas 1998), and in animal genetics (e.g., Thaller and Hoeschele 1996a; Uimariet al. 1996; Hoescheleet al. 1997).
A second assumption in methods currently employed for QTL linkage analysis of halfsib or fullsib designs, is that all individuals have observed marker genotypes. The incompleteness of marker data may be due to genotyping expenses or lack of DNA. This has hampered the implementation of a full pedigree evaluation in QTL mapping. Augmentation of missing genotypes via the Gibbs sampler has been suggested (e.g., Thomas and Cortessis 1992). However, when genotypes are missing on parents and the locus has more than two alleles, the Gibbs sampler may be theoretically reducible, i.e., not be able to reach all permissible genotypes from the starting configuration (e.g., Sheehan and Thomas 1993). This reducibility problem does not occur if at least one parent has observed marker genotypes, which may hold for dairy cattle data, where semen of sires is stored and available for DNA typing. In this study, we concentrate on this latter situation.
In this study a Bayesian approach is presented that estimates variance due to a single QTL, together with polygenic and residual variances, allowing ungenotyped individuals. We adapt the method of Jansen et al. (1998) to describe marker information on an individual in terms of allelic constitution of its homologues and identity by descent (IBD) values. We extend the genotype sampling approach of Bink et al. (1998a) from single marker to multiple linked markers. The approach described will be evaluated by the analysis of simulated data from a granddaughter design with many maternal ties between sons, and between sires and sons. Emphasis is on the accuracy of estimates of dispersion parameters. The position of the QTL relative to multiple linked markers is fixed in this study, and possibilities to estimate this parameter are discussed. We also discuss an extension of our approach to pedigrees with no restrictions on incompleteness of marker data.
MATERIALS AND METHODS
Marker genotypes: Consider a q member population on which marker scores are observed. Let g_{i} denote the ith individual's genotype at all marker loci (excluding the QTL genotype). The genotype g includes full multilocus information about alleles and their IBD pattern, but this information can be observed only partially. For each possible genotypic configuration g on the population (that is, being consistent with observed marker scores) a scalar probability of occurrence may be calculated. The number of possible genotypic configurations exponentially increases when considering marker data on many individuals for many marker loci and containing many missing marker scores. The Gibbs sampler has been successfully used to explore a large number of genotypic configurations and their probability of occurrence (e.g., Guo and Thompson 1992; Jansset al. 1995). Jansen et al. (1998) introduced different descriptions of the genotype of founders (that is, individuals with both parents unknown) and nonfounders in the population. They specified the genotypic state of any founder by the alleles at each of its homologues, and they expressed the state of any nonfounder by IBD values indicating parental origin of its alleles.
For illustration, consider the small pedigree in Table 1. Two founder individuals (individual 1 and 2) have observed marker scores and the linkage phase is assumed to be known for convenience (it limits the number of genotypic configurations that are consistent with observed marker scores). Marker alleles of these individuals are arbitrarily assigned to their first and second homologues, where first and second correspond to paternally and maternally inherited gametes, respectively. Note that these individuals do not have IBD values because their parents are unknown. On the basis of observed marker scores, three genotypes are allowed for the ungenotyped nonfounder (individual 3). For completeness, alleles of nonfounders' homologues are also given. Marker data may provide full information on the IBD pattern. For example, the allele a at marker 1 (and 2) for individual 4 is identical by descent to the first allele in its sire, and the IBD value equals 1. More often the IBD patterns are not constant, due to allelic switches in parent or offspring. For example, the IBD values for the allele c at marker 1 for individual 4 depend on the genotypes of individual 3. Note that for a homozygous parent, the IBD value of alleles transmitted to its offspring can be either 1 or 2.
The major advantage of the approach of Jansen et al. (1998) is that in each state of the Markov chain, each marker is informative for each offspring. This means that one can directly use information on the closest flanking markers and one does not have to search for informative segregations. Uncertainty about transmission of alleles is incorporated into the analysis by updating allelic constitution of genotypes in founders and by updating the IBD pattern for nonfounders, as is described later.
QTL model: In animal genetics models, allelic effects at the QTL in an outbred population may be represented by a normally distributed random effect, where covariances between allelic effects depend on identical by descent probabilities that are derived from marker information (Fernando and Grossman 1989; Van Arendonket al. 1994; Wanget al. 1995). Let v denote the vector of additive effects of QTL alleles, containing 2q elements for q individuals. That is, two unique QTL allelic effects are fitted for each individual. For individual i, let ν^{p}_{i} and ν^{m}_{i} denote the paternally and maternally inherited QTL allele, respectively. Let P(a ≡ b) denote the probability that alleles a and b are identical by descent. Then we can write
Let G denote the gametic relationship matrix for the QTL (2q × 2q), where the (i,j) element represents the probability of QTL allele i being identical by descent to QTL allele j. Then, the conditional density of v can be given as
Updating marker genotypes: Three classes of individuals are distinguished when updating genotypic information: (1) genotyped founders (with offspring); (2) genotyped nonfounders; and (3) ungenotyped parents (ungenotyped nonparents are not considered). Examples in Table 1 of each category are individuals 1 and 2, individuals 4, 5, and 6, and individual 3, respectively. The sampling of genotypes is described for each of these categories in the subsequent section. Note that when sampling genotypes for markers flanking the QTL, observed trait phenotypes are taken into account via the individuals' QTL effects.
Category 1: genotyped founders: To take all possible linkage phases in the genotypes of genotyped founders into account, linkage phases are sampled interval by interval and founder by founder, as suggested by Jansen et al. (1998). For a particular set of two neighboring markers, e.g., j and (j + 1), one can use information on the individual, its mates, and offspring to calculate the conditional probabilities for two options, “phase switch” and “no phase switch,” and subsequently sample one of the options. In a phase switch, the distal part of its homologue 1 (marker j + 1 to end) is attached to the proximal part of homologue 2 (map origin to marker j) and vice versa. Also, the IBD values at the distal part of the chromosome in its offspring are switched (1 becomes 2 and vice versa).
Updating of linkage phase for the marker interval containing the QTL actually involves two interval updates, i.e., the interval “left flanking marker—QTL” and “QTL—right flanking marker.” The conditional probabilities of the two linkage phases now also include information from the random QTL, using Equation 5 (the QTL has no IBD patterns). For the left interval, the phase switch option involves a switch in founder QTL effects. This affects the computation of Equation 5, and in a phase switch the founder QTL effects do switch (nothing changes for the QTL effects in its offspring). For the right interval, order of QTL effects within a founder is unaffected.
Category 2: genotyped nonfounders: To generate complete genotypes of nonfounders, one can sample a new IBD pattern given genotype of parents. This can be done individual by individual and marker locus by marker locus. If we update the IBD at a certain marker locus, then the two flanking marker loci (with “known” IBD) are fully informative and no other marker loci are needed. We consider at most four IBD patterns (two per known parent), discarding the ones inconsistent with the individual's marker score. The IBD values of the individual's offspring are used when one of the consistent IBD patterns for the individual involves an allelic switch in the individual. When only one parent is known, population allelic frequencies are used. When the individual's alleles are switched (heterozygous), its offspring's IBD values are switched as well (1 becomes 2 and vice versa).
When a marker flanks the QTL, the conditional probabilities include information on the QTL effects of the individual and its parents (and also of its offspring if one of the consistent IBD patterns causes an allele switch within the individual) by using Equation 5 for each consistent IBD pattern.
Category 3: ungenotyped parents: This is the most complicated category because genotypes cannot be updated individual by individual. To illustrate this, suppose a sire with genotype a/b, an ungenotyped dam, and their two offspring (g_{o1} = a/b, g_{o2} = a/c). Starting with g_{d} = b/c, the first offspring will have a/b, i.e., the aallele at its paternal homologue and the ballele at its maternal homologue. Then, updating individual by individual will not allow a switch to the configuration g_{d} = a/c that would be consistent with the first offspring having b/a instead of a/b. To avoid this problem, we update an ungenotyped parent and its offspring in a block, allowing an allelic switch in the offspring. This allelic switch needs of course to be consistent with the other parent's marker genotype. The genotype for the ungenotyped parent is sampled from its marginal (with regard to its offspring) distribution, and the IBD of its offspring is subsequently updated from its full conditional (with regard to the parent) distribution. Updates are done marker locus by marker locus. When one or both parents (of the ungenotyped parent) are unknown, the conditional probabilities also involve population allelic frequencies. Note that for an augmented homozygous genotype, the offspring's IBD value may equal 1 or 2 and both values are taken into account. This also holds for an augmented heterozygous genotype when parent and offspring have the same alleles. When a marker flanks the QTL, the conditional probabilities include information from the QTL using Equation 5. After updating an ungenotyped parent, its genotyped offspring are updated (as described under category 2).
Allele frequencies: The allelic frequencies at a particular marker locus in a population are likely unknown and can be treated as such. Let η_{mi} denote the counts of allele i at marker locus m at “founder” homologues, i.e., homologues of founders plus the nonparental homologue of nonfounders with only one parent identified. Then, allelic frequencies at each marker locus can be sampled from a Dirichlet distribution with parameters η_{mi} + 1 (for Dirichlet distribution, see Gelmanet al. 1995, p. 482).
Mixed linear model: Let b be a vector of fixed effects, and let u be a q × 1 vector of residual additive (polygenic) effects (not linked to the marker linkage group under consideration). Then the model underlying the N phenotypes is given as
The model is parameterized in terms of the heritability,
In the remainder of the article γ is referred to as proportion of QTL. In this study, the QTL position relative to the origin of the marker map is assumed known, but this assumption may be removed as shown by Bink et al. (1998c).
Prior knowledge on dispersion parameters: Different priors may be useful to explore the amount of information coming from the data for a particular parameter in the model. In a previous study, Bink et al. (1998b) showed that the posterior density of γ was clearly affected by using different beta distributions to represent prior knowledge on the proportion of QTL (γ), indicating lack of information on γ from the data. In this study, two beta distributions are considered to represent prior knowledge on γ. A beta (1,1) prior is uniform between 0 and 1 with mean equal to 0.5, and is denoted UNIFORM. A beta (1,9) prior has the mode at zero with mean equal to 0.10 and is denoted PEAKED AT ZERO. On the basis of Bink et al. (1998b), priors on h^{2} and
Implementation of MCMC sampling: Bayesian inferences about the parameters are here computed using the Gibbs sampler and the Metropolis Hastings (MH) algorithm (Metropoliset al. 1953; Hastings 1970) on the basis of the joint posterior distribution of the missing data and the parameters given the observed data (y) and marker data (m). The missing data are the fixed effects (b), the random QTL (v) and polygenic (u) effects, and marker genotypes (i.e., linkage phase between
alleles at the markers and marker scores for ungenotyped individuals). Now let θ denote {b,u,v,h^{2},γ,
To reduce the number of genetic effects (polygenic and QTL) that must be sampled (in a granddaughter design), a reduced animal model (RAM; Quaas and Pollak 1980) is used. That is, the genetic effects of ungenotyped granddaughters are absorbed into the parental genetic effects, as described by Bink et al. (1998b).
The sampling distributions for all elements in θ are similar to those in Bink et al. (1998b). For parameters b, u, and v, the full conditional densities are normals and values are drawn by using the Gibbs sampler. A scalarwise sampling strategy may lead to slow convergence of the Markov chain (Smith and Roberts 1993), especially when elements in θ are highly correlated. A fullblock sampling strategy, i.e., sampling all correlated elements in θ at once, may improve convergence significantly (Liuet al. 1994), but may also be hard to implement in animal breeding applications (GarciaCortes and Sorensen 1996). Within the RAM, block sampling as suggested by Janss et al. (1995) is applied to polygenic effects of grandsires together with those of their sons. Similarly, block sampling is applied, within the RAM, to the QTL effects of grandsires together with the paternally derived QTL effects in their sons and also to the QTL effects of elite dams together with maternally derived QTL effects of their sons. First a new realization is drawn for the parental effect from the reduced conditional density, after absorption of genetic effects of sons. Second, new realizations are drawn for the sons, conditional on the new value of the parental genetic effect.
The full conditional density for
Data simulation: In this study, we simulated the segregation of a QTL in a granddaughter design. The pedigree material consisted of 20 unrelated grandsires, 400 elite dams, and 800 sons, equally distributed over the 20 grandsires. Two hundred elite dams were daughters of randomly assigned grandsires and the remaining 200 were unrelated to the grandsires. There were no maternal relationships between dams. Dams may have 1, 2, 3, 4, 5, or 6 sons with probability 0.50, 0.25, 0.10, 0.075, 0.050, and 0.025, respectively (relaxing fixed probabilities, a truncated Poisson distribution may apply). Mating of dams with grandsires was at random, but fatherdaughter matings were avoided. As a result of this strategy ∼300 dams are related to at least two males in the pedigree (e.g., multiple sons and/or grandsire). About 400 sons are also maternal grandsons of grandsires. These numbers approximately reflect a Dutch granddaughter experiment design as described by Spelman et al. (1996). Polygenic and QTL effects for grandsires and founder dams were sampled from
For each individual a 100cM chromosome was simulated with six markers at 20cM intervals. The position of the QTL was 30 cM from the origin of the linkage group. Each marker contained either two (low informative markers) or four (high informative markers) alleles with equal frequencies, assuming HardyWeinberg equilibrium within marker alleles and linkage equilibrium between alleles of different markers (Table 2).
Approaches to analyze data: Marker data in a granddaughter design typically comprise marker genotypes for grandsires and their sons. Three different approaches for analysis are presented in Table 3. The first approach (denoted PAT_RLT) considers only paternal relationships between males in the pedigree, all with marker genotypes. The second approach (denoted ALL_RLT) considers all relationships between individuals in the pedigree, and allows ungenotyped parents (dams) with the condition that all their mates (grandsires) have marker genotypes observed. The third approach (denoted ALL_GTP) also considers all relationships, as in ALL_RLT, but all dams had observed marker genotypes. This third approach was included as a control for two reasons, first to verify whether the results from approach ALL_RLT made sense and second whether approach ALL_RLT could compete with a situation where dams were genotyped.
Post MCMC analysis, Bayesian inferences: For each parameter an effective sample size (ES) was computed that estimates the number of independent samples with information content equal to that of the dependent samples (Sorensenet al. 1995). From the Bayesian perspective, inference about parameter vector θ can be addressed via the posterior density p(θ  y). The highest posterior density (HPD) region attempts to capture a comparatively small region of the parameter space that contains most of the mass of the posterior distribution (Tanner 1993). We compute a 90% HPD region (HPD90). The null hypothesis that γ = 0—the QTL explains no genetic variance—was tested via a posterior odds ratio {mode{p(γ)}/f (0)}, where f (0) is max[p(γ = 0  y), 0.001], with a critical value of 20 (Jansset al. 1995). In the results section the natural log [ln(odds)] of the posterior odds ratio is given and the critical value then equals 3.0. Note that for both priors on γ used in this study, UNIFORM and PEAKED AT ZERO, the prior odds ratio equals one.
RESULTS
Running the MCMC sampler: The MCMC sampler was run for 100,000 cycles preceded by a burnin period of 500 cycles. Each 250th sample was stored for further analysis. This chain length proved to be sufficient to obtain at least 100 effective samples (Sorensenet al. 1995) in most runs. When the effective sample size was <75, the particular replicate was repeated with a different seed and this procedure was sufficient to obtain enough effective samples. Among all parameters, lowest effective sample sizes were found for parameter γ, indicating that estimating this parameter is most difficult. Effective sample sizes decreased for smaller QTL and for lower informative markers (Table 4). The prior density of γ did not seriously affect the effective sample size (Table 4). The MCMC sampler was run on a HP 9000 K260 server, and computing times of a single chain for approach PAT_RLT, ALL_RLT, and ALL_GTP were 23 min, 2 hr 12 min, and 1 hr 1 min, respectively. This indicates that updating the marker haplotypes and IBD patterns for ungenotyped individuals was the most time consuming part of the MCMC sampler.
Parameter estimates: Heritability: In all replicates, estimates for parameters h^{2} and
Small QTL, high informative markers: The marginal posterior density was flatter and shifted toward the mean of the UNIFORM prior (0.5), when using only paternal relationships compared to using all relationships (Figure 1). The posterior density for PAT_RLT was more similar to those of the other two approaches when using the PEAKED AT ZERO prior. Including all relationships led to posterior densities with a smaller standard deviation, that is, higher accuracy of estimates. Including genotypes for dams (ALL_GTP) did not further improve the accuracy. Including all relationships led to smaller estimated HPD90 regions for γ (Figure 1). The HPD90 regions were smaller when the PEAKED AT ZERO prior was used, especially when only paternal relationships were considered. Averaged over 10 replicates, the posterior mean of γ for approach PAT_RLT and the UNIFORM prior was 0.15, which was clearly larger than the simulated value (0.10). Apparently, the data did not provide sufficient information to reduce the effect of the UNIFORM prior, which has an expected mean of 0.5. When the PEAKED AT ZERO prior on γ was used, the estimated posterior mean was equal to the simulated value, which is also the expected mean of the prior (Table 4).
Large QTL, high informative markers: In approach PAT_RLT, the marginal posterior density for parameter γ was relatively flat when the UNIFORM prior was used (Figure 2). The marginal posterior density for γ was clearly shifted toward zero when applying the PEAKED AT ZERO prior in approach PAT_RLT. The other two approaches (ALL_RLT and ALL_GTP) gave similar and more stable densities with the two priors for γ, indicating more information coming from the data compared to PAT_RLT. The HPD90 region was largest for approach PAT_RLT with a UNIFORM prior (Figure 2). The PEAKED AT ZERO prior led to a downward shift of the HPD90 regions, in particular for approach PAT_RLT. The PEAKED AT ZERO prior led also to estimated posterior means that were smaller than the simulated values for all approaches (Table 4). The UNIFORM prior led to an upward bias in the estimated posterior mean for approach PAT_RLT but not for the other approaches. Large QTL, low informative markers: Low informative markers (two alleles per locus) resulted in relatively flat posterior densities for γ (Figure 3), but differences were observed between the three approaches. The use of all relationships improved the accuracy, but in this case the use of all genotypes gave an additional improvement over ALL_RLT. The PEAKED AT ZERO prior led to posterior densities that were closer to zero in all approaches but especially for PAT_RLT. The estimated HPD90 region was again largest for approach PAT_RLT with the UNIFORM prior. The HPD90 regions for approaches ALL_GTP and ALL_RLT were very similar for the UNIFORM prior. However, the HPD90 region for approach ALL_RLT was shifted more toward zero than the region for approach ALL_GTP with the PEAKED AT ZERO prior (Figure 3). The posterior mean estimates were all higher than the simulated value for the UNIFORM prior and below the simulated value for the PEAKED AT ZERO prior. Differences between estimated and simulated values were largest for approach PAT_RLT.
Hypothesis testing, detection of QTL: The hypothesis of the presence of a QTL at a particular position in a linkage map was tested via a posterior odds ratio. For a small QTL the ln(odds) averaged over 10 replicates for approach PAT_REL was 2.69, which was below the critical threshold of 3.0. For approach PAT_REL only 3 out of 10 replicates yielded significant evidence for the presence of a QTL (Table 4). This was very similar to the power of QTL detection found by Bink et al. (1998b). Approach ALL_RLT resulted in an average ln(odds) of 5.58, and the QTL was significantly detected in 7 out of 10 replicates. Approach ALL_GTP failed to significantly detect the small QTL in only one of the replicates.
For a large QTL and high informative markers, approach PAT_RLT was detected the QTL in at least 8 out of 10 replicates, i.e., two and one failures for UNIFORM and PEAKED AT ZERO priors, respectively (Table 4). The approaches ALL_RLT and ALL_GTP detected the QTL in all replicates. The average ln(odds) was clearly higher for the large QTL. Note that the posterior odds of approach ALL_RLT for a small QTL [ln(odds) = 5.64] was even a little higher than the posterior odds of approach PAT_RLT for a large QTL [ln(odds) = 5.58], when high informative markers were considered.
Reducing heterozygosity of the markers resulted in lower averaged estimates of the ln(odds) for all cases. The detection rate for approach PAT_RLT with a low informative marker was 50% or lower depending on the prior (Table 4). In all except one case, the QTL was still significantly detected by approaches ALL_RLT and ALL_GTP.
DISCUSSION
A variety of statistical gene mapping methods have been developed and applied to outbred populations (see Bovenhuiset al. 1997; Hoescheleet al. 1997). Computationally inexpensive methods, such as regression interval mapping, allow data permutation to determine genomewide threshold values for test statistics and can be extended more easily to incorporate multiple QTL; however, these methods can only use certain types of relatives (e.g., halfsibships or fullsibships). Bayesian analysis is computationally more demanding but takes full account of the uncertainty associated with all unknowns in the QTL mapping problem and offers the opportunity to analyze general pedigree data and to fit other random components such as polygenic effects (e.g., Thaller and Hoeschele 1996a). Bayesian linkage analysis has been applied in animals (e.g., Thaller and Hoeschele 1996a; Uimariet al. 1996), plants (e.g., Satagopanet al. 1996), and humans (e.g., Thomas and Cortessis 1992). Application of these methods to large pedigrees with missing genotypes, as described in this article, has not been explored in depth (Hoescheleet al. 1997). The procedures of Janss et al. (1995), i.e., block sampling of ungenotyped dams and their offspring, and Jansen et al. (1998), i.e., sampling IBD patterns, were implemented to achieve good mixing of the sampler in the full pedigree analysis with incomplete marker information. To accommodate missing marker data, special precautions need to be taken for the sampling procedure to avoid reducibility, i.e., not all possible genotype configurations can be reached from any valid starting configuration. Reducibility especially occurs in situations in which offspring are genotyped but both parents are not. In livestock, the number of offspring per sire is usually large and genetic material from males is often stored, which facilitates genotyping of the male parent. When genetic material is not available, genotypes of males can often be inferred from their offspring. In the present study, it is assumed that marker genotypes on at least one parent are known. This assumption does not limit the application of the presented approach to livestock, but it might be limiting in situations where family sizes are smaller. Sheehan and Thomas (1993) allowed nonMendelian segregation of alleles (e.g., genotype AB transmitting allele C) to solve the theoretical reducibility. Inferences were based on samples from only those Gibbs cycles with strict Mendelian segregation, which may be an inefficient procedure in large animal breeding populations. Instead of using a fixed probability on nonMendelian segregation, one may consider a simulated tempering scheme (Geyer and Thompson 1995) that allows this probability to randomly increase from and decrease to zero.
Uimari et al. (1996), Grignola et al. (1996b), and Hoeschele et al. (1997) investigated the effect of ignoring relationships among families on estimates of QTL location and genetic parameters. Virtually no difference was found between analyses with and without relationships between families for situations with much and little information about the QTL. In our study a large impact of including additional relationships was found (Table 4). This apparent discrepancy in literature can be explained by the relationships considered. In the earlier studies, relationships between the grandsires were included, which leads to additional information on estimating the paternally inherited QTL alleles. In the present study, the ungenotyped dams of the sons were included, which provides information for estimating the maternally inherited QTL alleles. The impact of including additional relationships is clearly demonstrated in Figures 1, 2, 3. Including additional relationships resulted in improved estimates of parameter γ, i.e., lower posterior standard deviations and smaller HPD90 regions (Table 4, Figures 1, 2, 3). These results strongly suggest that including all relationships in complex pedigrees does improve power of QTL detection.
The pedigree we analyzed consisted of ∼100,000 individuals. The largest proportion of individuals was offspring of sires that only had phenotypic records. The complexity of the problem was reduced by applying a RAM (Quaas and Pollak 1980) in which genetic effects of ungenotyped nonparents are absorbed into those of their parents as presented by Bink et al (1998b). The procedure presented in this article, which applies a RAM, offers the opportunity to combine the information from different experimental designs, e.g., a granddaughter design, a grandgranddaughter design (W. Coppieters, A. Kvasz, J.J. Arranz, B. Grisart, J. Riquet et al., unpublished results), or a daughter design, and also the information collected in a closed breeding population spanning several generations. Despite higher computational requirements, the application of a RAM in a Bayesian context more naturally treats missing genotypes than the restricted maximumlikelihood procedures described by Grignola et al. (1996a).
In this study, we assumed a fixed QTL position relative to known markers. Bink et al. (1998c) showed that the position of the QTL can be included as an additional parameter in the model. Appropriate sampling of QTL position was facilitated through the use of simulated tempering (Geyer and Thompson 1995). Simulated tempering, which has also been applied in radiation hybrid mapping (Heath 1997b), proved especially useful to improve mixing by relaxing the distance between closely linked loci. Alternatively, George et al. (1998) mapped a biallelic QTL relative to multiple markers via Bayesian model choice by implementing the reversible jump sampler (Green 1995).
In conclusion, the work presented shows that detection of QTL in data from complex pedigrees is feasible by the use of MCMC and Bayesian analysis. It is shown that using all existing relationships increases the power of detection and the accuracy of the estimates. This work also lays the foundation to study the number of QTL and their relative positions within marker linkage maps.
Acknowledgments
The authors thank Ritsert Jansen, Luc Janss, Henk Bovenhuis, and Dick Quaas for stimulating discussion. The authors acknowledge financial support from Holland Genetics.
Footnotes

Communicating editor: C. Haley
 Received May 13, 1998.
 Accepted October 5, 1998.
 Copyright © 1999 by the Genetics Society of America