Abstract
We describe the development of a multipoint nonparametric quantitative trait loci mapping method based on the Wilcoxon ranksum test applicable to outbred halfsib pedigrees. The method has been evaluated on a simulated dataset and its efficiency compared with interval mapping by using regression. It was shown that the rankbased approach is slightly inferior to regression when the residual variance is homoscedastic normal; however, in three out of four other scenarios envisaged, i.e., residual variance heteroscedastic normal, homoscedastic skewed, and homoscedastic positively kurtosed, the latter outperforms the former one. Both methods were applied to a real data set analyzing the effect of bovine chromosome 6 on milk yield and composition by using a 125cM map comprising 15 microsatellites and a granddaughter design counting 1158 HolsteinFriesian sires.
RECENT developments in DNA marker technology, such as the discovery of microsatellites (Weber and May 1989), random amplified polymorphic DNA (RAPDs; Williamset al. 1990), and amplified fragment length polymorphism (AFLPs; Voset al. 1995) as abundant sources of welldispersed genetic markers, have boosted the construction of marker maps across a broad taxonomic range. Not only are such maps now available for human and model organisms such as mouse and rat but for a number of agriculturally important animal and plant species as well.
These maps increasingly are applied to locate genes underlying inheritable phenotypes of interest. Several of the most relevant phenotypes are continuously distributed quantitative traits involving multiple polygenes or quantitative trait loci (QTL), as well as nongenetic effects. Experimental back and intercrosses are often the preferred design to map QTL. However, in a number of agriculturally important species (notably cattle and pine trees), reproductive cycles and breeding designs have led to the generation of extensive halfsib pedigrees that are readily available for QTL mapping. A welldocumented example of this is the socalled granddaughter design to map genes underlying milk production in commercial cattle populations (Welleret al. 1990). This design takes advantage of the numerous paternal halfbrother pedigrees that exist in dairy cattle populations, generated as part of the applied progenytest breeding design.
A number of mapping methods have been applied to such halfsib designs, including singlemarker regression (e.g., Cowanet al. 1990), interval mapping using regression (e.g., Knottet al. 1996), and maximum likelihood methods (e.g., Georgeset al. 1995). All these methods share a common assumption, namely the residual normal distribution and homoscedasiticity of the analyzed phenotypes or transformations thereof. These approaches therefore are not suitable for phenotypes that are known not to satisfy this normality assumption. Moreover, deviations from normality for traits that generally are assumed to be quasinormally distributed are likely to affect the power and robustness of these conventional approaches.
Recently, Kruglyak and Lander (1995a) described a nonparametric QTL interval mapping approach based on the Wilcoxon ranksum test applicable in experimental crosses. This method provided a robust alternative to conventional approaches, applicable to normally distributed traits with minimal loss of power and extending the scope of QTL mapping to a variety of traits not normally distributed, such as counts generated by a Poisson process, truncated data, probabilities, and qualitative data.
In this article, we describe the adaptation of this method to halfsib pedigrees in outbred populations and apply it to milk production in a granddaughter design. A computer program to implement this approach has been developed and is available from the authors upon request.
MATERIALS AND METHODS
A QTL interval mapping procedure based on the Wilcoxon ranksum test—general principles: To measure the evidence in favor of a QTL at a given map position, Kruglyak and Lander (1995a) define the following statistic (illustrated for an (A × B) × A backcross),
Under the null hypothesis of no QTL, Z_{W} is shown to behave asymptotically as a standard normal variable that reduces to a Wilcoxon ranksum test at the marker positions.
Adaptation to outbred halfsib designs: The method developed by Kruglyak and Lander (1995a) for experimental crosses was adapted to outbred halfsib designs, e.g., a founder sire mated to several dams to produce a large paternal halfsibship. The approach relies on the same Z_{W}(s) statistic. However, P[g_{i,A}(s)g_{i,}_{L},g_{i,}_{R}] (Equation 2) is now defined as the probability that progeny i has inherited QTL allele A from the founder sire—assumed to be heterozygous AB at the QTL—at map position (s) given its genotype at the left (g_{i,}_{L}) and right (g_{i,}_{R}) flanking markers. Only markers for which the founder sire is heterozygous are considered when computing P[g_{i,A}(s)g_{i,1},g_{i,R}]. Moreover, while the nearest flanking markers contain all information needed to compute P[g_{i,A}(s)g_{i,L},g_{i,R}] in a given interval when dealing with experimental crosses, information from more distant markers is considered in the outbred halfsib situation, when closer markers are not fully informative. This occurs in the case of missing genotypes or when the offspring has the same marker genotype as the sire, and the dam is either not genotyped or has the same heterozygous genotype as well. In the former case, part of the information can be recovered by considering marker allele frequencies in the population.
Calculation of P[g_{i,A}(s)g_{i,L},g_{i,R}] requires knowledge of the sire's marker linkage phase. In the absence of grandparental marker information, the most likely linkage phase is first estimated from the marker genotypes of the offspring. This is accomplished by calculating the likelihood of the pedigree data under the 2^{x}/2 possible phases (assuming x informative markers) as follows (Georgeset al. 1995):
All marker phases are a priori considered to be equally likely; i.e., linkage equilibrium is assumed to be reached between all markers. The marker phase maximizing the likelihood of the pedigree data is considered the true one and is selected for further analysis.
As pointed out by Kruglyak and Lander (1995a),
While ⟨[1 − 2·P[g_{i,A}(s)g_{i,L},g_{i,}_{R}]]^{2}⟨ or the expected value of [1 − 2·P[g_{i,A}(s)g_{i,L},g_{i,R}]]^{2} over all possible genotypes is computed easily for experimental crosses, its calculation is more cumbersome in outbred designs as it will depend on marker allele frequencies and genotype of the founder sire. The value of ⟨[1 − 2·P[g_{i,A}(s)g_{i,L},g_{i,R}]]^{2}⟨ is therefore calculated for each halfsib pedigree by simulating all possible offspring and calculating a frequency weighted mean of [1 − 2·P[g_{i,A} (s)g_{i,L},g_{i,R}]]^{2}.
Across family analysis: In practice, the available pedigree material is composed most often not of one halfsib pedigree but of a series of such halfsibships, such as in the granddaughter design (Welleret al. 1990). In outbred populations, however, the different sibships cannot be assumed to segregate for the same QTL or even QTL alleles; i.e., one cannot assume locus and allelic homogeneity across families.
Rather than analyze the pedigrees separately, however, and reduce power by multiple testing, the individual Z_{W}(s) scores were squared and summed over all k families yielding a χ^{2} statistic with k degrees of freedom:
Interval mapping by regression: The ranksumbased approach (hereafter referred to as method RS) was compared with interval mapping by using regression (hereafter referred to as method MR for multipoint regression; Knottet al. 1996). For each halfsib family, j, phenotypes were regressed on P[g_{i,A}(s)g_{i,L},g_{i,R}], calculated as described above, yielding leastsquares estimators of the y intercept, β_{0j}, and the slope, β_{1j}, the latter being an estimator of the QTL allele substitution effect in the corresponding family, j. The ratio
Significance thresholds: For both the RS and MR methods, chromosomewise significance thresholds were determined from the distribution of the test statistic over 10,000 permutations (simulated data set) or 100,000 permutations (real data set) of the phenotypes (or ranks) as suggested by Churchill and Doerge (1995). Phenotypes were permutated within family. For each permutation, the highest value of the test statistic over the entire chromosome was retained to yield “chromosomewise” distributions of the test statistic under the null hypothesis. For the real data set, a Bonferonni correction was applied to the chromosomewise significance level, considering that chromosome 6 represents 1/29 of the bovine autosomes and that we analyzed the equivalent of three independent traits (Spelmanet al. 1996) to obtain “experimentwise” significance thresholds.
Simulated data set: To test the efficacy of the proposed method, we simulated the segregation of a QTL in a granddaughter design. The pedigree material was composed of two paternal halfsib families of 100 sons, four families of 50 sons, and eight families of 25 sons, quite accurately reflecting a real data set. The 14 respective sires were considered to be unrelated.
A QTL was positioned in the center of the fourth interval of a map comprising seven markers spaced 15 recombination units apart. Markers were assumed to be polyallic markers with frequencies randomly assigned from a uniform distribution and rescaled to sum to unity, yielding a heterozygosity of
The QTL was assumed to be biallelic with frequencies p = 0.25 (Q) and q = 0.75 (q), respectively. Foundersires therefore had an a priori probability 2pq = 0.375 to be heterozygous Qq for the QTL. Following Falconer's notation (Falconer and MacKay 1996) and assuming additively acting alleles, the average phenotypic values of the QQ, Qq, and qq genotypic classes were set at +a, d = 0, and −a, respectively. Assuming HardyWeinberg equilibrium, this yields an average effect of an allele substitution, α = a, and a variance attributable to the segregation of the QTL:
Homoscedastic normal residual variance: Individual phenotypic values were generated as the mean of the genotypic class to which the individual belongs (QQ = a, Qq = 0, or qq = −a) plus a value drawn from a normal distribution with mean 0 and variance 1; i.e.,
Heteroscedastic normal residual variance: Individual phenotypic values were generated as the mean of the genotypic class to which the individual belongs (QQ = a, Qq = 0, or qq = −a) plus a value drawn from normal distributions with mean 0 and variances of
Homoscedastic, skewed residual variance: Skewness was simulated by assuming a residual effect distributed as a chisquared distribution with n degrees of freedom, with variance
Homoscedastic, positive kurtosis: Excess of kurtosis was simulated by assuming that the residual effect was distributed as a Student's tdistribution with n degrees of freedom, with variance
Homoscedastic, negative kurtosis: Negative kurtosis was simulated by assuming that the residual effect was distributed as a hemicircular distribution with mean 0 and variance
Figure 1 illustrates the expected phenotypic distributions of offspring from heterozygous foundersires, Qq, for the five examined models. Offspring are sorted in two genotypic classes depending on the QTL allele transmitted by the sire (Q or q). Each class therefore comprises two subpopulations: QQ (25%) and Qq (75%) for the Q class and Qq (25%) and qq (75%) for the q class.
At least 200 datasets (ranging from 200 to 866) were simulated under each of the five models of residual variation and analyzed with the RS and MR methods. Permutations were used to estimate the significance levels reached for each of these analyses (Churchill and Doerge 1995). For each replicate, 10,000 permutations were performed and analyzed with the RS and MR methods to yield a datasetspecific, chromosomewise distribution of the RS and MR statistics under the nullhypothesis, allowing us to measure the P value of the unpermutated data under the null hypothesis of no QTL. Average P values over replicates were calculated for each of the five models. For each model, the proportion of datasets yielding a P value less than 0.05 (=α) was used to measure the corresponding power (1 − β) of the RS and MR methods (Table 1). Within each model, we compared the relative merits of the RS vs. MR methods by applying the Wilcoxon matched pairs test on all resulting pairs of P values (Hollander and Wolfe 1973). Within each method, the effect of the model on the power to detect the QTL was evaluated by using the MannWhitney U test (Hollander and Wolfe 1973), using model 1 as reference.
Real data set: The real data set was a HolsteinFriesian granddaughter design comprising 1158 sons distributed over 29 paternal halfsib families, partially described in Spelman et al. (1996). The number of sons per family ranged from 11 to 153.
All animals were genotyped for a battery of 15 previously described (Kappeset al. 1997) microsatellite markers from bovine chromosome 6 (Table 2). Genotyping was performed as described (Georgeset al. 1995) or by using the “four dyeone lane” technology on an ABI373 or ABI377 sequencer.
Marker maps were built by using the CRIMAP program (Lander and Green 1987) to determine the most likely order and the ANIMAP program to refine the most likely recombination rates between adjacent markers (Georgeset al. 1995).
Information content along the marker map (Kruglyak and Lander 1995b) was measured as
QTL mapping was performed for five milk production traits: milk yield (kg), protein yield (kg), fat yield (kg), protein percentage, and fat percentage. The phenotypes used for QTL mapping were deviations of individual daughter yield deviations from the corresponding average of the parental predicted transmitting abilities (Van Raden and Wiggans 1991).
Marker allele frequencies, required for map construction, measuring of information content, and QTL mapping, were estimated from the dam population, separately for each pedigree, as
RESULTS
Simulated data: Using the approach described in materials and methods, we simulated GDDs segregating for a QTL explaining a fixed 9.4% of the phenotypic variance (corresponding to a = 0.5σ_{P}) but with five alternative residual components: homoscedastic normal, heteroscedastic normal, homoscedastic skewed, homoscedastic positive kurtosis, and homoscedastic negative kurtosis. The generated datasets were all analyzed by using both RS and MR methods. Table 1 reports, for each of the five scenarios, the average P values and the associated power at αvalue of 0.05, obtained by permutation as described in materials and methods.
The relative merit of the RS and MR methods was evaluated by using the Wilcoxon matched pairs test as described in materials and methods. As expected, multiple regression is superior to the ranksum approach under the basic model of homoscedastic normal residual variance (P = 0.000014). The loss of power when using the rankbased method is estimated at 8% at αvalue of 0.05. The MR method proved also significantly superior to the RS method in the negative kurtosis model (model 5; P = 0.000001); the loss of power with the RS method was estimated at 14% at αvalue of 0.05. For the three remaining scenarios, however, the RS approach outperformed MR, the gain in power ranging from 8 to 20% at αvalue of 0.05 (Table 1).
The effect of the model on the power to detect the QTL was evaluated by using the MannWhitney U test (see materials and methods), by using model 1 as reference. Comparisons were performed separately for the RS and MR approach. Interestingly, MR appears to be quite insensitive to the nonnormality of the residual variation, as the distribution of P values under the alternative models is never significantly different from those obtained under the basic model. This is likely due to the fact that significance levels are deduced from phenotype permutations rather than from the theoretical distribution of the test statistic. Using RS, on the contrary, significant increases in detection power are observed for models 2, 3, and 4 (respectively 9, 12, and 23% at αvalue of 0.05; Table 1), while the distribution of P values does not differ significantly between models 1 and 5.
Estimates of the precision in the estimation of QTL positions were also compared. Table 1 shows the standard deviation of the most likely QTL position for all simulations yielding a signal exceeding the 5% chromosomewise significant threshold. Comparing the difference between real and estimated position by using the MannWhitney U test, we found no evidence for a significant effect either of the statistical method or of the model for the underlying residual variance. In essence, precision was as poor in all circumstances, standard deviations of the estimated position being 20 to 25 cM. While the actual position of the QTL was at 62 cM counting from the first marker, the estimates ranged from 0 to 118 cM, i.e., the entire chromosome length. A total of 95% of the estimates were within 43 cM (=1.9 σ) from the actual position.
Real data: Table 2 and Figure 2 show the most likely marker map as obtained from our genotypes. The map covers 125 cM (Kosambi) with average interval of 9 cM. The most likely order was in agreement with Kappes et al. (1996). The same figure also compares information content when (1) exploiting marker allele frequency estimates to extract information from noninformative marker genotypes, and (2) when ignoring this information, i.e., when considering all microsatellite alleles to be equally frequent in the population. It can be seen that more than 80% of the maximal information is extracted for the central part of the chromosome; however, the information content drops at both extremities of the chromosome. Moreover, the figure shows that information content is improved only marginally by considering marker allele frequencies. This is especially true in the central, denser part of the marker map.
Figures 3a and 3b summarize the location score profiles obtained for the five different milk production traits by using both RS and MR approaches. Generally speaking, both methods clearly yield very similar curves for all traits along the entire chromosome length. For protein percentage, the location scores maximize at the same chromosome position (48 cM) using both approaches. The associated experimentwise significance levels are P = 0.03 for RS and P = 0.01 for MR, therefore slightly superior for the latter.
These results are in agreement with the report of a QTL affecting milk production on the centromeric half of chromosome 6, first identified by Georges et al. (1995) and later confirmed in independent studies in HolsteinFriesian by Spelman et al. (1996) and Kühn et al. (1996), in Finnish Ayrshire by Vilkki et al. (1997), and in Norwegian Red by GomezRaya et al. (1996). A detailed analysis of this chromosome region in the corresponding pedigree material is given in Spelman et al. (1996).
DISCUSSION
In this article, we have adapted a nonparametric QTL mapping method based on sum of ranks that was described previously for experimental crosses (Kruglyak and Lander 1995a) to outbred halfsib pedigrees. This is particularly relevant for mapping QTL in specific livestock and plant species where such pedigrees routinely are generated within the context of specific breeding designs. It extends the scope of QTL mapping in these pedigrees to a variety of not normally distributed traits, including counts generated by a Poisson process, truncated data, and probabilities and qualitative data (Kruglyak and Lander 1995a).
We confirm that this approach (the RS method) can be applied conveniently to normally distributed traits with minimal loss of power when compared to parametric methods. In the simulated example, we noticed a loss of power of 8% at αvalue of 5% when compared to the MR method. When simulating nonnormal or heteroscedastic residuals, however, the RS method outperformed the MR method in three out of four scenarios (models 2–4: heteroscedastic normal, homoscedastic skewed, and homoscedastic positively kurtosed). Interestingly, this was shown not to be due to a loss of power of the MR approach, which proved to be relatively robust in the scenarios that we simulated, but rather to a gain of power when applying the RS method. Our interpretation of this finding is that in the three scenarios where RS proved superior to MS, the phenotypic distribution is characterized by “outlyers” when compared to the normal distribution (see Figure 1). These outlyers contribute excessively to the residual variation, while the bulk of the observations actually are more centered around the mean (and therefore less variable) when compared to the normal distribution. When using ranks rather than the actual phenotypes, the contribution of the outlyers to the residual variation is tempered, therefore increasing the ratio QTL variance/residual variance and concommitantly increasing the power to detect the QTL.
A disadvantage of the rankbased methods is the fact that these do not provide convenient estimates of QTL effects. These methods therefore are suitable for the detection of QTLs but have to be complemented with alternative methods, such as leastsquares or maximum likelihood techniques when quantifying the QTL effects.
Recently, a number of QTL mapping methods that account for multiple linked or unlinked QTL have been proposed. These include two QTL models (e.g., Haley and Knott 1992), composite interval mapping (Zeng 1993), and multiple QTL mapping (Jansen 1993). Rankbased approaches have been described to test three or more classes, including the KruskalWallis test and the JonckheereTerpstra test, which would allow a twoQTL model to fit. Alternatively, it might be interesting to explore the possibility to use regression techniques directly on ranks, which, if applicable, would allow inclusion of additional markers as cofactors in the model.
Assuming paternal halfsib pedigrees, the proposed method allows for missing genotypes in the “dams.” In such cases, estimates of marker allele frequencies can be used to improve inference about the identity of the transmitted paternal chromosome. However, it is shown that when performing multipoint analyses with dense marker maps, this contributes only a marginal improvement of the information content. The benefit of including marker allele frequency is therefore doubtful. Indeed, errors in the estimation of the marker allele frequencies may even cause an increase in type I errors or a loss of power if accounting for inaccuracies in the frequency estimates (Charlieret al. 1996).
As expected, the precision in the estimation of the QTL position using both proposed parametric and nonparametric approaches is mediocre. This illustrates the need to develop alternative strategies for finemapping QTL in outbred populations.
Acknowledgments
We acknowledge the financial support of Holland Genetics, Livestock Improvement Corporation, the Vlaamse Rundvee Vereniging, and the Ministère des Classes Moyennes et de l'Agriculture, Belgium. Continuous support from Nanke den Daas, Brian Wickham, Denis Volckaert, and Pascal Leroy is greatly appreciated. We thank Johan van Arendonk, Richard Spelman, Henk Bovenhuis, Marco Bink, Dave Johnson, and Dorian Garrick for fruitful discussions.
Footnotes

Communicating editor: ZB. Zeng
 Received November 6, 1997.
 Accepted March 30, 1998.
 Copyright © 1998 by the Genetics Society of America