Abstract
A novel method using the nonparametric bootstrap is proposed for testing whether a quantitative trait locus (QTL) at one chromosomal position could explain effects on two separate traits. If the singleQTL hypothesis is accepted, pleiotropy could explain the effect on two traits. If it is rejected, then the effects on two traits are due to linked QTLs. The method can be used in conjunction with several QTL mapping methods as long as they provide a straightforward estimate of the number of QTLs detectable from the data set. A selection step was introduced in the bootstrap procedure to reduce the conservativeness of the test of close linkage vs. pleiotropy, so that the erroneous rejection of the null hypothesis of pleiotropy only happens at a frequency equal to the nominal type I error risk specified by the user. The approach was assessed using computer simulations and proved to be relatively unbiased and robust over the range of genetic situations tested. An example of its application on a real data set from a saline stress experiment performed on a recombinant population of wheat (Triticum aestivum L.) doubled haploid lines is also provided.
THE gradual development of different types of molecular markers over the past decade has widened the applicability of quantitative trait locus (QTL) mapping to many species, even some for which variability between lines is limited. Increased marker density has facilitated accuracy of QTL positioning, which, in turn, allows the development of comparative QTL mapping. Comparisons of coincident associations of markers and QTLs can take place between genomes of related genus within a family, a subfamily, or a tribe, thus exploiting the conserved synteny between genomes that may exist. It can also take place between different traits and across different environments.
Among the different sorts of recombinant populations derived from line crosses, doubled haploid and recombinant inbred lines are more amenable to comparative QTL mapping between traits and between environments because the same genotypes can be used in several experiments. The experimental power is thus increased because the different trait distributions are no longer independent and can be studied as a multivariate zdimensional distribution (where z is the number of traits or environments studied). Thus, as far as a QTL comparison between traits is concerned, existing genetic and environmental correlations can be exploited. Ronin et al. (1995) and Korol et al. (1995) used the correlation between traits to increase the QTL detection power in experiments involving two traits with some correlation between them of genetic and/or environmental origin. In so doing, they assumed pleiotropy for every QTL mapped. In cases when this hypothesis was wrong, Jiang and Zeng (1995) demonstrated that the estimates of the QTL effects and positions were biased. While staying within the same framework, the latter authors extended the approach to the simultaneous analysis of more than two traits without having to assume pleiotropy for every QTL being mapped. They termed their method “Joint Mapping.” They were then able to develop a test of close linkage vs. pleiotropy for a set of n QTLs, i.e., one QTL per trait detected in the same genomic region for the n traits. For two traits, the test statistic is the likelihood ratio
Visscher et al. (1996) proposed a less biased alternative to the LOD support interval. They estimated empirically a curve of probability density of the QTL's position using the nonparametric bootstrap, in which the original data are resampled N times with replacement and then plotted the frequency of the different positions observed. This procedure yielded confidence intervals on the QTL's position that performed better than LOD support intervals but tended to be too conservative when the QTL accounted for <10% of the total trait variance. Lebreton and Visscher (1998) managed to reduce the conservativeness and the size of the confidence intervals to a nearly unbiased level by conditioning the bootstrap on the genetic model, provided there was no bias on the QTL's position estimate.
In this study, we have used the selective bootstrap concept to develop a novel method to test close linkage vs. pleiotropy. In so doing, we replaced the likelihood ratio test statistic of Jiang and Zeng (1995) by a confidence interval on the estimated distance between two QTLs affecting different traits that are suspected to correspond to the same locus (hypothesis of pleiotropy). Our aim is to propose a test that is easy to implement and program (in the Cshell of a UNIX system, for example) with a variety of QTL mapping methods. The test must also be as little biased as possible over a range of genetic configurations commonly encountered. An absence of bias means that the null hypothesis is only rejected at the nominal frequency of type I error specified by the user. The bias and the power of the method are assessed by simulations using a range of genetic configurations. Finally, an example of its application to a real data set that comes from a saline stress experiment performed on a recombinant population of wheat (Triticum aestivum L.) doubled haploid lines is presented.
MATERIALS AND METHODS
QTL mapping procedures: The test of close linkage vs. pleiotropy that we propose can be implemented for any QTL mapping method that lends itself to an easy QTL model identification. However, as the method is based upon resampling, the speed of the QTL mapping procedure is important in allowing the assessment of the test on thousands of simulated data sets. Interval mappingbased methods, such as the interval mapping (stricto sensu) of Lander and Botstein (1989), the composite interval mapping of Zeng (1993, 1994) and Basten et al. (1996), or the multipleQTL models of Jansen (1993) and Jansen and Stam (1994) provide asymptotically unbiased QTL parameter estimates due to the property of the maximum likelihoodbased analysis. However, as these methods implement an iterative algorithm (the EM algorithm; Dempsteret al. 1977) that must perform several cycles to converge to an estimate of the QTL parameters at each tested position of the putative QTL, they are slower than the multiple linear regression, for example, as simplified by Whittaker et al. (1996). Multiple linear regression constitutes an approximation of the maximum likelihoodbased interval mapping in that only the withinmarker class distribution of the residuals is considered and assumed normal (Martinez and Curnow 1992; Xu 1995). However, the method gives almost identical results to those from the maximum likelihoodbased methods (Haley and Knott 1992). Walling et al. (1998) could nevertheless show, by using a large number of replicated simulations, the existence of a small but significant bias that tends to place the estimated QTL position closer to the nearest flanking marker. In this study we used two regressionbased QTL mapping methods for the assessment of the test of linkage vs. pleiotropy.
The first QTL mapping procedure was used in situations of saturated coverage of the genetic map with one marker every centimorgan, to minimize any possible confounding effects due to the bias described above. Applying the principles established by Stam (1991), Zeng (1993), Rodolphe and Lefort (1993), and Wright and Mowers (1994), given the high and regular marker density over the genome, a simple markerregressor selection was implemented to identify the QTLs. Because of their computational speed, regression methods lend themselves to resampling schemes. The QTLs were fitted at the positions of the selected markers, and therefore there were as many QTLs declared as there were markers selected. Their estimated additive effect was the partial regression coefficient related to the corresponding markerregressor. During the marker selection, a first, relatively lax, forward procedure, with an inclusion Fratio of 4.0, adds a subset of all the markers of the chromosome, some of which are included by chance (as discussed by Lebreton and Visscher 1998). Then, a stringent backward procedure was implemented to reject some markers. The stringency chosen determines the risk of type I error, i.e., the risk of retaining a false QTL. The stringent Fthreshold is calculated empirically according to the protocol presented in Lebreton and Visscher (1998) and follows the principles described by Churchill and Doerge (1994).
The second QTL mapping procedure was applied in situations where the robustness of the test was assessed with nonsaturated marker coverage of the genome. In a first stage, a subset of markers was chosen in the same way as the first method. Then, an “interpretation” stage followed, during which the number of QTLs and the sign of their additive effect (for the doubled haploid populations simulated in the context of this study) were estimated. A selected isolated marker, i.e., a marker for which no adjacent markers were selected, was interpreted as evidence of a single QTL situated nearby. If two adjacent markers were selected with “apparent” effects (i.e., the partial regression coefficient of the markerregressor in the fitted multilinear model) of the same sign, a single QTL was assumed between the two markers. If the “apparent” effects of the two selected adjacent markers were of opposite signs, two QTLs were assumed nearby. In any case, the sign of the QTL effects was inferred to be that of the respective linked markers. This procedure constitutes a minimum estimate of the number of QTLs, but regardless, no linear method can fit more than one QTL per marker interval. This estimate of the number of QTLs on the chromosome of interest and the sign of their effect constitutes the basis on which the bootstrapped sample is rejected or not, as will be explained later. It is followed by a third stage, if the sample is retained, which consists of estimating the effect and position of the QTLs within marker intervals. Thus, if an isolated marker was selected in the first stage, both its flanking markers were tested in turn by including them in the model. The marker that reduces the residual sum of squares the most is retained. The most likely interval that contains the QTL is thereby identified. The partial regression coefficients of the flanking markerregressors fitted one at a time are used to estimate the QTL's additive effect and position within the interval according to Equations 2 and 3, as presented in Lebreton and Visscher (1998):
In the case when two adjacent markers are selected, with apparent effects of the same sign, the most likely interval that contains the QTL is de facto identified. One of the flanking markers, say the one on the left of the interval, is removed, and the partial regression coefficient of the one on the right is calculated. Then the two markers are swapped over in the model, and the partial regression coefficient of the marker on the left is calculated. Then Equations 2 and 3 are applied to these coefficients. When the adjacent markers are of opposite signs the QTLs are fitted at the respective marker loci, and the apparent effects are retained as the estimate of the QTLs' effects—this as an ad hoc compromise between bias in QTL parameter estimation and speed of computation. If x (x > 2) adjacent markers were selected with “apparent” effects of the same sign, x – 1 QTLs in coupling would be assumed. However, our linear approximation was no longer able to estimate the QTLs' 2(x – 1) parameters (position and additive effect for each QTL) because there were only x values (the partial regression coefficients) to solve a system of 2(x – 1) equations with 2(x – 1) unknowns. Whittaker et al. (1996) also demonstrated the insolvability of this configuration when only the markers' “apparent” additive effects are available. Thus, compared to other regressionbased QTL mapping methods that include other markers as background parameters so that the full model tested explains as much genetic variance as possible, such as MQTL (Tinker and Mather 1995), our method should yield very similar results, with only a slightly lower QTL detection power. This small decrease in QTL detection power is due to the fact that the pairs of adjacent markers are not systematically included in the compared models.
The test of close linkage vs. pleiotropy: The principle of the test consists of using bootstrap resampling to obtain an empirical estimate of the sampling distribution of the “distance” between the two QTLs—one per trait—that map in the same genomic region. Here we use the word distance in the loose sense because this quantity can actually take negative values and is defined as
In this article, for simplicity we considered only a population of doubled haploid lines. Thus, let X_{n,m} be the matrix containing all the marker genotypes made of n rowvectors, n being the number of genotypes, and of m columnvectors, m being the total number of markers covering the genome. Also, let Y_{1} and Y_{2} be the vectors containing the n phenotypic values for trait one and trait two, respectively. The marker and the phenotypic data for the two traits of an individual are resampled jointly (x_{i}_{,1}, x_{i}_{,2},... x_{i}_{,}_{m}, y_{1,}_{i}, y_{2,}_{i}) and constitute one “resampleable” data point.
As in Lebreton and Visscher (1998), a selective nonparametric bootstrap was compared to a nonselective one. With the selective scheme, after the QTL mapping procedure was applied to the resampled data set, the parameter estimates contributed values to their estimated empirical distribution only if the QTL model, i.e., the number of QTLs and the sign of their effects on the chromosome under study, conformed, for both traits, to those inferred from the original data set (see Figure 1B). Data sets producing other outcomes were rejected, and the original data were resampled again until N conforming outcomes were recorded. In the nonselective bootstraps, the same number of QTLs as that observed in the original data set was fitted whether they were significant or not in every resampled data set. All of the QTLs contributed values to the estimated empirical distributions of the parameters. Visscher et al. (1996) demonstrated that 150 bootstrap samples gave a sufficiently low bootstrap resampling error component (due to the limited number of resampled data sets as opposed to the error due to the limited size of the original data set itself; Efron and Tibshirani 1993). So we retained that value for the assessment of our test.
The general assessment protocol for our test: The aim of the protocol is to assess empirically the bias and the power of our test by evaluating the frequency of rejection of the null hypothesis of pleiotropy over many data sets that are the realizations of a given genetic configuration. The protocol was largely inspired by that described in Visscher et al. (1996) and is depicted in Figure 1A. In a real data set, it is trivial to say that the test of close linkage vs. pleiotropy would not be applied if either of the two QTLs tested did not show a significant effect. This is why, in our simulations, a prior selection was made in the original data sets (which we call replicates from now on) so that only the replicates that yielded the simulated number of QTLs and had effects of the right sign were kept. A genomewise risk of 5% type I error in this prior selection was chosen to provide a stringency equal to what would realistically be applied in a real data set. Then the bootstrap test to be assessed was applied to each of these retained replicates with a specified risk of 5% type I error (risk of wrongly rejecting the null hypothesis of pleiotropy, as specified above) for all the simulations. The proportion of outcomes for which the null hypothesis was rejected constituted either: (1) an estimate of the power of the test when a situation of close linkage for the QTL pair studied was simulated, or (2) an estimate of the frequency of type I errors when pleiotropy was simulated, i.e., an estimate of the bias of the method, because the unbiased expectation for this observed value should be the X% that was used to define the confidence interval on the value of d.
The genetic configurations tested: For simplicity, genomes were made of one chromosome only, with a length of either 160 or 180 cM. Populations of doubled haploid lines of sizes ranging between 100 and 200 lines were simulated, with one QTL per trait in general. Heritabilities were defined as the ratio of the phenotypic variance of genetic origin between the line means over the total phenotypic variance between the line means. The trait's heritability was either 0.1 or 0.2. Only the mean of each line was simulated. Three QTLs per trait, on the same chromosome and in repulsion with a trait heritability of 0.6, were also simulated to assess the test in more complex genetic configurations. No epistatic interactions were simulated. The residuals were normally distributed.
Application of the test to real data: A recombinant population of 96 doubled haploid lines of bread wheat (T. aestivum L.) was used for some QTL studies on abiotic stress resistance by Quarrie et al. (1994). It is derived from the cross Chinese Spring × SQ1, with Chinese Spring being the maternal parent. SQ1 is a sibling line at F_{7} of the line 25/3/2 from the cross Highbury × TW269 and was selected for high abscisic acid production under drought conditions. It is also an early line that flowers ∼25 days before Chinese Spring in the absence of vernalization. Chinese Spring is known for its relative salinity tolerance.
Physiological, biochemical, and yield traits were measured on plants grown in a controlled environment and subjected to saline stress treatments comparing NaCl and Na_{2}SO_{4}, in two different experiments. In both experiments, five plants per genotype were used. The NaCl experiment was started at the end of May and finished in August 1994. The Na_{2}SO_{4} experiment was started on 11 January 1996 and finished in the middle of April 1996. Before sowing, seeds were soaked in water, in Petri dishes at 4° overnight and then left at room temperature to germinate. When coleoptiles reached 12 mm, the seedlings were transferred to rockwool blocks. For the first 5 days, plants were grown in water that was then replaced with “Sangral” nutrient medium. The nutrient medium was gradually salinized when leaf one was fully expanded to reach a final concentration of 200 mm · liter^{–1} in NaCl and 100 mm · liter^{–1} in Na_{2}SO_{4}, respectively. Both experiments were performed in a greenhouse with a 16hr day length. Daytime and nighttime temperatures were maintained at ∼20° and 15°, respectively, and the air humidity was maintained at a level of 5060%. Plants were harvested at maturity. The average spike length in the NaCl experiment and the length of leaf five in the Na_{2}SO_{4} experiment were then measured. The data analysis was carried out on the means of each genotype.
The genome was covered with 331 markers, mostly RFLPs and AFLPs, plus a few morphological markers including the Vrn1 gene (Galibaet al. 1995). This is the main gene controlling the vernalization requirements of European wheat varieties. It was scored as days to ear emergence in unvernalized plants. In our population, it produced a clearcut segregation for the flowering time between the genotypes. Each of the 14 chromosomes of the A and B genomes were covered with 20 markers on average. However, due to a lack of variability in the D genome of T. aestivum L. as described by Cadalen et al. (1997), we could place only 3.7 markers per D genome chromosome.
The marker map was constructed using the software Mapmaker (version 3.0b) with the Haldane mapping function. A set of 21 nullitetrasomic lines, 1 for each chromosome, allowed us to assign unambiguously from one to eight markers to each chromosome as anchor points. Other markers were then grouped around these anchor points using a LOD threshold of 3.0 and a maximum recombination fraction of 0.3. Unlinked linkage blocks with anchor markers on the same chromosome were then forced into one linkage group and oriented relative to each other according to Gale et al.'s (1995) consensus map for the RFLP markers. Segregation distortion of the marker genotypes was investigated using a χ ^{2} test with 1 d.f. The normality of the trait distribution among the genotype means was checked using Shapiro and Wilk's Wstatistic (Royston 1982). The QTLs were mapped, and the test of close linkage vs. pleiotropy was performed using the procedures described above that we implemented in a program in Fortran 90 (Elliset al. 1994). No epistatic effects were fitted. The Fsignificance thresholds to retain markers in the backward selection stage of the QTL mapping procedure were determined empirically using 1000 permutations as described in Churchill and Doerge (1994) to achieve a target risk of 10% type I error genomewise. Then, if the QTL was isolated (i.e., no QTLs in the adjacent intervals), the Fstatistic of a QTL was calculated by comparing the full model, including the markers flanking the QTL, to the reduced one that did not contain them. Otherwise, if the QTL was fitted at the marker position in the specific cases described above, only this marker was removed from the full model to calculate the Fstatistic.
The output of our test of close linkage vs. pleiotropy was a P value calculated as twice the minimum percentage of outcomes with an estimated d either above or below zero. It corresponds to the integrated area of the empirical distribution's tail beyond zero and is multiplied by two because our test is twotailed; it is the equality of a variable (d) to a specific value (zero) that is tested. Thus, P is the maximum risk of type I error one would need to adopt to be able to reject the null hypothesis of pleiotropy.
RESULTS
Biases in conservativeness of the selective bootstrap: Table 1 presents the results of a series of simulations aimed at assessing the bias of the selective bootstrap method to test close linkage vs. pleiotropy in a low power experimental design (although the heritability of an individual QTL can reach 20%). Nine genetic configurations were explored. In all of them, only one QTL was simulated per trait, for two traits, on a genome composed of a 180cMlong chromosome. Because it was the bias that was being explored, the QTLs were simulated at the same position for both traits. The simulated values for the additive effects were equal to unity. The correlation between the residuals, i.e., the environmental correlation between the traits, was equal to zero. Replicates were created until 2000 had been retained that conformed to the model simulated. For example, when a QTL of 10% heritability was simulated for both traits, 2260 replicates on average were generated to retain 1000 of them. The total number of replicates fell to 1200 when a QTL of 20% heritability was simulated for both traits (results not shown). A saturated marker coverage of one marker every centimorgan was compared to a density of only one marker every 20 cM.
The percentage estimates of inclusion of the value “0” by the 95% confidence intervals for the estimate of d were within confidence limits of ±1.3%. The estimates of the QTL additive effects were significantly biased upward due to the selection of the replicates and then of the bootstrapped samples. This bias is very small for QTL heritabilities of 20%: +5.6% ± 2% on average over the replicates and +9.5% ± 2% on average over the bootstrap samples, for both marker densities. For a heritability as low as 7.5%, the bias became more important: +33% ± 2% on average over the replicates and +49% ± 2% on average over the bootstrap samples with the dense marker coverage (results not shown). Biases were significantly higher with the sparse coverage of one marker every 20 cM: +40% ± 3% and +58% ± 2%, respectively.
Overall, the percentages of inclusion of the value “0” were quite close to their expected values (95%), which demonstrates the small bias of the method. The method was even unbiased when the QTLs were as close as 5 cM to a chromosome end, because 95.8% of the 95% confidence intervals contained the value “0” for a QTL with a heritability of 10% with the dense marker coverage and 94% with the sparse coverage (configuration 3 in Table 1). This last result contrasts with the properties of the confidence interval on a QTL's position, calculated using the same selective bootstrap, in which case it is anticonservative when the QTL is close to a chromosome end, as demonstrated by Lebreton and Visscher (1998). Only when the heritabilities of the two QTLs tested differed markedly, as in configuration 4 in Table 1, was the estimate of the “distance” significantly biased—by about 5 cM—for both marker densities. Even then, there was not a big discrepancy between the observed and the expected percentages of inclusions with the tight marker coverage. However, the bias was more important with the sparse coverage.
Selective vs. nonselective bootstrap: In Table 1, the selective bootstrap was also compared to the nonselective one. The 95% confidence intervals obtained with the nonselective bootstrap appeared consistently conservative, even very conservative when the QTLs were centrally situated (99.2%) in configuration 1. This implies that the hypothesis of pleiotropy is rejected too infrequently and therefore that the power of the test of close linkage vs. pleiotropy is not maximized with the nonselective scheme. The only configuration for which the confidence intervals were not too large was one in which two QTLs of high heritability (20%) were simulated near a telomere. The method was rather anticonservative, and the percentage of inclusion did not differ much from that obtained using the selective bootstrap. Average confidence interval widths on the estimated distance between the two tested QTLs vary from more than 210 cM in the least powerful design to ∼90 cM in the most powerful of the simulated set, with the nonselective scheme. They can be reduced to about 125 and 67 cM, respectively, with the selective scheme. These confidence interval widths were estimated within confidence margins of ±1 cM only. Thus, the greatest reduction in confidence interval size is achieved for the least powerful designs, which makes sense, because a greater proportion of bootstrap samples were then rejected with the selective bootstrap.
Figure 2 illustrates the effect of the selection step in the bootstrap resampling, where two QTLs at different positions were simulated. It appears that the selection, in addition to reducing the bias of the test, quite dramatically increases its power, especially when the heritability is low. However, heritabilities as low as 10% offer limited resolution in the population we studied because, even when the QTLs are 180 cM apart, the power of the selective bootstrap test reached only 88.4%.
Effect of the marker density on the power of the test: In Figure 3, three marker densities were tested for two population sizes—100 and 200 genotypes. The results demonstrated that a higher marker density significantly increased the resolution of the test for both sizes. For a population size of 100 genotypes, a difference of 1015% in power of the test was consistently observed between a sparse coverage of one marker every 20 cM and a saturated coverage of one marker every centimorgan, over the different distances between the QTLs. With the higher QTL detection power provided by the larger population size of 200, an even larger difference in QTL separation power was observed between the different marker densities.
Effect of the environmental correlation between the traits on the power of the test: In Figure 4, the results of a series of simulations, where different degrees of environmental correlation were combined with different marker spacings, are plotted. The population size was 100, the trait heritability was 0.2, and the effect of the environmental correlation was symmetrical around the value zero. The power of the test increased with a high absolute value of the environmental correlation (r). The effect of an increase in r was more pronounced for tight marker coverage than for loose marker coverage. Thus, with one marker every 20 cM, we observed an increase of only 5% in the power of the test between r = 0 and r = 0.9, whereas with one marker every centimorgan the increase was 20%. This increase was explained by a smaller confidence interval on the estimate of d, the distance between the two QTLs, when r had a high absolute value. For r = 0, the average width of the 95% confidence interval of d was 77 cM but only 64 cM for r = 0.9 (results not shown).
Results presented in Figure 5 explore the effect of the environmental correlation in a less powerful design. The simulations followed the same protocol as that in Figure 4, and the population size was also the same, but the heritabilities of both traits were only 0.1. The maximum absolute value of r was also increased to 0.95 to accentuate any possible effect of this parameter. A very different pattern for the effect of r was observed. First, it was very asymmetrical around the value zero. Negative values of r dramatically increased the power of the test, unlike positive values. Surprisingly, the lower density of one marker every 20 cM provided a slightly better resolution except when r = 0.45. The average widths of the 95% confidence interval of d varied significantly between the different simulated configurations, from 112 ± 5 cM when r = 0 to 97 ± 6 cM when r = –0.95, for a marker spacing of 20 cM. Besides the differences in the size of the confidence interval of d, a bias in the estimate of d seemed to further exacerbate the difference in power associated with r. Thus, in Table 2, this bias was studied further. The average observed value of d was calculated over 2000 retained replicates and for the same range of genetic configurations as that simulated in Figures 4 and 5. Biases were significant only with the lower simulated trait heritability of 0.1 and larger with the lower marker density of one every 20 cM. A negative value for r tended to bias the estimate of d upward (+7.2 cM for r = –0.95 and one marker every 20 cM) and a positive value, downward (–6.9 cM for r = 0.95 and one marker every 20 cM). Likewise, the correlation coefficients between the QTL position estimates were only significant with the lower simulated trait heritability of 0.1, but with both marker densities. A negative correlation between the residuals of the phenotypic trait values generated a negative correlation between the QTL position estimates and vice versa.
Behavior of the method in more complex genetic situations: The power and the bias of the test were also investigated in more complex configurations of several QTLs per trait. The model was three QTLs per trait, on a 160cMlong chromosome. The QTLs were of equal effect and linked in repulsion. Both trait heritabilities were 0.6, and the residuals were correlated with r = 0.2. The population was made of 150 doubled haploid lines. One thousand replicates were simulated. The results are summarized in Table 3. QTLs 1, 3, and 5 affected trait 1, whereas QTLs 2, 4, and 5 affected trait 2. Thus, QTL 5 was pleiotropic by definition, and we had two testable pairs of QTLs: QTLs 1 and 2 in one region of the chromosome and QTLs 3 and 4 in another region. The QTLs were 13 cM apart for the first pair and only 8 cM apart for the second pair. This design was inspired partly by that of Jiang and Zeng (1995), although they simulated F_{2} populations. The simulated chromosome was covered with only one marker every 20 cM to assess the robustness of the test in a realistic configuration. Both testable pairs of QTLs in close linkage were situated in the same marker interval. QTL 5 was the only QTL in its interval and was separated from the nearest one by one empty marker interval. Three positions within this interval were simulated for QTL 5. The estimate of the distance between QTL 1 and QTL 2 was very slightly biased upward by 1.6 cM and that between QTL 3 and QTL 4 was biased downward by ∼45.6 cM, depending on the simulated position of QTL 5. The null hypothesis of pleiotropy, for QTL 5, was rejected with varying frequencies, from 3.6%, when the QTL was simulated in the middle of the interval, to 9.8% when it was simulated at only 2 cM away from the flanking marker on the left. Simulations with the saturated map showed biases in the same directions as those observed with one marker every 20 cM for the three positions of QTL 5, respectively (results not shown). This indicates not only that the biases observed for QTL 5 are due to the withinmarker interval bias on QTL position estimates but also that the linkage with other strong effect QTLs differing for the two tested QTLs tends to affect the estimated distribution of d and the conservativeness of the test. The power of the test was relatively low for the other two pairs of nonpleiotropic QTLs. It ranged from only 4.2 to 7.2%, for QTL 3 and QTL 4, which were 8 cM apart, depending on the simulated position of the linked QTL 5. The power of the test for QTL 1 and QTL 2 was not affected by the position of QTL 5 and was ∼44%.
Application of the test to real data: There was a significant difference between the parents for the length of leaf five. For Chinese Spring the value was 338.0 mm on average and for SQ1 was 214.5 mm. The distribution of this trait among the recombinant genotypes did not differ significantly from normality. The intensity of the stress generated by Na_{2}SO_{4} reduced the number of spikes to fewer than one per plant. SQ1 produced on average one spike per plant, whereas Chinese Spring did not produce any spikes in any plant. Twentythree genotypes did not produce any spikes at all. This generated as many missing data for the average spike length trait. However, the distribution of the mean values of the remaining genotypes did not differ significantly from normality.
Segregation distortion in the marker genotypes was observed on 11 chromosomes, representing every homoeologous group except group 6. In particular, chromosome 7A showed some segregation distortion in several regions. Markers at the telomeric end of the short arm of chromosome 5A, i.e., the two leftmost markers on our map, showed significant segregation distortion (P < 0.001) in favor of the Chinese Springtype alleles in a 72:28 ratio. A relatively good correlation was observed between the lines' phenotypic means for the two traits (R^{2} = 0.3, results not shown).
The detected QTLs for the two traits are listed in Table 4. The 1000 permutations determined an Fsignificance threshold of 13.35 for spike length and 12.87 for leaf length. Four QTLs were detected for spike length and these accounted for 58% of the phenotypic variance among the line means. Two QTLs mapped on chromosome 5A. The first QTL was quite strongly significant (F = 44.8) and was estimated at 168.8 cM from the leftmost marker on our map. It is close to Vrn1, which mapped at 179.1 cM. The empirical distribution of its position estimate, using the selective bootstrap procedure described in Lebreton and Visscher (1998), did not allow us to reject Vrn1 as its candidate gene. One would have had to adopt a 26% type I error risk for this rejection (results not shown elsewhere). The second QTL on 5A for this trait mapped in a poorly covered region of the chromosome at position 229.0 cM in a 34.2cMwide marker interval and was linked in repulsion with the first QTL. Vrn1 could be rejected as its candidate gene with a 1% type I error risk, again using Lebreton and Visscher's (1998) confidence intervals. Only one QTL was detected for leaf length in the genome scan and this was at 202.6 cM on 5A, ∼20 cM from Vrn1, which required too high a risk of type I error (7%) to be rejected as its candidate gene. Figure 6 shows the comparative empirical distributions of the first QTL's position for the spike length and that of the QTL for the leaf length on 5A. Applying our test of close linkage vs. pleiotropy to these two QTLs that mapped near Vrn1 showed that the hypothesis of pleiotropy could be rejected with a stringency that allowed a 1% type I error risk. Figure 7 shows the empirical distribution of D, and zero clearly appears as an outlier to the distribution. Figure 8 shows that the correlation between the two QTLs' estimated positions was not at all significant. The test was also applied to the second QTLon 5A for the spike length, despite the poor marker coverage in its proximity, and to the QTL for the leaf length. Figure 9 shows the comparative distributions of these two QTLs' position estimates. Figure 10 shows the empirical distribution of D for this test.
DISCUSSION
We have introduced a novel way of testing close linkage vs. pleiotropy. Our method is based upon the empirical distribution of the distance between QTLs mapped for different traits in the same chromosome region. The method is very easy to implement and does not make any assumptions about the distribution of data or test statistics. Over the range of configurations tested with one QTL per chromosome, the method presented little bias, in that under the null hypothesis of pleiotropy the proportion of nonrejection was close to its expected value.
The QTL mapping method: Although the estimation of our QTL effect and position used the information from both flanking markers, the detection of the QTLs was carried out by selecting individual markers as opposed to testing pairs of adjacent markers as in the interval mapping or its multiplelinear approximations (Haley and Knott 1992). However, the consequent loss of power in our configurations with one marker every 20 cM is not very high, as shown by Knott and Haley (1992) and Rebai et al. (1995).
The selective bootstrap: The selection step inserted in the bootstrap resampling scheme was identical to that described in Lebreton and Visscher (1998). Because a significance threshold greater than zero was imposed to retain a QTL when analyzing an original data set, it was reasonable to apply the same threshold in order to follow the same procedure when analyzing the bootstrapped data sets to work out the empirical distribution of our d statistic (see Efron and Tibshirani 1993). This means that we only retained the bootstrap outcomes that showed the same QTLs for both traits. Selection was also imposed on the sign of the QTLs. This is because a confidence interval on a QTL's parameter is defined on the condition that this QTL exists. In other words, there is a genetic factor present somewhere along the chromosome, acting in a given direction on the trait. A QTL detected with an effect of opposite sign violates this condition and cannot logically be retained to contribute a value to the estimate of the statistic's empirical distribution. Nevertheless, the selection step generates an upward bias in the absolute value of the estimated QTL effect. As QTLs of higher effects have smaller confidence intervals on their position estimate, a prerequisite of the bootstrap is violated, namely that the statistic estimated from the bootstrapped data follows the same distribution law as that estimated from the original data set. Because we did not find the means to solve the issue analytically, we resorted to simulations.
In contrast to the situation in which the bootstrap resampling was applied on a single QTL's position, as in Lebreton and Visscher (1998) when the distance between two QTLs was resampled, in this study, no bias was observed when the pleiotropic QTL was situated near a telomere, provided that the percentages of variance that it accounts for are similar for both traits. In other words, on average, the distance is close to zero. Yet the individual estimated QTL's positions were biased toward the middle of the chromosome, as investigated by Hyne et al. (1995). The fact that two QTLs of similar significance, at the same position, are subject to similar biases may explain why their estimated distance remained close to the expected value of zero. Conversely, if a pleiotropic QTL explained a markedly different proportion of the variance of the two traits, the biases in the two estimated QTL positions would also be different and, as observed in our simulations, the estimated distance could differ significantly from zero. Even then, the conservativeness of the test did not seem overly decreased, which demonstrates the general robustness of the method.
Concerning the biases observed in the estimates of the QTL effects after the two selection stages, they are not solely an artificial consequence of the experimental protocol to assess our bootstrap method, but correspond to what would be observed when a set of real data is analyzed. Indeed, regarding the bias due to the first selection, if a pair of QTLs is to be tested for close linkage vs. pleiotropy, their observed effects would have to be significant for the QTLs to be detected in the first place. Imposing a significance threshold on the QTL effect for their detection involves a bias on their estimated effects, as investigated by Hyne et al. (1995). As for the bias due to the selection stage introduced in the bootstrap resampling, it grows with the percentage of outcomes rejected, which is inversely related to the size of the QTL effects.
Effect of the environmental correlation and of the marker spacing: We observed that the environmental correlation had a different pattern of effect, depending on the heritability of the traits. Thus, when the heritability was such that the experimental power was high (power of the test of close linkage vs. pleiotropy >50%), then the resolution power of our test also increased significantly with a high absolute value of the environmental correlation r. This increase in resolution and the resolution of the test itself were greater with a higher marker density. When the traits' heritability was lower, a totally different pattern emerged. The effect of r was no longer symmetrical around the value zero, and it was the less dense marker density that provided a slightly higher resolution power of the test. When the heritability was lower, the residuals represented a higher part of the phenotypic variation. Thus a strong correlation between the residuals such as those observed by Cheverud et al. (1997), for example, can have a dramatic effect on the resolution power as our simulations showed. The direction and magnitude of this effect are hard to predict intuitively. We observed that a positive r generated a positive correlation of the QTL position estimates between the two traits. This is expected to decrease the variance of D and therefore increase the power of the test. A negative r generates a negative correlation of the QTL position estimates between the two traits. This is expected to increase the variance of D and therefore decrease the power ofthe test. However, we also showed empirically that a positive r also generates a downward bias of D, which, if no other effects were present, would increase the apparent separation power of the test and vice versa. The overall effect is thus difficult to predict, and the simulations showed that the negative correlation between the residuals had the greatest effect in increasing the resolution; therefore, the bias on the QTL position estimates had the prevailing effect.
It also appears that the correlation between the environmental residuals of the two traits studied could not increase the resolution beyond a certain limit despite very high values of r. This is due to the fact that even if the environmental residuals were totally correlated, there would still be some discrepancy between the two traits' phenotypic scores—i.e., the two vectors corresponding to these scores would not be colinear—due to the recombination between the two QTLs. In reality, of course, it is very likely that the traits would be determined by some additional genetic factors differing between the two traits. This would further decrease the gain in resolution obtainable from the correlation of the environmental residuals.
In the absence of bias on the QTL position estimates, the test of close linkage vs. pleiotropy showed very little bias itself. It also turned out to be robust in the presence of such biases on the QTL position estimates, whether they were due to differences between the two traits' heritabilities or to linkage with other QTLs at different distances as in the example presented in Table 3, provided that the marker density was high. However, when the marker density decreased, the test became more sensitive to these biases, including the withinmarker interval bias on QTL position estimate.
Analysis of the wheat data set: The population sizes were small because data for a maximum of 96 doubled haploid lines were present in the data set. However, the high heritabilities of the traits studied compensated for the small sizes of our sample. The relevant chromosome span was covered with one marker every 6 cM on average for the first pair of QTLs tested for close linkage vs. pleiotropy, which mapped at some 34 cM away from each other. This allowed us to separate them with a maximum 1% type I error risk. The risk of type I error at which the second QTL for the spike length on chromosome 5A can be separated from the QTL of leaf length is to be considered with caution because of the poor marker coverage around the former. Because our QTL mapping method fitted all the detectable QTLs on all the chromosomes, in the full model, no residual correlation of genetic origin was left. There was no correlation of environmental origin between the residuals because the data came from two separate experiments for the two traits. As a consequence, there was also no correlation between the reestimated QTL positions and no extra power to be gained from a joint resampling of the data in this particular case. More generally, for traits measured on the same individuals (as opposed to the case presented above), when implementing a QTL mapping method that includes all the detectable genetic effects in its full model, such as the composite interval mapping of Zeng (1993, 1994) or the multipleQTL model of Jansen (1993), we also expect little extra power to be drawn from a joint resampling of the data if the correlation of environmental origin between the traits is low. However, in this configuration, the nonselective bootstrap method still has the merit of supplying an assumptionfree nonparametric test that is robust in a variety of situations.
In conclusion, the selective nonparametric bootstrap offers a robust alternative, with little bias, to the LODbased method to test close linkage vs. pleiotropy over an acceptable range of genetic configurations. It is computer intensive, but using our regressionbased analysis the test is completed in, at most, a few minutes on any relatively new computer because only ∼150200 bootstraps are necessary. Another element of its attractiveness is the simplicity of its principle, hence its ease of implementation in software using any QTL mapping method, provided the mapping method lends itself to the easy and quick estimation of the minimum number of QTLs along the chromosome of interest. It can be applied to the multipletrait joint composite interval mapping of Jiang and Zeng (1995) through resampling of the estimated distance between the two QTLs of interest in the unconstrained full model, thereby taking advantage of the increased QTL detection power of their QTL mapping method. It is unlikely that our test brings any increase in separation power to the likelihood ratiobased test, but it is unlikely that any analytical method would bring the resolution of the test below 1030 cM in common QTL mapping designs with realistic population sizes (Kearsey and Pooni 1996). To go below that resolution it will be necessary to resort to fine QTL mapping designs, such as advanced intercross lines or nearcongenic lines, or to greatly increased population sizes.
Acknowledgments
We express our gratitude to M. J. Kearsey, J. W. Snape, and J. K. M. Brown for their helpful comments on the manuscript. C.S.H. is grateful to Ministry of Agriculture, Fisheries and Food (MAFF) and Biotechnology and Biological Sciences Research Council (BBSRC) for financial support. Finally, we are also grateful to the John Innes Foundation and Monsanto who funded the Ph.D. project of the corresponding author.
Footnotes

Communicating editor: T. F. C. Mackay
 Received December 30, 1997.
 Accepted June 30, 1998.
 Copyright © 1998 by the Genetics Society of America