Genetics, Vol. 156, 899-911, October 2000, Copyright © 2000

Multitrait Least Squares for Quantitative Trait Loci Detection

Sara A. Knotta and Chris S. Haleyb
a Institute of Cell, Animal and Population Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
b Roslin Institute (Edinburgh), Midlothian EH25 9PS, United Kingdom

Corresponding author: Sara A. Knott, Institute of Cell, Animal and Population Biology, University of Edinburgh, W. Mains Rd., Edinburgh EH9 3JT, United Kingdom., s.knott{at}ed.ac.uk (E-mail)

Communicating editor: T. F. C. MACKAY


*  ABSTRACT
*TOP
*ABSTRACT
*METHOD
*SIMULATION
*RESULTS
*DISCUSSION
*LITERATURE CITED

A multiple-trait QTL mapping method using least squares is described. It is presented as an extension of a single-trait method for use with three-generation, outbred pedigrees. The multiple-trait framework allows formal testing of whether the same QTL affects more than one trait (i.e., a pleiotropic QTL) or whether more than one linked QTL are segregating. Several approaches to the testing procedure are presented and their suitability discussed. The performance of the method is investigated by simulation. As previously found, multitrait analyses increase the power to detect a pleiotropic QTL and the precision of its location estimate. With enough information, discrimination between alternative genetic models is possible.


FREQUENTLY in quantitative trait loci (QTL) mapping experiments a number of phenotypic traits are scored. The common procedure has been to search for QTL trait by trait. The traits, however, are often genetically correlated and, hence, the same QTL may affect two or more traits. Where a QTL has pleiotropic effects on two or more traits, using information from the traits simultaneously should improve the power to detect the QTL and the precision of the location estimate. Alternatively, separate QTL for two different traits may map to a similar location, and it is important to be able to test whether the same QTL could be affecting several traits or whether different linked QTL explain the observations. An obvious extension to the methods for single-trait analyses, therefore, is to methods for multitrait analyses, taking advantage of their potential improved power and precision and enabling additional tests about the genetic control of multiple traits to be addressed.

Several approaches to multitrait analyses have been proposed. WELLER et al. 1996 Down and MANGIN et al. 1998 Down have both proposed the use of canonical transformations of the original data followed by single-trait analyses on the resulting independent traits. The advantage of using an approach based on canonical transformation is that after transformation, existing single-trait software can be used for QTL analyses. If all canonical traits are analyzed, results for a given location can be back-transformed to give estimates of the effects of a putative QTL at that location on the original traits. Using canonical transformation is not wholly satisfactory, however, as a transformation that produces traits that are either phenotypically or genetically uncorrelated does not ensure that QTL only influence a single canonical trait. This is because different QTL affecting a trait may have different patterns of pleiotropy (e.g., some QTL affecting only one trait, others affecting two or more traits). In this case, it is not possible to find a canonical transform that ensures all QTL only influence one canonical trait. Consequently, it cannot be assumed that QTL found to be affecting two different canonical variables in the same location are actually different QTL, as stated by WELLER et al. 1996 Down. One could only conclude that QTL affecting different canonical traits are indeed different if the genetic correlations between traits are the same as the phenotypic correlations and all individual QTL follow the same pattern. Indeed, if all individual QTL have the same pattern of effects on two or more traits, one would expect the genetic correlation between traits to be plus or minus unity or zero.

Maximum-likelihood (ML) approaches have been proposed for a number of different experimental designs, for example, for crosses between inbred lines (JIANG and ZENG 1995 Down), for sib-pair analyses (EAVES et al. 1996 Down), and for half-sib families (RONIN et al. 1995 Down). The main drawbacks of ML approaches are that they are computationally intensive, require specialized software, and can lack robustness. In some circumstances, these multitrait approaches have been shown to have increased power for detecting a QTL and improved precision of parameter estimates. The testing procedure, however, is slow to carry out with maximum likelihood, as standard test-statistic distributions under the null hypothesis are not usually applicable because multiple correlated tests are being performed. Also, extensions to include many QTL are cumbersome within the maximum-likelihood framework.

Recently WU et al. 1999 Down used a multitrait least-squares method to analyze tiller number in rice measured at different times during development. Using a population of recombinant inbred lines and treating the measurement at each time as a different trait, they consider a model with a QTL with an effect on each trait. This is equivalent to the pleiotropic model presented in this article. They compare this with single-trait analyses of tiller number at the different times and find that the multitrait analyses are not necessarily more powerful when considering a single trait measured on several occasions over time, but suggest that the benefit comes in estimating the QTL location more precisely. They do not consider any other multitrait models.

CHEVERUD et al. 1997 Down and LEBRETON et al. 1998 Down have considered the problem of discriminating between linked QTL and a pleiotropic QTL. In both cases, however, the methods are based on single-trait analyses that may lack power compared with a multitrait alternative. CHEVERUD et al. 1997 Down propose a likelihood-ratio test to compare locations estimated for each trait separately vs. a weighted average location. This test has not been formally tested and it is not clear that an estimated location based on a weighted average will be the same as that obtained if the traits were analyzed simultaneously in a multitrait analysis. The method proposed by LEBRETON et al. 1998 Down does not rely on the use of likelihoods. Using a bootstrap approach, multiple sets of data are resampled and analyzed and the linked-QTL hypothesis is rejected (in favor of the null hypothesis of pleiotropy) if the confidence interval for the distance between the estimated QTL locations (one for each trait) includes zero. This approach examines traits a pair at a time and results would become difficult to interpret if many traits were considered. ALMASY et al. 1997 Down test for pleiotropy vs. linkage in a bivariate analysis using an identity-by-descent method with maximum likelihood. The extension to multiple traits is not clear and the use of maximum likelihood means that specialized software is required and analyses will be more time consuming than a least-squares alternative.

Testing procedures for multitrait analyses have not been well explored. For example, it is not clear whether one should start with one multitrait analysis of all traits, or with a series of single-trait analyses, or concentrate on multitrait analyses of subsets of traits clustered, for example, on their phenotypic or genetic correlations. The answer to such questions is likely to depend both on the objectives of the study (for example, is one interested in maximizing power to detect QTL that jointly affect several traits or, alternatively, to detect QTL with effects that run counter to a particular genetic correlation) and on the actual underlying genetic structure. Part of the reason for the difficulty in pursuing such questions is that performing replicated simulation studies is time consuming when using maximum-likelihood approaches.

In this article, we formally describe the implementation of a straightforward multitrait least-squares analysis for QTL detection and location, including models for pleiotropic and for linked QTL, and we explore alternative testing procedures. The basis of this approach is the single-trait analyses proposed by HALEY and KNOTT 1992 Down and HALEY et al. 1994 Down. We have developed and applied the method to a three-generation pedigree where the QTL can be assumed to be fixed in the grandparental generation. The general principles, however, are applicable to the wide range of different population structures appropriate for QTL mapping and amenable to least-squares analysis (see, for example, KNOTT et al. 1996 Down, KNOTT et al. 1997 Down; VAN KAAM et al. 1998 Down). Alternative approaches would be required for more general pedigree structures with many different relationships and multiple generations.


*  METHOD
*TOP
*ABSTRACT
*METHOD
*SIMULATION
*RESULTS
*DISCUSSION
*LITERATURE CITED

Here we describe the use of standard multiple-trait multivariate regression for the detection of QTL. The approach is described for a population based on the F2 generation of a cross between two lines, where the original lines can be assumed to be homozygous for any QTL of interest. In its simplest form this could be a true F2 from a cross between inbred lines; alternatively, following procedures presented by HALEY et al. 1994 Down, the initial lines may be segregating at the markers although fixed for alternative alleles at QTL of major effect. Traits are recorded on the F2 generation and all three generations are genotyped at the markers. We illustrate the method by reference to data on two traits, but the extension to multiple traits is straightforward.

The analysis is in two parts. In the first part, at locations throughout the genome, the probability of each F2 individual being each of the four possible genotypes (accounting for maternal or paternal origin of the alleles) is calculated based on observed marker genotypes. Note that in an inbred line cross the two heterozygous genotypes are indistinguishable. In the second step these probabilities are used as independent variates in a model to analyze the phenotypic data. The approach presented in this article replaces the single-trait least-squares analyses used in the second stage by HALEY and KNOTT 1992 Down and HALEY et al. 1994 Down with a multiple-trait analysis.

Basic model:
The extension of the single-trait to the multitrait analysis is straightforward and standard statistical procedures can be followed. The basic model is

where Y is a matrix (of dimension n x t, where n is the number of F2 individuals and t the number of traits) containing the trait values for all individuals; X is the design matrix (of dimension n x p, where p is the number of explanatory variables) containing fixed-effect levels, covariates, and functions of the genotype probabilities for the location(s) being considered; B (of dimension p x t) contains the estimates for each trait of the fixed effects, covariates, and genotype effects; and E (of dimension n x t) contains the error values for each trait and individual (under the general theory for testing, these are assumed to be multivariately normally distributed). Under this multitrait model, all traits have the same design matrix (X); i.e., the same fixed effects and covariates are fitted. For an additive model, for each QTL in the model the X matrix has an additional column. This column contains the coefficients associated with the considered location, which are a function of the genotype probabilities for each individual, as described in HALEY et al. 1994 Down. For example, the additive coefficient is the probability of the individual being one homozygous genotype minus the probability of being the other. When the grandparental lines are inbred, up to two effects per QTL can be fitted (for example, an additive and a dominance effect at the location). When they are outbred, because we may be able to distinguish between the two heterozygous genotypes at the QTL, up to three effects may be possible (for example, additive, dominance, and imprinting effects as fitted in KNOTT et al. 1998 Down, or a maternal, paternal, and interaction effect following KNOTT et al. 1997 Down).

The solutions to the equations can be obtained as usual as

For a given location the solutions and standard errors are identical to those obtained fitting the same model to the traits separately; the advantage of the multitrait approach comes in the testing procedure.

Significance tests:
Focusing on the situation with only two traits, each being controlled by a maximum of one QTL in the linkage group being considered, we are primarily interested in two types of test: (1) Is there evidence for QTL in this linkage group affecting the traits? and (2) If there is evidence for QTL affecting both traits, are there two linked QTL (one affecting each trait) or one pleiotropic QTL? For both of these tests the significance thresholds cannot be obtained from standard tables.

Evidence for QTL?
For both traits we consider the test of control by a single QTL vs. no QTL in the linkage group. The procedure is identical to that for single-trait analyses, in that the model given above, with a single QTL, is fitted for a given location in the genome and then repeated for different locations until the whole genome has been scanned. At each location the residual sums of squares (SS) matrix, which we denote RSSp, can be calculated as (Y'Y - 'X'Y), where the X matrix contains columns for the fixed effects and covariates that are constant throughout the genome and additional columns associated with the QTL at a given location that are not the same over different locations. The B matrix contains the relevant estimates. The location giving the best fit (see below) is the most likely location for a QTL. Many locations may give evidence for a QTL. Using results from a single scan of the genome, a number of alternative tests can be proposed. The procedure is detailed here and the tests are summarized in Table 1.


 
View this table:
In this window
In a new window

 
Table 1. Summary of the alternative tests performed

Single-trait test: For each trait separately, from the relevant diagonal element of the SS matrix RSSp, we can find the location giving the smallest residual SS when fitting the QTL. This is the most likely location for a single QTL that affects only this trait. The F-ratio of the mean square explained by fitting the QTL to the residual MS can be used to test whether the effect is significant. This is identical to the single-trait analyses described by HALEY et al. 1994 Down and HALEY and KNOTT 1992 Down. For uniformity across the single- and multiple-trait analyses in this article, we present results as approximate likelihood ratios. For single-trait analyses, following SEAL 1964 Down, these can be obtained as -{resdf -(2 - testdf)}ln(), where resdf are the residual degrees of freedom after fitting the QTL and testdf are the degrees of freedom for the test being performed (i.e., the degrees of freedom used to model the QTL effect, e.g., one for an additive QTL with no dominance component fitted). RSSp is the relevant SS from the residual SS matrix having fitted the QTL and RSSr is the SS obtained when fitting only fixed effects and any covariates but not the QTL. RSSr is constant across the genome. This test statistic is approximately equal to testdf x F. For a given location in the genome and under the null hypothesis of no QTL, the test statistic is expected to be distributed approximately as chi square with degrees of freedom equal to testdf.

Multiple-trait tests—pleiotropic QTL model: The determinant of the residual SS matrix obtained fitting, at a given location, a QTL affecting both traits, |RSSp|, can be used to identify the location explaining the most variance in the two traits jointly. As stated above, for uniformity we present approximate likelihood-ratio tests, although other test statistics, such as Wilk's {Lambda}, could be used. We can calculate the following statistic (following SEAL 1964 Down) to test for the presence of the QTL

(1)

with resdf and testdf being the degrees of freedom of the residual after fitting the QTL and of the test, respectively, for each trait, t being the number of traits and RSSr being the residual SS matrix when the QTL is not fitted. For a given location in the genome and under the null hypothesis of no QTL, this test statistic is expected to be distributed as chi square with t x testdf degrees of freedom when considering a single location. As with the single-trait analyses, however, we have the problem that we are selecting the best test statistic from multiple correlated tests and, hence, the standard tables cannot be used. The null hypothesis test statistic distribution, therefore, has to be generated by some other means. A permutation test (CHURCHILL and DOERGE 1994 Down) could be used to generate this distribution. This test is for the combined effect of the QTL on both traits and, therefore, does not test whether both traits are significantly affected by the QTL. Note that even where there is a single pleiotropic QTL, the location that explains most variance over both traits need not be the same as either of the locations that individually explain the most variance in each trait separately.

Multiple trait tests—linked-QTL model: A more general multitrait model is one in which each trait is affected by a different, single QTL. The SS matrix for this model can be obtained by finding the best location for each trait separately (as described for the single-trait tests). The X matrix containing only fixed effects and covariates is augmented with columns for each trait (or QTL) that simply contain the relevant function of the genotype probabilities (additive or dominance coefficients, for example) for the best location of that trait. The matrix contains the estimates obtained when fitting the best location alone for each trait and zero effect for all other locations. For example, in a model fitting only the mean and the additive effect at the putative QTL, the X and matrices would be

where cij is the coefficient for F2 individual i for the additive effect calculated at the best location for the QTL affecting trait 1 (location j) and cik is the coefficient calculated at the best location for the QTL affecting trait 2 (location k); 1 and 2 are the estimated means for the traits (after fitting the QTL) and âj1 is the estimated additive effect for trait 1 at location j and âl1 for trait 2 at location k. These estimates are obtained from the analyses fitting a pleiotropic QTL (or, equivalently, single-trait analyses) at the required locations. The residual SS matrix can now be obtained as (Y - Xll)'(Y -Xll). For a test of the joint effect of all the QTL vs. no QTL, the ratio of the determinant of this reconstituted residual SS matrix, which we denote |RSSl|, to the determinant of the residual SS matrix where no QTL are fitted, |RSSr|, can be used. Using Equation 1, this can be converted into an approximate likelihood ratio. Considering a given location for a QTL affecting each trait, the statistic should be distributed as a chi square with t x testdf under the null hypothesis that there is not a QTL affecting either trait. If the best location for both traits is the same, this test statistic will have the same value as the pleiotropic model test statistic testing for the presence of a pleiotropic QTL vs. no QTL.

Two linked QTL or one pleiotropic QTL?
If both traits are found to be affected by QTL in a given region, we are generally interested in determining whether it is the same QTL affecting both traits or whether there are more than one QTL. We consider two approaches for comparing these alternative hypotheses.

Likelihood-ratio test: A test for two linked QTL vs. one pleiotropic QTL can be accommodated within the multiple-trait framework. The test is based on the ratio of the determinants of the residual SS matrix from the best pleiotropic QTL model, |RSSp|, to the residual SS from the best-linked QTL model, |RSSl|. Converting to an approximate likelihood ratio we wish the test to have 1 d.f. (for the additional location estimated in the linkage model). This can be achieved by setting t and testdf to 1 [or more generally for t traits, this test has (t - 1) d.f.]. If the best locations for the two QTL from single-trait analyses are the same, this test gives a value of 0. As stated for the previous tests, we are no longer within the conditions for the standard significance thresholds to apply and so the distribution of the test statistic under the null hypothesis is required. The null hypothesis for this test is a pleiotropic QTL and, hence, the standard permutation test (for which the null hypothesis is a model with no QTL) cannot be implemented. WALLING et al. 1998 Down investigated the use of the parametric bootstrap to obtain empirical distributions of the QTL location estimate to estimate standard errors. They proposed using the estimates from the data analysis with the relevant model as parameters for simulation and performed multiple replicate simulations and analyses to obtain the distributions. For the test of linkage vs. pleiotropy an analogous procedure can be used. The estimates obtained from the best pleiotropic QTL model are taken and used as parameters for replicate simulations. The test statistic for linkage vs. pleiotropy can be obtained from each replicate giving the empirical distribution over the replicate simulations, from which the significance threshold can be obtained. The test statistic obtained from the original data set can be compared with this threshold to determine whether the test is significant. WALLING et al. 1998 Down considered two situations, one where the original marker data were used and a second where the marker data were also simulated for each replicate. They found no consistent difference between these approaches and as the former is easier and faster to implement, this is the version considered here. From the original analysis we have already calculated the probability of an F2 individual being each of the possible genotypes at all locations throughout the linkage group and the probabilities for the location of the best-fitting pleiotropic QTL can be used in the simulation.

Nonparametric bootstrap: LEBRETON et al. 1998 Down gave an alternative approach, not making direct use of the test statistic derived above, using the nonparametric bootstrap. If there was evidence for QTL affecting both traits, then the data were sampled with replacement to give a data set containing the same number of individuals and reanalyzed fitting a QTL for each trait. The distance between the best QTL locations was calculated and the 95% confidence interval (C.I.) for this distance obtained from replicate samplings. The test for linkage vs. pleiotropy is whether the C.I. for the estimate of the difference in position of the two QTL includes zero. If it does then there is no evidence to exclude the model of the same QTL affecting both traits (i.e., a pleiotropic QTL). This approach does not require the use of the multitrait framework but it is not obvious how it would be extended to more than two traits.


*  SIMULATION
*TOP
*ABSTRACT
*METHOD
*SIMULATION
*RESULTS
*DISCUSSION
*LITERATURE CITED

To investigate the behavior of this approach to QTL detection, a number of different genetic scenarios were simulated. In all cases a single linkage group and only two traits were considered and there was a maximum of one QTL affecting each trait. The QTL could be pleiotropic, that is, affecting both traits, or affecting only one trait. At most two QTL were simulated in a replicate, one affecting each trait. A total of 400 F2 individuals were simulated, each with a 100-cM chromosome with markers every 10 cM (giving 11 markers in all). Three sizes of additive QTL effect were considered: 0.5, 0.25, and 0.125 residual standard deviations (i.e., explaining 0.11, 0.03, and 0.01 of the total variance for a single trait in the F2). With an F2 design we cannot distinguish between genetic (unlinked to the markers being considered) and environmental residual variance, hence an overall residual correlation of 0.75 was investigated in addition to no residual correlation for the QTL of intermediate effect. When only a single QTL was simulated, this QTL was placed at 50 cM (i.e., the midpoint of the linkage group). Two QTL were placed either 20 cM apart or 60 cM apart, equidistant from the center of the linkage group. A total of 100 replicates of each situation were simulated and analyzed.

Significance thresholds:
For the tests with a null hypothesis of no QTL (see Table 1), significance thresholds were obtained by simulating F2 individuals with the markers described above (i.e., 100-cM chromosome with 11 equally spaced markers) and phenotypes depending only on a random environmental component without any QTL. A total of 5000 replicate simulations were performed to obtain the chromosomal 5% threshold (to be used as the experimental threshold for the purpose of this study). Residual correlations of 0.25 and 0.75 were investigated as well as no residual correlation. The relevant test statistics were calculated for each replicate and their distribution over multiple replicates was used to obtain the significance thresholds. For the single-trait tests, to account for the fact that two, possibly correlated, traits were being analyzed, for each replicate simulation the higher of the two single-trait test statistics was picked and the distribution of these was used to obtain the threshold. Alternatively, we can assume that the two traits are uncorrelated and use the 2.5% threshold for each trait, which is equivalent to 5% over both traits (following Bonferroni adjustment). For the multiple-trait tests, i.e., the pleiotropic QTL model and the linked-QTL model tests (Table 1), the significance thresholds already account for the fact that more than one trait is being analyzed.

To confirm that a straightforward implementation of the permutation test (or simulation with no QTL) would not be appropriate for the linkage vs. pleiotropy test (Table 1) where the null hypothesis is a pleiotropic QTL, sets of data were simulated with a single pleiotropic QTL. As before, three sizes of effect for the QTL were considered (0.5, 0.25, and 0.125 residual standard deviations). One thousand replicate simulations were performed for each situation. The distributions of the test statistics in the different situations were compared.

Testing procedure:
Three separate procedures were followed to investigate the optimum strategy for detection of QTL.

  1. Single-trait approach: Each trait was analyzed separately using the single-trait test. If both traits were significant using the empirical 5% thresholds obtained by simulation, the nonparametric version of the linkage vs. pleiotropy test was implemented.

  2. Pleiotropy model approach: Initially the pleiotropic QTL model (a single QTL affecting the two traits) was fitted. The test statistic for the QTL having an effect on both traits jointly was calculated and compared with the empirical experimental 5% significance threshold (obtained by simulation) for the null hypothesis of no QTL. If this was significant, a nominal 5% F-test for the effect of the QTL on each trait was used to determine whether both traits were affected by the QTL. If both traits were involved, then the pleiotropic QTL model was tested against a linked-QTL model (two linked QTL, one affecting each of the traits) using the linkage vs. pleiotropy test.

  3. Linkage model approach: To begin, the linked-QTL model was fitted. If this model was significant (tested against the null hypothesis of no QTL), for each trait the F-ratio for the effect of the QTL on that trait was calculated and tested against the 5% chromosomal threshold for the trait. These chromosomal thresholds are obtained from single-QTL analyses (calculated in the multitrait analyses, see above) not accounting for more than one trait. If both traits were significantly affected by a QTL, the linkage vs. pleiotropy test was carried out.

For the linkage vs. pleiotropy test, both the likelihood-ratio test and the nonparametric bootstrap were performed, each with 1000 replicate samples in the bootstrap.


*  RESULTS
*TOP
*ABSTRACT
*METHOD
*SIMULATION
*RESULTS
*DISCUSSION
*LITERATURE CITED

Significance thresholds:
The thresholds for the chromosomal 5% significance level obtained by simulation for the various tests are given in Table 2. The single-trait thresholds obtained from analysis of the two traits are expected to be the same (other than differences due to sampling) and should be consistent over the different residual correlations. Accounting for multiple correlated traits when obtaining the single-trait threshold (T1 + T2 in Table 2) gave approximately the same threshold as using the Bonferroni correction (accounting for two uncorrelated traits) when the simulated traits were uncorrelated, as expected. With an increase in the simulated residual correlation, we expect no change in the single-trait thresholds obtained using the Bonferroni correction whereas those accounting for correlated traits should become less extreme. The results in Table 2 support this expectation, although the changes in threshold values are small. The residual correlation had virtually no effect on the 5% threshold for the pleiotropic QTL model test, whereas the linked-QTL model test threshold became more extreme (i.e., for this statistic the threshold value was slightly reduced).


 
View this table:
In this window
In a new window

 
Table 2. 5% significance thresholds

When simulating a null hypothesis of one pleiotropic QTL, there was some evidence that the distribution of test statistics for the linkage vs. pleiotropy test depended on the magnitude of the effect of the simulated QTL (see Table 3). Although the differences observed were relatively small, we opted to use the bootstrap approaches, which take account of the size of the effect of the QTL to obtain a significance threshold for testing linkage vs. pleiotropy.


 
View this table:
In this window
In a new window

 
Table 3. Test statistic distributions for linkage vs. pleiotropy

Power:
Single-trait approach: Table 4 gives the number of runs out of the 100 replicates resulting in the different possible outcomes: two linked QTL (one affecting each trait), or one pleiotropic QTL, or a QTL affecting only one of the traits. The results presented are based on the significance thresholds obtained assuming two correlated traits and using the same residual correlation as used to simulate the data being analyzed.


 
View this table:
In this window
In a new window

 
Table 4. Power of the single-trait approach

Multiple-trait approaches: The nonparametric bootstrap proposed by LEBRETON et al. 1998 Down to test for evidence for linked QTL vs. a pleiotropic one lacked power compared with the likelihood-ratio test (Table 5). When a pleiotropic QTL was simulated we would have expected to observe 5% of the runs resulting in two linked QTL (this being the type 1 error we accept in setting the significance thresholds). Both the likelihood-ratio test and the nonparametric methods found fewer significant replicate simulations than expected. As the QTL were simulated to be further apart, discrimination between the linked QTL and pleiotropic models was better with the likelihood-ratio test than with the nonparametric bootstrap approach. With the QTL of smaller effect, neither method was very successful at detecting linkage and nearly all replicates where evidence for a QTL affecting each trait was found resulted in the pleiotropic QTL model not being rejected.


 
View this table:
In this window
In a new window

 
Table 5. Power of tests for linkage vs. pleiotropy

Table 6 and Table 7 give the number of runs resulting in the different models: a QTL affecting only one of the traits, one pleiotropic QTL, or two linked QTL (one affecting each trait) for the two multitrait approaches (i.e., the pleiotropic model and the linkage model approaches; see METHOD for a description). The significance thresholds obtained using the same residual correlation as used to simulate the data being analyzed were used. The results from the likelihood-ratio test for linkage vs. pleiotropy are presented. These are the same as presented in Table 5, but given as the percentage of all runs, rather than the percentage of runs where there was significant evidence for a QTL affecting both traits. The number of replicates in the resulting models does not necessarily sum to the total number of significant replicates. This is because although the multitrait model (linkage or pleiotropy) was significant, when the effect of the QTL was tested on the traits separately, they were not significant.


 
View this table:
In this window
In a new window

 
Table 6. Power of the pleiotropy model approach


 
View this table:
In this window
In a new window

 
Table 7. Power of the linkage model approach

When data were simulated with a QTL affecting only one of the two traits (and the other trait had no QTL affecting it in the linkage group), with no residual correlation between the traits, all three approaches (single-trait and two multitrait approaches) were similar in their power to detect QTL. All the approaches generally resulted in the correct model. By chance we expect the QTL to have a significant effect on the second trait in 5% of the significant runs, this being the type 1 error we were trying to achieve when setting the thresholds. The pleiotropy model approach results in both traits being significant more often than expected, suggesting that the significance criterion used to assess the effect of the QTL on each trait was not stringent enough. When the residual correlation was simulated to be 0.75, the power of the multitrait analyses to detect a QTL increased compared with when no residual correlation was simulated. The best results were observed with the pleiotropy model approach.

When data were simulated with QTL affecting both traits, the multitrait QTL models were significant in more runs than the single-trait analyses. For the largest-effect QTL, all approaches detected QTL in all replicate simulations. The linkage model and single-trait approaches always resulted in a QTL affecting both traits. When the QTL were tightly linked, the pleiotropy model approach performed well; however, when the QTL were 60 cM apart, 50% of runs ended with a model with a QTL affecting only one of the traits. For this large-effect QTL, when there was evidence that QTL were affecting both traits, there was good discrimination between linkage and pleiotropy, especially for the multitrait approaches. For the intermediate-effect QTL, the power to detect at least one significant QTL was high. The highest power to detect the pleiotropic QTL was found with the pleiotropy model approach. For the simulated linked QTL, the linkage model approach gave the highest proportion of runs resulting in the correct QTL model. The power to discriminate between linkage and pleiotropy was much lower for the intermediate-effect QTL than with the large-effect QTL. When a residual correlation between the traits was simulated, the multitrait approaches gave less power when a pleiotropic QTL was simulated, compared with the situation with no residual correlation, although fewer runs resulted in only one trait being significantly affected by a QTL. When the simulated-QTL model was linked QTL, the power of the multitrait models was similar to the situation without a residual correlation. Except for the situation with QTL simulated to be 60 cM apart and analyzed following the pleiotropy model approach, both the single-trait and the multitrait approaches less frequently resulted in a model with only one of the traits affected by a QTL. The power to detect the smallest-effect QTL was low. The pleiotropy model approach most frequently picked up that two traits were involved, but for the simulated linked QTL there was a lack of power to detect linkage. The improved performance of the pleiotropy model approach could also reflect the reduced stringency observed when testing the effect of a significant pleiotropic QTL on each trait.

Parameter estimates:
In QTL mapping experiments, we are interested not only in the power to detect any QTL but also in the magnitude of its effect and the location on the chromosome. Table 8 gives the mean parameter estimates obtained when fitting the pleiotropic QTL model and when fitting the single-trait model. The results in Table 8 are from the analyses of the two QTL with larger effect, which were detected with a high frequency.


 
View this table:
In this window
In a new window

 
Table 8. Parameter estimates

When the analyses resulted in the same QTL model as that used for the simulation (i.e., a pleiotropic QTL or linked QTL), the parameter estimates were, on average, close to those used for the simulation. Comparing the results from fitting the pleiotropic QTL model with the single-trait models (see Table 8), the multitrait analyses resulted in a reduction in the standard deviation of the estimate of location when a pleiotropic QTL was simulated, except as expected when the residual correlation between the traits was high. The additive effect was, on average, slightly overestimated in the single-trait analyses, as expected because of the selection of location for the QTL. The effect of selection was decreased when correctly fitting the pleiotropic QTL model, giving, on average, a lower overestimate. If the pleiotropic model was incorrectly fitted when linked QTL were simulated, as expected the mean estimate of location was, on average, halfway between the simulated locations of the QTL and the standard deviation of location was inflated. Also, the estimates of the additive effect were lower than those simulated, because the QTL could not be at the optimum location for both traits, and the standard deviation was inflated. In this situation, estimates obtained from the single-trait analyses (which would be the same as the linked-model results if all replicates ended in a linked-QTL model) were closer to the simulated parameter values. The estimates from the single-trait analyses were not affected by whether the simulated model was a pleiotropic QTL or linked QTL. When only one of the traits simulated was affected by a QTL, the pleiotropic QTL model and single-trait model gave very similar results when the QTL effect was large. The pleiotropic QTL model correctly detects the QTL and estimates it to have, on average, zero effect on the second trait. For the smaller-effect QTL, the pleiotropic QTL model is slightly less powerful, which results in a higher standard deviation of the location estimate.

The presence of correlated residuals in the data makes very little difference in the parameter estimates. Two main differences were observed, both concerning the variance of the location estimate. The first is the larger standard deviation seen when a pleiotropic QTL was simulated and the second is the lower standard deviation seen when a QTL affecting only one of the traits was simulated. In both cases, the observed change in the power of these analyses might cause this effect (decrease of power in the first case and increase in the second).


*  DISCUSSION
*TOP
*ABSTRACT
*METHOD
*SIMULATION
*RESULTS
*DISCUSSION
*LITERATURE CITED

The multitrait multiple regression approach for QTL mapping performs well in detecting and characterizing QTL and is very easy to implement. If traits are affected by the same QTL, a multitrait analysis increases the power of detection of this QTL compared with single-trait analyses. If the pleiotropic QTL model is the correct one, we would expect that fitting this model would give highest power and smallest standard deviations especially for location, as in this case both traits are being used to estimate the same parameter. When the simulated QTL are some distance apart, the pleiotropy model approach performs less well. The best estimate for the location of the pleiotropic QTL tends to be at the location of one of the simulated QTL and the evidence for the QTL affecting the second trait at this location will be low. The linkage model approach performs better in this situation.

We have implemented the analyses discussed in stand-alone software; however, the models being presented here could be fitted in a standard statistical package in which multitrait least squares is available. To do this, the genotype probabilities would have to be calculated from the marker data prior to analysis, using specialized software (e.g., HALEY et al. 1994 Down). The analysis would involve a call to the least-squares software for each location considered in the genome, altering the X matrix for subsequent calls as required. The tests for the linked-QTL and pleiotropic QTL models could be easily performed from output of such analyses, using the residual SS matrices. The likelihood ratio for linkage vs. pleiotropy would need to be calculated by constructing the relevant matrices and performing simple matrix manipulations.

Being a standard multitrait regression problem, the inclusion of fixed effects and covariates is straightforward and easy. The restriction that the same design matrix is required for all traits should not cause problems, as although fixed effects and covariates may be included that do not have a significant effect on some traits, this should not cause a bias in the results. The extension to fit multiple QTL through cofactors or by a multidimensional search is also straightforward following KNOTT et al. 1998 Down. There are two basic alternatives for determining the model to be fitted in terms of fixed effects, covariates, and cofactors. One is to find the best model for each trait separately and then include all the effects into the multiple-trait model; the alternative is to determine the best model in the multiple-trait framework, i.e., considering the effect of the explanatory variables on all traits simultaneously. If the explanatory variables were correlated, the latter approach would tend to result in fitting fewer explanatory variables.

Missing genotype data are not a problem as potential QTL genotype probabilities are obtained from neighboring informative markers (possibly more than two when markers are only partially informative). Individuals with no marker genotypes are best excluded because they provide no information about the location of potential QTL. With maximum likelihood, these individuals may be included, but they provide only distribution information, which, as shown previously (HALEY and KNOTT 1992 Down), has little effect on the results of the analysis in terms of the detection of QTL. If selective genotyping has been carried out, the analyses can still be performed. The estimate of the location of any QTL should not be biased but the effect of any QTL will be overestimated. Making assumptions about the distribution of the trait, a correction for the overestimate could be made. HENSHALL and GODDARD 1999 Down consider this problem of selective genotyping in half-sib families and propose the use of a logistic regression, which considers the genotype as the dependent variable and the phenotype as the independent one. They find that this approach performs as well as maximum-likelihood alternatives but is much easier to implement. It is limited, however, to situations where the offspring can be one of only two genotypes (e.g., a backcross).

F2 individuals with missing phenotypes would not give information about the presence of a QTL and, hence, could be omitted. In the multitrait situation, however, individuals may be missing only one of several traits and hence they would be wanted in the analysis. In experimental situations, the frequency of missing phenotypes will usually be low, except where factors such as sex limitation are involved (e.g., traits that can only be measured in one sex). For more general application of the multitrait analysis, however, methods for missing phenotypes need to be investigated.

The significance thresholds used to detect the presence of QTL in this simulation study have been obtained by simulation using the same residual correlation between the traits as used to generate the data with QTL. In practice, a permutation test may be implemented. In this case, the phenotypic correlation between the traits will be treated as a residual correlation. In the simulated data, however, any additional correlation between the traits generated by the QTL is small and, hence, ignoring this should not bias the results.

In this study we did not consider the more general model where there could be two linked QTL, both with an effect on each trait (i.e., two linked pleiotropic QTL). Such a model can easily be included within the multitrait framework described here and could be fitted in a two-dimensional search analogous to the two-QTL model for the single-trait analyses (for example, HALEY and KNOTT 1992 Down). This model could be tested against the nested ones of one pleiotropic QTL and two linked QTL using the parametric bootstrap to set the significance thresholds. Additionally the effect of the two QTL on each trait could be tested following the approach suggested here.

The results presented here are based on the situation with two traits. Obviously there are frequently more than two traits recorded and the models described here can easily be extended to accommodate more traits. The extreme models would be one QTL affecting each trait and one QTL possibly affecting all of them. In addition there could be a number of intermediate models, such as one QTL affecting some of the traits but not others. These models can easily be accommodated in the analysis and test statistics similar to those described here for two traits determined (based on the ratio of the determinants of sums of squares matrices). The problem is one of testing, as the null hypothesis will frequently not be no genetic control and a series of tests for alternative models may be required. As shown here, a permutation test could be performed to test for the presence of QTL. The parametric bootstrap could be adapted for the involvement of more traits to test for linkage or pleiotropy. The null hypothesis model would be simulated and alternative models tested against it. In this case, strategies for looking for QTL become important, as the number of possible models is much greater than in the two-trait situation.

Several strategies can be proposed to determine the genetic architecture of multiple correlated traits. If the traits are genetically uncorrelated, then we expect that QTL would not have pleiotropic effects on two or more traits so that there would be no benefit from a multitrait approach. (Although note that, theoretically at least, a lack of genetic correlation could result from pleiotropic effects at two or more QTL that counteract each other.) Thus, for uncorrelated traits, starting with single-trait analyses is appropriate, although tests of pleiotropy vs. linkage for specific QTL that map to a similar location should be performed by multitrait analysis. If the traits are highly genetically correlated, at least some QTL are expected to affect some or all traits and, if we wish to find these QTL, then the pleiotropy model approach will be the most suitable starting point. For traits that are less highly correlated, the linked QTL model may be a better starting point. This would also be appropriate where the investigator's interest is in locating QTL that run counter to the general trend for more highly correlated traits. In this case one might first analyze the data with a pleiotropic QTL model, fit the pleiotropic QTL found as cofactors, and repeat the analysis with a linked-QTL model. Further work is required to examine the efficiency of these alternative strategies for analyzing data on multiple traits and to determine how the strategies are influenced by both the biological structure that underlies the data and by the nature of the questions being asked.


*  ACKNOWLEDGMENTS

We thank Peter Visscher for very useful comments. We are grateful for support from the Royal Society and the Biotechnology and Biological Sciences Research Council.

Manuscript received December 14, 1999; Accepted for publication June 20, 2000.


*  LITERATURE CITED
*TOP
*ABSTRACT
*METHOD
*SIMULATION
*RESULTS
*DISCUSSION
*LITERATURE CITED

ALMASY, L., D. D. DYER, and J. BLANGERO, 1997  Bivariate quantitative trait linkage analysis: pleiotropy vs. co-incident linkages. Genet. Epidemiol. 14:953-958[Medline].

CHEVERUD, J. M., E. J. ROUTMAN, and D. J. IRSCHICK, 1997  Pleiotropic effects of individual gene loci on mandibular morphology. Evolution 51:2006-2016.

CHURCHILL, G. A. and R. W. DOERGE, 1994  Empirical threshold values for quantitative trait mapping. Genetics 138:963-971[Abstract].

EAVES, L. J., M. C. NEALE, and H. MAES, 1996  Multivariate multipoint linkage analysis of quantitative trait loci. Behav. Genet. 26:519-525[Medline].

HALEY, C. S. and S. A. KNOTT, 1992  A simple method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315-324[Medline].

HALEY, C. S., S. A. KNOTT, and J.-M. ELSEN, 1994  Mapping quantitative trait loci in crosses between outbred lines using least squares. Genetics 136:1195-1207[Abstract].

HENSHALL, J. M. and M. E. GODDARD, 1999  Multiple trait mapping of quantitative trait loci after selective genotyping using logistic regression. Genetics 151:885-894[Abstract/Free Full Text].

JIANG, C. and Z-B. ZENG, 1995  Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140:1111-1127[Abstract].

KNOTT, S. A., J.-M. ELSEN, and C. S. HALEY, 1996  Methods for multiple-marker mapping of quantitative trait loci in half-sib populations. Theor. Appl. Genet. 93:71-80.

KNOTT, S. A., D. B. NEALE, M. M. SEWELL, and C. S. HALEY, 1997  Multiple marker mapping of quantitative trait loci in an outbred pedigree of loblolly pine. Theor. Appl. Genet. 94:810-820.

KNOTT, S. A., L. MARKLUND, C. S. HALEY, K. ANDERSSON, and W. DAVIES et al., 1998  Multiple marker mapping of quantitative trait loci in a cross between outbred wild boar and Large White pigs. Genetics 149:1069-1080[Abstract/Free Full Text].

LEBRETON, C. H., P. M. VISSCHER, C. S. HALEY, A. SEMIKHODSKII, and S. A. QUARRIE, 1998  A nonparametric bootstrap method for testing close linkage vs. pleiotropy of coincident quantitative trait loci. Genetics 150:931-943[Abstract/Free Full Text].

MANGIN, B., P. THOQUET, and N. GRIMSLEY, 1998  Pleiotropic QTL analysis. Biometrics 54:88-99.

RONIN, Y. I., V. M. KIRZHNER, and A. B. KOROL, 1995  Linkage between loci of quantitative traits and marker loci: multitrait analysis with a single marker. Theor. Appl. Genet. 90:776-786.

SEAL, H. L., 1964 Multivariate Statistical Analysis for Biologists. Methuen, London.

VAN KAAM, J. B. C. H. M., J. A. M. VAN ARENDONK, M. A. M. GROENEN, H. BOVENHUIS, and A. L. J. VEREIJKEN et al., 1998  Whole genome scan for quantitative trait loci affecting body weight in chickens using a three generation design. Livest. Prod. Sci. 54:133-150.

WALLING, G. A., P. M. VISSCHER, and C. S. HALEY, 1998  A comparison of bootstrap methods to construct confidence intervals in QTL mapping. Genet. Res. 71:171-180.

WELLER, J. I., G. R. WIGGANS, P. M. VANRADEN, and M. RON, 1996  Application of a canonical transformation to detection of quantitative trait loci with the aid of genetic markers in a multitrait experiment. Theor. Appl. Genet. 92:998-1002.

WU, W.-R., W.-M. LI, D.-Z. TANG, H.-R. LU, and A. J. WORLAND, 1999  Time-related mapping of quantitative trait loci underlying tiller number in rice. Genetics 151:297-303[Abstract/Free Full Text].




This article has been cited by other articles:


Home page
GeneticsHome page
S. Banerjee, B. S. Yandell, and N. Yi
Bayesian Quantitative Trait Loci Mapping for Multiple Traits
Genetics, August 1, 2008; 179(4): 2275 - 2289.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. P. Kenney-Hunt, B. Wang, E. A. Norgard, G. Fawcett, D. Falk, L. S. Pletscher, J. P. Jarvis, C. Roseman, J. Wolf, and J. M. Cheverud
Pleiotropic Patterns of Quantitative Trait Loci for 70 Murine Skeletal Traits
Genetics, April 1, 2008; 178(4): 2275 - 2288.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. N. Sampson and S. G. Self
Identifying trait clusters by linkage profiles: application in genetical genomics
Bioinformatics, April 1, 2008; 24(7): 958 - 964.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
M. S. Lund, G. Sahana, L. Andersson-Eklund, N. Hastings, A. Fernandez, N. Schulman, B. Thomsen, S. Viitala, J. L. Williams, A. Sabry, et al.
Joint Analysis of Quantitative Trait Loci for Clinical Mastitis and Somatic Cell Score on Five Chromosomes in Three Nordic Dairy Cattle Breeds
J Dairy Sci, November 1, 2007; 90(11): 5282 - 5290.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
J. B. C. H. M. van Kaam, M. C. A. M. Bink, D. O. Maizon, J. A. M. van Arendonk, and R. L. Quaas
Bayesian reanalysis of a quantitative trait locus accounting for multiple environments by scaling in broilers
J Anim Sci, August 1, 2006; 84(8): 2009 - 2021.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
R. Yang, Q. Tian, and S. Xu
Mapping Quantitative Trait Loci for Longitudinal Traits in Line Crosses
Genetics, August 1, 2006; 173(4): 2339 - 2356.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
J. Kucerova, M. S. Lund, P. Sorensen, G. Sahana, B. Guldbrandtsen, V. H. Nielsen, B. Thomsen, and C. Bendixen
Multitrait quantitative trait Loci mapping for milk production traits in danish Holstein cattle.
J Dairy Sci, June 1, 2006; 89(6): 2245 - 2256.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Gonzalo, T. J. Vyn, J. B. Holland, and L. M. McIntyre
Mapping Density Response in Maize: A Direct Approach for Testing Genotype and Treatment Interactions
Genetics, May 1, 2006; 173(1): 331 - 348.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
T. M. Stearns, J. E. Beever, B. R. Southey, M. Ellis, F. K. McKeith, and S. L. Rodriguez-Zas
Evaluation of approaches to detect quantitative trait loci for growth, carcass, and meat quality on swine chromosomes 2, 6, 13, and 18. II. Multivariate and principal component analyses
J Anim Sci, November 1, 2005; 83(11): 2471 - 2481.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Macgregor, S. A. Knott, I. White, and P. M. Visscher
Quantitative Trait Locus Analysis of Longitudinal Quantitative Trait Data in Complex Pedigrees
Genetics, November 1, 2005; 171(3): 1365 - 1376.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
M. Perez-Enciso, A. Mercade, J. P. Bidanel, H. Geldermann, S. Cepica, H. Bartenschlager, L. Varona, D. Milan, and J. M. Folch
Large-scale, multibreed, multitrait analyses of quantitative trait loci experiments: The case of porcine X chromosome
J Anim Sci, October 1, 2005; 83(10): 2289 - 2296.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Cui and R. Wu
Mapping genome-genome epistasis: a high-dimensional model
Bioinformatics, May 15, 2005; 21(10): 2447 - 2455.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C. Xu, Z. Li, and S. Xu
Joint Mapping of Quantitative Trait Loci for Multiple Binary Characters
Genetics, February 1, 2005; 169(2): 1045 - 1059.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
C. Schrooten, M. C. A. M. Bink, and H. Bovenhuis
Whole Genome Scan to Detect Chromosomal Regions Affecting Multiple Traits in Dairy Cattle
J Dairy Sci, October 1, 2004; 87(10): 3550 - 3560.
[Abstract]