| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Corresponding author: Chen-Hung Kao, Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan, Republic of China., chkao{at}stat.sinica.edu.tw (E-mail)
| ABSTRACT |
|---|
A new statistical method for mapping quantitative trait loci (QTL), called multiple interval mapping (MIM), is presented. It uses multiple marker intervals simultaneously to fit multiple putative QTL directly in the model for mapping QTL. The MIM model is based on Cockerham's model for interpreting genetic parameters and the method of maximum likelihood for estimating genetic parameters. With the MIM approach, the precision and power of QTL mapping could be improved. Also, epistasis between QTL, genotypic values of individuals, and heritabilities of quantitative traits can be readily estimated and analyzed. Using the MIM model, a stepwise selection procedure with likelihood ratio test statistic as a criterion is proposed to identify QTL. This MIM method was applied to a mapping data set of radiata pine on three traits: brown cone number, tree diameter, and branch quality scores. Based on the MIM result, seven, six, and five QTL were detected for the three traits, respectively. The detected QTL individually contributed from ~1 to 27% of the total genetic variation. Significant epistasis between four pairs of QTL in two traits was detected, and the four pairs of QTL contributed ~10.38 and 14.14% of the total genetic variation. The asymptotic variances of QTL positions and effects were also provided to construct the confidence intervals. The estimated heritabilities were 0.5606, 0.5226, and 0.3630 for the three traits, respectively. With the estimated QTL effects and positions, the best strategy of marker-assisted selection for trait improvement for a specific purpose and requirement can be explored. The MIM FORTRAN program is available on the worldwide web (http://www.stat.sinica.edu.tw/~chkao/).
THE basic principle of using genetic markers to study quantitative trait loci (QTL) is well established (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In recent years, the advent of fine-scale molecular genetic marker maps for various organisms by molecular biology techniques has greatly facilitated the systematic mapping and analysis of individual QTL. ![]()
![]()
![]()
![]()
![]()
![]()
The approach of IM considers one QTL at a time in the model for QTL mapping. Therefore, IM can bias identification and estimation of QTL when multiple QTL are located in the same linkage group (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Ideally, we would extend the current QTL mapping models to a multiple QTL model for mapping multiple QTL in a way that QTL can be directly controlled in the model to further improve QTL mapping. In this article, a new QTL mapping method named multiple interval mapping (MIM) was developed. MIM uses multiple marker intervals simultaneously to construct multiple putative QTL in the model for QTL mapping. Therefore, when compared with the current methods such as IM and CIM, MIM tends to be more powerful and precise in detecting QTL as shown by the example in this article. In addition, MIM can readily search for and analyze epistatic QTL and estimate the individual genotypic value and the heritabilities of quantitative traits. On the basis of the MIM result, genetic variance components contributed by individual QTL were also estimated, and marker-assisted selection can be performed.
| GENETIC MODEL |
|---|
Consider m QTL, Q1, Q2, · · ·, and Qm, in a backcross population in which there are two genotypes, QjQj and Qjqj, each with one-half frequency for a QTL, say Qj. For m QTL, there are 2m possible different QTL genotypes in the population. Cockerham's genetic model (C-H. KAO and Z-B. ZENG, unpublished results) is used to define the genetic parameters and model the relation between the genotypic value and the genetic parameters. If only up to digenic epistasis is considered, the relation between the genotypic value of individual i, Gi, and the genetic parameters can be expressed in the equation
![]() |
(1) |
where xij is coded as 1/2 or -1/2 if the genotype of Qj is QjQj or Qjqj, respectively, aj is the corresponding main effect of Qj, and wjk is the epistatic effect between Qj and Qk. The main advantage of Cockerham's model is that it possesses an orthogonal property in modeling genetic parameters.
To assist with explaining the estimation of the genetic effects in the MIM model (Equation 3), the genetic model in Equation 1 is expressed in matrix notation as Equation 2 (Figure 1). In Equation 2, the column vector G contains the genotypic values of the 2m possible genotypes. The subscripts of G (1 or 0) denote the homozygote or heterozygote of the QTL in the order of the first, second, third, · · ·, and mth QTL, respectively. The first m columns in the genetic design matrix D are the coefficients associated with the main effects of the m QTL, and the last m(m - 1)/2 columns represent the coefficients of the epistatic effects among them. Vector E contains the QTL main and epistatic effects. If there is no epistasis between some QTL, some of the columns for epistasis should be dropped out from matrix D. If higher-order epistasis is considered, the dimension of matrix D is easy to expand accordingly. The matrix D plays an important role in estimation of genetic parameters in the MIM model.
| STATISTICAL MODEL OF MIM |
|---|
Multiple interval mapping:
Assume m QTL, Q1, Q2, · · ·, and Qm, located at positions p1, p2, · · ·, pm in m different marker intervals, I1, I2, · · ·, Im, along the genome, control a quantitative trait y. Among the m QTL, some may show epistasis and some may not. The quantitative trait value for an individual, i, can be related to the m putative QTL by the model
![]() |
(3) |
where µ is the mean, x*ij is the coded variable for the genotype of Qj, aj and wjk have the same definitions as those in the genetic model in Equation 1,
jk is an indicator variable for epistasis between Qj and Qk, and
i is assumed to follow N(0,
2). Indicator variable
jk takes value one if Qj and Qk interact; otherwise its value is zero. In this model, the first summation is for the main effects of the m QTL, the second summation is for their possible epistasis, and
i is the environmental deviation. This is termed the MIM model because multiple (m) marker intervals are simultaneously used to construct multiple (m) putative QTL in the model for QTL mapping. If QTL genotypes are known, the model tells that the quantitative trait value is the sum of the QTL main effects, their possible epistatic effects, and environmental deviation, and the MIM model is a regression model. However, the putative QTL genotypes denoted by x*ij's are usually not observed because QTL could be located in the intervals. Given observed flanking marker genotypes, the conditional distributions of QTL genotypes, x*ij's, for QTL at specific positions, pj's, can be inferred based on Haldane's mapping function (![]()
![]()
![]()
|
The MIM model is a multiple QTL model and its likelihood is a finite normal mixture. There are two problems that need to be solved for the MIM model. The first is that of parameter estimation of the finite normal mixture model. As m becomes large, the derivation of the maximum-likelihood estimates (MLEs) of the QTL effects and positions in estimation quickly becomes unwieldy. To handle the estimation problem, the general formulas derived by ![]()
| LIKELIHOOD OF THE MIM MODEL |
|---|
In the MIM model, the genotype of each putative QTL, Qj in interval Ij, is not observed, but its distribution can be inferred from the flanking markers of Ij based on the recombination frequency between them. For every QTL in the backcross population, the conditional probabilities of the QTL genotypes, given different flanking marker genotypes, can be found in Table 1 of ![]()

The joint conditional probability of the m QTL is the product of the marginal conditional probabilities of individual QTL. We refer to pij, j = 1, 2, · · ·, 2m, as the conditional probabilities of 2m possible QTL genotypes (note that pj's denote QTL positions and pij's denote the conditional probabilities). If multiple putative QTL within a single marker interval are considered, the individual and joint conditional probabilities of QTL genotypes can be also inferred directly or by a Markov chain procedure (![]()
Given a sample with size n, the likelihood function of the MIM model for
= (p1, p2, · · ·, pm, a1, · · ·, am, · · ·, wjk, · · ·,
2) is
![]() |
(4) |
where
(·) is a standard normal probability density function, µij's correspond to the genotypic values of the 2m different QTL genotypes in Equation 1, and pij's containing information on QTL positions are the corresponding joint conditional probabilities. Statistically, this is a normal mixture model. The density of each individual is a mixture of 2m possible normal densities with different means µij's and mixing proportions pij's. To obtain the MLEs and the asymptotic variance-covariance matrix of the model, the general formulas of ![]()
![]()
| PARAMETER ESTIMATION |
|---|
The likelihood of the MIM model is a finite normal mixture. In parameter estimation, the finite normal mixture model can be treated as an incomplete-data problem (![]()
![]()
In the MIM model, when only one putative QTL (m = 1) is considered in a backcross population, the likelihood is a mixture of two normals (like IM and CIM), and four parameters need to be estimated. The derivation of the MLEs for the one putative QTL model using the EM algorithm has been provided (![]()
![]()
![]()
To apply the general formulas to MIM, the genetic design matrix D of the MIM model has the same first m columns as those in Equation 2 for indicating the m main QTL effects and has some or none of the last m(m - 1)/2 columns for specifying epistasis. We refer to D as a 2m x k matrix, where k is the column dimension. There are m individual conditional probability matrices, Q1, Q2, · · ·, and Qm for the m QTL. The components of the conditional probability matrix Qj of QTL Qj in the interval Ij with flanking markers Mj and Nj can be found in Table 1 of ![]()
Q2
· · ·
Qm, where
denotes the Kronecker product. The 2m mixing proportions of any individual i, pij's, can be found to be one of the 4m rows in Q according to its flanking marker genotype. Given the matrices D and Q, the MLEs and the asymptotic variance-covariance matrix can be readily obtained by the general formulas.
Note that, at the tested positions p1, p2, · · ·, and pm, the mixing proportions pij's in the likelihood are fixed and need not be estimated. For obtaining the MLEs of mean, environmental variance, and marginal and epistatic effects, the general equations formulate the iteration of the (t + 1) EM step as follows:
E step:
Update the posterior probabilities of the 2m possible QTL genotypes for each individual i,
![]() |
(5) |
M step:
Find
(t + 1), which satisfies the solutions
![]() |
(6) |
![]() |
(7) |
![]() |
(8) |
where
= {
ij}nx2m, V = {1'
(Di#Dj)}kxk, r = {
}kx1, and M = {1'
x
(i
j)}kxk. Di(Dj) is the ith(jth) column of the genetic design matrix D. The notation
(i
j) is an indicator variable that takes value 1 if i
j, and 0 otherwise, and # denotes Hadamard product, which is the element-by-element product of corresponding elements of two same order matrices. For more detailed procedures of the derivation see ![]()
| STRATEGY OF QTL MAPPING |
|---|
For the MIM approach, the second problem that needs to be considered is how to search for QTL to fit into the MIM model. It is quite common that genetic marker data, e.g., rice (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Critical value for claiming QTL detection:
When using the LRT statistic as a criterion in model selection for QTL detection, it is very important to determine the appropriate critical value for claiming QTL detection such that correct statistical inference about QTL parameters can be made. ![]()
![]()
![]()
![]()
![]()
![]()
The above considerations on critical value are for the single-QTL model. For a multiple-QTL model, a model selection procedure is required to determine the final model. If stepwise selection is used, the final model is selected from a sequence of nested tests, and the significance level of the sequence will depend on the unknown true model (![]()
![]()
Stepwise selection procedure:
The stepwise selection begins with no QTL (m = 0). QTL are then added or deleted one by one in the model. Alternatively, a group of QTL can be added or deleted together. The testing hypotheses for adding or deleting one additional QTL Qi are
![]() |
(9) |
given other, say, k QTL in the model. In hypotheses 9, ai denotes the effect of Qi. A LRT statistic

is used for testing the hypotheses, where L0 and L1 are the likelihoods of the MIM models with k and k + 1 QTL, respectively. If a group of QTL is tested, the hypothesis testing would contain several QTL effects. The stepwise model selection procedure proceeds as follows:
Step 1:
Significant values for entry (SVE) and staying (SVS) of a LRT statistic are specified for adding and dropping a QTL in the MIM model. Note that SVE and SVS could be different in model selection.
Step 2:
For each position on the genome covered by markers, the LRT statistic reflecting the contribution of the putative QTL to quantitative trait variation is calculated (m = 1; IM). If there are positions with LRT statistics larger than SVE, the position with the largest value will be selected and added first in the model. When m = 1, it is important to note the shape of the likelihood profile and the direction of effect change along the genome for further mapping. Note that quite often no position is found with the LRT statistic larger than SVE when m = 1 because individual QTL contribute little to the trait variation. Two alternative approaches are proposed to prevent the procedure from stopping at a very early stage.
First, when m = 1, the position with the highest LRT statistic is automatically included in the model to initiate the procedure. In our experience, when only one QTL is considered in the model (m = 1), it is quite often found that the LRT statistic of a QTL could be less than SVE. But, when multiple QTL (if any) are accumulated in the model (m > 1), the partial LRT statistics of individual QTL might become significant because more genetic variation is removed from residual variation by taking multiple QTL into account.
Second, chunkwise selection (![]()
Step 3:
After the first k QTL are added to the model, the MIM model with m = k + 1 QTL is considered. The position that produces the most significant partial LRT statistic at the SVE level is added into the model. After the k + 1 QTL are fitted to the model, stepwise selection checks all the QTL and deletes any QTL that does not produce a significant partial LRT statistic at the SVS level. Note that a QTL that enters at an early stage may become superfluous at a later stage in stepwise selection procedure. By the same argument, chunkwise selection (m = k + l, l > 1) can be implemented. The stepwise process ends when none of the other positions has a partial LRT statistic significant at the SVE level.
Separating linked QTL:
The evidence of multiple-linked QTL clustering in a region could be suggested by the shape of the likelihood profile, for example, a likelihood profile with a wide range of significant multiple peaks, or by significant change in the direction of estimated QTL effects on a chromosome region. To separate closely linked QTL in a certain chromosome region, we can compare the likelihood of the multiple-QTL model with that of a single-QTL model in this region for separation.
Analyses of epistasis:
For a backcross population, it can be shown that if epistasis is present and ignored in mapping, the estimates of main effects of epistatic QTL are asymptotically unbiased whether epistasis between QTL is considered in the model or not, and the power of the test for detecting epistatic QTL could be low (Appendix 1). Therefore, when mapping QTL without considering epistasis in a backcross population, the positions and effects of the identified QTL could still be unbiased. For l QTL being tested, there are k = l(l - 1)/2 possible digenic epistases. For each pair of QTL Qi and Qj, the hypotheses for testing their epistatic effect wij are
![]() |
(10) |
given the l QTL in the MIM model. Again, the LRT is used to test the hypotheses. The hypotheses in Equation 10 can also be used to identify QTL with no main effect but interacting with other QTL. To choose the critical value for epistasis detection, a Bonferroni argument can be used. The critical value for rejection of H0 is suggested as
2
, where
is the overall significance level.
Fine tuning the estimates of QTL positions and effects:
In the above procedures, the estimates of QTL effects and positions were obtained individually. Therefore, the model likelihood might not be at the maximum, and the model is not the final model. To obtain the MLEs of the positions and effects, a multidimensional search around the regions of the identified QTL is suggested. By doing this, QTL estimates can be fine tuned and the final model can be determined. With estimates of QTL positions and effects, other composite genetic parameters (e.g., heritability and variance components) of a quantitative trait can be estimated and response to selection can be predicted.
Construction of the confidence interval for QTL positions and effects:
It is important to construct the confidence interval (C.I.) for QTL effects and positions. For example, when a particular QTL is to be transferred to a recipient, a C.I. of QTL position estimate can give us an idea about how large a chromosome segment is around the detected position to be transferred. There are several approaches to constructing a C.I. of the QTL positions and effects, including lod support interval (![]()
![]()
![]()
![]()
![]()
![]()
- Z(
)S
,
+ Z(
)S
), where
and S
are the estimates of QTL position and its standard deviation, to construct a C.I.
Estimation of variance components and heritability:
When the final model is determined, the variance components and the heritability of the quantitative trait can be estimated. The ratio VG/Vp, denoted by h2b, is called the heritability of a quantitative trait in the broad sense, where VG and Vp are the genetic and phenotypic variances. The genetic variance VG can be estimated by the sum of squares of the final model, and the phenotypic variance Vp can be estimated by the total sum of squares. The estimate of h2b can be approximated by the coefficient of determination R2 of the MIM model

To estimate the genetic variance components, for example, the total genetic variance contributed by m QTL in the backcross population by Equation 1 is
![]() |
(11) |
where Dij is the gametic linkage disequilibrium coefficient between Qi and Qj (![]()
. However, the estimated genetic component by
is biased, and this bias can be corrected by
. The genetic covariance between Qi and Qj is defined by 2Daiaj. By the same argument, the estimated genetic covariance by 2
âiâj is also biased and can be corrected by 2
[âiâj + Cov(âi, âj)] under the assumption that the effect and location of QTL are independent. Other genetic components can also be estimated in the same way. For an F2 population or a backcross population with segregation distortion, the partition of genetic variance into components is presented by C-H. KAO and Z-B. ZENG (unpublished results).
Estimation of individual genotypic value and marker-assisted selection:
In plant or animal breeding, individuals with high genotypic values or favorable genotypes are usually selected for producing progeny. With the estimated QTL effects and positions, the genotypic values of individuals can be estimated by Equation 1 and the favorable QTL genotypes can be determined for selection. To select individuals with large trait values, genotype AA (Aa) of nonepistatic QTL with positive (negative) effects is preferred. For QTL with epistasis, their epistatic effects must be considered in selecting the best combination of genotypes. If QTL controlling different traits are closely linked or at the same positions, traits are genetically correlated. Selecting individuals for improvement of one trait will affect the other trait due to linkage or pleiotropy. In practice, selecting individuals with the desired character for one trait will frequently accompany an undesired character for other traits. By considering circumstances such as genetic correlation between traits, the distances between markers and QTL, and the effects of QTL, the best strategies of marker-assisted selection for (multiple) trait improvement under specific purposes and requirements can be explored.
| DATA ANALYSIS |
|---|
Radiata pine:
Radiata pine is one of the most widely planted forestry species in the Southern Hemisphere. Two elite parents were crossed to produce 134 progeny. For each progeny, random amplified polymorphic DNA (RAPD) markers were generated, and traits measured included annual brown cone number at eight years of age, diameter of stem at breast height, and branch quality score. The cone number per tree, which varied from 0 to 45, was transformed to approximate a normal distribution using a square root transformation. The quality of branches of a tree were scored on a scale from 1 (poorest) to 6 (best). The mean of several branch quality scores denoted the branch quality of a tree. A pseudotestcross strategy is used to construct a linkage map for each parent, and then a backcross model can be used for mapping QTL for each parent separately (![]()
![]()
![]()
As mentioned in STRATEGY OF QTL MAPPING, the choice of critical value is a very complicated issue for the multiple-QTL model. The value depends on the marker data structure and several unknown QTL parameters (true model). In data analysis, a critical value from IM based on Bonferroni argument is used to evaluate and illustrate the MIM approach. The SVE and SVS of the LRT statistic for claiming a QTL detection at the overall
= 0.05 level were chosen as 12.12 (
21,
21,0.0005). For QTL selected as a chunk, the overall
= 0.05 level was chosen as
2k,
, where k is the number of tested parameters in the chunk.
QTL detection:
For trait DBH, when m = 1, there is no position along the genome with an LRT statistic higher than SVE. The position with the largest LRT statistic (7.85; R2 = 0.0639) was found at position [12,5,0] (0 cM away from the left marker of the fifth marker interval on the twelfth linkage group). The chromosome region between C1M3 (the third marker of the first linkage group) and C1M7 showed opposite direction of effects. At C1M3, the effect was positive (P = 0.57), while at C1M4 and C1M5, the effects were negative (P = 0.0253 and 0.4181, respectively). The genetic distance between C1M3 and C1M4 is 74.8 cM. It could suggest that there are two closely linked QTL with opposite directions of effects in this region. If only one QTL (m = 1) is fitted in the model for search, the effect can be canceled out by opposing QTL effects. QTL will be out of detection as shown by the LRT statistic profile of IM in Figure 1. Therefore, on linkage group 1, the MIM model with m = 2 selected two candidate QTL, at positions [1,3,63] and [1,4,0], as a chunk. The partial LRT statistic for fitting the two QTL in the model was 13.13 (SVE and SVS for two parameters are
22,0.0005 = 15.2), and the model R2 was 0.2104. Although the LRT statistic was less than SVE, the two QTL were selected as a chunk to initiate the stepwise selection process.
The procedure restarted at m = 2 by fitting two QTL with effects of opposite directions at [1,3,63] and [1,4,0]. The partial LRT statistics were 8.034 and 8.458 for the two QTL, with estimated effects 65.65 and -73.48, respectively. Given QTL at [1,3,63] and [1,4,0] in the model, a QTL at [10,5,12] with partial LRT statistic 12.83 was selected into the model (m = 3). The partial LRT statistics became 14.89, 15.42, and 12.83, which were all larger than the SVS of 12.12, for the three QTL. The model R2 was 0.3202. Given these three QTL in the model, the largest partial LRT statistic 7.40 was found at position [2,2,0]. A chunkwise selection for epistatic QTL was attempted. If the candidate QTL at [2,2,0] and [12,5,12] with epistasis were selected as a chunk (m = 5 and one epistasis, k = 6), the partial LRT statistic of the chunk would be 24.76 (compared with
23,0.0005 = 17.73). The partial LRT statistics were 23.48, 24.39, and 8.76 for the three preselected QTL at [1,3,63], [1,4,0], and [10,5,12], respectively. The QTL at [10,5,12] became nonsignificant and, therefore, was dropped from the model. Given the four QTL [1,3,63], [1,4,0], [2,2,0], and [12,5,12] in the MIM model, no other single position had a partial LRT statistic >8.76. The chunkwise selection was implemented again to find epistatic QTL. When the candidate QTL at [5,5,0] and [10,5,12] with epistasis were considered as the third chunk, the partial LRT statistic was 19.85. Adding these two epistatic QTL into the model (m = 6 and two epistasis, k = 8), the partial LRT statistics were 19.48, 20.69, and 26.91 for QTL at positions [1,3,63], [1,4,0], and the first chunk of QTL, respectively. Given the six QTL, no other QTL were identified.
Fine tuning the estimates of QTL position and effect:
Two epistatic pairs were identified as described above; no other epistatic interaction between QTL was found. No QTL without main effect but interacting with the identified QTL were found. A multidimensional search around the detected QTL was used to fine tune the estimates of QTL parameters. The locations changed to [1,3,61], [1,4,0], [2,2,0], [5,5,0], [10,5,9], and [12,5,9]. The estimated QTL effects are shown in Table 1. QTL at positions [1,3,61], [2,2,0], [10,5,9], and [5,5,0] had positive effects, and QTL at positions [1,4,0] and [12,5,9] had negative effects. The effects of QTL at positions [1,3,61] and [1,4,0] were larger when compared with others. The model R2 was 0.5226. Therefore, six identified QTL were conclusively identified in QTL mapping for the diameter trait. The partial LRT statistic profiles for each QTL are shown in Figure 1.
Epistasis:
The estimated epistatic effect between QTL at positions [2,2,0] and [12,5,9] was 39.54 (partial LRT statistic 15.23), and the epistatic effect between QTL at [5,5,0] and [10,5,9] was 22.64 (partial LRT statistic 4.84). Figure 2 shows how the QTL interact. Figure 2A shows that the effect of QTL (GBB - GBb) at position [12,5,9] was positive in the background of homozygote QTL (AA) at position [2,2,0], but it was negative in the heterozygote background (Aa). Figure 2B shows that the QTL at position [10,5,9] had a large effect in the background of homozygote QTL (AA) at [5,5,0], but it had a small effect in the heterozygote background (Aa).
|
Heritability and variance components:
The broad sense heritability for tree diameter can be estimated by the R2 value of the final MIM model. The R2 of the model including six QTL and two epistases was 0.5226. QTL at positions [2,2,0], [5,5,0], [10,5,12], and [12,5,9] contributed ~4.50, 1.36, 5.25, and 1.76% of the total genetic variance, respectively. The percentage of genetic variance contributed by the two linked QTL separated by 13.8 cM on the first linkage group was 76.75%. There was a negative genetic covariance between the two linked QTL. Two epistatic pairs contributed ~10.38% to the total genetic variance.
QTL mapping for cone number and branch quality:
QTL mapping was also performed on the traits of cone number and branch score. The mapping results are listed in Table 1. For cone number, seven QTL were identified (although the QTL at [1,1,3] was not significant with partial LRT statistic 9.44, we considered it as a candidate QTL). Epistasis was found between two QTL pairs using chunkwise selection. The model R2 value of the MIM model fitted to the seven QTL and their epistasis was 0.5606. The two linked QTL, separated by 27.6 cM on linkage group 10, contributed 29.93% of the genetic variance. The other five QTL contributed ~55.93% of the total genetic variance. Epistasis contributed 14.14%. For branch quality, five QTL were identified (we also considered the two QTL with partial LRT statistic values 10.37 and 10.36 at [1,4,11] and [12,5,0] as candidate QTL). No epistasis was found for QTL controlling branch score. The model R2 was 0.3630. Two linked QTL, separated by 19.6 cM on linkage group 11, contributed 48.69% of the genetic variance. The remaining three QTL contributed from ~11 to 27% of the total genetic variance.
Confidence intervals of QTL positions and effects:
The lod support interval and the ASD of QTL effect and position are listed in Table 1. Out of the 18 QTL detected for three traits, 9 ([2,6,0], [5,10,0], and [10,9,0] for cone number; [1,4,0], [2,2,0], and [5,5,0] for tree diameter; [2,1,0], [11,6,0], and [12,5,0] for branch score) of them were localized at the markers, and 2 ([10,5,9] and [12,5,9]) had negative ASD. Therefore, the ASD of these QTL position estimates were not available for constructing C.I.'s. The asymmetric lod support intervals are typical in this case. For example, the diameter QTL at [5,5,0] has an asymmetric lod support interval ([5,4,7], [5,5,16]). In general, the interval constructed by ASD is much narrower than the lod support interval. For example, C.I.'s constructed using four times ASD were 6.52 and 7.04 cM for the cone QTL at [6,4,18] and [12,3,2], and the lod support intervals are 59.6 cM and 14.6 cM, respectively.
Marker-assisted selection:
Individuals with favorable QTL genotypes are selected as parents to produce progeny. Trees carrying all the favorable QTL genotypes were not found for each trait in the sample. Therefore, only a subset of the detected QTL was considered in selection. For tree diameter, three trees were found to carry favorable genotypes and two trees were found to carry unfavorable genotypes (consider epistasis) of the five QTL (out of the six detected QTL) at positions [1,4,0], [2,2,0], [5,5,0], [10,5,9], and [12,5,9]. The observed trait means for the two groups were 232.38 and 163.05 mm, respectively, through selection of these five diameter QTL. The estimated genotypic values of the two groups were 233.84 and 160.06 mm (Table 2). The observed and estimated values of performing selection for the other two traits on the sample based on four and five QTL are also shown in Table 2. The mapping results in Table 1 also allow us to estimate the genotypic values of certain genotypes. For example, if trees carrying all six favorable diameter QTL were selected with epistasis taken into consideration, the estimated tree diameter for those trees would be 314.17 mm and the estimated cone number would be 8.22. If trees carry all seven favorable QTL (epistasis considered) for reducing cone number, the estimated cone number would be 0.33 and the estimated tree diameter would be 196.45 mm. Consequently, the improvement of tree diameter would cause simultaneous increase in cone number, which is a reflection of the positive genetic correlation between the two traits. Generally, the estimated and observed results were quite close based on the MIM result as found in this sample.
|
| DISCUSSION |
|---|
A new QTL mapping approach named MIM is proposed. It uses multiple-marker intervals simultaneously to construct multiple QTL in the model for QTL mapping. The MIM model is based on Cockerham's model (C-H. KAO and Z-B. ZENG, unpublished results) for defining genetic parameters and on the general formulas of ![]()
The MIM model is a multiple-QTL model. When the multiple-QTL model is considered, the likelihood is a finite normal mixture and becomes increasingly intractable in maximization as the number of QTL fitted into the model increases (![]()
![]()
![]()
![]()
![]()
Under the ad hoc critical value, MIM detected six QTL for tree diameter and CIM detected only two of them on the first linkage group in this example. IM failed to detect any QTL (Figure 1). The major reason for this difference is that CIM is not capable of controlling the two detected linked QTL simultaneously in further mapping. As a result, only the QTL at position [1,4,0] is controlled, but it does not contribute substantially to reducing the genetic variation because its effect has been canceled out by ignoring the linked QTL with opposite effects at position [1,3,61]. Accordingly, most of the genetic variance (76.75%) contributed by the two linked QTL becomes part of the genetic residue, making the other four QTL undetectable. This shows the beauty of MIM, which allows the current detected QTL being fitted directly into the model to search for the next QTL. Consequently, more QTL were detected by MIM than the current methods in this example.
In the data analyses, MIM localized two linked QTL with large opposite directions of effect in the third interval of linkage group 1 (Figure 1A). They contributed 76.75% of the total genetic variance. The size of this interval was 74.8 cM, so it is suggested that more markers should be added to this interval to permit further investigation. Two linked QTL, one controlling diameter and another controlling cone number, were detected in the same fifth interval of linkage group 10 (Table 1). The estimated locations are 2 cM apart. Further investigation is needed to check if they are the same (pleiotropic) or different (closely linked) QTL. The likelihood profile of linkage group 12 in Figure 1E is a result of conditioning on the other five unlinked QTL. It shows multiple significant peaks, which could suggest multiple-linked QTL on the same linkage group. However, after further investigating the linkage group, there was no evidence of multiple QTL given the peak at position [12,5,9] and the other five detected QTL. It is therefore concluded that there is only one QTL at position [12,5,9] on linkage group 12.
Another benefit derived by MIM is that epistasis can be readily incorporated in the model for analysis or searching for epistatic QTL. When taking both main and epistatic effects into account in searching for QTL, the critical value for hypothesis testing needs to be adjusted for the extra degree of freedom for epistasis. It is interesting to know that the estimated main effects of linked QTL are asymptotically unbiased in the backcross population (Appendix 1), but they are biased in the F2 population if epistasis is present and ignored in mapping (![]()
It has been 76 yr since ![]()
An initial version of the MIM program source code (written in Fortran 77 language) is available on the worldwide web (http://www.stat.sinica.edu.tw/~chkao/). A more user-friendly package can be developed based on this program. Using the MIM program, we implemented stepwise and chunkwise selections with the LRT statistic as a selection criterion to search for QTL in data analyses. In analyzing the data, we chose the two linked QTL with opposite direction of effect on the first linkage as a starting point to initiate the selection process, and six QTL were found for tree diameter. We also tried another possible starting point at [12,5,0] to initiate the process and obtained the same final model. This final model obtained by model selection might not be optimal. Even though the optimal model was obtained, there is no guarantee that it is the true model (the estimated QTL are the true QTL) for limited sample size. Ultimately, the reliability of the identified QTL will depend on further experiments to assess the validity of QTL. There is no single criterion that plays the role of a panacea in the model selection problem. Other model selection techniques and criteria could also be implemented. It is a very important task to explore and automate the model selection procedures of the MIM approach for general use in the QTL mapping community.
| ACKNOWLEDGMENTS |
|---|
We are greatly indebted to Dr. Chung-I Wu and three anonymous reviewers for their comments and criticisms. Chen-Hung Kao is grateful to Corinna Lange for her suggestions. C-H.K. was supported by grants NSC87-2313-B-324-001 and NSC88-2313-B-324-001 from the National Science Council, Taiwan, Republic of China; Z-B.Z. was funded by GM-45344 from the National Institutes of Health and no. 9600645 from the United States Department of Agriculture Plant Genome Program.
Manuscript received December 5, 1997; Accepted for publication March 24, 1999.
| APPENDIX Appendix |
|---|
THE PROBLEMS OF IGNORING EPISTASIS IN QTL MAPPING
To simplify the argument, consider the situation where the test positions for QTL are located precisely at the marker position. If only two epistatic QTL, A (x1) and B (x2), control a quantitative trait y, the single-marker regression coefficient of y on one of the QTL, say x1, is given by byx1 = Cov
, where Cov(y, x1) is the covariance between the trait and QTL A and V(x1) is the variance of QTL A. Assuming that there is no covariance between environmental deviation and QTL, it is easy to show that
![]() |
(12) |
because Cov(x1x2, x1) = 0 under Cockerham's model (C-H. KAO and Z-B. ZENG, unpublished results). The single-marker regression coefficient is byx1 = a1 + (1 - 2r)a2 because V(x1) =
. The main effect of the linked QTL B is involved in the estimation, but epistatic effect w is not involved. If both QTL A and B are considered in the model, the partial regression coefficient byx1.x2 becomes
![]() |
(13) |
where
yx1.x2 and
2x1.x2 denote the conditional covariance of trait y and QTL x1 on QTL x2 and conditional variance of x1 on x2 (![]()
![]()
| LITERATURE CITED |
|---|
AITKEN, K. S., G. SMAIL, J. DRENTH, Y. LI, C.-H. KAO et al., 1997 Detection of quantitative trait loci (QTL) for cone production in Pinus radiata, pp. 337341 in IUFRO '97 Genetics of Radiata Pine, edited by R. D. BURDON and J. M. MOORE. Proceedings of NZFRI-IUFRO Conference, 14 December, and Workshop, 5 December, Rotorua, New Zealand, FRI Bulletin no. 203.
AKAIKE, H., 1974 A new look at the statistical model identification. IEEE Trans. Auto. Control 19:716-723.
ATKINSON, A. C., 1980 A note on the generalized information criterion for the choice of a model. Biometrika 67(2):413-418
CARBONELL, E. A., T. M. GERIG, E. BALANSARD, and M. J. ASINS, 1992 Inter