- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Kao, C.-H.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Kao, C.-H.
On the Differences Between Maximum Likelihood and Regression Interval Mapping in the Analysis of Quantitative Trait Loci
Chen-Hung Kaoaa Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan, Republic of China
Corresponding author: Chen-Hung Kao
Communicating editor: Z-B. ZENG
| ABSTRACT |
|---|
The differences between maximum-likelihood (ML) and regression (REG) interval mapping in the analysis of quantitative trait loci (QTL) are investigated analytically and numerically by simulation. The analytical investigation is based on the comparison of the solution sets of the ML and REG methods in the estimation of QTL parameters. Their differences are found to relate to the similarity between the conditional posterior and conditional probabilities of QTL genotypes and depend on several factors, such as the proportion of variance explained by QTL, relative QTL position in an interval, interval size, difference between the sizes of QTL, epistasis, and linkage between QTL. The differences in mean squared error (MSE) of the estimates, likelihood-ratio test (LRT) statistics in testing parameters, and power of QTL detection between the two methods become larger as (1) the proportion of variance explained by QTL becomes higher, (2) the QTL locations are positioned toward the middle of intervals, (3) the QTL are located in wider marker intervals, (4) epistasis between QTL is stronger, (5) the difference between QTL effects becomes larger, and (6) the positions of QTL get closer in QTL mapping. The REG method is biased in the estimation of the proportion of variance explained by QTL, and it may have a serious problem in detecting closely linked QTL when compared to the ML method. In general, the differences between the two methods may be minor, but can be significant when QTL interact or are closely linked. The ML method tends to be more powerful and to give estimates with smaller MSEs and larger LRT statistics. This implies that ML interval mapping can be more accurate, precise, and powerful than REG interval mapping. The REG method is faster in computation, especially when the number of QTL considered in the model is large. Recognizing the factors affecting the differences between REG and ML interval mapping can help an efficient strategy, using both methods in QTL mapping to be outlined.
SINCE ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The likelihood of the interval mapping model is generally a finite normal mixture (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Although REG may approximate ML interval mapping well in some cases as shown by ![]()
![]()
![]()
![]()
| MAXIMUM-LIKELIHOOD INTERVAL MAPPING |
|---|
The differences between the ML and REG interval mappings can be illustrated by investigating the differences between their estimators of mean, genetic effects, and residual variance. To simplify the explanation of their differences, a one-QTL model for a backcross population is first used as an example. Their differences under a multiple-QTL model are discussed later. The one-QTL ML mapping model can be written as
![]() |
(1) |
where yi is the quantitative trait value of individual i, µ is the mean, a is the effect of QTL Q, x*i, taking a value 1/2 (-1/2) for homozygote QQ (heterozygote Qq), denotes the genotype of Q, and
i is the environmental deviation and is assumed to follow N(0,
2). Although the genotype of Q for an individual is usually unobserved and could be QQ or Qq, its distribution can be inferred from its flanking marker genotypes. Suppose the flanking markers are M and N. Then, there are four types of marker genotypes, type 1 MN/MN, type 2 MN/Mn, type 3 MN/mN, and type 4 MN/mn, as shown in Table 1. Given the four marker genotypes, the conditional probabilities for QTL genotypes QQ and Qq, denoted by pi1 and pi2, respectively, at a position between the markers can be calculated based on Haldane's mapping function (![]()
|
Since the QTL genotype x*i could be homozygote (1/2) or heterozygote (-1/2) for an individual, the likelihood is then a normal mixture with mixing proportions equivalent to the conditional probabilities pi1 and pi2. For n individuals in the sample, the likelihood of the model in Equation 1 is
![]() |
(2) |
where
denotes parameters (pij, µ, a,
2), Y and X denote the trait value and marker genotypes, N(µij,
2) denotes the normal density function with mean µij and variance
2, and

are genotypic values of QQ and Qq. In estimation, this normal mixture model can be treated as an incomplete data problem (![]()
Let the probability distribution of missing data x*i as

The conditional distribution of the observed data, yi and Xi, given the missing data xi* can be considered as an independent sample from a population such that yi|(
, Xi, xi*)
N(µ + axi*,
2), and the EM algorithm can be used to obtain the MLE. At a given position, pij's can be determined. By the definition of the EM algorithm, the iteration of the EM-step for obtaining µ, a, and
2 proceeds as follows:
E-step:
The posterior probabilities of the QTL genotypes x*i's of the n individuals are updated as

and

for i = 1, 2, ... , n (see ![]()
i1 for individual i is a function of pi1, pi2, yi, µ, a, and
2. It is important to clarify the relationship between the conditional probability pi1 and conditional posterior probability
i1 of the QTL genotype in the comparison of ML and REG interval mappings. It is shown later that the more similarity between
i1 and pi1 for each i, the better the approximation of REG to ML interval mapping. Note that pi1 =
i1 if pi1 = 1 or pi1 = 0 (i.e., the QTL is located at the marker) or a = 0. If pi1
1 or pi2
0 or a
0, then
i1
pi1.
M-step:
Find µ, a, and
2 to satisfy
![]() |
(3) |
![]() |
(4) |
![]() |
(5) |
where

and

are the conditional posterior expectations of x*i and x*i2 given yi and Xi, respectively. In each iteration, new estimates of µ, a, and
2 are obtained in the M-step. These new estimates are then used to obtain new
i1's and
i2's for the next iteration. The converged values in the iteration are the MLE. A disadvantage of the EM algorithm was that it did not provide the estimates of the covariance matrix of the MLE. However, this disadvantage can be easily removed by using appropriate methods, such as by ![]()
![]()
| REGRESSION INTERVAL MAPPING |
|---|
![]()
![]() |
(6) |
where µ, a, and
i have the same definitions as the model in Equation 1, and

is the conditional expectation of the QTL genotype given the flanking marker genotype. By treating wi as fixed, the model is a regression model, and this method is called REG interval mapping in QTL mapping. In estimation, both least-squares and maximum-likelihood techniques can be implemented to estimate µ, a, and
2 in Equation 6. Least-squares estimates (LSE) of µ and a are the solutions of
![]() |
(7) |
![]() |
(8) |
where

Note that the estimates of the regression model will fail if zi = 0 (pi1 = pi2 = 1|2) for every i. However, this situation will not occur because pi1
pi2 for individuals with type 1 (MN/MN) and type 4 (MN/mn) flanking marker genotypes in the backcross population. The LSE of
2 is
![]() |
(9) |
where n - 2 is the degree of freedom for the residual sum of squares. The likelihood of the REG mapping model is a normal density
![]() |
(10) |
rather than a normal mixture density. The mixing proportion of pij's in the ML mapping likelihood (Equation 2) is blended into wi of Equation 10. If the maximum-likelihood principle is used in estimation, the MLE of µ and a for maximizing Equation 10 are the same as Equation 7 and Equation 8. The MLE of
2 has a divisor n instead of n - 2 in Equation 9.
| DIFFERENCES BETWEEN ML AND REG INTERVAL MAPPING |
|---|
By comparing the solution sets between ML and REG interval mappings, it can be seen that the two solution sets have similar expressions, but different contents. In the REG method, the conditional expectations of QTL genotype, wi and zi, are used in estimation. In the ML method, the conditional posterior expectations of QTL genotype, wi* and zi*, play the same role in estimation. The conditional expectations consider only the conditional probabilities of QTL genotypes pi1's, and the conditional posterior expectations consider the posterior probabilities
i1's. It can be seen that the posterior probability
i1 also utilizes phenotypic information as well as marker information. Intuitively, the ML method can provide better estimates than the REG method because
i1 is more informative than pi1. Analytically, the differences between ML and REG interval mapping in estimation will depend on the differences between the two kinds of expectations. The two kinds of expectations are equivalent if and only if
i1 = pi1 and pi1 = 1 (or pi1 = 0) for each i (the QTL is located at a marker). How good the approximation of REG to ML interval mapping is depends on the similarity between pi1's and
i1's. Investigating the factors affecting the similarity between
i1 and pi1 can lead to identifying the differences between the two methods. These factors include (1) proportion of variance explained by a QTL (size of a QTL), (2) the relative QTL position within an interval, and (3) the size of the interval flanking the QTL.
Proportion of variance explained by a QTL:
If the proportion of variance explained by a QTL is small, the ratio of QTL effect a to the environmental deviation
(a/
) will be small. Consequently, the densities of normal mixture components are about the same for different genotypes, i.e.,

and pi1
i1. The extreme case is a = 0 (µi1 = µi2 = µ) and pi1 =
i1. Therefore, REG mapping can approximate ML mapping well when the proportion is low (the QTL effect a is small). When the proportion is high, QTL effect a becomes relatively large when compared with the environmental deviation
, and the difference between the two normal densities can become significant. As a result, the approximation of REG to ML interval mapping may not be good for QTL with large effect.
Relative QTL location in an interval:
If a QTL is located on the boundary of a marker interval, pi1 is close to 1 or 0, and the conditional and posterior probabilities will be similar (pi1
i1). When the QTL position shifts from the boundary toward the middle of an interval, pi1 and
i1 become more dissimilar to each other. If the QTL is located in the middle, individuals with type 2 or 3 flanking marker genotype have pi1 = 0.5, and pi1 and
i1 will be the most dissimilar. Consequently, the approximation of REG to ML mapping will be better when the QTL is located near the boundary, but it becomes poor as the location moves toward the middle of an interval.
Interval size:
There are four types of flanking marker genotypes (Table 1). Types 1 and 4 are nonrecombinant, and types 2 and 3 are recombinant. Given a position in an interval, the conditional probability pi1 for QQ will be closer to 1 or 0, i.e., pi1 can be closer to
i1, for nonrecombinant individuals than recombinant individuals. As there are more nonrecombinant flanking genotypes in a narrow interval than in a wider interval, the approximation of REG to ML mapping consequently is better for a QTL located in a narrow interval than in a wider interval. Therefore, if QTL are located in the dense marker region, the differences between the two methods will be minor.
| MULTIPLE-QTL MODEL |
|---|
For the one-QTL model, it has been shown that the approximation of REG to ML interval mapping depends on the similarity between the conditional probability
i1 and conditional posterior probability
i1 for each i. The same argument also applies to the multiple-QTL model. When multiple, say, m QTL are considered, the multiple interval mapping (MIM; ![]()
![]() |
(11) |
where xij* denotes the genotype of QTL Qj, aj and Ijk are the main and epistatic effects,
jk is an indicator variable for indicating whether the epistasis between Qj and Qk is present or not, and
i is the environmental deviation.
For m QTL, there are 2m possible QTL genotypes; hence there are 2m corresponding genotypic values, µij's, with probabilities pij's, j = 1, 2, ... , 2m. The likelihood of the multiple-QTL model is then a 2m normal mixture
![]() |
(12) |
It seems that the derivation of the MLE of µ, a1, a2, ... , am, Ijk, and
2 and their asymptotic variance-covariance matrix is tedious as the number of QTL considered increases in the model. However, this tedious estimation problem can be easily solved by the general formulas proposed by ![]()
2. Given these two matrices D and Q in the multiple-QTL mapping model, the derivation of the MLE and the asymptotic variance-covariance matrix can be systematically obtained by the general formulas. Given the testing QTL positions, pij's can be determined. According to the general formulas, in the E-step, the 2m posterior probabilities of QTL genotypes for n individuals,

are updated. In the M-step, the solutions of the parameter estimation are in the closed form as shown in ![]()
The REG interval mapping model for taking m QTL into account can be written as

In the model, wij is the conditional expectation of Qj given its flanking markers. The LSE of µ, a1, a2, ... , am, Ijk, and
2 as well as their asymptotic variances can be obtained using the standard least-squares technique.
Differences due to QTL effects, epistasis, and linkage:
By the same argument, it is required that pij
ij for each i and j for REG interval mapping to approximate ML interval mapping well in the multiple-QTL model. Besides the factors, such as the proportion explained by QTL, the relative QTL position in an interval, and interval sizes, discussed in the previous section, the relative sizes of genotypic values of the 2m possible genotypes, µij's, can affect the approximation of REG to ML interval mapping in the multiple-QTL model. If µij's are dissimilar to each other (more disperse),
ij's can be more dissimilar to pij's, and the differences between the two methods can become large. The difference between the sizes of QTL effects and the strength of epistasis between QTL seems to be an appropriate measure to quantify the dissimilarity between the 2m genotypic values. If QTL effects differ from each other significantly or epistasis between QTL is strong,
ij's tend to be dissimilar to pij's. Consequently, the differences between REG and ML interval mapping will be larger if QTL effects differ significantly or the interaction between QTL is strong.
If QTL are linked, they are correlated. Their correlation is 1 - 2r, where r is the recombination fraction between QTL. As QTL (predictors) in the model are correlated, the effects of collinearity, on modeling such as imprecise estimation and losing power in testing for individual parameters, will occur (![]()
| SIMULATION STUDIES |
|---|
Simulations were performed to verify the effects of the above factors, such as the proportion of variance explained by QTL, interval size, QTL position, the difference between QTL effects, epistasis, and linkage, on the approximation of REG to ML interval mapping. Assume two unlinked epistatic QTL with effects (a1 = 1, a2 = 1, I12 = 1) that affected a quantitative trait of interest in a backcross population (epistasis contributes 11.11% of the total genetic variation). For simplicity, 10 equally spaced marker intervals were simulated for each chromosome. Four proportions of variance explained by QTL (h2's), 0.01, 0.1, 0.3, and 0.5, and three different interval sizes, 10, 20, and 40 cM, are simulated. The relative QTL positions are placed in the middle or on the boundary of a marker interval (1, 2, and 4 cM away from the left marker of the three different spaced intervals, respectively). When investigating the effect of epistasis, the main and epistatic QTL effects are further set at (a1 = 1, a2 = 1, I12 = 2) or (a1 = 1, a2 = 1, I12 = 3), and the QTL are placed in the middle of 40-cM intervals. Together the QTL contribute 50% of the quantitative trait variation (the percentages of epistatic variance in the total genetic variance are 33.33 and 52.94%, respectively). When investigating the effect of the difference between QTL effects on the approximation, five unlinked QTL are placed in the middle of 10- or 40-cM intervals with effects (a1 = 1, a2 = 1, a3 = 1, a4 = 1, a5 = 1), (a1 = 4, a2 = 1, a3 = 1, a4 = 2, a5 = 1), or (a1 = 4, a2 = 1, a3 = 1, a4 = -1, a5 = 1), respectively, and together contribute 50% of the trait variation. When investigating the effect of linkage, two QTL are placed in two neighboring 40-cM intervals and are 10, 20, 30, or 40 cM apart from each other (5, 10, 15, and 20 cM from the marker between them). Their effects are set at (a1 = 1, a2 = -1) without epistasis or (a1 = 1, a2 = -1, I12 = 1) with epistasis, and the heritability is assumed to be 0.1, 0.3, or 0.5 for each case. The sample size is 200, and 500 replicates were simulated for all cases.
For simplicity of comparison, the QTL positions are assumed to be known, and the simulation is performed at the positions. When calculating the power of separating closely linked QTL, a successful separation requires the partial LRT statistic for each QTL >
2
(
2
for the epistasis case). Means of the estimated parameter values, their standard deviations (SDs), and mean squared errors (MSEs), as well as the LRT statistics, are recorded. MSE is used to evaluate the approximation of REG to ML interval mapping and the performance of the two methods under various cases. MSE, which is defined as

incorporates two components, one measuring the variability of the estimate (precision), and the other measuring its bias (accuracy). A good method needs to control both variance and bias in estimation.
As expected, if h2 is low (h2 = 0.01), the two methods provide almost identical means and SDs of the estimates for µ, a1, a2, I12,
2, and h2, and LRT statistics. These results for h2 = 0.01 correspond with the findings of ![]()
![]()
![]()
![]()
Proportion of variance explained by QTL, interval size, and QTL position:
The means of the estimated main and epistatic effects by the ML and REG methods are almost identical and very close to the true values for various h2's, interval sizes, and QTL positions. However, the ML method tends to provide estimates with smaller SD (MSE) and larger LRT statistics when compared to the REG method. For example, the MSEs of â1 by the REG method are 0.178, 0.050, and 0.024 for h2 = 0.1, 0.3, and 0.5, respectively, and the MSEs by the ML method are 0.175, 0.047, and 0.020, respectively (only the result for h2 = 0.5 and QTL located in the middle of the intervals is shown in Table 2). There is a similar pattern for other estimates. The estimates of
2 and h2 by the REG method are biased, and the estimates by the ML method are (asymptotically) unbiased. For example, the
2 by the REG method is 0.072 (SD 0.033), 0.187 (SD 0.047), and 0.302 (SD 0.051) for h2 = 0.1, 0.3, or 0.5, respectively, and the
2 by the ML method is 0.128 (SD 0.055), 0.316 (SD 0.070), and 0.509 (SD 0.061), respectively. The bias of the REG method in estimating
2 and h2 becomes obvious as h2 becomes large. Also, the ML method gives larger LRT statistics than the REG method in all cases. The difference in mean LRT statistics between the two methods is negligible: 0.4 (15.2 - 14.8 = 0.4 for 40-cM marker spacing) for h2 = 0.1 (results not shown), but 9.9 (82.5 - 72.6 = 9.9 for 40-cM marker spacing) for h2 = 0.5. Therefore, the difference in the LRT statistic becomes larger as h2 becomes higher. Similar patterns of difference in MSE and LRT statistics, caused by the change of h2, can be observed for other interval sizes and QTL positions.
|
Epistasis:
The means, SDs, and MSEs of the estimates as well as the mean LRT statistics for effect ratios 1:1:1, 1:1:2, and 1:1:3 are listed in Table 3. As interaction becomes stronger, the MSEs of the estimates by both methods become larger, and their differences in MSE and LRT statistics become larger. For example, the MSEs of Î12 by the ML method are 0.075, 0.110, and 0.159 for the three ratios, respectively, and they are 0.127, 0.175, and 0.254 by the REG method, respectively. A similar trend can be observed for other estimates. The means of LRT statistics for the three ratios are 82.5, 80.3, and 75.2 for the ML method, respectively, and they are 72.6, 65.5, and 59.5 for the REG method, respectively. The bias of the REG method in the estimation of
2 and h2 also becomes much more serious as interaction between QTL gets stronger. The
2 by the REG method is 0.302 (SD 0.051), 0.277 (SD 0.053), and 0.255 (SD 0.053) for the three ratios, respectively (h2 = 0.5). The ML method, however, can estimate h2 and other parameters well for all ratios.
|
Difference between QTL effects:
Table 4 shows the means, SDs, and MSEs of the estimates as well as the mean LRT statistics for QTL effects (a1 = 1, a2 = 1, a3 = 1, a4 = 1, a5 = 1), (a1 = 4, a2 = 1, a3 = 1, a4 = 2, a5 = 1), and (a1 = 4, a2 = 1, a3 = 1, a4 = -1, a5 = 1). When QTL effects are of the same size, the mean LRT statistic of the ML method is 1.8 (81.5 - 79.7) larger than that of the REG method. If there are some relatively large and small QTL, their differences in LRT statistics are 3.7 (84.1 - 80.4) and 5.8 (84.7 - 78.9), respectively, for the other two cases. Also, the estimates by the REG method tend to have larger MSEs, and the
2 and
2 by the REG method are biased. For the case (a1 = 1, a2 = 1, a3 = 1, a4 = 1, a5 = 1, and h2 = 0.1) and 10-cM intervals, the difference in mean LRT statistic between the two methods is at a very micro level (52.3 - 52.2 = 0.08).
|
Linkage:
The powers of separating 10-, 20-, 30-, and 40-cM-apart QTL are 97.2, 99.0, 99.6, and 99.8% for the ML method, respectively, and are 22.0, 60.0, 91.6, and 98.4% for the REG method, respectively (Table 5 and Fig 1). Also, the difference in MSE between the two methods becomes larger as QTL get closer. The MSE ratios of â1 for the two methods are 4.52 (0.226/0.050), 8.67 (0.091/0.011), 4.00 (0.056/0.014), and 2.57 (0.036/ 0.011), respectively (Fig 1C). The estimated h2 by the REG method is seriously biased. The means of
2 by the REG method are 0.037, 0.070, 0.112, and 0.162 for 10-, 20-, 30-, and 40-cM-apart QTL, respectively (h2 = 0.5). The ML method, however, can estimate h2 well (see also Fig 1C). If h2 = 0.3 or 0.1, the power, the MSE ratio of â1, and
2 for the two methods are shown in Fig 1, ac. If the linked QTL show epistasis, the advantage gained by the ML method becomes even more significant (Fig 1D).
|
|
| CONCLUSION AND DISCUSSION |
|---|
In this article, the differences in QTL parameter estimation and testing for the existence of QTL between the ML and REG methods are investigated both analytically and numerically. It is found that the REG method tends to give estimates with larger MSE and smaller LRT statistics in testing parameters, and it is less powerful in QTL detection when compared with the ML method. Also, the REG method is biased in estimating the residual variance and the proportion of total variance explained by QTL. Therefore, ML interval mapping is more accurate, precise, and powerful than REG interval mapping in QTL mapping. The differences in power, MSE, and LRT statistics between the two methods depend on factors such as size of QTL effect, interval size, relative QTL position in an interval, difference between QTL effects, epistasis, and linkage between QTL, as shown in the article. Their differences in general may be minor, but can be significant in certain situations. The differences become larger as the proportion explained by QTL becomes higher, marker interval becomes wider, QTL position moves from boundary to middle of an interval, the difference of QTL effects is larger, epistasis becomes stronger, and the QTL positions are closer. Especially, the REG method may have a serious problem in detecting closely linked QTL when compared with the ML method. As shown in Table 5 and Fig 1, the difference in detecting closely linked (1020 cM apart) QTL with opposite effects is quite significant (power 0.22 vs. 0.97 for 10-cM-apart QTL and h2 = 0.5; power 0.60 vs. 0.99 for 20-cM-apart QTL and h2 = 0.5; power 0.09 vs. 0.54 for 10-cM-apart QTL and h2 = 0.3; power 0.34 vs. 0.61 for 20-cM-apart QTL and h2 = 0.3). In addition, the REG method is seriously biased in estimating the proportion of variance explained by QTL (Fig 1B), and it gives the estimates of the effects with much larger MSEs (Fig 1C). The problem of the REG method in detecting closely linked QTL becomes worse if epistasis is present (Fig 1D).
It was often pointed out that there is no significant difference in the estimation of QTL parameter and statistical power of QTL detection between the REG and ML methods with the exception that the estimate of residual variance by the REG method is biased (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The cost in computation per iteration in the EM algorithm is generally not very expensive (![]()
18 iterations to converge and is <10 times slower than the REG method for the 40-cM interval case (the REG method takes
65 sec and the ML method takes
608 sec to finish the computation of 500 replicates). Therefore, the ML method should not be regarded as formidably expensive in computation as the computer technology is advancing. ![]()
![]()
![]()
![]()
![]()
The distributions of most quantitative traits approximate more or less close to normal or can be scaled to normal through simple transformation (![]()
ij's, which take the distribution of the residual error into account using normal density. The estimation of the REG method depends on the conditional probabilities pij's, which ignore the distribution of residual error, and the IRWLS method takes only the second moment of residual error into account whatever the underlying residual error distribution is. This is also the reason why the ML method can be better than the REG and IRWLS methods. If the residual error does not follow normal distribution, the mixture model in Equation 2 should take its specific form into account to model the relation between the quantitative trait and QTL in estimation. In practice, although most of the residual errors are normally distributed and the use of the normal mixture model should be safe in most situations, it is important to examine the pattern of residuals, which is a requisite procedure in model selection, to ensure that the final QTL mapping model is appropriate.
The QTL mapping result will be used as a base for follow-up operations, such as marker-assisted selection or gene transfer, on QTL for trait improvement. To ensure the validity of trait improvement, the quality of QTL mapping should be more important than the ease of computation. Researchers using the REG method for mapping QTL need to be concerned with the factors affecting its approximation to the ML method in practice. For example, if there are wide marker intervals along the genome (known data structure), or the QTL effects are not sure to be equally small, or the QTL are linked with epistasis (unknown QTL parameters), the REG method may perform poorly when compared to the ML method. Then, after the analysis of the REG method, there is a need to further use the ML method to finalize the QTL mapping result. As far as computation is concerned, it is suggested that researchers may use the REG method as an initial procedure to obtain preliminary results and further use the ML method as a final procedure to obtain the conclusive results of QTL mapping.
| ACKNOWLEDGMENTS |
|---|
This study was supported by grants NCS89-2313-B-001-006 from the National Science Council, Taiwan, Republic of China.
Manuscript received July 26, 1999; Accepted for publication May 30, 2000.
| LITERATURE CITED |
|---|
CARBONELL, E. A., T. M. GERIG, E. BALANSARD, and M. J. ASINS, 1992 Interval mapping in the analysis of nonadditive quantitative trait loci. Biometrics 48:305-315.
CASELLA, G., and R. BERGER, 1990 Statistical Inference. Wadsworth, Belmont, CA.
DEMPSTER, A. P., N. M. LAIRD, and D. B. RUBIN, 1977 Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39:1-38.
DOERGE, R. W. and G. A. CHURCHILL, 1996 Permutation test for multiple loci affecting a quantitative character. Genetics 142:284-294.
DUPUIS, J. and D. SIEGMUND, 1999 Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics 151:373-386
FALCONER, D. S., and T. F. C. MACKAY, 1996 Introduction to Quantitative Genetics. Longman Group, London.
GOFFINET, B. and B. MANGIN, 1998 Comparing methods to detect more than one QTL on a chromosome. Theor. Appl. Genet. 96:628-633.
GRATTAPAGLIA, D., F. L. G. BERTOLUCCI, R. PENCHEL, and R. R. SEDEROFF, 1996 Genetic mapping of quantitative trait loci controlling growth and wood quality traits in Eucalyptus grandis using a maternal half-sib family and RAPD markers. Genetics 144:1205-1214[Abstract].
HACKETT, C. A. and J. I. WELLER, 1995 Genetic mapping of quantitative trait loci for traits with ordinal distributions. Biometrics 51:1252-1263[Medline].
HALDANE, J. B. S., 1919 The combination of linkage values and the calculation of distances between the loci of linked factors. J. Genet. 8:299-309.
HALEY, C. S. and S. A. KNOTT, 1992 A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315-324[Medline].
HALEY, C. S., S. A. KNOTT, and J.-M. ELSEN, 1994 Mapping quantitative trait loci in crosses between outbred lines using least squares. Genetics 136:1195-1207[Abstract].
HENSHALL, J. M. and M. E. GODDARD, 1999 Multiple-trait mapping of quantitative trait loci after selective genotyping using logistic regression. Genetics 151:885-894
HOESCHELE, I. and P. VANRADEN, 1993a Bayesian analysis of linkage between genetic markers and quantitative trait loci. I. Prior knowledge. Theor. Appl. Genet. 85:953-960.
HOESCHELE, I. and P. VANRADEN, 1993b Bayesian analysis of linkage between genetic markers and quantitative trait loci. II. Combining prior knowledge with experimental evidence. Theor. Appl. Genet. 85:946-952.
JANSEN, R. C., 1993 Interval mapping of multiple quantitative trait loci. Genetics 135:205-211[Abstract].
JIANG, C. and Z.-B. ZENG, 1997 Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140:1111-1127[Abstract].
KAO, C.-H. and Z.-B. ZENG, 1997 General formulas for obtaining the MLE and the asymptotic variance-covariance matrix in mapping quantitative trait loci when using the EM algorithm. Biometrics 53:359-371.
KAO, C.-H., Z.-B. ZENG, and R. D. TEASDALE, 1999 Multiple interval mapping for quantitative trait loci. Genetics 152:1203-1216
LANDER, E. S. and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199
LEBRETON, C. M., P. M. VISSCHER, C. S. HALEY, A. SEMIKHODSKII, and S. A. QUARRIE, 1998 A nonparametric bootstrap method for testing close linkage vs. pleiotropy of coincident quantitative trait loci. Genetics 150:931-943
LI, Z., S. R. M. PINSON, W. D. PARK, A. H. PATERSON, and J. W. STANSEL, 1997 Epistasis for three grain yield components in rice (Oryza sativa L.). Genetics 145:453-465[Abstract].
LITTLE, R. J. A., and D. B. RUBIN, 1987 Statistical Analysis With Missing Data. John Wiley, New York.
LOUIS, T. A., 1982 Finding the observed information matrix when using the EM algorithm. J. R. Stat. Soc. Ser. B 44:226-233.
MARTINEZ, O. and R. N. CURNOW, 1992 Estimating the locations and the sizes of the effects of quantitative trait loci using flanking markers. Theor. Appl. Genet. 85:480-488.
MCLACHLAN, G. F., and T. KRISHNAN, 1997 The EM Algorithm and Extensions. John Wiley, New York.
MENG, X.-L. and B. RUBIN, 1991 Using EM to obtain asymptotic variance-covariance matrix: the SEM algorithm. J. Am. Stat. Assoc. 86:899-909.
NETER, J., W. WASSERMAN and M. H. KUTNER, 1990 Applied Linear Statistical Model. Richard D. Irwin, Tokyo.
REBAI, A. and B. GOFFINET, 2000 More about quantitative trait locus mapping with diallel designs. Genet. Res. 75:243-247[Medline].
SATAGOPAN, J. M., B. S. YANDELL, M. A. NEWTON, and T. C. OSBORN, 1996 A Bayesian approach to detect quantitative trait loci using Markov chain Monte Carlo. Genetics 144:805-816[Abstract].
SILLANPAA, M. J. and E. ARJAS, 1999 Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data. Genetics 151:1605-1619
SONG, J. Z., M. SOLLER, and A. GENIZI, 1999 The full-sib intercross line (FSIL): a QTL mapping design for outcross species. Genet. Res. 77:61-73.
WEBER, K., R. EISMAN, L. MOREY, A. PATTY, and J. SPARKS et al., 1999 An analysis of polygenes affecting wing shape on Chromosome 3 in Drosophila melanogaster. Genetics 153:773-786
WHITTAKER, J. C., R. THOMPSON, and P. M. VISSCHER, 1996 On the mapping of QTL by regression of phenotype on marker type. Heredity 77:23-32.
XU, S., 1995 A comment on the simple regression method for interval mapping. Genetics 141:1657-1659[Medline].
XU, S., 1996 Mapping quantitative trait loci using four-way crosses. Genet. Res. 68:175-181.
XU, S., 1998a Further investigation on the regression method of mapping quantitative trait loci. Heredity 80:364-373.
XU, S., 1998b Iteratively reweighted least squares mapping for quantitative trait loci. Behav. Genet. 28:341-355[Medline].
ZENG, Z.-B., 1994 Precision mapping of quantitative trait loci. Genetics 136:1457-1468[Abstract].
ZENG, Z.-B., J. LIU, L. F. STAM, C.-H. KAO, and J. M. MERCER et al., 2000 Genetic architecture of a morphological shape difference between two Drosophila species. Genetics 154:299-310
This article has been cited by other articles:
![]() |
B. Feenstra, I. M. Skovgaard, and K. W. Broman Mapping Quantitative Trait Loci by an Extension of the Haley-Knott Regression Method Using Estimating Equations Genetics, August 1, 2006; 173(4): 2269 - 2282. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Anderson, A. F. McRae, and P. M. Visscher A Simple Linear Regression Method for Quantitative Trait Loci Linkage Analysis With Censored Observations Genetics, July 1, 2006; 173(3): 1735 - 1745. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sen, J. M. Satagopan, and G. A. Churchill Quantitative Trait Locus Study Design From an Information Perspective Genetics, May 1, 2005; 170(1): 447 - 464. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, G. B. Jansen, and C. Y. Lin Quantitative Trait Loci Mapping for Dairy Cattle Production Traits Using a Maximum Likelihood Method J Dairy Sci, February 1, 2004; 87(2): 491 - 500. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. O. Borevitz, J. N. Maloof, J. Lutes, T. Dabi, J. L. Redfern, G. T. Trainer, J. D. Werner, T. Asami, C. C. Berry, D. Weigel, et al. Quantitative Trait Loci Controlling Light and Hormone R |














