- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zou, F.
- Articles by Fine, J. P.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zou, F.
- Articles by Fine, J. P.
Statistical Issues in the Analysis of Quantitative Traits in Combined Crosses
Fei Zoua, Brian S. Yandella, and Jason P. Fineaa Department of Statistics, University of Wisconsin, Madison, Wisconsin 53706
Corresponding author: Fei Zou, Department of Statistics, 1210 W. Dayton St., Madison, WI 53706., feizou{at}stat.wisc.edu (E-mail)
Communicating editor: Z-B. ZENG
| ABSTRACT |
|---|
We consider some practical statistical issues in QTL analysis where several crosses originate in multiple inbred parents. Our results show that ignoring background polygenic variation in different crosses may lead to biased interval mapping estimates of QTL effects or loss of efficiency. Threshold and power approximations are derived by extending earlier results based on the Ornstein-Uhlenbeck diffusion process. The results are useful in the design and analysis of genome screen experiments. Several common designs are evaluated in terms of their power to detect QTL.
QUANTITATIVE trait analysis has many applications in plant and animal breeding and in human genetics. Mapping quantitative trait loci (QTL) that influence agriculturally important traits such as grain yield in rice or milk production in cows can help scientists produce specimens with more desirable qualities. Complex human diseases, like breast cancer and diabetes, are known to have genetic etiologies. Animal models may be useful in studying their origins.
Most existing statistical methods have been developed for experimental designs with a single cross from two inbred parents (![]()
![]()
![]()
![]()
![]()
![]()
The effects of polygenes on standard approaches to major QTL mapping are not well understood. With a single cross, the progeny have identical relationships given the QTL genotypes, resulting in a compound symmetry structure (![]()
Recently, methods were proposed to analyze all crosses simultaneously. ![]()
![]()
![]()
![]()
In this article, we consider an arbitrary number of crosses from multiple inbred lines. While we were preparing this manuscript, ![]()
![]()
![]()
![]()
![]()
![]()
| SIMULATION STUDY OF BIAS AND EFFICIENCY |
|---|
If one combines different crosses simultaneously but ignores the different relationships among individuals, substantial bias may result. In this section, we show the effect of polygenes on the QTL estimates. We examine two crosses, BC1 and F2, from common inbred parents P1 and P2. Although the design is simple, it illustrates the key issues. The additive effect of a single major QTL is set to 0 (i.e., no QTL) and 5, respectively, with no dominance effect. Five markers are located at 0, 20, 40, 60, and 80 cM. The major QTL is located at 30 cM. The environmental errors are identically distributed for BC1 and F2 and are sampled from N(0, 25). One hundred individuals from BC1 and F2 are simulated without background polygenes or with 10 background polygenes. The 10 background polygenes are in coupling phase and have common additive effects (i.e., allele substitution effect
k, k = 1, 2, ... , 10) 1 or 2 (see ![]()
![]()
|
We observe that when there are no polygenes both models consistently estimate the QTL effects. However, model P gives more accurate estimates than model N when there are polygenic effects. The bias of model N increases as the expected polygenic differences between BC1 and F2 increase. In summary, our simulations indicate that when analyzing combined crosses, the polygenic model produces more precise and less biased estimates than the traditional interval mapping method.
| THRESHOLD AND POWER CALCULATIONS |
|---|
On the basis of the simulations in the above section, fitting combined crosses (![]()
![]()
![]()
For more general models (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Our approach extends the Ornstein-Uhlenbeck large sample approximations. It is quite simple and practically useful. Calculating the threshold and power under different map distances can be accomplished with closed-form expressions arising from the Ornstein-Uhlenbeck setup. Simulations shown below indicate this works well with realistic sample sizes.
Two inbred strains:
In this section, we consider combined crosses from two inbred parents (P1 and P2), BC1, F2, and BC2. Our goal is to extend ![]()
![]()

where X is the design matrix and e is the random error. n1, n2, n3 are the number of observations for BC1, F2, and BC2, respectively, with n observations in total. The submatrix X1 corresponds to the covariates identifying crosses or other measurements that do not involve the QTL effects and X2 corresponds to the QTL effects. Suppose the allele from parent P1 is q and from P2 is Q. The possible QTL genotypes are qq, Qq, and QQ. Ignoring other covariate effects, we let

where x1ki = 1 (or 0) if individual i in cross k has genotype Qq (or else), and x2ki = 1 (or 0) if individual i in cross k has genotype QQ (or else). The random error e is normally distributed with mean 0 and Var(eki) =
2k, i = 1, 2, ... , nk, k = 1, 2, 3. In general Var(e) =
23G with G-1 = diag(
1, ... ,
1;
2; ... ,
2; 1, ... , 1), where
k =
for k = 1, 2. In the following, we assume that
k is known. If
k is unknown, then consistent maximum-likelihood (ML) estimates may be substituted and the result still holds. Without loss of generality, assume that
23 = 1. At locus d, the hypothesis of no QTL effect is H0: b2 = 0 vs. H1: b2
0, or equivalently, H0: Hb = 0 vs. H1: Hb
0, where

From the normal regression theory, the likelihood ratio statistic is

where
= G-1/2y and
(d) = G-1/2X(d). Under H0,

where (
1
2)' is the maximum-likelihood estimate of b2 = (
1
2)' and A22 is given in (A2) in the Appendix
The distribution of 2 LR(d) depends on
1 and
2, which are correlated. Thus, we cannot directly apply ![]()
1 =
1
1,
2 =
1
1 +
2
2 (see Appendix for
1 and
i), Zi = 
i, i = 1, 2. It is shown in the Appendix that Z1 and Z2 are asymptotically independent and distributed N (0, 1). Thus,

which depends on two uncorrelated normal variates and is asymptotically
22. Note that
i,
i, and Zi, i = 1, 2 all depend on the locus d. In the sequel, when necessary, we use
1(d),
2(d),
i(d), and Zi(d), i = 1, 2 to emphasize their dependence on d.
To demonstrate the Ornstein-Uhlenbeck equivalence, the covariances at different loci d1 and d2 are proved to be

and

This means that for large n, Z1(d) and Z2(d) are approximately independent Ornstein-Uhlenbeck processes with mean zero and covariance 1 - ß1r + O(r2) and 1 - ß2r + O(r2), respectively. Adapting the argument in ![]()
![]() |
(1) |
where
= the distance between markers (in Morgans), C = the number of chromosomes, and L = total length of the genome (in Morgans). The definition of v(x) can be found in ![]()

when a QTL is located at a marker locus. Here

For a QTL between markers, the noncentrality parameters
*1 and
*2 are
*1 exp(-ß1
1) and
*2 exp(-ß2
1), respectively, where the distance between the QTL and the marker is
1. In the case of an F2 population, the formulas above reduce to those in ![]()
General Results:
The derivations above can be generalized to more complicated models, including those in ![]()
2, and models with covariates are also possible. Our framework can be modified for a wide variety of designs.
As before, let the model be

where X1 is an n x p submatrix not involving the QTL effects, X2 is an n x m matrix corresponding to the m QTL effects, and e is the random error. Following the procedure in the Appendix, we first compute A22 using (A2) and then derive the orthogonal transformation matrix P from (A3). Note that both A22 and P involve only the design matrix X and not ß or the correlation parameters. Next, 2 LR(d) can be partitioned into the sum of squares of m asymptotically independent N(0, 1) random variables Z1, ... , Zm, where Z = (Z1, ... , Zm)' = P
2. To calculate Cov(Zj(d1), Zj(d2)), j = 1, 2, ... , m, we find D in (A4) on the basis of the specific designs. It is straightforward to establish

where ßj is the jth diagonal element of -PA22DA'22P'. Now, the tail distribution of 2 LR under the null hypothesis is approximately

where
2m is a
2 random variable with m degrees of freedom and ß = (ß1 + ß2 + ... + ßm)/m.
The formula for power may also be obtained. However, it is quite complicated and is omitted here.
| SIMULATION STUDY OF THRESHOLDS AND POWER |
|---|
We investigated the performance of (1) with different marker distances and different polygenic backgrounds. Thresholds for the log-likelihood were based on interval mapping with combined BC1, F2, and BC2 crosses. n1 = n2 = n3 = 100, giving 300 observations in total and chromosome length = 100 cM. The marker interval lengths are set at 10, 5, and 2 cM, respectively. Different polygenic effects are sampled, as reflected by models ad (see legend of Table 2 for details; Table 3). The approximations from (1) with v(a{2ß
}1/2) are always smaller than the empirical thresholds derived in the simulations. However, as the interval length decreases, our approximations are more similar to the empirical thresholds. In general, the dense map assumption (v = 1) produces conservative thresholds. Since more markers are likely to be typed around promising loci (![]()
}1/2) (using the true map distances), respectively.
|
|
Next, we evaluate the power with different proportions of BC1, F2, and BC2. The power is calculated for dominant (
1 =
2) and additive (
2 = 2
1) models. We compare our results with those of ![]()
![]()
![]()
Fig 1 and Fig 2 exhibit the power curves. When the polygenes are in linkage equilibrium and have only additive effects, the phenotypic variation due to polygenes and environment satisfies
2BC1 =
2BC2 =
2P +
2e and
F2 = 2
2P +
2e, respectively, where
2P is the total polygenic variation in the BC population and
2P is the environmental variation. For this reason, we take
1
1 and choose
-12 = 1, 0.75, 0.67, 0.57, 0.5, which correspond to
2P = 0,
2e/2,
2e, 3
2e, or
2P >>
2e, respectively. We also evaluate the power by using F2's only, which quantifies the loss in power when discarding data from the BCs (see Fig 1A).
|
|
In Fig 1, the proportions of BC1 and BC2 are assumed equal. When the QTL is dominant, power is gained by using BC populations unless there is no polygenic effect (i.e.,
2 is close to 1). The larger the polygenic effects, the greater is the gain with BCs. However, when the QTL is additive, F2's tend to have more information for detecting a QTL than do BCs, unless
P >>
e (i.e., the polygenic effects are very large). Note that when the proportion of F2 approaches 1, our results again match those of Dupuis and Siegmund.
In Fig 2, we allow the proportions of BC1 and BC2 to be unequal with a dominant QTL. In this case, BC1 is more powerful than F2, which is expected. When the QTL is additive, both BC1 and BC2 individuals have identical contributions in detecting the QTL, so only the total proportion of BC1 and BC2 influences the power, as shown in Fig 1A.
| CONCLUSION |
|---|
In this article, we addressed some important practical issues in the analysis of closely related crosses derived from multiple inbred lines when both QTL and polygenes influence a trait. We showed that biased and inefficient estimates of the QTL effects may occur if the polygenic effect is ignored. We derived simple and general approximations for the threshold and power to detect a QTL, allowing different designs to be compared.
Based on our power calculations, we find that the F2 population is more robust in detecting QTL than the two backcross populations. This confirms ![]()
| ACKNOWLEDGMENTS |
|---|
We thank anonymous reviewers for their critical reading of this manuscript. This research was supported in part by the U.S. Department of Agriculture Hatch project through the University of Wisconsin, College of Agricultural and Life Sciences.
Manuscript received August 14, 2000; Accepted for publication April 18, 2001.
| APPENDIX |
|---|
In this section, for combined crosses from two inbred parents, we prove that the likelihood ratio 2 LR(d) can be partitioned into the sum of the squares of two asymptotically independent Ornstein-Uhlenbeck processes through an orthogonal transformation. Define
![]() |
(A1) |
Note that B does not depend on locus d. Let
![]() |
(A2) |
Define
![]() |
(A3) |
where
1 =
, and
1 = c3/
,
2 =
.
Making the orthogonal transformation

gives

For the same reason,

This indicates that Z1 and Z2 are approximately independent N(0, 1). Furthermore,

Now, let
(d1) and
(d2) be the corresponding transformed incidence matrices at loci d1 and d2. Note that the first and second columns of
depend only on the proportions of BC1, F2, and BC2 and not on d1 and d2. The third and fourth columns depend on d1 and d2. Also x1ki(d) = 0, 1, or 0 and x2ki(d) = 0, 0, or 1 if individual i in cross K's genotype at locus d is qq, Qq, or QQ, respectively.
For BC1,

where r is the recombination fraction between loci d1 and d2. Enumerating the probabilities of (x1ki(d1), x1ki(d2)), (x1ki(d1), x2ki(d2)), and (x2ki(d1), x2ki(d2)) for BC1, F2, and BC2, and using the fact that
(d) = G-1/2X(d), we obtain
|
(A4) |
with

Thus,

Therefore,

and

| LITERATURE CITED |
|---|
BERNARDO, R., 1994 Prediction of maize single-cross performance using RFLPs and information from related hybrids. Crop Sci. 34:20-25
CHURCHILL, G. A. and R. W. DOERGE, 1994 Empirical threshold values for quantitative trait mapping. Genetics 138:963-971[Abstract].
DAVIES, R. B., 1977 Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64:247-254
DAVIES, R. B., 1987 Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 74:33-43
DOERGE, R. W., Z-B. ZENG, and B. S. WEIR, 1997 Statistical issues in the search for genes affecting quantitative traits in experimental populations. Stat. Sci. 12:195-219.
DUPUIS, J. and D. SIEGMUND, 1999 Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics 151:373-386
ELSTON, R. C., 1990 Models for discrimination between statistical alternative modes of inheritance, pp. 4155 in Advances in Statistical Methods for Genetic Improvement for Livestock, edited by D. GIANOLA and K. HAMMOND. Springer-Verlag, New York.
FERNANDO, R. L. and M. GROSSMAN, 1989 Marker-assisted selection using best linear unbiased prediction. Genet. Sel. Evol. 21:467-477.
FERNANDO, R. L., C. STRICKER, and R. C. ELSTON, 1994 The finite polygenic mixed-model: an alternative formulation for the mixed-model of inheritance. Theor. Appl. Genet. 88:573-580.
HALEY, C. S. and S. A. KNOTT, 1992 A simple regression method for mapping quantitative trait in line crosses using flanking markers. Heredity 69:315-324[Medline].
JANSEN, R. C. and P. STAM, 1994 High resolution of quantitative traits into multiple quantitative trait in line crosses using flanking markers. Heredity 69:315-324.
LANDER, E. S. and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199
LANDER, E. S. and L. KRUGLYAK, 1995 Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat. Genet. 11:241-247[Medline].
LIU, Y. and Z-B. ZENG, 2000 A general mixture model approach for mapping quantitative trait loci from diverse cross designs involving multiple inbred lines. Genet. Res. 75:345-355[Medline].
PIEPHO, H. P., 2001 A quick method for computing approximate thresholds for quantitative trait loci detection. Genetics 157:425-432
REBAI, A., B. GOFFINET, B. MANGIN and D. PERRET, 1994a Detecting QTLs with diallel schemes, pp. 170177 in Biometrics in Plant Breeding: Applications of Molecular Markers. 9th Meeting of the EUCARPIA, edited by J. W. VAN OOIJEN and JANSEN. CPRO-DLO, Wageningen, The Netherlands.
REBAI, A., B. GOFFINET, and B. MANGIN, 1994b Approximate thresholds of interval mapping tests for QTL detection. Genetics 138:235-240[Abstract].
REBAI, A., B. GOFFINET, and B. MANGIN, 1995 Comparing power of different methods for QTL detection. Biometrics 51:87-99[Medline].
SIEGMUND, D., 1985 Sequential Analysis: Tests and Confidence Intervals. Springer-Verlag, New York.
YANDELL, B. S., 1997 Practical Data Analysis for Designed Experiments. Chapman & Hall/CRC Press, London/Cleveland.
ZENG, Z-B., 1983 Theoretical basis of separation of multiple link gene effects on mapping quantitative trait loci. Proc. Natl. Acad. Sci. USA 90:10972-10976
ZENG, Z-B., 1994 Precision mapping of quantitative traits loci. Genetics 136:1457-1468[Abstract].
This article has been cited by other articles:
![]() |
G. B. Collin, T. P. Maddatu, S. Sen, and J. K. Naggert Genetic modifiers interact with Cpefat to affect body weight, adiposity, and hyperglycemia Physiol Genomics, July 14, 2005; 22(2): 182 - 190. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Zou, J. A. L. Gelfond, D. C. Airey, L. Lu, K. F. Manly, R. W. Williams, and D. W. Threadgill Quantitative Trait Locus Analysis Using Recombinant Inbred Intercrosses: Theoretical and Empirical Considerations Genetics, July 1, 2005; 170(3): 1299 - 1311. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Li, M. A. Lyons, H. Wittenburg, B. Paigen, and G. A. Churchill Combining Data From Multiple Inbred Line Crosses Improves the Power and Resolution of Quantitative Trait Loci Mapping Genetics, March 1, 2005; 169(3): 1699 - 1709. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Zou, J. P. Fine, J. Hu, and D. Y. Lin An Efficient Resampling Method for Assessing Genome-Wide Statistical Significance in Mapping Quantitative Trait Loci Genetics, December 1, 2004; 168(4): 2307 - 2316. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Diao, D. Y. Lin, and F. Zou Mapping Quantitative Trait Loci With Censored Observations Genetics, November 1, 2004; 168(3): 1689 - 1698. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Zou, F.
- Articles by Fine, J. P.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Zou, F.
- Articles by Fine, J. P.








