Genetics, Vol. 157, 433-444, January 2001, Copyright © 2001

Marker Pair Selection for Mapping Quantitative Trait Loci

Hans-Peter Piephoa and Hugh G. Gauch, Jr.b
a Institut für Nutzpflanzenkunde, Universität Kassel, 37213 Witzenhausen, Germany
b Department of Plant Breeding, College of Agriculture and Life Sciences, Cornell University, Ithaca, New York 14583

Corresponding author: Hans-Peter Piepho, Institut für Nutzpflanzenkunde, Universität Kassel, Steinstrasse 19, 37213 Witzenhausen, Germany., piepho{at}wiz.uni-kassel.de (E-mail)

Communicating editor: C. HALEY


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Mapping of quantitative trait loci (QTL) for backcross and F2 populations may be set up as a multiple linear regression problem, where marker types are the regressor variables. It has been shown previously that flanking markers absorb all information on isolated QTL. Therefore, selection of pairs of markers flanking QTL is useful as a direct approach to QTL detection. Alternatively, selected pairs of flanking markers can be used as cofactors in composite interval mapping (CIM). Overfitting is a serious problem, especially if the number of regressor variables is large. We suggest a procedure denoted as marker pair selection (MPS) that uses model selection criteria for multiple linear regression. Markers enter the model in pairs, which reduces the number of models to be considered, thus alleviating the problem of overfitting and increasing the chances of detecting QTL. MPS entails an exhaustive search per chromosome to maximize the chance of finding the best-fitting models. A simulation study is conducted to study the merits of different model selection criteria for MPS. On the basis of our results, we recommend the Schwarz Bayesian criterion (SBC) for use in practice.


PROCEDURES for detecting multiple quantitative trait loci (QTL) are of growing interest to plant breeders and geneticists. The currently most widely used methods are interval mapping (IM; LANDER and BOTSTEIN 1989 Down, LANDER and BOTSTEIN 1994 Down) and composite interval mapping (CIM; JANSEN 1993 Down; ZENG 1993 Down). In CIM, a chromosome is scanned for the presence of a QTL, while controlling for the genetic background using some markers as cofactors in a multiple regression framework (ZENG 1993 Down; JANSEN and STAM 1994 Down). IM and CIM may be implemented using the maximum-likelihood (ML) method (ZENG 1994 Down). Alternatively, an approximate least-squares method can be used (HALEY and KNOTT 1992 Down; MARTINEZ and CURNOW 1992 Down; WHITTAKER et al. 1996 Down). In this article we use the least-squares method.

WHITTAKER et al. 1996 Down showed that the least-squares method for IM and CIM in backcross (BC1) and F2 populations can be cast as a standard multiple linear regression of phenotype on marker type, since information on the QTL is absorbed by the flanking markers (STAM 1991 Down). Therefore, the problem of QTL detection essentially reduces to the problem of finding the appropriate pairs of markers. In CIM, we have the additional task of selecting cofactors for controlling the genetic background. It may be argued that cofactors are useful for controlling genetic background only if they are closely linked to a QTL. In fact, the best control is expected for markers flanking the QTL. Thus, the dual problem of QTL detection and selection of cofactors is seen to be a single problem of finding the flanking markers of QTL (LEBRETON and VISSCHER 1998 Down). This puts us into the general framework of model selection in multiple regression for which there is a vast literature (see, e.g., MILLER 1990 Down; DRAPER and SMITH 1998 Down; MCQUARRIE and TSAI 1998 Down).

A peculiarity of multiple regression for QTL mapping is that there is no single true model, because there is no fixed set of markers. If we drop a pair of flanking markers from the analysis, the dropped pair can be replaced by adjacent markers. Similarly, if a different marker system is used, marker loci will change, but still the flanking markers will absorb the QTL effects, leading to a different model conditional on the markers. Thus, the term "true model" has to be used with this peculiarity in mind.

We believe that multiple LR testing (equivalently F-testing if linear least squares is used) for model selection is problematic for several reasons. Most importantly, multiple likelihood-ratio (LR) tests without adjustments are known to tend to overfitting (GELFAND and GHOSH 1998 Down). For example, in multiple regression, using an F-to-enter statistic at the nominal {alpha} = 5% level in a forward selection procedure can easily give a true significance level (false positive rate) >50% (MILLER 1990 Down). Furthermore, only nested models can be compared with LR tests. Also, the sequence of models to be compared in a model-building process is not unique.

One reaction to the problems connected with multiple LR testing is to consider a Bayesian framework (see, e.g., DRAPER 1995 Down; SILLANPAA and ARJAS 1998 Down). A somewhat intermediate approach, which is computationally much less demanding than many of the Bayesian methods, is to use information criteria such as Akaike's information criterion (AIC) or criteria assessing the mean squared error of prediction such as Mallows' Cp (BURNHAM and ANDERSON 1998 Down; GELFAND and GHOSH 1998 Down; MCQUARRIE and TSAI 1998 Down). Some selection criteria such as the SCHWARZ 1978 Down Bayesian criterion (SBC) involve a Bayesian approach to model selection. A number of different selection criteria are compared in the present article.

Our suggested procedure is denoted as marker pair selection (MPS). Instead of implementing a standard model selection procedure, we exploit knowledge of the genetic mechanisms underlying the data. Our MPS procedure has three distinctive features: (i) markers are selected in adjacent pairs to increase the chance of selecting flanking markers while reducing the risk of selecting nonflanking markers; (ii) an exhaustive search per chromosome is used in place of simple forward selection, which increases the chance of finding the best-fitting model; and (iii) a model selection criterion such as SBC is employed to select the final model among a sequence of models.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

In this section, we describe the model for mapping QTL in a backcross population as well as the method for parameter estimation. We then develop an MPS algorithm on the basis of a modified forward selection procedure, which generates a sequence of models with an increasing number of markers. From this sequence, a best-fitting model may be selected according to one of the criteria given in Table 2. The performance of MPS is studied by means of simulation.


 
View this table:
In this window
In a new window

 
Table 1. Expectation of g conditional on flanking markers for a BC1 population


 
View this table:
In this window
In a new window

 
Table 2. List of model selection criteria used

Model and parameter estimation:
Consider a backcross mLmLqqmRmR x MLmLQqMRmR, where Mi and mi (i = L, R) denote the left and right flanking marker alleles, while Q, q are the QTL alleles. The recombination frequency between left and right markers is denoted as {theta}, while rL and rR are the recombination frequencies of the markers and the QTL. Let Yj be the phenotypic value (e.g., yield) of an individual in the backcross population. Then, conditional on the QTL genotype, we write

(1)

and

(2)

where {gamma} is an intercept term and {alpha} is the allele substitution effect for the QTL. Let xL = 1 when an individual of the backcross population has the genotype MLmL at the left flanking marker and xL = 0 when the genotype is mLmL. The dummy variable xR for the right flanking marker is similarly defined. Let g = 0 if the QTL genotype is qq and g = 1 if it is Qq. Assuming no crossover interference, the expectation of g conditional on the flanking markers and QTL position is as given in Table 1 (MARTINEZ and CURNOW 1992 Down). Conditional on the markers we have

(3)

Note that this model is nonlinear in the parameters {alpha} and rL. It can be shown, however, that E(g|rL, xL, xR) is linear in xL and xR, i.e.,

with

(4)

(compare to WHITTAKER et al. 1996 Down; note that we use a slightly different coding here; the formula for a BC1 given in that article is for a coding of xL and xR as 1 and -1, not 0.5 and -0.5 as erroneously stated in the article; JOHN WHITTAKER, personal communication). Defining ß0 = {gamma} + a{alpha}, ßL = {lambda}{alpha}, and ßR = {rho}{alpha}, the expectation for Yj, conditional on the markers, can be written

(5)

Dividing ßL by ßR and rearranging shows that rL is a root of the quadratic

(6)

For rL {isin} (0, 0.5), the only feasible solution is

(7)

Back substitution of this solution into ßL = {lambda}{alpha} yields

(8)

This shows that a single regression on the adjacent marker covariates xL and xR suffices to estimate {alpha} and rL. The model may be extended to cover more than one QTL in a straightforward manner. To estimate the QTL effects and position, we just apply (7) and (8) to the pairs of markers corresponding to the putative QTL in question. This implies the model

(9)

H is the number of QTL and ß0 = {gamma} + {Sigma}hah{alpha}h, where ah and {alpha}h are the coefficient a and the genetic effect for the hth QTL. Application of this model for estimation of QTL effects and position requires QTL to be isolated; i.e., no pair of QTL shares a common flanking marker. For estimation in the case of nonisolated QTL see WHITTAKER et al. 1996 Down. In this article, we assume that QTL are isolated for simplicity of exposition.

Model selection criteria:
We use criteria in Table 2 to select the best-fitting model among candidate models. A very thorough and concise review of these criteria can be found in MCQUARRIE and TSAI 1998 Down. Model selection criteria can be broadly classified as either efficient or consistent (MCQUARRIE and TSAI 1998 Down). Efficient criteria are based on the presumption that the generating or true model is of infinite dimension and/or that the set of candidate models does not contain the true model. The goal is to select the model that best approximates the true model. In large samples, a selection criterion that chooses the model with minimum mean squared error (MSE) is said to be asymptotically efficient. Examples for efficient criteria are AIC, AICc, Cp, final prediction error (FPE), Rp, and leave-one-out cross-validation (PRESS; Table 2). Efficient criteria seek to minimize some measure of discrepancy between true model and selected model. The two most common measures are the Kullback-Leibler discrepancy and mean squared error. Some efficient measures seek to minimize the former (e.g., AIC), while others try to minimize the latter (e.g., FPE). Both discrepancy measures are asymptotically equivalent (MCQUARRIE and TSAI 1998 Down, p. 7).

Consistent criteria are designed for cases where the true model has low dimension and is assumed to be among the candidate models. A consistent criterion identifies the correct model asymptotically (as sample size increases) with probability one. Examples are SBC, HQ (HANNAN and QUINN 1979 Down), HQc, and GM (GEWEKE and MEESE 1981 Down; Table 2; MCQUARRIE and TSAI 1998 Down). It is not clear, on a priori grounds, which type of criterion is more appropriate. The objective of QTL mapping is to detect as many of the true QTL as possible, while not detecting false QTL, i.e., to find the true genetic model. If the number of QTL is small, we might expect a consistent criterion to stand a better chance of correctly detecting QTL, while efficient criteria may perform better in more complex cases. Note, however, that optimality of different model selection criteria is based on asymptotic arguments. Therefore, in this article we study the small sample behavior of different criteria by means of simulation.

In case there are more markers than observations, the full model is not estimable, and hence Mallows' Cp and GM are not applicable due to lack of an error variance estimate based on the full model. We might continue the forward selection until the error variance estimate stabilizes, but this raises the problem of determining when stabilization has taken place. Incidentally, sequential F-testing will not work for our procedure, since models in the sequence are not necessarily nested.

Subset selection of markers:
In what follows, we first point out the need to select adjacent pairs of markers rather than individual markers. We then make a few remarks regarding applicability of standard subset selection procedures to our problem. Finally, suggestions are given for modifications exploiting the biology of the problem at hand and the procedure is described in algorithmic form.

The effect and position of a QTL can be estimated from the regression coefficients of two flanking markers. A subset selection procedure can be used to find markers, which are likely to flank a QTL. If one marker is selected, we will also have to include one of the adjacent markers, because two flanking markers are needed in the estimation procedure. WHITTAKER et al. 1996 Down state that "an exception to this rule might be when markers are fitted as cofactors to absorb the effect of QTL which, although too small to be mapped individually, contribute a significant portion of genetic variance." In our procedure, we include pairs of adjacent markers as a general rule. An obvious requirement for entry of a pair of adjacent markers into the model is that the sign of their estimates be the same, for otherwise the estimated model is not consistent with the presence of a QTL between the pair (WHITTAKER et al. 1996 Down).

Since we are in a multiple regression framework, standard procedures for subset selection could be used, such as forward selection, etc. (MILLER 1990 Down; DRAPER and SMITH 1998 Down). By so doing, however, we would ignore all we know about the relationship among markers. A potential payoff is expected if this knowledge is taken into account. Backward selection is not used here, because it does not work when the number of markers exceeds the sample size. If the number of markers is large, an overall exhaustive search is usually prohibitive due to the large number of possible models. Forward selection or "stepwise" regression (EFROYMSON 1960 Down) are the most feasible approaches among standard techniques. It is well known, however, that these methods are not guaranteed to find the best-fitting subsets (MILLER 1990 Down). They work best when the regressor variables are nearly uncorrelated (WEISBERG 1985 Down, p. 195). Marker data from the same chromosome are correlated, so simple forward selection is problematic, mainly because the best-fitting submodel is likely to be missed, while spurious variables may enter the model (WEISBERG 1985 Down). Particularly, some of the variables selected first may not be included in the best model (see MILLER 1990 Down, p. 48, for a striking example). A genome-wide exhaustive search assures that the best-fitting model will not be missed, but has the disadvantage of a high computational burden.

In this article, we propose a modified forward selection strategy based on an article by GABRIEL and PUN 1979 Down(see also MILLER 1990 Down, p. 64). These authors suggested that in some situations it may be possible to find groups of regressors, within which an exhaustive search is possible. The grouping needs to be such that if two variables xi and xj are in different groups then their regression sum of squares is additive. This requirement is fulfilled for orthogonal variables. For orthogonal groups, performing an exhaustive search over all possible models is equivalent to an exhaustive search per group and is thus guaranteed to find the best-fitting model, with enormous savings in computational effort. Marker data from different chromosomes are stochastically independent. Thus, in large samples, they are nearly orthogonal, conditional on the observed data. This suggests that it is useful to do an exhaustive search for each chromosome and that the regression sum of squares for markers from different chromosomes is nearly additive. Of course, in small samples we may fail to find the best-fitting model due to chance correlation among markers from different chromosomes. However, the probability of missing the best-fitting model is expected to be very much smaller than with simple forward selection. In this article, a model will be called sign consistent if its estimated regression coefficients are of the same sign for each marker pair in the model. Our MPS procedure is described as Algorithm 1.

ALGORITHM 1: Make the following definitions: ic is a counter for the number of marker pairs selected for the cth chromosome; k is the total number of marker pairs in the current model; C is the total number of chromosomes; RSSmin is the smallest residual sum of squares of sign-consistent models of order k found so far; Mk is the selected model of order k.

  1. For each chromosome set ic = 0. Set k = 0. Fit the model with just an intercept and record the residual sum of squares (RSStotal). Record this model as M0.

  2. Set k {Rightarrow} k + 1. Set RSSmin = RSStotal (from step 1). For c = 1 to C do the following: From the current model drop the ic marker pairs from the cth chromosome (but keep all pairs from other chromosomes) and do an exhaustive search for models with ic + 1 marker pairs from the cth chromosome. Consider entry of a set of ic + 1 pairs of markers only if the resulting model is sign consistent. For a current model that is sign consistent, compute the residual sum of squares (RSScurrent). If RSScurrent < RSSmin then set RSSmin = RSScurrent, set cmin = c, and record the current model as Mk.

  3. If in step 2 no sign-consistent model of order k can be found, stop. Else set ic {Rightarrow} ic + 1 for chromosome cmin and go back to step 2.

  4. Apply a model selection criterion to select the best-fitting model in the sequence of models Mk (k = 0, 1, 2 ... ) generated by steps 1, 2, and 3.

A remark regarding step 2 is in order. If a sign inconsistency is observed for a pair of markers to be entered, this suggests that the pair may not flank a QTL. Thus, such pairs should not be considered. Checking sign consistency upon entry does not, however, prevent a sign change in an entered pair later in the model-building process. If, while other pairs are being added, a sign change occurs in a pair from another chromosome, that pair may be a false positive, suggesting there is an increasing risk of detecting false positives and that the selection procedure should be terminated. Therefore, we stop the selection process when no sign-consistent model of order k is found.

We should point out that it is impossible that different orders of chromosomes lead to different results with Algorithm 1. This is because step 2 tries to add a pair of markers on each chromosome. In step 3 the algorithm then chooses the one chromosome for which addition of a pair gives the best fit. This will be the same chromosome, regardless of the order in which chromosomes are tried.

Note that in the model sequence obtained from Algorithm 1, the best model with k pairs does not necessarily contain all markers that are in the best model with k - 1 pairs or less. An important reason for allowing the implicit drop of one or two markers during each step of the model-building process is that there may be two adjacent QTL on the same chromosome with the same sign of the associated genetic effect. The pair of markers selected first is likely to lie between the two QTL. If left in the model, a ghost QTL will be detected. Allowing a pair to be dropped from the model during model building reduces the risk of detecting ghost QTL. For a chromosome with six markers and two QTL in the intervals (2, 3) and (5, 6) the model sequence may look like the hypothetical example shown in Table 3. The first pair tries to explain as much of the phenotypic variation as possible. However, only marker 3 is a flanking marker. Marker 4 is included because it accounts for the QTL in the interval (5, 6). In the next step, marker 4 is dropped while the flanking pairs (2, 3) and (5, 6) enter. SBC selects the four-marker model as fitting best (smallest value of criterion), while the full model fits slightly worse. Were simple forward selection applied to the above example, we would first select the pair (3, 4), and this would remain in the model throughout. Thus, there is no more flexibility to end up with the "true" marker model (2, 3, 5, 6), and a ghost QTL will be detected.


 
View this table:
In this window
In a new window

 
Table 3. Hypothetical MPS model fitting sequence (n = 200)

Simulation study:
We simulated BC1 populations for various settings. The number of chromosomes ranged from 12 to 20, while the number of QTL was between zero and five. Equal spacing of markers (10 or 20 cM) along a 100-cM chromosome and absence of interference were assumed. The number of crossovers per chromosome was simulated according to a Poisson distribution with parameter equal to the length of the chromosome in morgans, which is in accordance with Haldane's mapping function. For each setting, we performed 100 simulation runs. Assuming Poisson sampling, the standard error for an expected count µ (e.g., number of false positives) is (µ/100)0.5, e.g., 0.2 for µ = 4. Due to high positive correlation among statistics of the same type as computed for different selection criteria (number of false positives, etc.), the accuracy of comparisons was deemed reasonable. Algorithm 1 was used, allowing a maximum of two QTL per chromosome to limit the computational burden of the exhaustive search. This does not imply, however, that such limitation is needed in practical applications where there is only one sequence of models to be generated instead of 100 or more in simulations. A QTL was considered as detected when an estimated QTL position was within 15 cM of the true QTL position. While the 15-cM margin is somewhat arbitrary, rankings of model selection criteria according to different performance measures were rather insensitive to changes in the margin. If the hth QTL is detected, h is the estimate of the hth QTL effect based on (8). Otherwise h = 0. As an aggregate measure of bias we computed

(10)

where h is an estimate of the hth QTL effect. For each model selection criterion we counted the number of correctly detected QTL as well as the number of false positives. From these counts we computed the fraction of correct detections among all detections. If for a given QTL there were more than one pair yielding a QTL position estimate within 15 cM of the true QTL position, only the pair with position estimate closest to the true QTL was considered as a detecting pair. All pairs of markers not detecting any QTL were considered as false positives.

We considered 14 examples with different QTL numbers, positions, and effect sizes (Table 4). Heritabilities were computed as described in the Appendix Example 1 is adapted from LANDER and BOTSTEIN 1989 Down. In this example, there are five QTL with decreasing effect size. This pattern of "tapered effects" (BURNHAM and ANDERSON 1998 Down), i.e., few large effects and many small effects, is very typical of many real applications (KEARSEY and FARQUHAR 1998 Down). LANDER and BOTSTEIN 1989 Down used a marker spacing of 20 cM for IM. From DARVASI et al. 1993 Down and PIEPHO 2000 Down it can be conjectured that using a much smaller spacing does not usually provide a dramatic gain in accuracy and power for loosely linked QTL. For detecting closely linked QTL, however, a finer spacing is necessary. In all examples except two, we used a spacing of 10 cM. We also included one example with a spacing of 5 cM. We do not consider finer marker spacings since this would increase the problem of multicollinearity and thus of instability of parameter estimates (MELCHINGER et al. 1998 Down). The LANDER and BOTSTEIN 1989 Down example was modified in different ways, i.e., by changing of error variance (heritability) and marker spacing. We included some other examples with less and with more markers. Examples 9 and 10 are adapted from BEAVIS 1994 Down, who used examples with 10 and 40 QTL of the same, but small effect. Also, examples with two QTL on the same chromosome were included (examples 7, 8, 13, and 14; see Table 4).


 
View this table:
In this window
In a new window

 
Table 4. Examples considered in simulations of BC1 populations

If markers are densely spaced, it may happen that two adjacent markers are perfectly correlated, so that the design matrix for a model that includes these two markers is not of full column rank. If two markers are perfectly correlated, there is no information as to the position and effect of a QTL between the markers and the approach of WHITTAKER et al. 1996 Down breaks down, unless some constraint is imposed. For simplicity, we rejected the corresponding model in simulations. The problem occurred very rarely with a marker spacing of 10 cM and never with a marker spacing of 20 cM, but became more serious with spacings of 5 cM and smaller (results not shown). In practice one would include a check for collinearity (SARI-GORLA et al. 1997 Down), drop one of two perfectly (or very highly) correlated markers, and include the best fitting of either adjacent (pseudo) noncollinear one.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

The number of detected QTL is usually quite stable across criteria (Table 5). SBC tends to select the simplest models and thus the average number of correctly detected QTL is usually smaller than for other criteria, but the difference is very small most of the time, except for more extreme cases such as examples 9 and 10, where the difference is somewhat more pronounced. HQc and FPE4 also tend to select simpler models. In all cases investigated, SBC clearly has the smallest false positive rate (Table 6) and the most favorable percentage of correct detections among all detections (Table 7), often followed by HQc and FPE4. For these two types of counts, SBC is generally markedly superior to some other quite popular criteria such as s2 and Cp. For example, with example 2, SBC has an average number of 0.63 false detections, while s2 and Cp have 6.06 and 7.36 false detections, respectively. SBC is followed by HQc (1.07) and FPE4 (1.31) in this example. In the example with no QTL (example 12), SBC picks the correct model (model with no markers) 92% of the time, which is by far better than any other criterion. Only FPE4 and HQc come anywhere near this figure (59 and 74%, respectively). It should be noted that all criteria select from the same sequence of models. The difficult task is to strike the right balance between underfitting and overfitting, i.e., to find Ockham's Hill (MACKAY 1992 Down), and it is this task for which model selection criteria are designed. Obviously, SBC is best at finding a suitable cut-off; i.e., it detects when the sequence starts picking up more noise than pattern.


 
View this table:
In this window
In a new window

 
Table 5. Number of true QTL detected by MPS for selected fitted models based on different selection criteria and different examples, averaged across 100 simulations


 
View this table:
In this window
In a new window

 
Table 6. Number of false positives among detections for fitted models selected by MPS based on different selection criteria and different examples, averaged across 100 simulations


 
View this table:
In this window
In a new window

 
Table 7. Proportion of number of detected true QTL among total number of detections for fitted models selected by MPS based on different selection criteria and different examples, averaged across 100 simulations

Examples 9 and 10, which were chosen mainly to see how the criteria performing best in most cases would perform under circumstances very favorable to other criteria such as AIC, are extreme cases in many respects. The effects are all equal and not tapered as in many of the other examples. In contrast to other examples, SBC has a markedly smaller number of correct detections in example 9 (Table 5), so the fact that it still has the most favorable rate of correct detections relative to the total number of detections does not have an unambiguous interpretation. If we are more concerned about false positives, SBC is clearly favorable, while other criteria fare better regarding the number of correct detections.

Examples 13 and 14 have the same QTL as example 7, but are different in that the number of markers exceeds the number of individuals. Thus, there is a larger potential for overfitting. Note that the criteria Cp and GM are not applicable because the number of markers exceeds the sample size. While for both examples 13 and 14 the number of correct detections is about the same for all criteria, SBC is the clear winner in terms of the ratio of true detections among all detections (Table 7). For many of the other criteria, the number of false detections (Table 6) increases dramatically for examples 13 and 14 compared to example 7, showing that the problem of overfitting increases with the number of markers. SBC is the only criterion for which the number of false positives does not change markedly relative to example 7.

A comparison of examples 2, 3, 4, and 5 in Table 6 and Table 7 shows that all criteria select simpler models as {sigma}2 increases and as sample size decreases. Increasing the sample size from 200 (example 2) to 500 (example 5) results in a mild increase in the number of correct detections (Table 5) and in the proportion of correct detections among all detections (Table 7). Reducing marker spacing (compare examples 1 and 2 and examples 7 and 14) increased the number of false positives and reduced the proportion of correct detections, indicating that the risk of overfitting increases with the number of markers. Note, however, that in example 2 the number of correct detections is also increased relative to example 1.

Bias, as assessed by the overall measure SSE({alpha}), is comparable for all selection criteria (Table 8). The only exception to this rule is SBC, which due to its tendency to select simpler models than other criteria has a notably smaller number of detected QTL and so has somewhat larger aggregate bias than other criteria in some examples, mainly due to undetected QTL. Bias decreases with smaller variance {sigma}2 (examples 2–4 and examples 9 and 10). This corroborates the finding of UTZ and MELCHINGER 1994 Down that heritability is among the main factors determining bias. Our results for examples 9 and 10 are somewhat more diverse than those of BEAVIS 1994 Down, who almost exclusively found large upward biases for examples similar to ours on the basis of a comparable range of heritabilities. Note that BEAVIS 1994 Down used IM, which has been shown by UTZ and MELCHINGER 1994 Down to be associated with more severe biases than CIM. We should point out that MPS is more akin to CIM, which may go some way toward explaining the contrasting results. Moreover, bias assessment is necessarily somewhat arbitrary, for it depends on the definition of QTL detection. We consider a QTL as detected when the model has a position estimate within 15 cM from the true QTL. Changing the margin to 10 or 20 cM leads to different bias estimates.


 
View this table:
In this window
In a new window

 
Table 8. SSE({alpha}) for fitted models selected by MPS based on different selection criteria and different examples, averaged across 100 simulations

We summarize the results as follows: Since the number of correct detections is usually quite constant across criteria, we think that the number of false positives and the fraction of correct detections are the most meaningful performance measures. From the overall picture of simulation results, SBC emerges as the best criterion, with FPE4 and HQc as the closest competitors. While SBC, HQc, and FPE4 tend to find slightly fewer QTL than other criteria, they do much better in avoiding the risk of detecting spurious QTL.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Features of MPS:
MPS is a new procedure that addresses the goal of finding as many QTL as possible, while limiting the risk of detecting spurious QTL. It contains three important building blocks specifically designed to achieve this goal: (i) selection of marker pairs, (ii) augmentation of a forward selection procedure by an exhaustive search per chromosome, and (iii) application of a model selection criterion to select the final model from a sequence of models. None of these building blocks is in itself new. The novelty here is the way in which these components are integrated into a single algorithm and how they are applied to QTL mapping, exploiting our knowledge of the underlying biology. The two main differences between MPS and CIM are the way in which cofactors are selected and how the final model is selected. MPS implicitly uses marker pairs of other QTL as cofactors, while conventional CIM can use a wide variety of ways in which cofactors are selected (forward selection, using the best five markers, using two markers per chromosome, etc.). MPS uses criteria such as SBC to select a model, while CIM uses multiple LR tests.

MPS can be used as a stand-alone procedure for detecting QTL and estimating their effect and position in BC1 populations, or equivalently for recombinant inbred and doubled haploid lines. It is also applicable for F2 populations, if one is interested only in additive effects, but not in dominance effects (WHITTAKER et al. 1996 Down). Alternatively, for any kind of unselected population, MPS may serve as a supplement to CIM in two ways. First, the LOD profile produced by CIM can be overlaid with the positions of the pairs of markers selected by MPS. Peaks in the profile associated with selected pairs can then be given more credibility than other peaks. Second, MPS can be used to select cofactors. While Algorithm 1 will likely select pairs of markers accounting for QTL, it may not be efficient to use all markers selected by Algorithm 1 in CIM. Instead, it may be preferable to use only a subset of markers selected by Algorithm 1. We therefore suggest using the markers detected by Algorithm 1 and performing an exhaustive search, using some model selection criterion. The resulting subset may be submitted to CIM. It is as yet an open question whether use of all pairs of markers selected by Algorithm 1 in CIM is inferior to using a subset of them. This question deserves further study.

The equivalence of IM/CIM as applied to adjacent marker pairs and the approach by WHITTAKER et al. 1996 Down used in MPS is restricted to the case where IM/CIM does not map a QTL exactly at a marker, i.e., where the RSS has a local minimum in the open interval rL {isin} (0, {theta}) and this minimum is smaller than the RSS at rL = 0 and rL = {theta}. In our experience this case will be the rule in real applications. As pointed out by a referee, if IM/CIM maps a true QTL exactly at a marker, it is possible that with the approach of WHITTAKER et al. 1996 Down estimates of ßL and ßR have opposite sign, where one marker corresponds to the mapped QTL. In this case it may happen that Algorithm 1 fails to detect the QTL due to the requirement of sign consistency, while IM/CIM finds the QTL. Our procedure can be modified to account for the problem. Consider three adjacent markers 1, 2, and 3 and assume there is a QTL close to marker 2. As we scan the chromosome, adjacent pairs (1, 2) and (2, 3) will be tried, possibly in conjunction with a set of additional pairs on the same chromosomes. We suggest scrutinizing fits of pairs (1, 2) and (2, 3) with the same set of additional pairs on the same chromosome (this set may be empty). If the sign of the regression coefficient for marker 2 is the same for both pairs, and if the signs of regression coefficients for markers 1 and 3 agree with each other and are opposite to the sign for marker 2, the QTL would go unnoticed by Algorithm 1. Thus, in such cases we could fit the pair (1, 3), again with the same set of additional pairs. The pair (1, 3) would be considered further only if the signs of the regression coefficients for markers 1 and 3 change compared to the corresponding fits with pairs (1, 2) and (2, 3). If the pair (1, 3) is selected for the current model order, the dropped marker 2 could be considered again for higher model orders. We have not incorporated this modification in our description of Algorithm 1 and in the simulation, because in our experience the problem is not very common, and the expected gain in power is small. Also, the modification increases the risk of detecting false positives.

Instead of an exhaustive search per chromosome as implemented in our Algorithm 1, we could adopt a simple forward selection procedure, possibly improved by some measures to exploit knowledge of the biology. For example, if at one step markers 2 and 3 have been selected on a chromosome, it is sensible to allow the pair (1, 4) to be selected in subsequent steps, providing pairs (1, 2) and (3, 4) lead to regression estimates of the same sign for a pair. This makes sure that the "correct" pairs can be selected in case there are two isolated QTL in the intervals (1, 2) and (3, 4) on the same chromosome, and at an earlier step pair (2, 3) was selected. Also, we could allow a selected pair of markers to move one position to the left or to the right as more marker pairs are being added. Thus, e.g., having selected the pair (2, 3) at some stage, we would allow this pair to be replaced by (1, 2) or (3, 4) later in the selection process, if this improves the fit. One can think of more modifications of simple forward selection. In fact, the modified algorithm may become fairly complicated and unrealistic to program when one attempts to cover all the possible QTL and marker configurations that may occur in reality. While our partially exhaustive search is computationally more demanding, it has the virtue of simplicity and at the same time covers many of the features lacking in a simple forward selection algorithm.

We observed that occasionally MPS selects more than one pair of markers for a large QTL, leading to overfitting of that QTL. We could augment our Algorithm 1 by a step that tries to reduce the model whenever there are two or more pairs of markers on the same chromosome. Our investigations (results not shown) suggest that this modification will slightly increase efficiency when in fact there is only one QTL on the chromosome, while it may deteriorate performance of the algorithm in case there is more than one QTL. We have not included such modifications in our simulation study for simplicity.

Comparison of MPS to other procedures:
In conventional CIM, cofactors are usually selected on the basis of simple forward selection, with markers entering the model individually rather than in pairs. Often, the selection is semiautomated or fully automated, with no check of whether or not the selected cofactors match QTL detected later in the CIM scan across the chromosome. Such a check would be useful as a guard against overfitting. It has been observed that inclusion of too many cofactors that are not associated with a QTL will reduce power to identify QTL relative to IM (ZENG 1993 Down; BEAVIS 1994 Down). Since MPS selects markers in pairs, the likelihood of selecting spurious markers as cofactors is reduced. Moreover, a stringent criterion such as SBC further reduces the risk of overfitting.

A more detailed comparison of CIM and MPS would be rewarding, but is beyond the scope of this article. For CIM there are many parameters that would have to be considered in simulations: window size, definition of critical threshold, definition of when a LOD peak detects a QTL and when it must be considered as a "sub-peak" of another detecting peak, selection of cofactors, ML, or least squares, etc. In fact, CIM could be modified by taking up some or all of the ingredients that make up MPS, i.e., selecting markers in pairs, requiring sign consistency, using SBC or some other criterion instead of LOD thresholds, doing an exhaustive search per chromosome to select cofactors, etc. Thus, a detailed comparison will have to be fairly extensive and should include various blends of MPS and CIM. Such a study is left for future work.

Recently, KAO et al. 1999 Down suggested a method termed multiple interval mapping (MIM) that was judged superior to CIM. Using MIM, several QTL can be fitted simultaneously, allowing for complex models of gene action including epistasis. Due to the potentially large complexity of models fitted by MIM, there is a severe danger of overfitting. The authors mention a number of model selection strategies, including use of AIC and SBC, but, for their suggested procedure, they adopt a stepwise selection procedure in conjunction with multiple LR testing and a Bonferroni adjustment on tests for epistasis. The results presented in our article strongly suggest that the performance of MIM could be considerably improved by using a selection criterion such as SBC in place of multiple LR tests. In fact, MPS can be used to first select potential regions in the genome for fitting QTL. This can be followed by MIM restricted to the selected regions. Restricting application of MIM to the regions selected by MPS is expected to reduce the inherent danger of overfitting.

Further remarks on model selection:
Procedures for mapping QTL aim at finding as many true QTL as possible, while avoiding the risk of detecting spurious QTL. Significance testing as is commonly used for IM, CIM, and MIM is not necessarily the best strategy to achieve this goal (GELFAND and GHOSH 1998 Down). For IM, there are a number of methods for determining the appropriate threshold so that the genome-wise type I error rate is controlled at a predetermined value, such as 5% (LANDER and BOTSTEIN 1989 Down, LANDER and BOTSTEIN 1994 Down; CHURCHILL and DOERGE 1994 Down; REBAI et al. 1994 Down, REBAI et al. 1995 Down; DOERGE and CHURCHILL 1996 Down; GOFFINET and MANGIN 1998 Down; DUPUIS and SIEGMUND 1999 Down). Most of these methods operate under a global null hypothesis of no QTL anywhere in the genome, which is rather restrictive, but see DOERGE and CHURCHILL 1996 Down and GOFFINET and MANGIN 1998 Down. Controlling the genome-wise error rate in a sequential model-building process is an inherently difficult problem. Also, significance tests do not allow a comparison of nonnested models. Forcing a nested model sequence entails the risk of detecting ghost QTL and missing better-fitting models. Moreover, having controlled the genome-wise rate of false positives among tests under the null hypothesis of no QTL at 5%, the rate of false positives among QTL detections, which is a different quantity that is usually of greater interest, can easily be 50% or more (SOUTHEY and FERNANDO 1998 Down).

Model selection criteria are based on a philosophy that is essentially different from that underlying significance testing (BURNHAM and ANDERSON 1998 Down). A common basis of many criteria is the notion that the more information is gathered, the greater is the model complexity that the data can support (BUCKLAND et al. 1997 Down). While not guaranteeing the absence of false positives among detections, criteria such as SBC do a better job at striking the balance between the contrasting objectives of finding as many real QTL as possible and at the same time keeping the risk of fitting spurious QTL low. Moreover, the difficult task of finding an appropriate adjustment for multiple testing to control a genome-wise error rate is obviated, and nonnested models can be compared.

In this article we have not used computer-intensive methods of model selection, such as leave-d-out cross validation and bootstrapping (HJORTH 1994 Down; MCQUARRIE and TSAI 1998 Down), to limit the computational burden in simulations. It is interesting to note, however, that there exist several asymptotic equivalence relationships between cross-validation and selection criteria used here (see Table 2), for example, between FPE and PRESS as well as between SBC and leave-d-out cross validation, when d = n(1 - 1/(log(n) - 1)) (SHAO 1996 Down; MCQUARRIE and TSAI 1998 Down).


*  ACKNOWLEDGMENTS

We thank John Whittaker and three anonymous reviewers for helpful comments. This article was written while the first author was visiting the Department of Biometrics and the Department of Plant Breeding, College of Agriculture and Life Sciences, Cornell University, Ithaca, New York. Support of the Heisenberg Programm of the Deutsche Forschungsgemeinschaft is gratefully acknowledged.

Manuscript received January 27, 2000; Accepted for publication October 6, 2000.


*  APPENDIX
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

Let z = g1{alpha}1 + g2{alpha}2, where {alpha}1 and {alpha}2 are additive genetic effects of two QTL and g1 and g2 are coded 0 and 1 depending on the genotype at the QTL. For the nonrecombinant genotypes we have either g1 = g2 = 0 or g1 = g2 = 1. For the recombinant genotypes g1 = 0 and g2 = 1 or g1 = 1 and g2 = 0. Let r be the recombination fraction between the two QTL. It can be shown that

(A1)

(see WEIR 1996 Down for a more general result). Note that in the case of independence r = 0.5 and var(z) = 1/4({alpha}21 + {alpha}22). Thus, the contribution to the total genetic variance of the hth QTL is , provided it is independent of all other QTL in the genome. The joint contribution of two linked QTL, which are independent of all other QTL, is given in (A1). Since in our simulations there are no more than two QTL in a linkage group, these results suffice to compute the total genetic variance and the heritability.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*APPENDIX
*LITERATURE CITED

AKAIKE, H., 1973 Information theory and an extension of the maximum likelihood principle, pp. 267–281 in 2nd International Symposium on Information Theory, edited by B. N. PETRIV and F. CSAKI. Aakademia Kiado, Budapest.

ALLEN, D. M., 1974  The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16:125-127.

BEAVIS, W. D., 1994 The power and deceit of QTL experiments: lessons from comparative QTL studies, pp. 250–266 in Report of the Forty-Ninth Annual Corn and Sorghum Research Conference, edited by D. B. WILKINSON. American Seed Trade Association, Washington, DC.

BHANSALI, R. J. and D. Y. DOWNHAM, 1977  Some properties of the order of an autoregressive model selected by a generalized Akaike's EPF criterion. Biometrika 64:547-551[Abstract/Free Full Text].

BREIMAN, L. and D. FREEDMAN, 1983  How many variables should be entered in a regression equation? J. Am. Stat. Assoc. 78:131-136.

BUCKLAND, S. T., K. P. BURNHAM, and N. H. AUGUSTIN, 1997  Model selection: an integral part of inference. Biometrics 53:603-618.

BURNHAM, K. P., and D. R. ANDERSON, 1998 Model Selection and Inference. Springer, New York.

CHURCHILL, G. A. and R. W. DOERGE, 1994  Empirical threshold values for quantitative trait mapping. Genetics 138:963-971[Abstract].

DARVASI, A., A. WEINREB, V. MINKE, J. I. WELLER, and M. SOLLER, 1993  Detecting marker-QTL linkage and estimating QTL gene effect and map location using a saturated genetic map. Genetics 134:943-951[Abstract].

DOERGE, R. W. and G. A. CHURCHILL, 1996  Permutation tests for multiple loci affecting a quantitative character. Genetics 142:285-294[Abstract].

DRAPER, N. R., 1995  Assessment and propagation of model uncertainty. J. R. Stat. Soc. B 57:45-97.

DRAPER, N. R., and H. SMITH, 1998 Applied Regression Analysis. Wiley, New York.

DUPUIS, J. and D. SIEGMUND, 1999  Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics 151:373-386[Abstract/Free Full Text].

EFROYMSON, M. A., 1960 Multiple regression analysis, pp. 191–203 in Mathematical Methods for Digital Computers, edited by A. RALSTON and H. S. WILF. Wiley, New York.

GABRIEL, K. R., and F. C. PUN, 1979 Binary prediction of weather events with several predictors, pp. 248–253 in 6th Conference on Probability and Statistics in Atmospheric Sciences. American Meteorological Society, Boston, MA.

GELFAND, A. E. and S. K. GHOSH, 1998  Model choice: a minimum posterior predictive loss approach. Biometrika 85:1-11[Abstract/Free Full Text].

GEWEKE, J. and R. MEESE, 1981  Estimating regression models of finite but unknown order. Int. Econ. Rev. 22:55-70.

GOFFINET, B. and B. MANGIN, 1998  Comparing methods to detect more than one QTL on a chromosome. Theor. Appl. Genet. 96:628-633.

HALEY, C. S. and S. A. KNOTT, 1992  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315-324[Medline].

HANNAN, E. J. and B. G. QUINN, 1979  The determination of the order of an autoregression. J. R. Stat. Soc. B 41:190-195.

HJORTH, J. S. U., 1994 Computer Intensive Statistical Methods. Validation, Model Selection and Bootstrap. Chapman and Hall, London.

HURVITCH, C. M. and C.-L. TSAI, 1989  Regression and time series model selection in small samples. Biometrika 76:297-307[Abstract/Free Full Text].

JANSEN, P. C., 1993  Interval mapping of multiple quantitative trait loci. Genetics 135:205-211[Abstract].

JANSEN, R. C. and P. STAM, 1994  High resolution of quantitative traits into multiple loci via interval mapping. Genetics 136:1447-1455[Abstract].

KAO, C. H., Z-B. ZENG, and R. D. TEASDALE, 1999  Multiple interval mapping for quantitative trait loci. Genetics 152:1203-1216[Abstract/Free Full Text].

KEARSEY, M. J. and A. G. L. FARQUHAR, 1998  QTL analysis in plants: where are we now? Heredity 80:137-142.

LANDER, E. S. and D. BOTSTEIN, 1989  Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199[Abstract/Free Full Text].

LANDER, E. S. and D. BOTSTEIN, 1994  Corrigendum. Genetics 136:705.

LEBRETON, C. M. and P. M. VISSCHER, 1998  Empirical nonparametric bootstrap strategies in quantitative trait loci mapping: conditioning on the genetic model. Genetics 148:525-535[Abstract/Free Full Text].

MACKAY, D. J. K., 1992  Bayesian interpolation. Neural Comput. 4:415-447.

MALLOWS, C. L., 1973  Some comments on Cp. Technometrics 15:661-675.

MARTINEZ, O. and R. N. CURNOW, 1992  Estimating the locations and the sizes of the effects of quantitative trait loci using flanking markers. Theor. Appl. Genet. 85:480-488.

MCQUARRIE, A. D. R., and C.-L. TSAI, 1998 Regression and Time Series Model Selection. World Scientific Publishers, Singapore.

MELCHINGER, A. E., H. F. UTZ, and C. C. SCHÖN, 1998  Quantitative trait loci (QTL) mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. Genetics 149:383-403[Abstract/Free Full Text].

MILLER, A. J., 1990 Subset Selection in Regression. Chapman & Hall, London.

PIEPHO, H. P., 2000  Optimal marker density for interval mapping in a backcross population. Heredity 84:437-440.

REBAÏ, A., B. GOFFINET, and B. MANGIN, 1994  Approximate thresholds of interval mapping tests for QTL detection. Genetics 138:235-240[Abstract].

REBAÏ, A., B. GOFFINET, and B. MANGIN, 1995  Comparing power of different methods for QTL detection. Biometrics 51:87-99[Medline].

SARI-GORLA, M., T. CALINSKI, Z. KACZMAREK, and P. KRAJEWSKI, 1997  Detecting QTL x environment interaction in maize by a least squares interval mapping method. Heredity 78:146-157.

SCHWARZ, G., 1978  Estimating the dimension of a model. Ann. Stat. 6:461-464.

SHAO, J., 1996  Bootstrap model selection. J. Am. Stat. Assoc. 91:655-665.

SILLANPÄÄ, M. J. and E. ARJAS, 1998  Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. Genetics 148:1373-1388[Abstract/Free Full Text].

SOUTHEY, B. R. and R. L. FERNANDO, 1998  Controlling the proportion of false positives among significant results in QTL detection. Proc. 6th World Congr. Genet. Appl. Livest. Prod. 26:221-224.

STAM, P., 1991 Some aspects of QTL analysis, pp. 23–31 in Proceedings of the VIIIth Meeting of the Eucarpia Section Biometrics in Plant Breeding, edited by J. PESEK, M. HERMAN and J. HARTMANN. Brno, Czechoslovakia.

UTZ, H. F., and A. E. MELCHINGER, 1994 Comparison of different approaches to interval mapping of quantitative trait loci, pp. 195–204 in Biometrics in Plant Breeding: Application of Molecular Markers, Proceedings of the Ninth Meeting of the EUCARPIA Section Biometrics in Plant Breeding, edited by J. W. VAN OOIJEN and J. JANSEN. CPRO-DLO, Wageningen, The Netherlands.

WEIR, B., 1996 Genetic Data Analysis II. Sinauer, Sunderland, MA.

WEISBERG, S., 1985 Applied Linear Regression. Wiley, New York.

WHITTAKER, J. C., R. THOMPSON, and P. M. VISSCHER, 1996  On the mapping of QTL by regression of phenotype on marker-type. Heredity 77:23-32.

ZENG, Z.-B., 1993  Theoretical basis of separation of multiple linked gene effects on mapping quantitative trait loci. Proc. Natl. Acad. Sci. USA 90:10972-10976[Abstract/Free Full Text].

ZENG, Z.-B., 1994  Precision mapping of quantitative trait loci. Genetics 136:1457-1466[Abstract].




This article has been cited by other articles:


Home page
Crop Sci.Home page
H. P. Piepho
Ridge Regression and Extensions for Genomewide Selection in Maize
Crop Sci., June 26, 2009; 49(4): 1165 - 1176.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
L. Zhang, H. Li, Z. Li, and J. Wang
Interactions Between Markers Can Be Caused by the Dominance Effect of Quantitative Trait Loci
Genetics, October 1, 2008; 180(2): 1177 - 1190.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. Yang, W. Wu, and J. Zhu
Mapping Interspecific Genetic Architecture in a Host-Parasite Interaction System
Genetics, March 1, 2008; 178(3): 1737 - 1743.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B. Kusterer, H.-P. Piepho, H. F. Utz, C. C. Schon, J. Muminovic, R. C. Meyer, T. Altmann, and A. E. Melchinger
Heterosis for Biomass-Related Traits in Arabidopsis Investigated by Quantitative Trait Loci Analysis of the Triple Testcross Design With Recombinant Inbred Lines
Genetics, November 1, 2007; 177(3): 1839 - 1850.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
L. A. Robertson-Hoyt, C. E. Kleinschmidt, D. G. White, G. A. Payne, C. M. Maragos, and J. B. Holland
Relationships of Resistance to Fusarium Ear Rot and Fumonisin Contamination with Agronomic Performance of Maize
Crop Sci., September 1, 2007; 47(5): 1770 - 1778.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Yang, J. Zhu, and R. W. Williams
Mapping the genetic architecture of complex traits in experimental populations
Bioinformatics, June 15, 2007; 23(12): 1527 - 1536.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B. Stich, J. Yu, A. E. Melchinger, H.-P. Piepho, H. F. Utz, H. P. Maurer, and E. S. Buckler
Power to Detect Higher-Order Epistatic Interactions in a Metabolic Pathway Using a New Mapping Strategy
Genetics, May 1, 2007; 176(1): 563 - 570.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. J. Balint-Kurti, J. C. Zwonitzer, R. J. Wisser, M. L. Carson, M. A. Oropeza-Rosas, J. B. Holland, and S. J. Szalma
Precise Mapping of Quantitative Trait Loci for Resistance to Southern Leaf Blight, Caused by Cochliobolus heterostrophus Race O, and Flowering Time Using Advanced Intercross Maize Lines
Genetics, May 1, 2007; 176(1): 645 - 657.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Li, G. Ye, and J. Wang
A Modified Algorithm for the Improvement of Composite Interval Mapping
Genetics, January 1, 2007; 175(1): 361 - 374.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Baierl, M. Bogdan, F. Frommlet, and A. Futschik
On Locating Multiple Interacting Quantitative Trait Loci in Intercross Designs
Genetics, July 1, 2006; 173(3): 1693 - 1703.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
L. A. Robertson-Hoyt, M. P. Jines, P. J. Balint-Kurti, C. E. Kleinschmidt, D. G. White, G. A. Payne, C. M. Maragos, T. L. Molnar, and J. B. Holland
QTL Mapping for Fusarium Ear Rot and Fumonisin Contamination Resistance in Two Maize Populations
Crop Sci., June 20, 2006; 46(4): 1734 - 1743.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
R. Mihaljevic, H. F. Utz, and A. E. Melchinger
No Evidence for Epistasis in Hybrid and Per Se Performance of Elite European Flint Maize Inbreds from Generation Means and QTL Analyses
Crop Sci., October 27, 2005; 45(6): 2605 - 2613.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
H. Wang, Y.-M. Zhang, X. Li, G. L. Masinde, S. Mohan, D. J. Baylink, and S. Xu
Bayesian Shrinkage Estimation of Quantitative Trait Loci Parameters
Genetics, May 1, 2005; 170(1): 465 - 480.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Zhang, K. L. Montooth, M. T. Wells, A. G. Clark, and D. Zhang
Mapping Multiple Quantitative Trait Loci by Bayesian Classification
Genetics, April 1, 2005; 169(4): 2305 - 2318.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. J. Sillanpaa and M. Bhattacharjee
Bayesian Association-Based Fine Mapping in Small Chromosomal Segments
Genetics, January 1, 2005; 169(1): 427 - 439.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
M. Bogdan, J. K. Ghosh, and R. W. Doerge
Modifying the Schwarz Bayesian Information Criterion to Locate Multiple Interacting Quantitative Trait Loci
Genetics, June 1, 2004; 167(2): 989 - 999.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
N. Yi, V. George, and D. B. Allison
Stochastic Search Variable Selection for Identifying Multiple Quantitative Trait Loci
Genetics, July 1, 2003; 164(3): 1129 - 1138.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. T. G. Hwang and D. Nettleton
Investigating the Probability of Sign Inconsistency in the Regression Coefficients of Markers Flanking Quantitative Trait Loci
Genetics, April 1, 2002; 160(4): 1697 - 1705.
[Abstract] [Full Text] [PDF]