Genetics, Vol. 165, 2259-2268, December 2003, Copyright © 2003

Theoretical Basis of the Beavis Effect

Shizhong Xua
a Department of Botany and Plant Sciences, University of California, Riverside, California 92521

Corresponding author: Shizhong Xu

Communicating editor: S. OTTO


*  ABSTRACT
*TOP
*ABSTRACT
*STATISTICAL THEORY
*NUMERICAL EVALUATION
*DISCUSSION
*LITERATURE CITED

The core of statistical inference is based on both hypothesis testing and estimation. The use of inferential statistics for QTL identification thus includes estimation of genetic effects and statistical tests. Typically, QTL are reported only when the test statistics reach a predetermined critical value. Therefore, the estimated effects of detected QTL are actually sampled from a truncated distribution. As a result, the expectations of detected QTL effects are biased upward. In a simulation study, William D. Beavis showed that the average estimates of phenotypic variances associated with correctly identified QTL were greatly overestimated if only 100 progeny were evaluated, slightly overestimated if 500 progeny were evaluated, and fairly close to the actual magnitude when 1000 progeny were evaluated. This phenomenon has subsequently been called the Beavis effect. Understanding the theoretical basis of the Beavis effect will help interpret QTL mapping results and improve success of marker-assisted selection. This study provides a statistical explanation for the Beavis effect. The theoretical prediction agrees well with the observations reported in Beavis's original simulation study. Application of the theory to meta-analysis of QTL mapping is discussed.


THE primary goal of genetic mapping experiments is to identify the locations of genes that affect variable expression of a trait among individuals. Most researchers also use the data from a sampled population to estimate the genetic effects of quantitative trait loci (QTL). Information about the magnitude of the genetic effects of significant QTL is useful in prioritizing subsequent uses of the loci as candidate genes for consideration in genetic engineering and marker-assisted selection (LANDE and THOMPSON 1990 Down). Many statistical methods produce unbiased or asymptotically unbiased estimates of parameters. However, this property does not apply to QTL effects estimated from genome scans. This phenomenon was first discussed by LANDE and THOMPSON 1990 Down.

Almost all QTL mapping procedures can detect QTL with large effects, but not all can detect QTL with intermediate and small effects. Quantitative traits are defined as traits controlled primarily by intermediate and small QTL. BEAVIS 1994 Down, BEAVIS 1998 Down designed a large-scale simulation experiment to evaluate the efficiency of interval mapping for detecting and estimating polygenes. Hereafter, this simulation experiment is called the Beavis experiment. BEAVIS 1998 Down simulated either 10 or 40 independent QTL, examining situations where each QTL explained a proportion of the phenotypic variance ranging from 0.75 to 9.5% with sample sizes ranging from 100 to 1000. The Beavis experiment showed that the average estimates of phenotypic variances associated with correctly identified QTL were greatly overestimated if only 100 progeny were evaluated, slightly overestimated if 500 progeny were evaluated, and fairly close to the actual magnitude when 1000 progeny were evaluated. When the sample size was small, say 100, the statistical power of detecting a small QTL was as low as 3% and the estimated effects were typically inflated 10-fold. This phenomenon has since been referred to as the Beavis effect and has formed the basis of a number of subsequent analyses (e.g., OTTO and JONES 2000 Down; HAYES and GODDARD 2001 Down).

Estimating the number of quantitative trait loci is another goal of QTL mapping experiments (SILLANPAA and ARJAS 1998 Down). OTTO and JONES 2000 Down acknowledged that the estimated number of QTL using molecular markers (interval mapping) can be more informative than that obtained from analyses using only phenotype (LANDE 1981 Down; ZENG 1992 Down). However, a QTL is reported from the analysis only if it is detected. Therefore, the distribution of the detected QTL is actually inferred from censored data due to missing small-effect QTL. OTTO and JONES 2000 Down incorporated the Beavis effect into their maximum-likelihood analysis to recover the potential number of QTL using the number of detected QTL, so called "detecting the undetected." It should be emphasized that the purpose of OTTO and JONES 2000 Down was to infer the number of QTL for a quantitative trait within a single experiment. Some QTL, however, may have large effects and some may have small effects, and estimating the total number depends on the distribution of QTL effects. If the overall distribution of the effects is described by an exponential distribution, the distribution of detected QTL effects becomes a truncated exponential distribution after incorporating the Beavis effect. This should not be confused with the original Beavis experiment where all simulated QTL had an equal genetic effect.

HAYES and GODDARD 2001 Down conducted a meta-analysis for QTL mapping in livestock to infer the distribution of the effects of genes. They fit the estimated gene effects surveyed from all QTL mapping experiments in pigs and cattle to a gamma distribution. The difference between this meta-analysis and the usual analysis is that estimated genetic effects from multiple experiments were fit in a model rather than estimated from observed measurements. As a consequence, they had to take into account two additional sources of error for the estimated gene effects: the experimental error due to limited sample sizes and bias caused by data censoring. They found that the estimated genetic effects fit a gamma distribution very well. Again, one should not confuse this meta-analysis with the Beavis experiment because HAYES and GODDARD 2001 Down intended to infer the distribution of genetic effects throughout the genome from multiple experiments. Even if two QTL were identified roughly at the same location by two investigators, they were still treated as different QTL because the estimated positions were not precise enough to be assigned to exactly the same position within the genome. The original Beavis experiment demonstrated that a bias exists in estimating QTL effects. Herein, I present a theoretical basis for the Beavis effect. The theory predicts the amount of bias as a function of the type of progeny used in the experiment, the estimation procedure, the marker density, and the sample size.


*  STATISTICAL THEORY
*TOP
*ABSTRACT
*STATISTICAL THEORY
*NUMERICAL EVALUATION
*DISCUSSION
*LITERATURE CITED

Marker analysis:
The theory is developed using a backcross (BC) mating design, which provides the simplest genetic model in QTL mapping. The result is then extended to other progeny derived from common mating designs. Let yj be the phenotypic value of a quantitative trait measured from individual j sampled from a BC population. The linear model describing yj is

(1)

where µ is the population mean, a is the additive genetic effect at the locus of interest, and {epsilon}j is the residual error, which is assumed to follow a normal distribution, N(0, {sigma}2). In the BC progeny, there are only two possible genotypes at the QTL, heterozygote and homozygote for the backcross parent allele. The independent variable xj is an indicator variable defined as xj = 0 for the homozygote and xj = 1 for the heterozygote.

Let â be the estimated genetic effect and {sigma}2â be the variance of the estimate for a QTL. For simplicity, let us assume that the QTL is closely linked to a marker so that

(2)

where n is the sample size (the number of BC progeny) and

(3)

is the sampling variance of the genotype. Note that n may be replaced by n - 1 for a more precise result.

The variance of x depends on the type of progeny. In a BC population without segregation distortion, half of the individuals will be homozygous and half heterozygous, leading to in the limit of large n. We have implicitly made the assumption that there is no dominance effect. If there is dominance, it will be confounded with the additive effect; i.e., the genetic effect is actually a + d instead of a. Therefore, all subsequent analyses are based on the additive model.

If the positions of all QTL are known a priori (uncensored), the estimate of each QTL effect is approximately unbiased (depending on the method used) so that the distribution of â can be assumed to be approximately normal, i.e., N(a, {sigma}2â). The normal distribution is mathematically convenient because it allows the standard statistical machinery for censored/truncated data to be used (COHEN 1991 Down). Of course, if the assumption of normal distribution is violated, the theory developed here is not valid. The assumption of a normal distribution for â holds better if the residual error is normally, independently, and identically distributed. Some traits may be distributed in a discrete manner but analyzed as a quantitative trait, and some traits may have a skewed distribution with unequal variances. These traits may not have a normal residual error, and thus the theory developed in this study should be used with extra caution for such traits.

In the practice of QTL mapping, estimated genetic effects are reported only for significant QTL. Thus, the reported QTL consists of a censored sample so that the distribution of the estimated QTL effects conditional on statistical significance is a truncated normal distribution with a mean and variance different from those of the original normal distribution.

I now proceed to calculate this truncated distribution. Let us assume that a z-test statistic is used for the significance test so that the critical value for the z-test statistic is defined as Z1-{alpha}/2, where {alpha} is a controlled type I error rate. A QTL is reported only if |z| > Z1-{alpha}/2, where z = â/{sigma}â is the z-test statistic. In other words, all the QTL reported satisfy |â| > {sigma}âZ1-{alpha}/2, i.e., â < -{sigma}âZ1-{alpha}/2 or â > {sigma}âZ1-{alpha}/2. The two-tailed test leads to the possibility that even if a is positive, it may be detected as a significantly negative effect due to sampling. Denote the truncated â by âT so that âT ~ NT(aT, {sigma}2âT), where NT represents a truncated normal distribution, aT = E(âT) is the expectation, and {sigma}2âT = Var(âT) is the variance of the truncated normal distribution. The truncated normal distribution is not symmetric. Let us define the two truncation points in the standardized (z-test statistic) scale as

(4)

and

(5)

Further define

(6)

where {phi}({xi}) and {Phi}({xi}) are the standardized normal probability density function (pdf) and cumulative distribution function (cdf), respectively.

Often, when using likelihood methods, the LOD score criterion may be used for QTL detection. The critical value of the LOD score can be converted into the critical value of the z-test statistic, using the following approximate relationship,

(7)

where {chi}21,1-{alpha} is the critical value of the chi-square distribution with 1 d.f. and is also the critical value for the likelihood-ratio test statistic. The relationship between the likelihood-ratio test statistic and LOD can be found in LYNCH and WALSH 1998 Down.

According to the standard statistical machinery of truncated normal distributions

(8)

(COHEN 1991 Down), so that the bias is

(9)

The variance of the truncated sample is

(10)

This truncated variance is used later when we discuss the bias in the QTL variance estimate.

The phenotypic variance may be partitioned into the true QTL variance ({sigma}2G) and the residual variance ({sigma}2). The genetic variance of a QTL in a BC population is {sigma}2G = a2/4. The corresponding variance in an F2 population is {sigma}2G = a2/2. In general, the genetic variance can be expressed as , where {gamma} is a constant depending on the filial relationship among segregating progeny; e.g., {gamma} = 1/2 for BC and {gamma} = 1/2 for F2. Therefore, a commonly used approach to estimating the genetic variance is . However, this estimation is also biased because

(11)

where

(12)

Therefore, the bias in the estimated variance is

(13)

Note that there are two sources of bias: the contribution of environmental variance to the estimate of a2 given by [n{sigma}2x]-1{gamma}{sigma}2 plus the Beavis effect given by [n{sigma}2x]-1{gamma}{sigma}2 ({xi}'2{psi}2 - {xi}'1{psi}1). Thus, this estimator is biased even in the absence of the Beavis effect. The bias due to the first source of error has been investigated by LUO et al. 2003 Down, who also provided a method to correct the bias.

The one-tailed test is a special case of the two-tailed test in that we simply set {xi}1 = -{infty} and {psi}1 = 0. For the two-tailed test, we can further simplify B into [1 + ({xi}2{psi}1 - {xi}1{psi}2)] due to and . We did not present this simplified version of B as a general formula because it would not allow us to extend the results simply to the one-tailed test.

The proportion of phenotypic variance explained by the QTL is defined as

(14)

where is the total phenotypic variance among the test progeny. The h2Q notation is adopted from classical quantitative genetics and is known as the heritability. When used as a proportion of variance contributed by an individual QTL, it is no longer called the heritability. The typical estimator for h2Q is . It is hard to find the expectation of a ratio. However, if we assume that the estimation error of the denominator is negligible, we can take . This assumption is justified for many quantitative traits because the phenotypic variance is usually accurately estimated. On the basis of this assumption, we get

(15)

Letting , the bias of the genetic variance can be rewritten as

(16)

which leads to

(17)

Therefore, the proportion of the phenotypic variance associated with the QTL also is biased. Note that the term ({xi}'2{psi}2 - {xi}'1{psi}1) in Equation 17 involves the genetic effect, which can be expressed using (14) as

(18)

If we substitute Equation 18 into Equation 4 and Equation 5, {sigma} cancels out from ({xi}'2{psi}2 - {xi}'1{psi}1). Therefore, the bias given in Equation 17 is only a function of h2Q and the parameters associated with the sampled progeny.

We now extend the results to other types of progeny. For simplicity, we assume that dominance is absent. In an F2 population, there are three possible genotypes whose genotypic values are defined as a for the homozygote with the "high allele," 0 for the heterozygote, and -a for the homozygote with the "low allele." The linear model given in Equation 1 applies here in the F2 population except that the x variable is now defined as xj = 1, 0, -1 for the three genotypes, respectively. Without segregation distortion, the variance of x in an F2 population is . Therefore, the results derived above apply here with in BC progeny replaced by in F2 progeny.

The method also can be applied to double-haploid (DH) and recombinant inbred line (RIL) populations. In both DH and RIL, the heterozygous genotype is absent. The x variable is defined as xj = 1 for one homozygote and xj = -1 for the other homozygote. The two types of homozygote have an equal frequency, and thus in both DH and RIL. The results derived above apply by substituting in the BC by in the DH and RIL.

Note that {sigma}2x and {gamma} are identical for all types of progeny if a marker provides a fully informative genotype for the QTL. Different notation is used because there can be differences when the QTL is not tightly linked to a marker.

Interval mapping:
In interval mapping, a QTL may be identified at an intermediate position between markers by inferring the genotype of the QTL from flanking marker information. This will affect both the parameter estimates and the statistical tests of inference. We need to substitute the variance of x by the variance of the estimated x, denoted by {sigma}2. This variance depends on the relative position of the QTL within the interval. The explicit expression is tedious, but numerical values may be computed conveniently. For example, in a BC population, there are four possible marker classes, and the estimated x is denoted by j = E(xj|m1j, m2j) = pj(1), where m1j and m2j indicate the flanking marker genotypes and pj(1) = Pr(xj = 1|m1j, m2j). Let Pr(m1j, m2j) be the joint probability of the flanking marker genotype. These values are given in Table 1, where r1 and r2 are the recombination fractions between the QTL and the two markers, respectively, and r is the recombination between the two markers. From this table, we can calculate {sigma}2 using

(19)

Note that when {sigma}2 is used in place of {sigma}2x, the assumption of a normal distribution for â may be less valid.


 
View this table:
In this window
In a new window

 
Table 1. Predicted genotypic indicator variable and its probability distribution for the four flanking marker genotypes in backcross progeny

The method used to derive {sigma}2 implicitly assumes that the least-squares method (HALEY and KNOTT 1992 Down) is used because the expectation of the QTL genotype indicator is calculated conditional on markers only. In maximum-likelihood analysis, the expectation is actually calculated conditional on both markers and the phenotypic values. However, it is extremely difficult to derive the variance using expectation conditional on both markers and phenotypes. Therefore, the way we calculate {sigma}2 is considered as an approximation.

For a DH population, we define j in a slightly different way (see Table 2) but still use (19) to calculate {sigma}2. For RIL, we use the same table (Table 2), but replace r1, r2, and r by c1, c2, and c, where c = 2r/(1 + 2r) (LYNCH and WALSH 1998 Down).


 
View this table:
In this window
In a new window

 
Table 2. Predicted genotypic indicator variable and its probability distribution for the four flanking marker genotypes in double-haploid progeny

In an F2 population, there are nine flanking marker genotypic classes. The definitions of j's and their probabilities for the nine classes are given in Table 3. The same formula (Equation 19) is used to calculate {sigma}2 except that the summation is taken over nine categories. For example, if the QTL position is in the middle of a 20-cM interval, we have r1 = r2 = 0.0906 and r = 0.1648, leading to . Fig 1 shows the relationship between {sigma}2 and the position of the QTL in the 20-cM interval flanked by two fully informative markers. Note, as the QTL position approaches a marker, {sigma}2 approaches 1/2.



View larger version (13K):
In this window
In a new window
Download PPT slide
 
Figure 1. Relationship between {sigma}2 and the position of a QTL flanked by two fully informative markers in an interval of 20 cM in an F2 population in the limit of large n.


 
View this table:
In this window
In a new window

 
Table 3. Predicted genotypic indicator variable and its probability distribution for the nine flanking marker genotypes in F2 progeny


*  NUMERICAL EVALUATION
*TOP
*ABSTRACT
*STATISTICAL THEORY
*NUMERICAL EVALUATION
*DISCUSSION
*LITERATURE CITED

Bias was evaluated numerically by considering the following factors: the sample size, the genetic effect measured by h2Q, and the LOD score criterion. In evaluating the bias of QTL effects, the residual variance was set at unity, i.e., {sigma}2 = 1.0. If the actual residual variance is not unity, one can always standardize the genetic effect using a/{sigma} in place of a.

The numerical evaluation was conducted only in BC populations because the general trends are similar for all types of progeny (data not shown). In addition, only a one-tailed test was evaluated. The one-tailed test is a special case of the two-tailed test with {xi}1 = -{infty} and {psi}1 = 0. The functional relationships between the size of the detected QTL and the true sizes are shown in Fig 2. The diagonal lines in the first column of Fig 2 represent the case where aT = a and those in the second column represent the case where , which holds only when the sample size is infinitely large. With finite sample sizes, the curves deviate from this straight line and the deviation increases as the sample size decreases. The deviation also increases as the LOD criterion increases. The deviation is negligible when (corresponding to a/{sigma} = 0.2), even if the sample size is as small as n = 50. Assume that the commonly used LOD criterion is 3 and the sample size is 200; the bias is within 7% of the true value as long as h2Q >= 0.10. The bias becomes more severe, however, for small detectable QTL. This observation is consistent with that observed by BEAVIS 1998 Down. It is interesting to note that when and n = 50, the bias in h2Q is as high as 0.18 for LOD = 2 and 0.33 for LOD = 5.



View larger version (31K):
In this window
In a new window
Download PPT slide
 
Figure 2. Functional relationships of the detected QTL parameters to the true parametric values under various sample sizes and LOD score critical values in BC mapping. The diagonal lines represent the relationship between the parameter of detected QTL and the true parametric value when the sample size is infinitely large (unbiased relationship). Note that as the LOD criterion increases, the bias associated with detecting a very small QTL (a -> 0) becomes worse but the probability of detecting such a QTL becomes much smaller (not shown). Note that in the Beavis experiment, LOD = 2.5 and n = 100, 500, 1000.

The functional relationships between h2Q and the sample size under various LOD criteria and h2Q are shown in the right column of Fig 3. The corresponding plots for the effects are given in the left column. When n >= 200 and h2Q >= 0.10, all the biases are negligible (within 10% of the true value), regardless of the LOD criterion. For small h2Q, a large sample size, even as large as n = 500, is not sufficient to eliminate the bias, again consistent with results of the Beavis experiment (BEAVIS 1998 Down).



View larger version (32K):
In this window
In a new window
Download PPT slide
 
Figure 3. Changes in effect and variance of detected QTL as a function of their true parametric values as sample size increases for various sizes of QTL and LOD score critical values in BC mapping.

If the average estimated h2Q is 0.14 among all the experiments surveyed where the average sample size is ~100 and the average LOD criterion is ~3, the true h2Q may actually be zero (found from the second panel from the top of the right column of Fig 2). If h2Q is 0.15 (just a slight increase), however, the true h2Q is ~0.08. If , the true h2Q is about the same as h2Q; i.e., very small bias is expected. From these graphs or using Equation 17, one can find the true h2Q retrospectively from h2Q for all other settings.

We now use the parameter values of the Beavis experiment to compare the biases with those observed by BEAVIS 1998 Down. Table 4 shows the original data reported by Beavis as well as the predicted biases for both the QTL effects and their variances. The average QTL position is in the middle of a 20-cM interval. However, for an interval this short, the position of a detected QTL rarely coincided with the true position. We observed that the detected position within the interval varied almost uniformly across the interval. Therefore, we choose an average as the estimated variance of x, where 0.5 is the variance when the QTL is completely linked with a marker and 0.4 is the variance when the QTL is exactly in the middle of the interval. The agreements between the observed and the predicted biases, judged by the differences between the two, are quite good except when the sample size is small (n = 100). However, the percentage differences, (observed-predicted)/predicted, show an opposite trend with >8% discrepancies in the variance observed only when n = 1000. The percentage difference may be misleading, however, as the predicted effects become smaller and more sensitive to error with increasing sample size and because, ultimately, one is interested in the absolute error made in inferring the effects of QTL. The large absolute deviations of the predicted biases from the observed values for small sample size may be explained as follows. The critical value of LOD = 2.5 was used by Beavis for a test statistic involving both the additive and dominance effects. Our prediction, however, was based on a test statistic for additive effects only. For small samples, some QTL were detected due to large estimated dominance effects, even though the dominance was absent in the simulations (see Table 10.5 of BEAVIS 1998 Down). To verify this, we redid the simulations for all the cases with n = 100 and used the pure additive model (dominance was included in neither the simulation nor the model). We used our own interval-mapping program known as QTL-BY-SAS to detect QTL (XU and XU 2003 Down). Here, we replicated the experiment 500 times, instead of 200 times. The results are listed in Table 5, which shows very good agreement between the predicted and observed biases. Even though the LOD test statistic of the Beavis experiment involved 2 d.f. (additive and dominance), our model still predicted the biases quite accurately when the sample size was relatively large (n >= 500). This was because for large samples, the LOD score test statistic was determined primarily by the additive effect. The 2-d.f. LOD score and 1-d.f. LOD score were virtually identical.


 
View this table:
In this window
In a new window

 
Table 4. Comparisons of predicted and observed (estimated) biases in estimated QTL effects and variances from Beavis F2 simulation experiments


 
View this table:
In this window
In a new window

 
Table 5. Comparisons of predicted and estimated biases in estimated QTL effects and variances from our own F2 simulation experiments at sample size 100


*  DISCUSSION
*TOP
*ABSTRACT
*STATISTICAL THEORY
*NUMERICAL EVALUATION
*DISCUSSION
*LITERATURE CITED

The Beavis effect describes a phenomenon that occurred in the Beavis experiment where all QTL were simulated to have the same effect and distributed independently throughout the genome. The average effect of the detected QTL was biased upward due to censoring. It is more likely that QTL effects vary across the genome and the distribution of the QTL effects may be described by a negative exponential distribution (XU 2003 Down). In addition, some QTL may be linked within the same chromosomes and thus they do not segregate independently. On the one hand, the Beavis effect will cause the estimated number of QTL to be biased downward (BEAVIS 1994 Down, BEAVIS 1998 Down; OTTO and JONES 2000 Down), because the undetected QTL are not reported. On the other hand, the average effect of the detected QTL will be biased upward.

Using the Beavis effect to interpret results of a meta-analysis of QTL mapping is more straightforward. If a QTL mapping experiment can be repeated many times, the average effect of a chromosome location calculated only from the significant replicates will definitely be biased unless this QTL is detected in all replicates. If one considers incorporating a particular marker into a marker-assisted selection program for an economically important quantitative trait, the Beavis effect will affect the decision. The investigator may decide to search the literature to see how much genetic variance is accounted for by this marker from all published experiments.

The theory developed here helps predict the potential bias in the estimated effect of QTL. The theory may also be used to correct the bias but should be used with caution. Let be the average of the QTL effect estimated from N replicated experiments. To correct the bias, we may simply substitute the expectation given in Equation 17 by the observed average,

(20)

and solve for h2Q to obtain the unbiased estimate of h2Q. The equation is highly nonlinear, but the solution can be easily solved numerically. Unfortunately, the solution is very sensitive to the sample size (n). It works only when n is sufficiently large, say n >= 500. For a single experiment, we have only one estimate and the above equation becomes

(21)

The solution is even more sensitive to n. Therefore, the theoretical prediction of the Beavis effect may not be used retrospectively to correct for the bias when the sample size is small. The correction is necessary when the estimate is obtained by summarizing the results of many experiments. In that case, we should incorporate not only the mean of the estimates but also the variance of these estimates. The optimal method is the maximum-likelihood method that treats the estimates of the multiple experiments as censored data and infers (or recovers) the parameter of the uncensored data (COHEN 1991 Down).

The theory developed herein applies to segregating populations with two alternative genotypes. For populations with more than two alternative genotypes, e.g., F2, the model is restricted to either the additive or the dominance model but not both. This is because the test statistic utilized is the z-test statistic, which is a 1-d.f. test. Further investigation is necessary to predict the biases in both the additive and dominance effects using a 2-d.f. test. Similar extensions can be made for a test with >2 d.f., e.g., four-way crosses (XU 1996 Down) and diallel crosses (REBAI and GOFFINET 1993 Down).

The variance of the genotype indicator (x) determines the estimation error of the QTL effect and thus plays an important role in the Beavis effect. When the QTL under investigation is tightly linked to a fully informative marker, {sigma}2x is a constant depending on the filial relationships among progeny. If the marker is not fully informative, e.g., a certain percentage of individuals have missing genotypes at the marker, the variance {sigma}2x will decrease and this information loss will have the same ultimate effect as increasing the interval between markers when the QTL is not at the marker. We use {sigma}2x conditional on not only the type of progeny but also the marker information content and the QTL position. This conditional variance is denoted by {sigma}2 and it is used whenever the QTL genotype is not directly observable.

The theory of the Beavis effect is derived on the basis of a single QTL model. However, it applies approximately to multiple QTL experiments, as shown in the Beavis experiment where multiple QTL were actually simulated (Table 4 and Table 5). When we predict the bias, the residual variance includes the environmental variance plus the sum of the variances of the remaining QTL excluding the one in question. For example, when we predicted the bias for the QTL explaining 3% of the phenotypic variance, the residual variance was chosen as 100 - 3 = 97, where 100 is the total phenotypic variance. If the data were analyzed by a multiple-QTL model, e.g., composite interval mapping (JANSEN 1993 Down; ZENG 1994 Down), the residual variance should include only the environmental variance given that the remaining QTL have been included in the model. When QTL are linked, the situation will be complicated and further investigation is necessary.


*  ACKNOWLEDGMENTS

The author thanks Dr. Chenwu Xu for his help in the simulating experiment. The author also appreciates editor S. Otto and two anonymous reviewers for their constructive comments and suggestions on the manuscript. This work was supported by the National Institutes of Health grant R01-GM55321 and the U.S. Department of Agriculture National Research Initiative Competitive Grants Program 00-35300-9245.

Manuscript received November 11, 2002; Accepted for publication August 26, 2003.


*  LITERATURE CITED
*TOP
*ABSTRACT
*STATISTICAL THEORY
*NUMERICAL EVALUATION
*DISCUSSION
*LITERATURE CITED

BEAVIS, W. D., 1994 The power and deceit of QTL experiments: lessons from comparitive QTL studies, pp. 250–266 in Proceedings of the Forty-Ninth Annual Corn & Sorghum Industry Research Conference. American Seed Trade Association, Washington, DC.

BEAVIS, W. D., 1998 QTL analyses: power, precision, and accuracy, pp. 145–162 in Molecular Dissection of Complex Traits, edited by A. H. PATERSON. CRC Press, New York.

COHEN, A. C., 1991 Truncated and Censored Samples. Marcel Dekker, New York.

HALEY, C. S. and S. A. KNOTT, 1992  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315-324.[Medline]

HAYES, B. and M. E. GODDARD, 2001  The distribution of the effects of genes affecting quantitative traits in livestock. Genet. Sel. Evol. 33:209-229.[Medline]

JANSEN, R. C., 1993  Interval mapping of multiple quantitative trait loci. Genetics 135:205-211.[Abstract]

LANDE, R., 1981  The minimum number of genes contributing to quantitative variation between and within populations. Genetics 99:541-553.[Abstract/Free Full Text]

LANDE, R. and R. THOMPSON, 1990  Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743-756.[Abstract]

LUO, L., C. MAO, and S. XU, 2003  Correcting the bias in estimation of genetic variances contributed by individual QTL. Genetica 119:107-113.[Medline]

LYNCH, M., and B. WALSH, 1998 Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA.

OTTO, S. P. and C. D. JONES, 2000  Detecting the undetected: estimating the total number of loci underlying a quantitative trait. Genetics 156:2093-2107.[Abstract/Free Full Text]

REBAI, A. and B. GOFFINET, 1993  Power of tests for QTL detection using replicated progenies derived from a diallel cross. Theor. Appl. Genet. 86:1014-1022.

SILLANPAA, M. J. and E. ARJAS, 1998  Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. Genetics 148:1373-1388.[Abstract/Free Full Text]

XU, C., and S. XU, 2003 A SAS/IML program for mapping QTL in line crosses, pp. 1–6 in Proceedings of the Twenty-Eighth Annual SAS Users Group International Conference (SUGI 28). SAS Institute, Cary, NC.

XU, S., 1996  Mapping quantitative trait loci using four-way crosses. Genet. Res. 68:175-181.

XU, S., 2003  Estimating polygenic effects using markers of the entire genome. Genetics 163:789-801.[Abstract/Free Full Text]

ZENG, Z-B., 1992  Correcting the bias of Wright estimates of the number of genes affecting a quantitative character—a further improved method. Genetics 131:987-1001.[Abstract]

ZENG, Z-B., 1994  Precision mapping of quantitative trait loci. Genetics 136:1457-1468.[Abstract]




This article has been cited by other articles:


Home page
GeneticsHome page
A. R. Whiteley, N. Derome, S. M. Rogers, J. St-Cyr, J. Laroche, A. Labbe, A. Nolte, S. Renaut, J. Jeukens, and L. Bernatchez
The Phenomics and Expression Quantitative Trait Locus Mapping of Brain Transcriptomes Regulating Adaptive Divergence in Lake Whitefish Species Pairs (Coregonus sp.)
Genetics, September 1, 2008; 180(1): 147 - 164.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. K. Ingvarsson, M. V. Garcia, V. Luquez, D. Hall, and S. Jansson
Nucleotide Polymorphism and Phenotypic Associations Within and Around the phytochrome B2 Locus in European Aspen (Populus tremula, Salicaceae)
Genetics, April 1, 2008; 178(4): 2217 - 2226.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
B. Raz, M. Janner, V. Petkovic, D. Lochmatter, A. Eble, M. T. Dattani, P. C. Hindmarsh, C. E. Fluck, and P. E. Mullis
Influence of Growth Hormone (GH) Receptor Deletion of Exon 3 and Full-Length Isoforms on GH Response and Final Height in Patients with Severe GH Deficiency
J. Clin. Endocrinol. Metab., March 1, 2008; 93(3): 974 - 980.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
R. Tuberosa, S. Salvi, S. Giuliani, M. C. Sanguineti, M. Bellotti, S. Conti, and P. Landi
Genome-wide Approaches to Investigate and Improve Maize Response to Drought
Crop Sci., December 18, 2007; 47(Supplement_3): S-120 - S-141.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Zhong and J.-L. Jannink
Using Quantitative Trait Loci Results to Discriminate Among Crosses on the Basis of Their Progeny Mean and Variance
Genetics, September 1, 2007; 177(1): 567 - 576.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
G. Willems, D. B. Drager, M. Courbot, C. Gode, N. Verbruggen, and P. Saumitou-Laprade
The Genetic Basis of Zinc Tolerance in the Metallophyte Arabidopsis halleri ssp. halleri (Brassicaceae): An Analysis of Quantitative Trait Loci
Genetics, May 1, 2007; 176(1): 659 - 674.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Xu and Z. Jia
Genomewide Analysis of Epistatic Effects for Quantitative Traits in Barley
Genetics, April 1, 2007; 175(4): 1955 - 1963.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. J. B. Keurentjes, L. Bentsink, C. Alonso-Blanco, C. J. Hanhart, H. Blankestijn-De Vries, S. Effgen, D. Vreugdenhil, and M. Koornneef
Development of a Near-Isogenic Line Population of Arabidopsis thaliana and Comparison of Mapping Power With a Recombinant Inbred Line Population
Genetics, February 1, 2007; 175(2): 891 - 905.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. H. Lee and J. H. J. Van der Werf
Using Dominance Relationship Coefficients Based on Linkage Disequilibrium and Linkage With a General Complex Pedigree to Increase Mapping Resolution
Genetics, October 1, 2006; 174(2): 1009 - 1016.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
R. Yang, Q. Tian, and S. Xu
Mapping Quantitative Trait Loci for Longitudinal Traits in Line Crosses
Genetics, August 1, 2006; 173(4): 2339 - 2356.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
N. H. Martin, A. C. Bouck, and M. L. Arnold
Detecting Adaptive Trait Introgression Between Iris fulva and I. brevicaulis in Highly Selective Field Conditions
Genetics, April 1, 2006; 172(4): 2481 - 2489.
[Abstract] [Full Text] [PDF]