- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Bost, B.
- Articles by Dillmann, C.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Bost, B.
- Articles by Dillmann, C.
Genetic and Nongenetic Bases for the L-Shaped Distribution of Quantitative Trait Loci Effects
Bruno Bosta, Dominique de Viennea, Frédéric Hospitala, Laurence Moreaua, and Christine Dillmannaa Station de Génétique Végétale, INRA/UPS/INA P-G, F-91190 Gif sur Yvette, France
Corresponding author: Christine Dillmann, Station de Génétique Végétale, INRA/UPS/INA P-G, Ferme du Moulon, F-91190 Gif sur Yvette, France., dillmann{at}moulon.inra.fr (E-mail)
Communicating editor: P. D. KEIGHTLEY
| ABSTRACT |
|---|
The L-shaped distribution of estimated QTL effects (R2) has long been reported. We recently showed that a metabolic mechanism could account for this phenomenon. But other nonexclusive genetic or nongenetic causes may contribute to generate such a distribution. Using analysis and simulations of an additive genetic model, we show that linkage disequilibrium between QTL, low heritability, and small population size may also be involved, regardless of the gene effect distribution. In addition, a comparison of the additive and metabolic genetic models revealed that estimates of the QTL effects for traits proportional to metabolic flux are far less robust than for additive traits. However, in both models the highest R2's repeatedly correspond to the same set of QTL.
WITH the use of molecular markers, mapping of loci involved in quantitative variation, i.e., quantitative trait loci (QTL; ![]()
![]()
![]()
20 marker loci and 82 traits in two F2 populations of maize, each of about 1900 individuals. With a type I error of 5%, they found 2460 significant associations, with a very L-shaped distribution of the fractions of phenotypic variance explained by the detected QTL (R2). The maximum R2 value was 16.3%, and 94.5% of the associations exhibited R2 values <5%, with a minimum at 0.3%. Other examples can be found in the literature (e.g., ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
These L-shaped distributions of QTL effects could partly be due to several statistical artifacts. Most of the results cited above are compilations from several traits and/or several populations/environments and result in a mixture of possibly different distributions. Moreover, QTL effects expressed as fractions of phenotypic variation are proportional to the square of genetic effects, and thus even a normal distribution of true genetic effects would appear skewed toward the smallest values at the R2 level. Finally, the experimental distributions of QTL effects only concern detected QTL and may be misleading. First, the distribution of estimated R2 is typically L-shaped when truncated at a given threshold significance value. Second, undetected QTL can inflate the effect of linked detected QTL, as shown very early by ![]()
![]()
![]()
![]()
However, a survey of the literature shows that for various traits the same major QTL can be found in different populations or environments, which seems quite unlikely in case of erroneous estimates of the QTL effects. For example, in two different F2 populations derived from crosses between maize and teosinte, ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
From the biological point of view, there is no reason for a discontinuity between all-or-null (wild-type/mutant) variation and quantitative variation: a continuity between high-effect QTL and low-effect QTL is rather expected. Intermediate cases are found: for example, in pea, a major gene for Ascochyta blight resistance was mapped on chromosome 4 using both a QTL detection approach and a Mendelian analysis after partitioning the distribution of the resistance in the progeny into two classes (![]()
![]()
![]()
The issue of this article is to determine whether an apparent L-shaped distribution of estimated QTL effects allows us to make inferences about the true underlying distribution of gene effects, or not. In a recent article, we showed that for any trait proportional to a flux through a linear metabolic pathway at the steady state, the L-shaped distribution is expected as a consequence of the summation property of the control coefficients (![]()
![]()
| METHODS |
|---|
Definitions:
Strictly speaking, the "effect" of a locus is classically described by the difference between genetic values of alleles in a population (additive effect). Nevertheless, the "QTL effect" is often referred to in the literature as the fraction of phenotypic variance explained by the QTL (e.g., ![]()
![]()
- i. The "additive allelic effect" (aq) is half the difference between the genetic values of "high" and "low" homozygous genotypes at locus q (we assumed no epistasis).
- ii. The "true QTL effect" (r2q) is the fraction of phenotypic variance explained by the QTL in the model used to decompose the effects,

- where h2b is the broad sense heritability of the trait,
2G is the total genetic variance of the trait in the population, and
2q is the genetic variance contributed by QTL q in the model. - iii. The "estimated QTL effect" (R2q) is the statistic that estimates r2q in the experiment

- where SSq is the sum of squares corresponding to the QTL q and SST is the total sum of squares.
Model:
We consider a cross between two inbred lines and the F2 population obtained by selfing their F1 hybrid. In this population, 50 polymorphic QTL determine the value of a quantitative trait, with two alleles, "high" or "low," for each QTL, no dominance, and no epistasis. The genetic value,
qi, of individual i at locus q in the F2 population is
![]() |
(1) |
where mq is the midhomozygote value at locus q,
qi is the number of high alleles (0, 1, or 2) of individual i at locus q, and aq is the additive allelic effect at locus q. The genetic value of individual i for the trait is computed under an additive model (ADD) as
![]() |
(2) |
or under a metabolic model (MET) where the quantitative trait is proportional to a flux through a linear metabolic pathway at the steady state, as
![]() |
(3) |
where K is a constant that characterizes the metabolic pathway (![]()
qi depends on the maximal velocity and on the Michaelis constant of enzyme q in individual i (![]()
We compared four distributions of additive allelic effects (aq) among the loci: (i) constant distribution with all the aq having the same value; (ii) normal distribution; (iii) exponential distribution with the mode of the distribution corresponding to high aq values [the probability density function is f(x) =
, where
is the standard deviation of the distribution and amax is the maximal aq value]; and (iv) uniform distribution, where aq can take any value between amin and amax with the same probability. Whatever the distribution, the mq are kept constant and identical across loci. To compare the additive and metabolic models, mq values, as well as the parameters of the aq distributions, are fitted in the additive model to get approximately the same genetic coefficient of variation as in the metabolic model (Table 1). For each distribution, a fixed sample of 50 aq values was used in all simulations. The distributions of aq values in those samples are presented in the first column of Fig 1.
|
|
The phenotypic value zi of each F2 individual was computed by adding an environmental effect (
i) to the genetic value
![]() |
(4) |
where
i is randomly drawn from a normal distribution with mean zero and variance
2E.
2E is computed from the broad sense heritability of the trait (h2b) and the total genetic variance in the F2 population (
2G):
![]() |
(5) |
As defined previously, R2q = SSq/SST, where SSq is the sum of squares corresponding to the QTL q, and SST is the total sum of squares of the model. With ANOVA, SSq is computed straightforwardly, and the residuals of the model contain all the variation due to segregation at the other QTL, as well as the random environmental deviation. It is well known that, in this case,
![]() |
(6) |
where N is the population size,
is the number of genotypic classes at QTL q (in an F2 population,
= 3), and F is the test statistics for the effect of QTL q. F follows a noncentral Fisher distribution
(
- 1, N -
,
q), with

being the noncentrality parameter (![]()
![]()
![]()
When several QTL are taken simultaneously into account via multiple regression, a relationship similar to the one in Equation 6 is found between the global R square of the model and the corresponding F-statistics. However, the partial sum of squares for QTL q (SS*q) takes into account all the other QTL of the model, and the SST is not equal to the sum of all SS*q plus the residual sum of squares (SSR*; see Appendix A for details). Hence, there is no simple relationship between F and R2q, and the theoretical distribution of R2q values is unknown.
Genetic maps and parental genotypes:
Linkage disequilibrium between QTL depends on both the genetic map and the genotype of the parental inbred lines. We considered four different genetic structures for the pairs of parental inbred lines. The pair RandomU had independent QTL and random gametic phase: the genotype of one parent at each locus was drawn at random, so that the parental gametic phase (succession of high and low alleles along the genome) was random; the other parent had the complementary succession of high and low alleles, since all the loci are polymorphic. The pair RandomL had linked QTL and random parental gametic phase. The pair CouplingL had linked QTL that were in coupling, with one parent having the low alleles at all loci and the other parent having the high alleles at all loci. The pair RepulsionL had linked QTL that were in repulsion, with an alternation of high and low alleles along the chromosomes in each parent. The same genetic map (set of QTL locations on chromosomes) was used for all the simulations involving linked QTL, where QTL locations were randomly spread over 10 chromosomes of 200 cM each.
Simulations:
For each situation, we performed Monte Carlo simulations to obtain 100 replicates of the F2 population. Each replicate consisted of the following sequence:
- N F2 individuals were drawn at random by selfing the F1 hybrid (N = 200, 500, 1000, 5000), assuming recombinations with no interference (
HOSPITAL and CHEVALET 1993 ).
- The genetic values GAi or GMi of all the F2 individuals were computed. The total genetic variance (
2G) was then computed, allowing us to compute the environmental variance (
2E) from the given broad sense heritability (h2b), following Equation 5. Then the random environmental deviations were added to get phenotypic values of the F2 individuals. - Finally, multiple regression of the phenotypic values was directly performed on the genotype at each of the 50 QTL. The fraction of the phenotypic variation explained by the qth QTL in the kth sample of a given F2 population was estimated as R2qk =
. Sums of squares were computed with SAS GLM procedures (SAS INSTITUTE 1988).
Distribution of estimated QTL effects (R2q):
The 50 x 100 = 5000 R2qk values obtained for each genetic model were used to characterize the distribution of estimated QTL effects. The shape of the distributions was first characterized by their skewness, Skew (![]()
As each QTL is known in our model, we should explain 100% of the phenotypic variation by adding the effects of each QTL. Thus we computed the proportion of the phenotypic variation explained by the model, averaged over the 100 replicates (
):

Finally, to measure the repeatability of the ranking of the estimated QTL effects among the replicates, we ranked the 50 QTL in each replicate according to their R2 values and computed three parameters: (i) Kendall's concordance coefficient, W (![]()
|
|
| RESULTS |
|---|
Distribution of estimated QTL effects:
We used simulations to study the factors likely to influence the distribution of the estimated effects (R2) of a set of 50 known QTL controlling an additive trait in an F2 population: distribution of the additive allelic effects (aq) of the underlying genes, linkage disequilibrium, parental gametic phase (coupling or repulsion), environmental effects, and population size. In addition to the simulations, we performed analytical calculations, taking into account simpler situations, to explain the mechanisms that are involved for the differents factors. The details of these analytical calculations are given in Appendix A and B.
Independent QTL, without environmental variation:
In that case, the only source of variation for a given QTL is the sampling of individuals in the population, which leads to some deviations of R2 around the true QTL effect r2. At each QTL, the observed proportions of the three different genotypes randomly deviate from the theoretical (1:2:1) proportions. It is easy to show, using the notations introduced in Appendix B, that this sampling of individuals leads to an underestimation of the variance contributed by the QTL,

where
0 and
2 are the observed frequencies of the homozygote genotypes, and
2q is the genetic variance contributed by QTL q in the infinite F2 population. As shown in Table 2, a consequence is that the average sum of R2 (
) is always <100%, even with a population of 1000 individuals (between 95.2 and 96.3%, Table 2).
increases when population size increases and tends toward 100% for a population of 5000 individuals.
When underlying genes have identical additive allelic effects, the true effect (r2) is obviously the same for all the QTL. As shown in Fig 1 (Constant, RandomU), the random deviations of R2 around r2 are moderate with N = 1000 individuals. When underlying genes have nonidentical effects, the distribution of R2 values roughly corresponds to the distribution of true QTL effects (Fig 1, RandomU).
Linked QTL, without environmental variation: When the QTL are randomly located over the genome, linkage between QTL occurs, which consistently results in L-shaped distributions of R2 values, regardless of the distribution of additive allelic effects (Fig 1, RandomL). The data in Table 2 confirm this tendency: the values of skewness (Skew) are high and positive, and PCT values are showing that most (between 53 and 100%) of the R2 values are <2%.
To explain these observations, we calculated the values of true QTL effects (r2) in the simple case of three linked QTL (Q1, Q2, and Q3; see Appendix B for details). Whether the detection method is one-way ANOVA or multiple regression as used in our simulations, the r2 values appear to depend on both additive allelic effects (aQ1, aQ2, and aQ3) and linkage disequilibria (D12, D23, and D13) between QTL Q1, Q2, and Q3 in the F2 population. The total genetic variance in the F2 population (
2G) is
![]() |
(7) |
(Appendix B). Thus, for given values of the additive allelic effects, the total genetic variance increases with coupling (D > 0) and decreases with repulsion (D < 0).
With ANOVA, we have
![]() |
(8) |
![]() |
(9) |
![]() |
(10) |
(Appendix B). Thus, the genetic variances contributed by linked QTL (numerators of Equation 8, Equation 9, and Equation 10 increase with coupling and decrease with repulsion.
With multiple regression, we have
![]() |
(11) |
![]() |
(12) |
![]() |
(13) |
(Appendix B). Thus, in the multiple regression model, whatever the gametic phase (coupling or repulsion), linkage disequilibrium decreases the genetic variance contributed by linked QTL (numerators of Equation 11, Equation 12, and Equation 13). In this case, if all the additive allelic effects are identical, the QTL with the highest r2 are the independent ones (D = 0). However, the sum of individual r2 still depends on the gametic phase. From Equation 7 and Equation 11Equation 12Equation 13, it is expected to be <100% in case of coupling (D > 0) and >100% in case of repulsion (D < 0). This is consistent with our simulations when we compare the sum,
, of the estimated QTL effects (R2) between CouplingL and RepulsionL parental inbred lines (Table 2). With equal additive allelic effects (constant), the CouplingL gives
= 14.0%, while the RepulsionL gives
= 125.5%. The sum is intermediate (
50.0%) with RandomL parental inbred lines.
In general, the genetic variance of a QTL q should also depend on linkage disequilibria between QTL q and all the QTL linked to q, or linkage disequilibria of higher order. However, if the genetic variance contributed by the QTL q is computed by multiple regression, we showed that it depends only on the first order linkage disequilibria between QTL q and its nearest neighbors, the QTL q - 1 and q + 1 (Equation 11 and Equation 13), in the case of an F2 population. Thus, the flanking QTL tend to absorb the effects of all nearby QTL. ![]()
In our simulations, we investigated the relationship between R2 and the squared linkage disequilibrium value between each QTL and its two nearest neighbors, in the F2 population resulting from the cross between RandomL parental inbred lines, and with identical additive allelic effects for all the 50 QTL. A nonlinear multiple regression of R2 values on squared linkage disequilibria was performed on these data to check Equation 12. The regression showed that the simulation results perfectly fit the analytical results: the determination coefficient of the regression is 99.9%, and the parameter estimates are very close to the theoretical coefficients in Equation 12. This result is illustrated in Fig 2. This observation confirms that the r2 value of a given QTL depends only on pairwise linkage disequilibria between the QTL and its nearest neighbors. Hence, unequal map distances between QTL, which result in an L-shaped distribution of squared linkage disequilibria between QTL, should contribute to the L-shaped distribution of the estimated QTL effects. On the other hand, as QTL effects are confounded with the genetic distances between the QTL and its two neighbors, equally spaced QTL should result in two different R2 values: one for QTL flanked by two other QTL and one for QTL at the beginning or the end of the chromosomes. Simulations with equally distributed distances between QTL showed indeed that all the QTL have the same R2 values, once the QTL located at the ends of the chromosomes were excluded (not shown). In the general case, the true QTL effects also depend on the additive allelic effects of each QTL. Relying on Equation 11Equation 12Equation 13, we expect QTL with the highest r2 in a given population to be either independent QTL or QTL with high additive allelic effect. This point was confirmed by checking the R2 values of individual QTL in the simulations (not shown).
|
Environmental variation and reduced population size: When there is environmental variation on the trait (h2b < 1), or small sample size, the distribution of R2 is again L-shaped, whatever the distribution of additive allelic effects, and whether there is linkage or not: all Skew values are significant and positive, and PCT values are between 86 and 100% (Table 2).
To explain this phenomenon, we considered the simplest case of one-QTL analysis of variance (Appendix A). Using Equation 6, we performed numerical computations of the distribution of R2 from the distribution of F. We consider Q independent QTL with the same additive allelic effect on an additive trait. The total genetic variance is
2G = Q
2q, and the percentage of phenotypic variance explained by each QTL is r2q =
. Thus the noncentrality parameter for the distribution of F is

For each R2 value, the probability P(X
R2) = P(Y
F) was computed to get the theoretical distribution of ANOVA R2 values (open bars in Fig 3) for various heritabilities (h2b = 0.2, 0.5, 1.0) and various population sizes (N = 200 and 1000). The resulting distributions were compared to those obtained by simulation with multiple regression (solid bars in Fig 3), taking the same values for the heritability (h2b) and population size (N). With ANOVA, the sampling distribution of R2 depends on both parameters, N and h2b (![]()
decreases as well as r2 and the average
2, but the sampling variance of R2 increases. On the contrary, the population size does not influence r2, but as N decreases, the sampling mean and sampling variance of R2 increase.
|
The higher the sampling variance of R2, the more pronounced the L-shaped distribution of QTL R2 values. An intuitive explanation for such a distribution shape is that, with random errors, the most likely event is that the intraclass variability hides the difference between genotypic classes at one QTL, leading to R2 values low or close to zero. However, it may occur, by chance, that random samples of individuals (genotypes) or environments reinforce the differences between genotypic classes and lead, for some QTL, to R2 values >r2. The higher the r2, the lower the chance of such event. With ANOVA, the L-shaped distribution occurs in small populations (N = 200), even with h2b = 1, because of the segregation at the other QTL. For this reason, the distribution is very sensitive to the population size. With multiple regression all known QTL are taken into account in the model and the L-shaped distribution of R2 mainly occurs because of environmental noise. Thus, the L-shaped distribution is not observed with multiple regression in small populations when h2b = 1. As the heritability of the trait decreases, the differences between ANOVA and multiple regression decrease (Fig 3).
Comparing ranking order of QTL effects in different samples of a population:
Given that the distribution of estimated QTL effects (R2q) may be quite different from the distribution of the corresponding additive allelic effects (aq), the question of the reproducibility of the ranking order of the R2q in independent experiments performed from a given population arises.
Independent QTL without environmental variation: The parental inbred lines are RandomU and the heritability is h2b = 1. When genes have identical allelic effects, there is of course no correlation between replicates (Table 2). On the contrary, when genes have unequal effects, the distributions of r2 and R2 reflect the distribution of additive allelic effects. Hence, the ranking of the QTL is well conserved between replicates: W values between 0.985 and 0.999 for populations of 1000 individuals, N2 is between 3 and 10, and HPF is between 42 and 100% (Table 2).
Linked QTL, without environmental variation: The parental inbred lines are RandomL, CouplingL, or RepulsionL and the heritability is h2b = 1. In this case, the distributions of r2 and R2 reflect both the additive allelic effects and linkage disequilibria between QTL. Hence, the ranking of the QTL is again well conserved between replicates because of unequal linkage disequilibrium distribution (Table 2). However, a gene with a large allelic effect may repeatedly be found as a QTL with a small R2 if this gene is close to other genes (Equation 9).
Environmental variation and reduced population size: QTL are independent (RandomU) or linked (RandomL). As expected, whatever the distribution of r2, the correlation between the rankings of the R2's decreases when heritability decreases. This correlation is close to the heritability of the trait, since heritability describes the quality of the prediction of genetic values by phenotypic values. For a given heritability, the rank correlation decreases with the population size.
Comparison between additive and metabolic models:
In a previous article, we studied the distribution of r2 for a trait related to a metabolic flux (![]()
When all genes have the same additive allelic effect, the distribution of R2 is slightly more skewed with MET than with ADD. The differences are greater when the population size is smaller. Even with unequal additive allelic effects, the differences between MET and ADD are reduced when QTL are linked, as compared to independent QTL.
Actually, the main feature of MET is that correlation between the rankings of the QTL in different replicates is lower than for ADD, in particular for small population sizes (Table 3): at least 5000 individuals are needed with MET to obtain rank correlations similar to those observed for ADD with only 200 individuals. In fact, the variances of each R2q over replicates are always
10 times larger with MET than with ADD (not shown). Such low correlations between replicates can be explained by dominance and epistatic effects, which are inherent to MET (![]()
50%, we checked that only two or three QTL are ranked first in the 100 replicates. The rank correlation between replicates also decreases with the heritability of the trait, though the correlation is always slightly lower than the heritability. But, with MET, the robustness of estimated QTL effect is approximately equally sensitive to heritability and population size.
| DISCUSSION |
|---|
We analyzed different factors likely to influence the estimates of the effects of a finite set of QTL, whose "real" individual effects and positions are known. Even though in experimental situations the number of detected QTL is not very high for a given trait (usually <10 with common sizes of population), the numbers of QTL really contributing to the variation of complex traits are expected to be much higher. Accordingly, a set of 50 QTL was chosen for the simulations. The question of mapping these QTL by the use of molecular markers was not considered, because it is analyzed in length in other articles, and because we focused on the influence of particular genetic and nongenetic factors on the distribution of estimated QTL effects.
QTL effects are generally estimated by the statistic R2, which measures the fraction of phenotypic variation explained by the variation at the QTL. Our simulations have shown that the distribution of the estimated QTL effects (R2) reflects the distribution of additive allelic effects only if several conditions are combined: no linkage disequilibrium, linear relationship between gene effects and trait values (additive model), no environmental variance, and large population size. Otherwise the R2 distributions are clearly L-shaped, even with an additive model and J-shaped distributions of additive allelic effects (exponential). We showed here that the distribution of R2 depends on both genetic and nongenetic factors. Genetic factors such as the distribution of additive allelic effects, the model (additive or metabolic), the linkage disequilibrium between QTL, or the parental gametic phase determine the distribution of r2, the "true" QTL effects. Then, for each QTL, the R2 value is a random variable, which depends of course on r2, but also on the amount of residual noise that confuses the estimation of r2.
Among the factors likely to increase the residual noise, the heritability of the trait and the population size do not play the same role. The heritability of the trait buffers the relationship between phenotypic and genetic values. It is of course always possible to enhance the heritability of a trait by using refined experimental designs involving progeny tests or cloning. The population size determines the amount of sampling variation for the genotypic composition of the experimental population. In a given genotypic class at a given QTL, the latter mainly affects the segregation at the other QTL. The resulting overestimation of detected QTL effects and lack of repeatability with small population sizes have already been documented by ![]()
![]()
![]()
Another cause for the L-shaped distribution of the R2 values is the L-shaped distribution of r2 values themselves. Unequal linkage disequilibrium between QTL appears to be a cause for the L-shaped distribution of the r2 values. We showed analytically that when using one-QTL analysis of variance for estimating the QTL effects, coupling in the parents increases the r2 values, while repulsion decreases them. When using multiple regression, analytical developments as well as simulations showed that linkage disequilibrium decreases the r2 values, as a function of the squared linkage disequilibria, whatever the gametic phase and the additive effects of the nearby QTL. Beyond the estimation methods, it is clear that the R2 values reflect not only the additive allelic effects but also the relative position of the QTL on the genetic map. It is likely that the linkage disequilibrium bias occurs in actual situations and that it is unequally distributed. As unequal map distances between QTL would result in an L-shaped distribution of squared linkage disequilibria between QTL, linkage disequilibrium would therefore contribute to the L-shaped distribution of R2.
For some applications of QTL methodology, for example, in the context of the candidate gene approach, it is important to get accurate estimates of the effects/positions of the QTL. Our results emphasize that the multiple regression or composite interval mapping methods, which take into account the other QTL as cofactors (![]()
![]()
![]()
![]()
![]()
However, these methods have also some drawbacks. First, only detectable QTL can be used as cofactors in the multiple regression, and the residual variation is still expected to include the genetic variation at unknown QTL. Second, linkage disequilibria between markers and QTL, as well as the apparent effect of several QTL located in the same interval between two markers, will influence the r2 values. Thus, even with these methods, we can predict neither the distribution of the true effects of the QTL (r2) nor the distribution of additive allelic effects from the estimated effects (R2). But here the question of which parameter is the most relevant to describe the effect of a gene arisesits true effect or its additive allelic effect?
Besides the statistical tools, different methods may be used to confirm the presence of a QTL in a given chromosomal region. First, QTL detection may be done in different genetic backgrounds, i.e., with different gametic phases. Second, the fine mapping methods, using near isogenic lines or introgression lines (![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We are very grateful to A. Leonardi for helpful discussions and reading the manuscript, and also to P. Keightley and anonymous referees for helpful comments on the manuscript. B. Bost was supported by a Ph.D. grant from the French Ministry of National Education, Research, and Technology (MENRT).
Manuscript received August 11, 1999; Accepted for publication December 20, 2000.
| APPENDIX A |
|---|
Effect of environmental variation on the percentage of phenotypic variation explained by a QTL, estimated by ANOVA or multiple regression
We consider here a trait controlled by Q QTL.
One-QTL analysis of variance:
The SST of the trait in an F2 population of N individuals can be decomposed as in Table A1, where
is the number of genotypic classes at locus q (
= 3 in an F2 population),
2q is the genetic variance at the QTL q,
2R is the residual variance, and
=
is a term taking into account the sampling of individuals in the genotypic classes at the QTL q (![]()
- 1 and N -
as degrees of freedom, and a noncentrality parameter
q (![]()

Multiple regression:
The SST of the trait in an F2 population of N individuals can be decomposed as in Table A2, where
2q* is the genetic variance at the QTL q, taking into account the other QTL,
2R* is the residual variance, and the other parameters are the same as in Table A1. The test statistic of the effect of the locus q is F =
, and follows a noncentral Fisher distribution, with
- 1 and N -
- (Q - 1)(
- 1) as degrees of freedom, and a noncentrality parameter
*q:
|
|

As SST
Qq=1 SS*q + SSR*, there is no simple relationship between F and the R2q estimated with the multiple regression. However, we can note that ![]()
| APPENDIX B |
|---|
Effect of linkage disequilibrium on the percentage of phenotypic variation explained by a QTL, estimated by ANOVA or multiple regression
We consider here the simple case of a trait controlled by three QTL (Q1, Q2, Q3) being located on the same chromosome. The phenotypic value of the individual i of an F2 population is
![]() |
(B1) |
where j, k, l are indices for the number of high alleles (0, 1, or 2) at QTL Q1, Q2, Q3, respectively;
jkl is the genetic value of individual i; and
i is the random environmental deviation.
jkl is determined according to an additive model,
![]() |
(B2) |
where
is a constant, and aQ1, aQ2, and aQ3 are the allelic additive effects of QTL Q1, Q2, and Q3, respectively.
Let
12,
23,
13 be the recombination rates between QTL Q1 and Q2, Q2 and Q3, and Q1 and Q3, respectively. pabc is the frequency of the gamete produced by the F1 hybrid with allele a, b, and c at QTL Q1, Q2, and Q3, respectively. The eight gametic frequencies are

There are four cases for the parental gametic phase (coupling or repulsion) between QTL Q1 and Q2, and Q2 and Q3 (Table B1), and the corresponding gametic frequencies E, F, G, and H are given in Table B2.
|
|
In the F2 population, the linkage disequilibria between Q1 and Q2 (D12), Q2 and Q3 (D23), and Q1 and Q3 (D13) are

It is worth noting that

The genotype frequencies (fjkl) can be deduced from gametic frequencies. For example, f011 = 2p000p011 + 2p010p001 = 2EF + 2GH.
In this model, the mean phenotypic value of the population is

and the total genetic variance is
![]() |
(B3) |
One-QTL analysis of variance:
With one-QTL ANOVA, the statistical model for the phenotypic value of an individual i, with the genotype j at the QTL considered, is
![]() |
(B4) |
where Gj is the effect of genotype j at the QTL, gij is a genetic component due to segregation at the two other QTL, and
ij is the random environmental deviation. The mean phenotypic value for genotype j at the QTL is

which is the conditional expectation for the phenotype, given the genotype at the QTL. We show below that, in this simple case, Ei(gij)
0 if there is linkage between the QTL considered and the other ones.
For example, for the QTL Q2, the mean phenotypic value depends on genotype frequencies fjkl at the three QTL, as well as on genetic values
jkl,

and the variance contributed by QTL Q2 is

In the infinite F2 population that we consider, there are three different genotypes at one QTL, with genotypic frequencies f0 =
, f1 =
, and f2 =
. As an example, for the QTL Q2, we then have

Thus, we can calculate the contribution of each QTL:
![]() |
(B5) |
![]() |
(B6) |
![]() |
(B7) |
Hence, the sum of the individual contributions of each QTL is not equal to the total genetic variance (B3) when QTL are linked (D
0).
Multiple regression:
With multiple regression, we take into account all the QTL, and the statistical model for the phenotypic value of an individual i, with the genotypes j, k, and l at the QTL Q1, Q2, and Q3, respectively, is
![]() |
(B8) |
where
ijkl is the random environmental deviation. The genetic variances contributed by each QTL are computed conditionally on the other QTL declared in the model,
![]() |
(B9) |
where

are the genetic variances contributed by the other QTL, and

are, respectively, the mean phenotypic values for genotype jk at QTL Q1 and Q2, jl at QTL Q1 and Q3, and kl at QTL Q2 and Q3. Using the genetic model defined in (B2), we find

Thus, following (B9), the variances contributed by each QTL are
![]() |
(B10) |
![]() |
(B11) |
![]() |
(B12) |
and the sum of the individual contributions of each QTL is

and is not equal to the total genetic variance (B3) when QTL are linked (D
0).
Note that, in an F2 population, with multiple regression, the effect of a QTL q does not involve the effects of the QTL that are linked to q, but only the linkage disequilibria between these QTL and QTL q.
| LITERATURE CITED |
|---|
BARTON, N. H. and M. TURELLI, 1987 Adaptative landscapes, genetic distances and evolution of quantitative characters. Genet. Res. 49:157-173[Medline].
BEAVIS, W. D., 1998 QTL analyses: power, precision and accuracy, pp. 145162 in Molecular Dissection of Complex Traits, edited by A. H. PATERSON. CRC Press, Boca Raton/New York.
BOST, B., C. DILLMANN, and D. DE VIENNE, 1999 Fluxes and metabolic pools as model traits for quantitative genetics. I. L-shaped distribution of gene effects. Genetics 153:2001-2012
CARBONELL, E. A., T. M. GERIG, E. BALANSARD, and M. J. ASINS, 1992 Interval mapping in the analysis of nonadditive quantitative trait loci. Biometrics 48:305-315.
CARBONELL, E. A., M. J. ASINS, M. BALSEGA, E. BALANSARD, and T. M. GERIG, 1993 Power studies in the estimation of genetic parameters and the localization of quantitative trait loci for backcross and doubled haploid populations. Theor. Appl. Genet. 86:411-416.
CHARCOSSET, A. and A. GALLAIS, 1996 Estimation of the contribution of quantitative trait loci (QTL) to the variance of quantitative traits by means of genetic markers. Theor. Appl. Genet. 93:1193-1201.
CRAMER, J. S., 1987 Mean and variance of R2 in small and moderate samples. J. Econometrics 35:253-266.
DIRLEWANGER, E., P. G. ISAAC, S. RANADE, M. BELAJOUZA, and R. COUSIN et al., 1994 Restriction fragment length polymorphism analysis of loci associated with disease resistance gene and developmental traits in Pisum sativum L. Theor. Appl. Genet. 88:17-27.
DOEBLEY, J. and A. STEC, 1993 Inheritance on the morphological differences between maize and teosinte: comparison of results for two F2 populations. Genetics 134:559-570[Abstract].
EDWARDS, M. D., C. W. STUBER, and J. F. WENDEL, 1987 Molecular-marker-facilitated investigations of quantitative-trait loci in maize. I. Numbers, genomic distribution and types of gene action. Genetics 116:113-125
ESHED, Y. and D. ZAMIR, 1995 An introgression line population of Lycopersicon pennellii in the cultivated tomato enables the identification and fine mapping of yield associated QTL. Genetics 141:1147-1162[Abstract].
FATOKUN, C. A., D. I. MENANCIO-HAUTEA, D. DANESH, and N. D. YOUNG, 1992 Evidence for orthologous seed weight genes in cowpea and mung bean based on RFLP mapping. Genetics 132:841-846[Abstract].
GELDERMANN, H., 1975 Investigations on inheritance of quantitative characters in animal by gene markers. I. Methods. Theor. Appl. Genet. 46:319-330.
GEORGES, M. and L. ANDERSSON, 1996 Livestock genomics comes of age. Genome Res. 6:907-921
GRANDILLO, S. and S. D. TANKSLEY, 1996 QTL analysis of horticultural traits differentiating the cultivated tomato from the closely related species Lycopersicon pimpinellifolium.. Theor. Appl. Genet. 92:935-951.
HOSPITAL, F. and C. CHEVALET, 1993 Effect of population size and linkage on optimal selection intensity. Theor. Appl. Genet. 86:775-780.
JANSEN, R. C., 1993 Interval mapping of multiple quantitative trait loci. Genetics 135:205-211[Abstract].
JANSEN, R. C. and P. STAM, 1994 High resolution of quantitative traits into multiple loci via interval mapping. Genetics 136:1447-1455[Abstract].
KACSER, H. and J. A. BURNS, 1973 The control of flux. Symp. Soc. Exp. Biol. 27:65-104[Medline].
KACSER, H. and J. A. BURNS, 1981 The molecular basis of dominance. Genetics 97:639-666
KEARSEY, M. J. and A. G. L. FARQUHAR, 1998 QTL analysis in plants; where are we now? Heredity 80:137-142.
KENDALL, M. G., 1955 Rank Correlation Methods. Griffin, London.
LEE, S. H., M. A. BAILEY, M. A. R. MIAN, T. E. CARTER, JR, and E. R. SHIPE et al., 1996 RFLP loci associated with soybean seed protein and oil content across populations and locations. Theor. Appl. Genet. 93:649-657.
LIN, Y. R., K. F. SCHERTZ, and A. H. PATERSON, 1995 Comparative analysis of QTL affecting plant height and maturity across the Poaceae, in reference to an interspecific sorghum population. Genetics 141:391-411[Abstract].
LIU, S.-C., S. P. KOWALSKI, T.-H. LAN, K. A. FELDMANN, and A. H. PATERSON, 1996 Genome-wide high resolution mapping by recurrent intermating using Arabidopsis thaliana as a model. Genetics 142:247-258[Abstract].
MACKAY, T. F. C., 1996 The nature of quantitative genetic variation revisited: lessons from Drosophila bristles. Bioessays 18:113-121[Medline].
MAUGHAN, P. J., M. A. S. MAROOF, and G. R. BUSS, 1996 Molecular-marker analysis of seed-weight: genomic locations, gene action, and evidence for orthologous evolution among three legume species. Theor. Appl.















is the theoretical linkage disequilibrium coefficient of r2q in the three-QTL model (











