Abstract
Heterosis is widely used in breeding, but the genetic basis of this biological phenomenon has not been elucidated. We postulate that additive and dominance genetic effects as well as two-locus interactions estimated in classical QTL analyses are not sufficient for quantifying the contributions of QTL to heterosis. A general theoretical framework for determining the contributions of different types of genetic effects to heterosis was developed. Additive × additive epistatic interactions of individual loci with the entire genetic background were identified as a major component of midparent heterosis. On the basis of these findings we defined a new type of heterotic effect denoted as augmented dominance effect di* that comprises the dominance effect at each QTL minus half the sum of additive × additive interactions with all other QTL. We demonstrate that genotypic expectations of QTL effects obtained from analyses with the design III using testcrosses of recombinant inbred lines and composite-interval mapping precisely equal genotypic expectations of midparent heterosis, thus identifying genomic regions relevant for expression of heterosis. The theory for QTL mapping of multiple traits is extended to the simultaneous mapping of newly defined genetic effects to improve the power of QTL detection and distinguish between dominance and overdominance.
THE concept of heterosis is widely used in plant and animal breeding. In many species, the controlled crossing of selected parental components, mainly inbred lines, is employed to maximize heterosis and thus agronomic performance of the resulting F1 hybrids. However, the understanding of this biological phenomenon is limited and the genetic basis of heterosis has yet to be elucidated.
Quantitative genetics has contributed to our understanding of heterosis by (i) formulating genetic models explaining heterosis on the basis of different modes of gene action such as dominance, overdominance, and epistasis (for review see Lamkey and Edwards 1999); (ii) devising the theory for the design and analysis of experiments investigating these types of gene action (for review see Lynch and Walsh 1998); and (iii) gathering a plethora of experimental data supporting or questioning these theories. One of the experimental approaches proposed for investigating the relative importance of different types of gene action is the analysis of generation means (Kearsey and Pooni 1996). Six basic generations (the two parental lines and their F1 and F2 as well as backcrosses of the F1 to each of the parents) are used to estimate the magnitude of additive, dominance, and epistatic effects affecting the quantitative trait under study. However, these parameters capture the net contribution of gene effects summed over all loci and consequences of summation may be pronounced if positive and negative effects at individual loci cancel each other. An alternative approach is the partitioning of the genetic variance into independent components due to additive, dominance, and epistatic effects (Fisher 1918). However, as pointed out by Lynch and Walsh (1998), unless information on gene frequencies of the reference population is available, variance components provide limited information on the relative importance of the different modes of gene action because dominance and epistasis can greatly affect additive or dominance components of variance. This limitation can be overcome by the use of populations derived from a cross between two inbred lines such as the North Carolina experiment III (design III) proposed by Comstock and Robinson (1952). A random sample of F2 individuals derived from a cross between two homozygous inbred lines is backcrossed to each of the parents, yielding a population with gene and genotype frequencies equivalent to an F2. A one-way analysis of variance (ANOVA) of phenotypic means and differences of the F2 backcrosses yields estimates of the dominance and additive genetic variance with nearly equal precision and their ratio provides a weighted estimate of the squared degree of dominance.
A major step toward the analysis of the type of gene action at individual genetic loci affecting quantitative traits (QTL) was the advent of molecular marker technology. The first marker-aided study estimated the predominant type of gene action in segregating F2 populations of maize (Edwards et al. 1987). Additive and dominance gene effects at individual QTL as well as digenic epistatic interactions were estimated on the basis of contrasts of marker genotype classes. However, these estimates are not sufficient for making inferences about the genetic basis of heterosis. The performance of an F1 hybrid or its filial generations and midparent heterosis (MPH) have distinct genetic components. Assuming the presence of digenic epistasis, hybrid performance is a function of the sum of dominance effects and dominance × dominance (dd) epistasis. To date, a general mathematical derivation of the contributions of different genetic effects to MPH is still lacking, but quantitative genetic expectations of MPH have been presented in a descriptive manner (e.g., Van Der Veen 1959). While additive × additive (aa) interactions contribute to MPH, dd interactions may not, depending on the metric used for describing genotypic values. In this study we show that QTL exhibiting significant dominance effects may not contribute to MPH if the sum of their aa epistatic interactions with the genetic background is positive. Furthermore, we show that significant heterotic QTL can result from aa epistatic interactions with the genetic background if positive and negative alleles are contributed with equal frequency by each parent. Consequently, we define a new genetic effect (di*) that allows us to express MPH as the sum of individual QTL effects and demonstrate that design III is suitable for mapping of QTL with genotypic expectations equivalent to the augmented dominance effect di*.
The objectives of this study were to (i) develop a general theoretical framework for determining the contributions of the different types of genetic effects to heterosis, (ii) dissect MPH into its underlying components using composite-interval mapping (CIM) and a variant of design III, (iii) give quantitative genetic expectations of heterotic QTL, and (iv) extend the theory for the joint analysis of multiple traits (Jiang and Zeng 1995) to the simultaneous mapping of different newly defined genetic effects to improve the power of QTL detection and distinguish between dominance and overdominance. Results from an experiment on heterosis in Arabidopsis for which the presented theory has formed the quantitative genetic basis are in an accompanying article (Kusterer et al. 2007, this issue).
GENETIC EFFECTS CONTRIBUTING TO HETEROSIS
When assessing the type of gene action contributing to heterosis, we need to express heterosis as a function of additive, dominance, and epistatic gene effects. To date, no generalized mathematical derivation showing the respective contributions of the different genetic effects has been developed for an arbitrary number of QTL and all types of higher-order epistatic effects.
MPH for a quantitative trait is defined as the difference between the genotypic value of an F1 hybrid () and the mean genotypic value of its two homozygous parents (
):
(1)Let P1 and P2 differ at the loci set Q = {1,…, q} affecting the quantitative trait of interest. Let vi be an indicator variable for the genotype at QTL i taking values 0, 1, 2 if homozygous P1, heterozygous, or homozygous P2, respectively. The types of genetic effects contributing to G are described by the 3q parameters αAD, which define the genetic effects of type additive at the loci set A (
) and of type dominance at the loci set
(
\A, the complement of A in Q). For exemplification of αAD for q = 3 see supplemental Table S1 at http://www.genetics.org/supplemental/.
The coefficients and meaning of parameters αAD in the genotypic value G depend on the choice of the metric. In the quantitative genetic literature, two main metrics have been described for populations derived from a cross between two inbred lines: the F2 metric and the F∞ metric (Van Der Veen 1959; Yang 2004). The F2 metric model, a special case of Cockerham's (1954) model for partitioning the genetic variance into eight orthogonal contrasts due to additive, dominance, and epistatic effects, defines genetic effects as deviations from the mean of the F2 population in linkage equilibrium. The F∞ model defines genetic effects as contrasts between different genotypes without reference to any population (Yang 2004). Both models can be translated into each other by a linear transformation. In the presence of epistasis they differ with respect to interpretation of genetic effects and the structure of variance components. With the F2 metric and an F2 population in linkage equilibrium the genetic variance is partitioned into orthogonal components that represent sums of squared additive, dominance, and epistatic effects, providing insight into the relative importance of different types of gene action. Consequently, in the analysis of heterosis with design III, we prefer the F2 metric. With the F∞ metric, additional complexity is introduced, because besides the dominance effect at a QTL and its aa epistatic interactions, also dd epistatic interactions with the genetic background contribute to MPH (Van Der Veen 1959). Consequently, genetic expectations of individual heterotic QTL become unwieldy and more difficult to interpret with the F∞ metric.
Under the F2 metric, variables and
determine the coefficient of αAD in G with xV,i = −1, 0, 1 and
when vi = 0, 1, 2, respectively.
Thus, we can express the genotypic value of genotype V = (v1,…, vq) as(2)with
indicating summation over all possible subsets A within the set Q. For A = Ø and D = Ø, αAD = μ; for D = Ø, αAD = αA; for A = Ø, αAD = αD.
Let us assume an F2 individual with V = (vi, vj, vk) = (2, 1, 0). Then, xV,i = 1, xV,j = 0, xV,k = −1, yV,i = − yV,j =
and yV,k = −
Thus,
The parameter μ denotes the genotypic expectation of the F2 generation in linkage equilibrium. In accordance with the definition of Falconer and Mackay (1996, p. 109) ai denotes the additive or homozygous effect at QTL i and di the dominance or heterozygous effect. The additive effect at locus i is positive (+ai) when the trait-increasing allele is contributed by P2 and negative (−ai) when contributed by P1. The degree of dominance is expressed as di/ai. Epistatic interactions between loci i, j, and k are denoted, adopting the notation of Yang (2004). Analogously, the genotypic values of the parental homozygous lines P1 and P2 and the F1 hybrid are
and
and
where
and
denote the number of elements in sets A and D, respectively. Then, MPH can be calculated as
(3)This generalized derivation shows that under the F2 metric the quantitative genetic expectation of MPH is affected by the dominance effects at QTL and by epistatic interactions including an odd number of dominance terms (e.g., ddd but not dd). In addition, epistatic effects including additive terms also contribute to MPH, because MPH is based on the deviation of the F1 hybrid from the mean of the two homozygous parental lines. However, only effects with an even number of additive terms (e.g., aa and aad but not ad and aaa) contribute to MPH.
Considering only digenic epistasis, Equation 3 can be written aswith Qi denoting the loci set Q excluding element i.
To express MPH as the sum of individual QTL effects we define a new type of heterotic genetic effect di* that we denote as augmented dominance effect and that includes the dominance effect of QTL i (di) minus half the sum of its additive × additive epistatic interactions (aaij) with all other QTL irrespective of linkage(4)Then, MPH can be expressed as the sum of augmented dominance QTL effects
(5)
The effect of additive × additive epistasis on MPH and di* is demonstrated with a numerical example for four QTL in the supplemental information (supplemental Table S2 at http://www.genetics.org/supplemental/). Depending on gene dispersion and the magnitude of the aa epistatic effects, QTL with significant dominance effects di may not contribute to MPH if the sum of their aa epistatic interactions with the genetic background is positive [supplemental Table S2, S = (+, +, +, +)]. Furthermore, significant heterotic QTL can result from aa epistatic interactions with the genetic background if positive and negative alleles are contributed with equal frequency by each parent, as would be expected for an elite × elite cross [supplemental Table S2, S = (+, +, −, −)].
The quantitative genetic expectation of the parental difference (PD) is found to be(6)
Note that under the F2 metric the formulas for homozygous genotypes such as parents P1 and P2 involve also dominance terms () besides purely additive terms (
). As a consequence, PD is affected by the additive effects at the QTL as well as by epistatic interactions including an odd number of additive and an arbitrary number of dominance terms (e.g., ad but not aa). Considering only digenic epistasis, PD reduces to
where
(7)In accordance with the term suggested for the effect di*, ai* is denoted as augmented additive effect. It includes the additive effect for QTL i (ai) minus half the sum of dominance × additive epistatic interactions (daij) with all other QTL, corresponding exactly to the net contribution of QTL i to the parental difference.
IDENTIFICATION OF AUGMENTED QTL EFFECTS
On the basis of the results from the previous section it becomes obvious that we need to revise our approaches for the identification of genomic regions contributing to heterosis. Instead of identifying QTL with maximum dominance and dd interactions that increase F1 performance we need to identify genomic regions that contribute to MPH, i.e., that yield significant results in the QTL analysis due to the dominance effect at QTL i and its epistatic interaction with the genetic background. Thus, specific experimental designs are needed that identify QTL with genotypic expectations that precisely equal the augmented dominance effect di*. To our knowledge this criterion is met only by genetic effects estimated with design III (Comstock and Robinson 1952). The original design III comprised the analysis of F2 individuals backcrossed to their parental inbred lines and was devised for estimating the average degree of dominance over all loci. We modified the statistical analysis of design III to accommodate the analysis of testcrosses of recombinant inbred lines (RILs) because they are “immortal” test units that can be shared between research groups and can be repeatedly phenotyped.
Design III with RILs:
Let us assume a random population of RILs derived from the cross between the two homozygous lines P1 and P2. Further, we assume that the RILs are backcrossed to their parental lines, yielding testcross progenies H1 and H2. Gene effects of the design III testcross progenies are expressed with regard to the corresponding gene-orthogonal population, i.e., the F2. The parental line exhibiting superior average testcross performance across both testers is denoted as P2. With testcrosses evaluated in a randomized complete block design we obtain phenotypic trait values Ytpk of the testcross progeny Ht of RIL p (p = 1,…, n) crossed with tester t (t = 1, 2 for design III) in the kth block (k = 1,…, r). Following Comstock and Robinson (1952), we perform two linear transformations Zs (s = 1, 2) on the performance data Ytpk of testcross progenies Ht with pair means Z1pk = (Y1pk + Y2pk)/2 and pair differences Z2pk = Y1pk − Y2pk. Thus, Zspk denotes the phenotypic value of transformation Zs for RIL p grown in the kth block and Zsp the progeny mean value of Zs for RIL p. Expected mean squares from the ANOVA are given in Table 1.
Analysis of variance of design III evaluated in a randomized complete block design
Marker-based estimation of augmented effects ai* and di*:
The first to present the statistical theory for estimation of the type of gene action at individual QTL with design III were Cockerham and Zeng (1996). They presented genotypic expectations of marker contrasts using single-marker ANOVA and extended the analysis to test for two-locus interactions between linked QTL. They demonstrated that genotypic expectations of QTL mapped with design III were biased with epistasis. Dominance effects were confounded with aa epistasis, additive effects with da interactions regardless of linkage. However, interactions of individual QTL with the entire genetic background were not accounted for and, most importantly, they did not make the connection to genetic expectations of MPH. While Cockerham and Zeng (1996) considered the confounding epistatic effects in their analysis a limitation, we claim that they are favorable for the identification of genomic regions contributing to heterosis. The following derivations show that genotypic expectations of QTL identified with design III precisely equal the augmented dominance effect di*.
Cockerham and Zeng (1996) defined four orthogonal single-marker contrasts (C1, C2, C3, C4) among the means of testcross progenies from F2 individuals or F3 lines, estimating additive (C1) and dominance effects (C3) as well as digenic epistasis of linked QTL (C2, C4). Here, we extend their methods to the analysis of heterosis and to the analysis of RILs. We defined two orthogonal single-marker contrasts on the basis of progeny mean values Zsp for pair means (Z1(m)) and pair differences (Z2(m)) (see the appendix for a generalized derivation). While contrasts C1 and C3 in the notation of Cockerham and Zeng (1996) correspond to Z1(m) and Z2(m) in our notation, contrasts C2 and C4, which provide tests for epistasis among linked QTL, cannot be calculated for RILs because they rely on comparisons of the heterozygous vs. homozygous marker classes.
With epistasis restricted to digenic effects, we obtain(8)
(9)Dmi being the linkage disequilibrium between QTL i and marker m (Weir 1996). For RILs, Dmi can be calculated from the recombination frequency rmi between QTL i and marker m as
(for further details see the appendix).
As can be seen from Equation 9, the expectation of Z2(m) is a multiple of the sum of the augmented dominance effects di* for all QTL with Dmi > 0, weighted by their linkage disequilibrium Dmi to the marker m. Thus, estimation of dominance effects of QTL linked to marker m is confounded with digenic epistatic interactions of type additive × additive with the entire genetic background. Analogously, the linear contrast Z1(m) is a multiple of the sum of augmented additive effects ai* for all QTL with Dmi > 0, weighted by their linkage disequilibrium Dmi to the marker.
Expected mean squares for an ANOVA that includes contrasts Z1(m) and Z2(m) for design III evaluated in a randomized complete block design are presented in Table 1. Appropriate statistical tests can be derived by standard statistical theory. From expected mean squares, it becomes evident that the power of detecting significant QTL effects linked to marker m depends mainly on segregation variances and
of the variables Z1 and Z2 with respect to QTL effects not accounted for by the marker m, i.e., the genetic background variation.
With single-marker ANOVA the QTL position (here reflected by parameter Dmi) and QTL effects (here ai* or di*) are confounded. Hence, the effects of linked QTL cannot be separated, and effects of other unlinked QTL, which cause background noise, are not accounted for. The above shortcomings have been overcome by CIM (Zeng 1994; Jansen and Stam 1994), which allows separate estimation of the QTL position and the QTL effect as well as separation of linked QTL and control of genetic background variation. We have adopted this method for identification of genomic regions affecting heterosis. Using the model(10)where Zsp is the linear function Zs calculated for the pth RIL, b0s is the mean effect of the model for Zs, bs* is the augmented effect of the putative QTL on Zs, xp* is the conditional probability of the dummy variable θm given the observed genotypes at the marker loci m and (m + 1) flanking the putative QTL (θm takes value 0 or 2 if the genotype of the RIL at the QTL is homozygous P1 or P2, respectively), bls are partial regression coefficients of Zsp on the lth marker assuming h markers included in the model as cofactors, xlp is an indicator variable taking value 0 or 2 depending on the genotype at marker l, and
is the residual effect on Zs for the pth RIL. CIM analyses can be individually performed for the two linear functions Zs from design III. Tests of significance are straightforward as described for CIM (Zeng 1994) with the null hypothesis H0: bs* = 0 and the alternative hypothesis HA: bs* ≠ 0.
If we assume complete linkage between the marker m and one of the QTL (rmi = 0), we obtain the following genotypic expectations for the contrast of the two homozygous marker classes at marker m:(11)
(12)
The advantages of CIM in comparison to single-marker ANOVA become obvious immediately: (i) the position of the QTL i can be estimated and (ii) the effect of the linked QTL j should be blocked by the use of cofactors (Jansen and Stam 1994; Zeng 1994). Thus, it is possible to test contrasts for QTL (not markers) and the genotypic expectation for Z2(i), the contrast of Z2 between the two unobservable homozygous genotype classes at QTL i, reduces to Consequently, a genome scan with Z2 localizes genomic regions affecting MPH and b2*/2 = di*. Accordingly, ai*, the contribution of QTL i to the parental difference, is equal to b1*. The advantage of design III is that genotypic expectations of QTL mapped with Z2 precisely equal their net contribution to MPH. However, the contribution of the main dominance effect di cannot be assessed independently from the sum of aa epistatic interactions of QTL i with the genetic background, which is a limitation of design III.
Extension of heterotic QTL analyses:
In addition to separate CIM of Z1 and Z2 we propose to apply the theory of a joint analysis for multiple traits as suggested by Jiang and Zeng (1995) to the simultaneous mapping of augmented QTL effects ai* and di*. Depending on the correlation structure of data on multiple traits a joint analysis may increase the power of the likelihood-ratio (LR) tests for QTL detection and may allow us to distinguish between pleiotropy and close linkage of QTL for individual traits. Following Jiang and Zeng (1995), we regard Z1 and Z2 as two different traits and rewrite the model from Equation 10 in matrix notation,(13)where Z is a matrix of Zsp, x* is a column vector of xp*, b* is a row vector of bs*, X and B are two matrices controlling the genetic background variation, and ℰ is a matrix of
The power of QTL detection using the model from Equation 13 is significantly increased by a joint LR test if the product of the effects b1* (corresponding to ai*) and b2* (corresponding to 2di*) and the correlation ρ12 between residuals
and
have different signs (Jiang and Zeng 1995). In the joint QTL analysis, the hypotheses to be tested are: H0: b1* = 0, b2* = 0 vs. HA: b1* ≠ 0 or b2* ≠ 0.
On the basis of experimental results from the analysis of complex traits it seems reasonable to assume that marker–trait associations detected with the model in Equation 10 will rarely explain >50% of the genotypic variance even with large sample sizes (Schön et al. 2004). Thus, the residual component must be subdivided into two components; i.e.,
= gsp + esp, with gsp reflecting the genetic component in
not accounted for by the putative QTL and the cofactors in the model, and esp being the experimental error. As a consequence, residual components
and
can be correlated within RILs but are independent among RILs. The correlation ρ12 between residuals
and
can be approximated by
Using the derivation of covariances presented in the appendix and assuming digenic epistasis and absence of linkage,
(14)Hence, it follows that ρ12 depends mainly on the variation in sign and magnitude of ai* and di*. When P1 and P2 are elite inbred lines, we expect that about half of the trait-increasing alleles are contributed by P1 (i.e., ai* < 0) and half by P2 (i.e., ai* > 0). With directional dominance (di* > 0), the correlation of residuals is in this case expected to be close to zero. Thus, the joint LR test statistic will approximate the sum of the individual LR test statistics and no increase in the power of QTL detection can be expected from a genome scan using the multivariate model from Equation 13. However, when analyzing progenies from crosses between an elite parent (P2) and an exotic donor line (P1) with a much smaller proportion of positive alleles than the elite parent the joint LR test statistic can be advantageous for finding heterotic QTL contributed by P1. If P2 contributes the majority of positive dominant alleles, i.e., ai* > 0 and di* > 0, then ρ12 > 0. In genome regions where a positive dominant allele originates from P1 (ai* < 0; di* > 0), the product of effects ai* and di* will be negative, while ρ12 > 0. Thus, the power of detecting heterotic QTL from exotic donor parents will be increased with the joint LR test.
In addition to increasing the power of QTL detection with multiple traits, the statistical method suggested by Jiang and Zeng (1995) also provides a test for separating pleiotropy from close linkage. In our case, the analogous test can be applied to distinguish between the hypothesis of dominance vs. overdominance in genomic regions with significant effects ai* and di*. LR tests for the null hypothesis H0: z(1) = z(2) vs. HA: z(1) ≠ z(2), where z is the position of the QTL affecting Z1 and Z2, respectively, have been described by Jiang and Zeng (1995). Under the null hypothesis, the same genetic locus contributes to the augmented effects ai* and di*. If both effects are of similar magnitude, acceptance of H0 would imply that dominant gene action is prevalent. In contrast, acceptance of HA would imply that one locus shows additive gene action (ai* > 0; di* = 0) and at the second locus the heterozygote outperforms both homozygotes, thus exhibiting overdominance (aj* = 0; dj* > 0; ).
GENOTYPIC EXPECTATION OF BETTER PARENT HETEROSIS
When analyzing heterosis in self-pollinating crops such as wheat, rice, or tomato the reference base for the calculation of heterosis is often not the midparent value but the superiority over the better parent (e.g., Semel et al. 2006). However, if one is interested in the genetic causes of heterosis, it is most plausible to compare the hybrid with the average performance of both parental lines (and not only the better parent) because the F1 inherited half its nuclear genome from each parent. As can be seen from the following derivations, better parent heterosis (BPH) can be expressed as a function of augmented dominance and additive effects. Consequently, the genetic causes of BPH are more complex than those of MPH and include the latter.
Defining P2 to be the better-performing parent, the genotypic expectation of BPH can be calculated asThus, we obtain that BPH is affected by genetic effects with an odd number of dominance terms and effects with at least one additive and an arbitrary number of dominance terms. Considering only digenic epistasis,
BPH can be estimated using phenotypic data −H2 (i.e., backcrosses of RILs to P2). If we assume complete linkage between the marker m and one of the QTL contributing to BPH (rmi = 0), we obtain the following genotypic expectations for the contrast of the two homozygous genotypic marker classes at marker m:(15)Using CIM for QTL mapping with −H2, the position of the QTL i can be estimated and the effect of the linked QTL j is blocked by the use of cofactors. Hence, analogously to Z2(i) the genotypic expectation for −H2(i), the contrast of −H2 between the two unobservable homozygous genotype classes at QTL i, reduces to
EXTENSION OF ANALYSES TO QUADRIGENIC EPISTASIS
Considering the special case of quadrigenic epistasis and using Equation 1, the genotypic expectations of PD and MPH can be calculated asIt becomes obvious that effects of higher-order epistasis contribute to PD and MPH. Even if individual effects are small, their summed effects may be large. In the case of quadrigenic interactions, the sum of effects aaaa comprises
individual effects aaaaijkl and the sum of aadd effects comprises
individual effects aaddijkl. When looking at expectations of the marker contrast at marker m for pair means Z1 and pair differences Z2 and assuming rmi = 0 and linkage equilibrium between QTL, we obtain
Thus, the estimate b2*/2 from CIM yields QTL effect estimates that account not only for di* but also for the contribution of daa and aaaa interactions of QTL i with the genetic background. However, compared with the contribution of digenic interactions to MPH, effects of type daa and aaaa are only half accounted for and effects ddd and aadd are not accounted for at all when estimating the heterotic effect at QTL i with CIM. Similar results are obtained for a comparison between genotypic expectations of PD and
that differ mainly in the contribution of effects dda and addd. With arbitrary linkage disequilibrium among QTL the expectations above become rather unwieldy, but it can be shown that bs* explains a larger proportion of all types of higher-order epistatic effects contributing to midparent heterosis or to the parental difference if linkage between QTL is present.
GENETIC CONSTITUTION OF VARIANCES
Generalized derivations of the genetic constitution of variances and
for two QTL are given in the appendix. Contributions of genetic effects to variances of pair means (
) and pair differences (
) for RILs summed over all QTL assuming arbitrary linkage and digenic epistasis are presented in Table 2. If segregating QTL show intermediate linkage, the linkage disequilibrium coefficient Dij and consequently genotypic expectations of
and
differ for RILs and double-haploid lines (DHLs), but deviations are small (see appendix). For unlinked (rij = 0.5) and completely linked (rij = 0) loci, variances for RILs and DHLs are identical.
Contribution of genetic effects to variances of pair means () and pair differences (
) of recombinant inbred lines (RILs) and F2 progenies backcrossed to parental lines P1 and P2, assuming linkage and digenic epistasis
Using the definitions of ai* given in Equation 7 and di* given in Equation 4 and defining the quadratic forms(16)and
(17)the variances
and
can be expressed as shown in Table 2. For a direct comparison with the results obtained for F2 progenies, Table 2 also presents variances
(F2) and
(F2), which can be readily transformed into those presented by Cockerham and Zeng (1996) with σm2 =
(F2) and
=
(F2). Regardless of the type of population used for producing the testcrosses (i.e., RIL or F2), it can be seen that the quadratic forms
and
are the main components of variances
and
respectively. The bias due to digenic epistatic interactions of types aa and dd (
) as well as ad and da (
) is small, especially for F2, as pointed out also by Cockerham and Zeng (1996). Thus, we postulate that
(RIL) and
(RIL) are reasonable approximations for 1/4
and
respectively. As a result, the estimate of
obtained from the ANOVA of design III is a close approximation of the variance of heterotic effects at QTL segregating in the cross P1 × P2.
AVERAGE DEGREE OF DOMINANCE
Originally, design III was devised to provide an estimate of the average degree of dominance over loci calculated from the ratio of dominance to additive variance (). The ratio
proposed by Comstock and Robinson (1952) is equivalent to
and is an approximation of
the ratio of quadratic forms
and
of augmented dominance (di*) and additive (ai*) effects, respectively, rather than of the ratio of dominance and additive variance. Therefore, the ratio
should be denoted the augmented degree of dominance
Estimation of
can be biased by linked QTL, if linkage equilibrium among them has not been reached. Genetic effects at linked QTL contributing to
(i.e.,
have the same sign when loci are in coupling, while signs are different when loci are in repulsion. If two elite parents are crossed and coupling and repulsion linkages occur with equal probabilities, the effects of linked loci are likely to cancel in
However, estimates of
are likely to be inflated because Dij is positive by definition and in hybrid breeding, loci with high positive augmented dominance effects (di*) are favored in reciprocal recurrent selection. Thus, the contribution of genetic effects at linked loci to
(i.e.,
) is generally positive irrespective of their linkage phase and
will be strongly affected by the presence of epistasis and the magnitude of the linkage disequilibrium between QTL. The numerical example in supplemental Table S2 (http://www.genetics.org/supplemental/) clearly shows the effect of aa epistasis on estimates of
for unlinked loci. Altogether, the ratio of variance components obtained with design III is not always a useful estimate of the average degree of dominance, thus questioning the interpretation of many experimental results on the relative importance of dominance, overdominance, and epistasis in the expression of quantitative traits.
DISCUSSION
Genotypic components of trait performance and heterosis:
Elucidating the genetic basis of heterosis has been the aim of a number of studies making use of advances in molecular biology. Some studies compared specific molecular traits such as differential gene expression in the F1 hybrid and the parental inbred lines (e.g., Guo et al. 2006; Swanson-Wagner et al. 2006). A number of authors used testcross progenies of F3 lines or RILs with the parental inbred lines as testers for QTL studies on heterosis (Stuber et al. 1992; Xiao et al. 1995; Li et al. 2001; Luo et al. 2001). Estimation of QTL effects was separately performed for the backcrosses to each parental line, except for a recent study with triple testcross progenies in maize, which used also 2Z1 and Z2 for QTL mapping (Frascaroli et al. 2007). For the backcrosses to the better-performing parent, this corresponds to the marker contrast −H2(m) defined in Equation 15. Thus, with interval mapping (Lander and Botstein 1989) their analyses yielded estimates for QTL contributing to BPH (di* − ai*) for the progenies backcrossed to one parent and for poorer parent heterosis (di* + ai*) for the progenies backcrossed to the other parent. Consequently, conclusions on gene action at individual QTL can be made only within this context.
Semel et al. (2006) used a set of tomato introgression lines to exclude the confounding effects of genetic background variation and epistasis from the analysis of heterosis. Parental genotypes and the F1 hybrid differed exclusively in one defined chromosome segment while the entire genetic background originated from the elite parent. For a multitude of traits, the authors compared the phenotypic means of the elite parent, the homozygous introgression line, and the hybrid between the two. On the basis of these comparisons, the type of gene action at QTL was determined. However, as pointed out earlier, trait performance and heterosis have different genotypic expectations. Employing the quantitative genetic theory derived in this article, it can be shown in terms of the F2 metric that genotypic expectations of QTL effects contributing to MPH and BPH comprise both main and epistatic effects, despite the fact that the entire genetic background originates from the elite parent (Melchinger et al. 2007).
The use of an immortalized F2 population was proposed by Hua et al. (2003) for identifying genomic regions that contribute to MPH. RILs were derived from a heterotic rice cross and intermated for construction of an immortalized F2 population. Heterosis was calculated as the phenotypic deviation of each immortal F2 from the mean of its two RIL parents. Digenic interactions were estimated on the basis of the interaction effect of two marker loci. Genotypic expectations of the QTL main effects and interactions were not given by the authors. On the basis of our findings, we postulate that the immortalized F2 design has great value for estimating the dominance effect di and certain types of digenic epistatic interactions but does not identify QTL with genotypic expectations that equal precisely di*, their contribution to MPH (derivations not shown).
We conclude that none of the available experimental designs of quantitative genetics has the potential to separate the dominance effect di and half the sum of aa epistatic interactions confounded in the augmented dominance effect di* of heterotic QTL. Design III identifies heterotic QTL but does not allow separation of dominance and aa epistasis. For the time being, the only solution to this problem is to identify QTL contributing to heterosis with design III and to estimate the augmented QTL effect di*. In genomic regions exhibiting significant heterosis, the augmented dominance effect di* can be dissected into its components (di and ) by employing additionally the immortal F2 design and estimating the magnitude of dominance effect di. With RILs, the same lines can be used for generating the progenies for design III and the immortal F2 population, and marker data generated for the RILs can readily be employed in both types of analysis. The optimal dimensions of experimental studies applying our approach with respect to population size and population type as well as marker density have yet to be determined. Epistatic interactions of QTL i with the genetic background are expected to vary considerably across RILs. Implications of this biological variation on estimates of di* will be addressed in a separate study.
Epistasis and heterosis:
As has been shown in this study, even small epistatic interactions can be important for the expression of heterosis because their contribution to the genotypic expectation of augmented QTL effects sums up over many effects. On the basis of results from classical quantitative genetic experiments, we have reason to assume that epistasis plays an important role in the inheritance of quantitative traits and heterosis (e.g., Jinks and Jones 1958) but results from QTL studies on the importance of epistasis have been rather ambiguous (e.g., Stuber et al. 1992; Cockerham and Zeng 1996; Frascaroli et al. 2007). In QTL analyses, the power of detecting QTL with epistatic effects is generally low mainly due to the problem of multiple testing in two- or multidimensional genome scans (Lander and Botstein 1989) or due to the necessity of a priori model selection with one-dimensional scans (Kao et al. 1999). We need to keep in mind that QTL with significant epistatic interaction effects might not be representative for the majority of QTL with small effects contributing to gene networks that control the expression of quantitative traits. Hence, we are likely to introduce an ascertainment bias as pointed out by Kroymann and Mitchell-Olds (2005).
Type of epistasis:
When estimating the most prominent type of epistatic interactions in the expression of quantitative traits and heterosis in rice, a preponderance of aa epistatic effects was identified compared with ad or dd interactions (Yu et al. 1997; Hua et al. 2003). In self-pollinated crops, it is well known that coadapted gene complexes are favored by selection. As a consequence, half the sum of mainly positive aa interactions enters di* with a negative sign, thus decreasing MPH and the power for detection of heterotic QTL. It is questionable if the results on heterosis from self-pollinated crops are directly applicable to cross-pollinated crops (Frascaroli et al. 2007). Economic seed production, however, requires the development of inbred lines with high grain yield. Combined with the management of separate heterotic pools, it is highly probable that coadapted gene complexes are selected during inbred line development. As a consequence, if the sum of aa epistatic interactions in the parents increases due to selection, MPH should decrease over time unless the sum of dominance effects at QTL influencing heterosis increases proportionally. A decrease in relative superiority of hybrids compared with their inbred lines has been described for maize (Duvick 1999). This decrease in relative heterosis can be the result of the accumulation of favorable dominant alleles at individual QTL, but it can also be explained by overdominance in the presence of aa epistatic effects contributed by the parents. If this is the case, the outcome of marker-assisted selection programs aiming at the transfer of QTL for maximization of heterosis will strongly depend on the presence of favorable epistatic interactions with the genetic background in the respective germplasm and will be difficult to predict.
In conclusion, the results presented here are important in two ways. First, they provide the quantitative genetic theory to express heterosis as the sum of individual QTL effects. Second, they allow the assessment of epistatic interactions of individual QTL with the entire genetic background, thus extending the concept of epistasis from single-gene to system-level interactions. We suggest the use of CIM and design III with RILs to identify QTL expressing maximum heterosis (i.e., maximum di*). All analyses can be performed with an extended version of the software PLABQTL (Utz and Melchinger 1996; http://www.uni-hohenheim.de/plantbreeding/software/index.html). Permutation tests for determining the significance threshold (Doerge and Churchill 1996) and cross-validation for unbiased QTL estimation (Utz et al. 2000) can be readily applied. Using the joint likelihood-ratio test for augmented effects ai* and di* will improve identification of heterotic QTL from elite × exotic crosses and provide a first test to distinguish between dominance and overdominance. Applying models accounting for multilocus epistasis and using molecular tools to finely dissect genomic regions contributing to heterosis will allow an assessment of the relative contribution of epistatic interactions in the manifestation of heterosis.
APPENDIX: GENERAL DERIVATION OF EXPECTATIONS, VARIANCES, AND COVARIANCES OF PAIR MEANS (Z1) AND PAIR DIFFERENCES (Z2) AS WELL AS MARKER CONTRASTS Z1(m) AND Z2(m)
Let denote the coefficient of parameter
AD in the conditional genotypic expectation of the testcross progeny Ht (t = 1, 2) of a RIL with genotype vi at the ith QTL. Then, we obtain from Equation 1
(A1)with
and
Let E denote the vector of genetic effects AD for two QTL i and j under digenic epistasis; i.e.,
and
denote the vector of conditional genotypic expectations of testcross progeny Ht (t = 1, 2) with design III given the parental RIL has genotype vivj (with vi = 0, 2; vj = 0, 2). Then, Gt = HtE with elements of the matrices Ht equal to coefficients calculated according to Equation A1.
By calculating with pst equal to the stth element of
for design III, we get
where
denotes the vector of conditional genotypic expectations of Zs, given the genotype of the parental RIL.
To simplify formulas, to allow extension to multiple QTL, and to provide a generalized formula for RILs and DHLs, we use the parameter Dij to quantify linkage disequilibrium between loci i and j. Dij can be calculated from the recombination frequency rij between loci i and j by the formulasand
with g being the number of random-mating generations prior to selfing for the development of RILs or production of DHLs (Frisch and Melchinger 2006).
Under the assumption of Mendelian segregation we defineand
Expectations, variances, and covariances of Zs are given byand
Expectations of marker contrasts Z1(m) and Z2(m) for the case of two QTL i and j linked to marker locus m with marker classes u (u = 0, 2) are given as follows. We define the parental genotypes as 0m0i0j (P1) and 2m2i2j (P2). Recombination frequencies between the three loci are denoted rmi, rmj, and rij, respectively. The frequencies of the four possible QTL genotypes (ij = 22, 20, 02, 00) conditional on the marker genotype u at the marker locus m are given for RILs and DHLs in Table A1. Thus, we obtain the vector
and the conditional expectations of linear functions Zs can be calculated as Zs|m
From this, we obtain the orthogonal marker contrasts Zs(m) = Zs|m(u = 2) − Zs|m(u = 0) = (
(u = 2) −
(u = 0))
Frequencies (fij|m) of the four possible QTL genotypes of recombinant inbred line (RIL) parents at QTL i and j conditional on the marker genotype u at the marker locus m calculated using the linkage disequilibrium parameter D
Acknowledgments
We thank two anonymous reviewers for their valuable contributions. This project was supported by the Deutsche Forschungsgemeinschaft (German Research Foundation) under the priority research program “Heterosis in Plants” (research grants ME931/4-1 and ME931/4-2, PI 377/7-1 and PI 377/7-2).
Footnotes
Communicating editor: J. B. Walsh
- Received June 13, 2007.
- Accepted August 28, 2007.
- Copyright © 2007 by the Genetics Society of America