# Unraveling Additive from Nonadditive Effects Using Genomic Relationship Matrices

- Patricio R. Muñoz*,
^{†},^{1},^{2}, - Marcio F. R. Resende Jr
^{†},^{‡},^{1}, - Salvador A. Gezan
^{†}, - Marcos Deon Vilela Resende
^{§},**, - Gustavo de los Campos
^{††}, - Matias Kirst
^{†},^{‡‡}, - Dudley Huber
^{†}and - Gary F. Peter
^{†},^{‡‡},^{2}

^{*}Agronomy Department, University of Florida, Gainesville, Florida 32611^{†}School of Forest Resources and Conservation, University of Florida, Gainesville, Florida 32611^{‡}Genetics and Genomics Graduate Program, University of Florida, Gainesville, Florida 32611^{‡‡}University of Florida Genetics Institute, University of Florida, Gainesville, Florida 32611^{§}EMBRAPA Forestry, Estrada da Ribeira, Colombo, PR 83411-000 Brazil^{**}Department of Forest Engineering, Universidade Federal de Viçosa, Viçosa, MG 36571-000 Brazil^{††}Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama 35294

- 2Corresponding authors: School of Forest Resources and Conservation, P. O. Box 110410, University of Florida, Gainesville, FL 32611-0410. E-mail: gpeter{at}ufl.edu; and Agronomy Department, P. O. Box 110965, University of Florida, Gainesville, FL 32611-0965. E-mail: p.munoz{at}ufl.edu

## Abstract

The application of quantitative genetics in plant and animal breeding has largely focused on additive models, which may also capture dominance and epistatic effects. Partitioning genetic variance into its additive and nonadditive components using pedigree-based models (P-genomic best linear unbiased predictor) (P-BLUP) is difficult with most commonly available family structures. However, the availability of dense panels of molecular markers makes possible the use of additive- and dominance-realized genomic relationships for the estimation of variance components and the prediction of genetic values (G-BLUP). We evaluated height data from a multifamily population of the tree species *Pinus taeda* with a systematic series of models accounting for additive, dominance, and first-order epistatic interactions (additive by additive, dominance by dominance, and additive by dominance), using either pedigree- or marker-based information. We show that, compared with the pedigree, use of realized genomic relationships in marker-based models yields a substantially more precise separation of additive and nonadditive components of genetic variance. We conclude that the marker-based relationship matrices in a model including additive and nonadditive effects performed better, improving breeding value prediction. Moreover, our results suggest that, for tree height in this population, the additive and nonadditive components of genetic variance are similar in magnitude. This novel result improves our current understanding of the genetic control and architecture of a quantitative trait and should be considered when developing breeding strategies.

- Genomic selection
- G-BLUP
- nonadditive
- realized relationship matrices
- dominance relationship matrix
- GenPred
- shared data resource

QUANTITATIVE genetics and its applications in plant and animal breeding have largely focused on additive models. Under idealized conditions, such as those described by Cockerham (1954) and Kempthorne (1954), genetic values due to additive and nonadditive effects are orthogonal. However, these conditions are often not met in breeding populations, with the consequence that genetic values due to additive and nonadditive effects may be confounded. Under these conditions, a large proportion of variance due to interactions of alleles (dominance and epistasis) can manifest as additive variance (Hill *et al.* 2008). For the same reason, with most commonly used family structures, it is difficult to dissect genetic variance into additive, dominance, and epistatic effects. With standard pedigree models, variance estimates of these elements are highly correlated, reflecting confounding effects (Lynch and Walsh 1998; Hill 2010). The proportion of additive variance attributable to interactions of alleles largely depends on the distribution of allele frequencies at causal loci (Lu *et al.* 1999; Hill *et al.* 2008; Zuk *et al.* 2012). This affects the estimation of variance components and breeding value (BV) predictions (Vanderwerf and Deboer 1989; Palucci *et al.* 2007), as well as the ability to dissect the genetic architecture of the trait at the causal level. Understanding the genetic architecture of a trait is also useful for defining breeding strategies and for maximizing genetic gains. For instance, individual genetic differences due to nonadditive effects can be exploited by designing mating schemes that maximize favorable allelic combinations, particularly if family or clonal propagation are possible in the breeding program.

Separation of additive and nonadditive genetic components with standard pedigree-based models requires specific family structures, which are commonly available in plant or animal breeding programs. In practice, estimation of variance due to dominance and additive effects involves mating designs with large numbers of close, typically full-sib relatives. Partitioning epistasis requires, in addition, either inbreds or vegetatively propagated (clonal) populations. In perennial plants, inbreds are not used because of their long generation time and because severe inbreeding depression often occurs. Thus, clonal populations are an alternative to explore the full genetic architecture in these species (Foster and Shaw 1988). Several studies aimed at partitioning genetic variance into its various components detected small dominance and negligible epistatic effects (Foster and Shaw 1988; Mullin *et al.* 1992; Wu 1996; Isik *et al.* 2003, 2005; Costa E Silva *et al.* 2004, 2009; Baltunis *et al.* 2007, 2008, 2009; Araujo *et al.* 2012). These results do not necessarily imply that such effects are not important. Instead, the contribution of nonadditive effects may be masked by effects due to the distribution of allele frequencies (*e.g.*, Hill *et al.* 2008). These results may also reflect the limitations imposed by the data/family structure available or the genetic information used (pedigrees), which only allows for estimation of the expected degree of genetic similarity.

Genome-wide genotypic data can identify, with a high level of certainty, the actual fraction of allele sharing between pairs of individuals. In pedigree-based genetic relationships, each element in the numerator relationship matrix (**A** matrix) is defined as the expected fraction of shared alleles assuming an infinitesimal model. However, due to Mendelian sampling, the values of the realized genomic relationships (**A _{G}** matrix), constructed from molecular marker information, deviate from their expected value (Vanraden 2008; Hill and Weir 2011). One way of incorporating molecular marker information for prediction of genetic values consists of replacing, in a genomic best linear unbiased predictor (BLUP) analysis, the pedigree-based relationship matrices (P-BLUP) with marker-based counterparts (or genetic, G-BLUP) (Vanraden 2008). Indeed, G-BLUP is one of the most frequently used methods that combine molecular information to predict BVs and has shown remarkably good predictive performance in animal and plant breeding populations (Hayes

*et al.*2009; Habier

*et al.*2010; Veerkamp

*et al.*2011; de los Campos

*et al.*2012; Heslot

*et al.*2012).

Genomic BLUP is a well-known and easily understood methodology. In the context of genome-wide selection (GWS), it is equivalent to ridge regression BLUP (RR-BLUP) (Vanraden 2008; de los Campos *et al.* 2012). Similar to P-BLUP, G-BLUP can be extended to account for nonadditive effects by replacing pedigree-based relationship matrices due to nonadditive effects (Mrode 2005), with their marker-based counterpart. This is because dominance and epistatic interaction (*e.g.*, additive by additive, dominance by dominance, and additive by domiance) relationship matrices can also be constructed using molecular information, as is currently done with **A _{G}**. Use of dominance and epistasis matrices of realized genetic relationships may increase the precision of estimates derived from data in poorly structured populations and may also increase the power to dissect genetic variance into components due to main and interaction effects.

Evidence indicates that G-BLUP based on **A _{G}** yields more accurate predictions of breeding value and of future phenotypes than its pedigree-based counterpart (

**A**) (Vanraden 2008; de los Campos

*et al.*2009; Hayes

*et al.*2009; Crossa

*et al.*2010; Heslot

*et al.*2012; Resende

*et al.*2012b; Muñoz

*et al.*2013). This suggests that use of the realized genomic similarity (

**A**) increases (relative to

_{G}**A**) the ability of the model to uncover the genetic components of the phenotypic data. However, it is not clear whether the power to partition genetic variance into additive and nonadditive components can also be improved by the use of the realized genomic relationships. If so, this would lead to a finer dissection of the genetic architecture of complex traits that could have profound impacts on the future design and implementation of breeding strategies. The objective of this study is to assess the extent to which the use of marker-based additive and nonadditive relationship matrices improves the precision of partitioning genetic variance into its components. For this assessment, tree height from a clonal population of

*Pinus taeda*L. was evaluated with a series of models that account for additive, dominance, and first-order epistatic interactions (additive by additive, dominance by dominance, and additive by dominance) implemented with either pedigree or molecular marker information.

## Materials and Methods

### Data

Field data from a single experimental trial from the CCLONES population (see Baltunis *et al.* 2007 and Resende *et al.* 2012a for details) was used in this study. The response variable total tree height (HT, m) was used. The population was generated by crossing 32 parents in a circular mating design with additional off-diagonal crosses, resulting in 70 full-sib families with an average of 13.5 individuals per family. Each individual was clonally replicated (ramet) and a clonal field trial was established using single-tree plots with eight replicates (one ramet per replicate), in a resolvable alpha-incomplete block design (Williams *et al.* 2002). Four of the replicates were grown under high intensity management while the rest were under an operational-like regime.

A subset of the CCLONES population, composed of 951 individuals from 61 families, was genotyped using the Illumina Infinium platform (Illumina, San Diego (Eckert *et al.* 2010) with 7216 SNPs, each representing a unique pine EST contig. A total of 4853 SNPs were polymorphic and were used for further analyses.

### Relationship matrices

A marker-based additive relationship matrix (**A _{G}**) was constructed following the method described by Yang

*et al.*(2010). The pairwise relationship for individuals

*j*and

*k*was defined bywhere

*m*is the total number of markers,

*w*is an indicator variable representing the number of copies of a given allele, and

*p*is the observed allele frequency of the

_{i}*i*th SNP. To reduce the sampling variance of the entries of

**A**

_{G}*, we expanded the formula proposed by Yang

*et al.*(2010) and adjusted each value of the

**A**

_{G}* matrix by shrinking it toward its expectation. The the adjusted

**A**

_{G}was obtained aswhere

*C*represents all values of

_{jk}**A**that belong to the same class in

_{G}**A*(

_{jk}*e.g.*, full-sib individuals, where

*A*= 0.5). The resulting

_{jk}**A**was used to correct the original pedigree as previously detailed (Muñoz

_{G}*et al.*2013), and it was verified that the estimated genomic coefficients and their standard deviations were within expectations according to Simeone

*et al.*(2011).

In addition, a molecular marker-based dominance relationship matrix (**D _{G}**) was constructed. To build a dominance relationship matrix, we created an incidence matrix (

**S**) for effects due to dominance

**S**= {

*s*}, where

_{ij}*s*was parameterized to be coded 1 if the genotype was heterozygous and 0 if the marker genotype was homozygous for either class. The matrix

_{ij}**S**was further standardized to have mean zero by using:

*s*= 1 − 2

_{ij}*p*if the individual is heterozygous,

_{j}q_{j}*s*= 0 if the individual has missing data, and

_{ij}*s*= 0 − 2

_{ij}*p*otherwise.

_{j}q_{j}Using the above, we expanded the theory for the **A _{G}** matrix to construct

**D**aswhere the denominator is the sum of the variances of

_{G}*s*under Hardy–Weinberg equilibrium, and the other terms were previously defined. This extension to construct

_{ij}**D**was also used by Su

_{G}*et al.*(2012). Another parameterization has been proposed for the dominance genomic relationship matrix (Vitezica

*et al.*2013), which generates a different partition of the genetic variance. We also evaluated this parameterization and included the results in the supporting material.

Pedigree-based relationship matrices for additive (**A**) and dominance (**D**) effects were computed using standard methods (Lynch and Walsh 1998; Mrode 2005). Following existing theory (Cockerham 1954; Kempthorne 1954; Henderson 1985; Gianola and de los Campos 2008), the covariance matrices due to first degree epistatic terms were computed using Hadamard products (*i.e.*, cell-by-cell product denoted as #) of the following form: (i) additive-by-additive interactions (**A#A** or **A _{G}#A_{G}**); (ii) dominance-by-dominance interactions (

**D#D**or

**D**); and (iii) additive-by-dominance interactions (

_{G}#D_{G}**A#D**or

**A**) for pedigree and marker-based methods, respectively.

_{G}#D_{G}### Genetic analyses

All analyses were carried out in the software ASReml v3.0 (Gilmour *et al.* 2009), which fits mixed models with complex datasets using sparse matrix methods. ASReml is equipped with the residual maximum likelihood (REML) for variance component estimation using the average information algorithm (Gilmour *et al.* 1995).

Five models were fit using the pedigree-based matrices (models 1–5) and five using the marker-based matrices (models 6–10). These models range from a simple additive model to a full model including additive, dominance, and epistatic effects. The full model (*i.e.*, model 5 or 10) is described below:where **y** is the phenotypic HT response, **β** is a vector of fixed effects (*i.e.*, silvicultural treatment and replicate), **i** ∼ *N*(**0**, **I**σ^{2}_{i}) is a vector of the random incomplete block effects within replication, ** a** ∼

*N*(

**0**,

**C**σ

_{1}^{2}

_{a}) is a vector of random additive effects of individuals and

**C**is a relationship matrix due to additive effects either from pedigree (

_{1}**A**) or markers (

**A**),

_{G}**t**∼

_{1}*N*(

**0**,

**C**⊗

_{1}**I**σ

^{2}

_{t1}) is a vector of random additive by silviculture-type interactions,

**d**∼

*N*(

**0**,

**C**σ

_{2}^{2}

_{d}) is a vector of random individual dominance effects and

**C**is a relationship matrix due to dominance effects that was computed either from pedigree (

_{2}**D**) or markers (

**D**),

_{G}**t**∼

_{2}*N*(

**0**,

**C**⊗

_{2}**I**σ

^{2}

_{t2}) is a vector of random dominance by silviculture-type interactions, ∼

*N*(

**0**,

**C**σ

_{1}#C_{1}^{2}

_{iaa}) is either a vector of random additive-by-additive interaction, a vector of random dominance-by-dominance interactions ∼

*N*(

**0**,

**C**σ

_{2}#C_{2}^{2}

_{idd}), or a vector of random additive-by-dominance interactions ∼

*N*(

**0**,

**C**σ

_{1}#C_{2}^{2}

_{iad}), and

**e**∼

*N*(

**0**,

**I**σ

^{2}

_{e}) is a vector of random residual effects. Above, matrices

**X**and

**Z**–

_{1}**Z**are incidence matrices for fixed and random effects, respectively, and

_{6,}**I**denotes an identity matrix, and ⊗ and # represent the Kronecker and Hadamard (cell by cell) product, respectively.

Under the above model, the narrow-sense heritability can be estimated as the dominance to total variance ratio as the epistatic to total variance ratio as and the broad-sense heritability as The is the estimated additive variance, is the estimated dominance variance, and , , and are the total phenotypic, epistatic, and total genetic variance, respectively, that changed accordingly to the model being fit (Table 1).

### Model comparisons

Models were compared using the Akaike information criterion (AIC) (Akaike 1974). Precision of variance components estimates, and their dependency, was assessed using the asymptotic variance–covariance matrix of estimates of variance parameters (**V**). The asymptotic sampling correlation matrix of estimates (**F**) was computed as where **L** is a diagonal matrix containing the diagonal elements of **V**. Inspection of the off-diagonal elements of the **F** matrix allows assessing sampling correlation of variance estimates. To have an overall assessment of dependency between the estimates, eigenvalues of **F** were examined. The standard error of the prediction (SEP) was estimated for each model as the square root of the prediction error variance (PEV), which is obtained by extracting the elements of the diagonal of the generalized inverse of the coefficient matrix from the linear mixed model equations (left hand side), and scaled by the error variance. In short form, the PEVs correspond to the (Mrode, 2005, p. 51).

Predictive ability and stability of the models in estimating breeding and genetic values were evaluated. The predictive ability of a model’s breeding value was defined as the correlation between the estimated breeding value and the phenotypic average of all the ramets (clones). These values were calculated when all the data were used without cross-validation (BV-all). The predictive ability of a model’s total genetic value (sum of BV-all, dominance effect, and epistatic effect) was defined as the correlation between the predicted total genetic value and the phenotype average of all the clones using all the data without cross-validation (GV-all). Prediction models were assessed under cross-validation (Kohavi 1995) to obtain predicted breeding value (BV-cv) and predicted total genetic value (GV-cv) with a random sub-sampling partitioning, fixed for all models. The stability of the predictive models were evaluated as the correlation between the BV-all and BV-cv, and between GV-all and GV-cv, and was defined as a measure of the dependency of the predictive breeding value on the phenotype. The mean square error (MSE) was calculated between BV-all and BV-cv within each model using standard methods. Finally, the capacity of the model to predict ranking position of the top 10% of the individuals, simulating a selection scenario, was evaluated as the correlation between the ranking position using the BV-all and the ranking position using the BV-cv.

## Results

The genetic parameters and goodness-of-fit statistics, estimated for each model, are summarized in Table 2. Both P_A and M_A models had narrow-sense heritability (*h*^{2}) >0.30. After including the dominance effect in the pedigree-based model (P_AD), *h*^{2} decreased by ∼26% and the dominance ratio (*d*^{2}) estimate was small (0.06) and nonsignificant (2 × SE(*d*^{2}) > 0.06). When the dominance effect was included with the molecular marker-based model (M_AD), the *h*^{2} decreased 47%, to 0.20, and *d*^{2} increased to 0.12. With the M_AD model, the dominance variance represents 60% of the additive value and 39% of the total genetic variation. We further extended these models to include the additive-by-additive, dominance-by-dominance, and additive-by-dominance first-order epistatic interaction factors in three separate models. In pedigree-based models, P_A#A, P_D#D, and P_A#D, the estimations of variance components for additive and dominance varied only slightly from those of the P_AD model (Supporting Information, Table S1). Moreover, epistasis estimates were zero in all three models. When the additive-by-additive, dominance-by-dominance, and additive-by-dominance interactions were added (models M_A#A, M_D#D, and M_A#D), the narrow-sense heritability dropped by >30% and the dominance ratio by 80%, compared to the M_AD model. The epistatic ratio (*i*^{2}) was estimated at 0.15, 0.12, and 0.14 for the M_A#A, M_D#D, and M_A#D models, respectively (Table 2). The alternative parameterization for the dominance genomic relationship matrix proposed by Vitezica *et al.* (2013) showed similar results regarding the partition of additive and nonadditive effects (Table S2).

Goodness-of-fit statistics show that inclusion of nonadditive effects improved slightly the model fit for pedigree-based models and substantially for marker-based models (Table 2). The marker-based models M_A#D and M_D#D yielded the best fit of the data; however, fitting differences among the more complex models were small. Thus, the dependency of the random component estimates was evaluated to further differentiate the best model.

We studied the sampling correlation among the variance component estimates, to assess which of the nine models shows less dependency and thus partitioned the genetic variance better (Table S3). Figure 1 shows the cumulative proportion of variance explained by high order eigenvalues of the sampling variance–covariance matrix of estimates derived from models including additive-plus-dominance, additive-by-additive, dominance-by-dominance, and additive-by-dominance epistatic interactions, for pedigree- and marker-based models. As reference, the distribution of eigenvalues for a perfect orthogonal correlation matrix, representing the ideal model (all of the eigenvalues equal to 1) is included. In all cases, the marker-based cumulative distributions are closer to the orthogonal distribution, suggesting less dependency between estimates of variance components. Indeed, the sampling correlation between estimates of variance components due to additive and dominance effects decreases in absolute value from 0.90 with the P_AD to 0.70 with the M_AD model (Table S2). In general, all the marker-based models that include epistasis outperform their pedigree-based counterpart (Figure 1, B–D). Models M_D#D and M_A#D showed the smallest sampling correlations between additive and dominance/epistasis, with absolute correlation values <0.45 (Table S3).

The standard error of the predictions (SEP) of BV and dominance value (DV) were compared for the pedigree and markers models including additive by additive, dominance by dominance, and additive by dominance (Figure 2). Values <45° reference line indicate that marker-based models have smaller SEPs. The SEPs for BVs from the marker-based models were smaller than the pedigree-based models in 99.2% of the cases (Figure 2, A–C). In the case of the SEPs of DVs, a clear advantage was observed for marker-based models (*y*-axis) over pedigree-based models (*x*-axis), with SEP on average 52% lower for the marker-based models (Figure 2, D–F).

The predictive ability of breeding value and genetic value for the pedigree-based and marker-based models are shown in Table 3. The highest predictive ability for BV was obtained with the pedigree additive model (P_A). A slight decrease in the BV prediction ability was observed when nonadditive effects were included in the pedigree-based model (0.86), and a much larger decrease was observed for the marker-based models (0.76). All models evaluated reached similar GV predictive ability.

Predictive stability can be viewed as a measure of how much the prediction of the breeding value and genetic value using all the data (BV-all and GV-all) depend on the individual phenotype (Table 3). Predictions based on models with markers are more stable than those derived from pedigree models (3% increase when comparing M_A to P_A). In the pedigree-based models, inclusion of nonadditive effects increased the stability to predict BV by 13, 14, and 14% for P_A#A, P_D#D, and P_A#D, respectively, while inclusion of nonadditive effects in the more complex marker-based models, increased the BV prediction stability by >29% when compared with the M_A and by >33% when compared to P_A. The mean square error (MSE) decreased by ∼50% from the additive models (P_A) to the more complex pedigree-based models. The addition of nonadditive effects to the marker-based models decreased the MSE even further by >68% and up to 94% decrease in the case of model P_A#A (Table 3).

In a breeding program, it is important to predict the trend and magnitude of the complete set of individuals in the population; however, it is often more important to predict the best performing individuals (potential selections). Here we ranked all individuals based on BV-all and BV-cv and evaluated the ranking correlation of the top 10%, emulating the selection of the top 10% of genotypes (Table 3). When the pedigree-based matrix was replaced by the marker-based matrix in the additive models (P_A and M_A), the capacity to predict the top 10% remained the same. However, this capacity increased substantially for the more complex marker-based models where the predictive stability of the top 10% of genotypes increased 82–170% (Table 3).

## Discussion

Here we assessed the use of marker- and pedigree-based models to separate additive from nonadditive variances for height, in a structured population of loblolly pine. We showed that the two approaches are dramatically distinct in their capacity to properly partition the genetic variance into its various components, with marker-based models being significantly more effective in accounting for nonadditive variances. In the pedigree-based models, inclusion of nonadditive effects decreased the estimated narrow-sense heritability by 26%. This result is expected because depending on the distribution of allele frequencies, a sizable proportion of variance due to nonadditive effects can be manifest as additive variance (Lu *et al.* 1999; Zuk *et al.* 2012). In marker (pedigree) models 71% (57%) of the decrease in additive variance was captured by the dominance variance, suggesting that indeed, dominance is making a substantial contribution to the estimated additive variance obtained when dominance is ignored. This phenomena has been postulated theoretically (*e.g.*, Falconer and Mackay 1996, p. 126) and observed in multiple studies (Wei and Van Der Werf 1993; Winkelman and Peterson 1994; Rodriguez-Almeida *et al.* 1995; Pante *et al.* 2002). In addition, when pedigree-based models included nonadditive effects, the conclusions were not different from what has been commonly observed that nonadditive effects represent a small fraction of the total genetic variation (Isik *et al.* 2003; Costa E Silva *et al.* 2004; Baltunis *et al.* 2007; Araujo *et al.* 2012). In contrast, marker-based models with additive and nonadditive effects yield a substantially different variance partitioning than their counterparts using the pedigree models. The additive variance decreased as dominance was included in the model and it further decreased when dominance and epistasis were considered. These models indicate that, for this population and trait, nonadditive effects are as important as additive effects, and dramatically larger than predicted by the pedigree-based matrices. These changes in the magnitude of variance components have already been observed when the relationship matrix derived from markers is used instead of the pedigree-based relationship matrix, in the context of additive genomic selection models (Lee *et al.* 2010).

The value for AIC varied modestly for the best models, with no clear advantage of one model relative to the others. This is not surprising; if additive effects capture part of the effects due to dominance and epistasis, the additive model should not suffer much if these components are omitted. However, these models varied considerably in the partitioning of the genetic variance components, thus changing not only the inference but also the potential decisions taken in the breeding strategy.

We also assessed the dependency of the random effects estimates to discriminate the best model, given the small differences for AIC in the different models. The level of confounding between components was very different in the pedigree- and marker-based models. The most unambiguous dissection of the genetic variance occurs when estimates of variance components are uncorrelated, *i.e.*, the sampling correlation among the model effects will be closer to zero and all eigenvalues of this correlation matrix close to one (Hill 2010). In the models that included additive and dominance, and additive, dominance, and epistatic effects, the correlation matrices indicated that those derived from molecular markers partitioned the genetic effects more precisely, although the partition is still not fully orthogonal. The parameterizations of these paired models were identical, except for the origin of the relationship matrices (pedigree or marker based). The limited capacity of pedigree-based models to partition these components is not surprising, as all relationship matrices are derived from the pedigree additive relationship matrix (Mrode 2005) and, therefore, are strongly correlated (Visscher 2009). The models M_A#A, M_D#D, and M_A#D had the lowest correlation between additive and nonadditive, showing a partition substantially better than that of pedigree-based models (Table S3). These results support the finding that pedigree-based models are inadequate in separating the additive from nonadditive effects, as their results are comparable to those of additive models (Hill *et al.* 2008). On the other hand, the use of the matrix derived from markers has already been related to a better capacity to separate random effects in a model (Lee *et al.* 2010). Thus, we conclude that the use of the marker-based relationship matrices increase substantially the capacity to separate additive and nonadditive genetic effects.

Assessment of prediction accuracy further support the conjecture that in pedigree-based models, additive components can capture a large proportion of the variance due to interaction terms (Hill *et al.* 2008). Consequently, there is limited gain in this scenario, by including nonadditive effects. On the other hand, for marker-based models, the ability to predict the mean phenotype with the BV decreased when nonadditive effects were included in the model, and the maximum predictability (0.89) was only reached when additive and nonadditive values were considered together (GV). This indicates that pedigree-based models potentially overestimate the additive effects, which is likely to be due to an inflated additive variance estimate that also represents some of the nonadditive components. Inflation from epistasis, for example, falls apart as recombination breaks down favorable combinations of alleles. This is a problem for breeding programs because overestimates of BV inflates genetic gains, but the portion due to nonadditive effects is transient and cannot be captured if controlled sexual reproduction is used. Additionally, the genetic architecture of the trait will be predicted to be simpler than it actually is.

In breeding programs the true breeding value is never known. Thus, the prediction models including all the available data (BV-all) is usually used as the best BV estimation. We evaluated the stability of BV estimates by comparing the results obtained with all data, with the results from cross-validation for pedigree- and marker-based models. This is a measure of the influence of an individual’s phenotype on the predicted breeding value. We observed that models with nonadditive relationship matrices are more stable and produce estimates of breeding values in independent sets that are more similar to the BV-all. The inclusion of nonadditive relationship matrices yields models that predict BVs more stably than additive pedigree-based (animal model) and marker-based models (traditional G-BLUP). In addition, in this cross-validation scheme, the MSE of the model M_A#A decreased >15- and 8-fold when compared with both additive models and full pedigree-based models, respectively. These results indicate that, for this trait, a considerable increase in the stability cannot be reached simply by replacing the **A** matrix by the **A _{G}** matrix but also needs to incorporate nonadditive effects in the model.

Overall, our study supports the hypothesis that additive effects can capture a large proportion of the genetic variance from dominance and epistasis. This is in part due to the fact that, in breeding populations, additive and nonadditive genetic components are not typically independent. However, we also show that with relationship matrices derived from markers, the genetic variances were partitioned more precisely than using only pedigree information. Moreover, our estimates suggest that in this population, for tree height, the additive and nonadditive components of the genetic variance are similar in magnitude. While further research is needed in other species, traits, and populations, we show that variance estimates can be inadequately estimated if only pedigree information is used. This study improves our current understanding of the genetic control and architecture of a quantitative trait and should be considered when developing effective breeding strategies.

## Acknowledgments

The authors thank members of the Forest Biology Research Cooperative at the University of Florida for their support in establishing, maintaining, and measuring the field trial used in this study. The work was supported by the National Science Foundation Plant Genome Research Program (award no. 0501763), the Foundational Program (award no. 2013-67013-21159), the Department of Energy (award no. 2013-67009-21200), and the Plant Breeding and Education Program (award no. 2010-85117-20569) from the US Department of Agriculture, National Institute of Food and Agriculture.

## Footnotes

Supporting information is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.114.171322/-/DC1.

*Communicating editor: N. Yi*

- Received July 1, 2014.
- Accepted October 4, 2014.

- Copyright © 2014 by the Genetics Society of America