## Abstract

There is recent evidence from laboratory experiments and analysis of livestock populations that not only the phenotype itself, but also its environmental variance, is under genetic control. Little is known about the relationships between the environmental variance of one trait and mean levels of other traits, however. A genetic covariance between these is expected to lead to nonlinearity between them, for example between birth weight and survival of piglets, where animals of extreme weights have lower survival. The objectives were to derive this nonlinear relationship analytically using multiple regression and apply it to data on piglet birth weight and survival. This study provides a framework to study such nonlinear relationships caused by genetic covariance of environmental variance of one trait and the mean of the other. It is shown that positions of phenotypic and genetic optima may differ and that genetic relationships are likely to be more curvilinear than phenotypic relationships, dependent mainly on the environmental correlation between these traits. Genetic correlations may change if the population means change relative to the optimal phenotypes. Data of piglet birth weight and survival show that the presence of nonlinearity can be partly explained by the genetic covariance between environmental variance of birth weight and survival. The framework developed can be used to assess effects of artificial and natural selection on means and variances of traits and the statistical method presented can be used to estimate trade-offs between environmental variance of one trait and mean levels of others.

IN the classic model of quantitative genetics, the phenotype is assumed to be the sum of an effectively infinite number of genes and environmental effects; *i.e.*, (Falconer and Mackay 1996; Lynch and Walsh 1998). The phenotypic variance is the sum of the genetic () and environmental () variance. It is assumed that the environmental variance is homogeneous across genotypes, but recently there is evidence from laboratory experiments and analysis of livestock populations that not only the phenotype itself, but also its , is under genetic control (Hill and Mulder 2010). This might be expressed as genetic heterogeneity in the within-individual variance of a trait with repeat measurements, such as weight of individual piglets in a litter or total litter weight across parities, or as a difference in within-family variance for traits such as juvenile body weight expressed once.

The median genetic coefficient of variation for ( where is additive genetic variance in ) is ∼0.3 based on a review of 14 studies with estimates of genetic variance in primarily on livestock populations (Hill and Mulder 2010). This indicates that could be increased or decreased by 30% if changed by one genetic standard deviation. These analyses were mainly based on genetic analysis of pedigreed livestock populations; but with 300 isofemale lines of *Drosophila melanogaster*, thereby avoiding possible confounding by within-line genetic heterogeneity, Mackay and Lyman (2005) showed a significant genetic variance in CV and . In a selection experiment in rabbits, selection on estimated breeding value for of birth weight successfully changed the within-litter variability of birth weight (Garreau *et al.* 2008). There are more recent data on genetic control of at the level of individual loci from genome-wide association studies (see review by Geiler-Samerotte *et al.* 2013). Those in experimental species are very clear cut: for example, Jimenez-Gomez *et al.* (2011) and Nelson *et al.* (2013) located QTL for phenotypic variability in gene expression in *Arabidopsis* and yeast, respectively, and Shen *et al.* (2012) identified regions controlling the phenotypic variability of traits in *Arabidopsis*. In vertebrates, the FTO genotype was shown to be associated with phenotypic variability in body mass index in humans, with the two homozygotes differing 7% in standard deviation (SD) (Yang *et al.* 2012). In laying hens, a QTL region on chromosome 4 was identified that explained 30% and 16% of the genetic variance in mean and within-individual SD, respectively, of egg weight (Wolc *et al.* 2012); and in dairy cattle the phenotypic variability of somatic cell score, an indicator for mastitis, was shown to be determined by many loci, while that with the largest effect explained ∼3% of the genetic variance in (Mulder *et al.* 2013a). It has been argued that findings such as those of Yang *et al.* (2012) can also be explained in terms of a scale effect (Sun *et al.* 2013), but the extreme transformations required are not biologically plausible (Shen and Rönnegård 2013).

In agriculture, there is interest in reducing phenotypic variance to improve uniformity of animals and animal products. For instance, penalties are imposed by slaughterhouses for meat animals delivered with excessive variability because uniform products are demanded by retailers and packing efficiency of products is higher (Hennessy 2005). It has been shown both in theory (SanCristobal-Gaudy *et al.* 1998; Sorensen and Waagepetersen 2003; Mulder *et al.* 2007, 2008) and from experiment (Garreau *et al.* 2008) that uniformity can be increased by selection.

Before implementing selection to reduce in livestock breeding programs it is necessary to know what correlated changes in means of the target trait and other traits of interest would be expected. A particular concern is that animals with small would not have the capacity to react to unpredictable environments, *i.e.*, have reduced plasticity. There is some evidence that plasticity and are positively correlated, *i.e.*, that genotypes with higher have higher plasticity (Mulder *et al.* 2013b; Tonsor *et al.* 2013). Whether plastic genotypes with high or stable genotypes with small are better capable of handling unpredictable environments is under debate. In livestock little is known about trade-offs between and means of other traits, except that uniformity of birth weight in litter-bearing animals does increase the survival of the offspring, *e.g.*, in pigs (Kapell *et al.* 2011) and rabbits (Garreau *et al.* 2008).

Genetic variation in of a trait has been shown analytically to lead to a nonlinear relationship between the breeding value of and the phenotypic deviation from the mean (Mulder *et al.* 2007). Therefore, if there is a genetic correlation between the mean of one trait and of another, which we term an M-V genetic correlation between traits, the phenotypic relationship between them is expected to be nonlinear. Nonlinear parent–offspring regression and nonlinear regression of one trait on another within half-sib groups have been employed to explore nonlinearity between traits by Sölkner and Fuerst-Waltl (1996) and Fuerst-Waltl *et al.* (1997, 1998), who assumed this was caused by nonlinearity between additive genetic effects of different traits. They also showed that phenotypic relationships were often more linear than the genetic relationships due to dilution with environmental covariance. Nonlinear relationships might also be expected when one of the traits, for example fitness, has an optimum value (Lande and Arnold 1983; Falconer and Mackay 1996; Kingsolver *et al.* 2001, 2012). Such nonlinear relationships are often observed or hypothesized in natural populations, typically as “fitness profiles” between trait values and fitness (*e.g.*, Falconer and Mackay 1996) that show whether phenotypes on one side of the distribution (directional), in the middle (stabilizing), or at extremes (disruptive) have highest fitness. If there is an optimum phenotype, a maximum would imply a negative covariance between and fitness; *i.e.*, families with small variance would have selective advantage if there is stabilizing selection, and correspondingly families with a large variance would have an advantage if there is disruptive or strong directional selection (Hill and Zhang 2004; Mulder *et al.* 2007). Examples of nonlinear trait relationships include that between birth weight and survival of infants in humans in historic data when medical care was less optimized (Karn and Penrose 1951 in Schluter 1988), survival and size in house sparrows (Schluter and Smith 1986; Schluter 1988), and between litter size and birth weight in piglets (Varona and Sorensen 2014). These nonlinear relationships might be due in part to a nonnormality of at least one of the traits such as survival (binomial). To our knowledge no studies have yet been undertaken of the degree to which nonlinearity between traits is caused by an M-V genetic correlation between them. It is possible that low may yield better fitness in some environments (a conservative bet-hedging strategy) or high in others (a coin-flipping strategy) (Olofsson *et al.* 2009). Therefore, while insight into the nonlinear relationships between traits should increase understanding of the existence and if so of the position of phenotypic and genetic optima, no framework is available to study the role of M-V genetic correlations between traits and the consequent effects of selection.

The objectives this article are to (1) derive mathematically nonlinear relationships between traits when one exhibits genetic variance in , (2) predict responses to selection in mean and variance of both traits, (3) predict genetic and phenotypic optima, (4) predict changes in genetic parameters consequent on selection shifting the population mean toward the fitness optimum, and (5) investigate the relationship between piglet birth weight and survival in pigs as a practical application of the theory. We assumed multivariate normality for deriving the expressions, whereas survival is binomially distributed. Therefore, we extended the predictions toward assuming that survival has an underlying normal distribution and predictions can be transformed to the observed binomial scale.

## Quantitative Genetic Model

Consider two traits for which the phenotypic deviations from the mean are the sum of additive genetic and environmental effects, with no dominance or epistasis (Falconer and Mackay 1996). The first trait, with phenotypic value , is also subject to additive genetic differences in its environmental variance, so-called genetic heterogeneity of environmental variance (SanCristobal-Gaudy *et al.* 1998), whereas the second trait (phenotypic value ) has homogeneous environmental variance. Here we assume the exponential model for environmental variance (SanCristobal-Gaudy *et al.* 1998; Hill and Mulder 2010), because of its statistical tractability, as the environmental variance cannot be negative and the model appears linear on the log scale. Different models to model genetic differences in environmental variance appear similar when genetic variance in is small. Further discussion and comparison of models for genetic analysis of the environmental variance are given elsewhere (Hill and Mulder 2010). The genetic model is (1) (2)where is the population mean, is the breeding value, and is the environmental deviation for trait *i*. The environmental variance for trait 2 is assumed constant, whereas is the expected environmental variance for trait 1 when but varies among individuals according to their breeding value The standard normal deviate *χ* is the standardized environmental deviation for trait 1. The breeding values and are assumed to be the sum of effects of an infinite number of additive loci and are multivariate normally distributed,where **A** is the additive genetic or numerator relationship matrix; and are the additive genetic variances in and and and are the corresponding additive genetic covariances. The environmental effects and are bivariate normally distributed,where is the environmental variance of trait 2 and is the environmental correlation.

## Methodological Framework for Nonlinear Relationships

### Multiple-regression framework

We used a multiple-regression framework to predict the nonlinear relationship between traits caused by M-V genetic correlations between them and to predict responses in mean and variance due to selection, following the methodology of Mulder *et al.* (2007). In Mulder *et al.* (2007), breeding values and selection responses for mean and variance were predicted using linear, quadratic, and cubic regression on the phenotypic deviation of an individual phenotype. This methodology formally requires multivariate normality, which holds only approximately because genetic variance in for causes small deviations from normality (Mulder *et al.* 2007). We also used selection index theory (Hazel 1943), which is essentially an application of multiple regression assuming that fixed effects are known without error.

### Monte Carlo simulation

To check the accuracy of derived relationships and predicted responses to selection, we simulated 1000 replicates each with 1 million animals according to the genetic models in Equations 1 and 2 and calculated the expected value of trait 2 given trait 1 by calculating the average given within 601 successive intervals of 0.01 between −3 and 3 Selection responses to directional mass selection on and were calculated as the mean and of all selected animals with in each replicate and averaged over replicates, where *x* is the truncation point.

### Deriving the phenotypic relationship between P_{2} and P_{1}

The phenotypic relationship between and was derived as the multiple regression of on and because genetic variance in causes genetic variance in (Mulder *et al.* 2007). The regression can be written as (3)where (4) and The elements of **P** and **c** were derived using the higher-order moments of the normal distribution (Stuart and Ord 1994) and using the laws of total (co)variance and standard variance–covariance rules (see *Appendix A* for full derivations), (5) (6) (7) (8) (9)where Note that Equation 3 is about the relationship between single observations of and In Equation 7, the two rightmost terms are related to the skewness (Equation 6) and kurtosis of the distribution, respectively. All elements of **P** and **c** (equal to Equations 5–9) were checked using Monte Carlo simulation. The multiple-regression prediction of from shows very good agreement with the Monte Carlo simulation (Figure 1 and Figure 2), with *R*^{2} > 0.995 for all cases evaluated. *Appendix C* shows a natural extension to multiple random effects, such as permanent and common environmental effects.

### Deriving response to mass selection

#### Deterministic predictions:

Response to selection (*R*) is predicted as the mean deviation of offspring from the population mean regressed on the selection differential of the parents which is equal to the regression of the breeding value (*A*) on the selection differential (*S*) (Falconer and Mackay 1996). We showed previously that response to selection in and could be predicted for different kinds of mass selection, *e.g.*, directional, stabilizing, or disruptive (Mulder *et al.* 2007). Here we extend this framework to the correlated responses in (1) from mass selection on and (2) and from mass selection on In both cases, the general equation to predict selection responses in in terms of selection differentials is (10)where **P** is the variance–covariance matrix among selection differentials in , because selection differentials are random variables under the quantitative genetic model assumed; **G** is the covariance matrix between and ; and **B** is a matrix of regression coefficients. First, if mass selection on is performed, then the selection differentials on and are (11)If it is assumed that (12)where is the transpose of **P** is the same as in Equation 3, and *i.e*., the covariance matrix between *i* is the selection intensity when selecting the best *p* animals, and *x* is the truncation point corresponding to *p* based on a normal distribution (Mulder *et al.* 2007).

Second, if mass selection is on Equation 10 becomes (13)where and the covariance vector between and

#### Evaluation of predictions:

Genetic selection differentials in and with directional mass selection on either (Equation 12) or (Equation 13) were shown by Monte Carlo simulation to be very accurately predicted, with differences only at the third decimal (Table 1). Correlated responses in both and are expected following mass selection on *P*_{1}, even if genetic correlations between and or are zero, provided that the genetic correlation between and is nonzero. This is because, with strong directional selection (*e.g.*, *P* ≥ 20%), animals with higher (*i.e.*, environmental variance in ) have a higher probability of selection (Hill and Zhang 2004; Mulder *et al.* 2007). In other words, with directional selection genotypes with a higher variance are more likely to be parents of the next generation than genotypes with a low variance.

### Response to selection on estimated breeding value

In animal breeding, selection is often on predicted breeding values rather than on raw phenotypes. For instance, if the breeding goal is to increase trait 2, then the predicted breeding value using information of and is (14)where **P** is given in Equation 3, and The expected response to selection would be equal to that using the generalized form of the breeders equation (Hazel 1943): (15)An alternative to using squared phenotypic deviations is to use squared residuals of corrected for its leverage (*l*), which is the diagonal element of the “hat matrix” for each observation (Hoaglin and Welsh 1978) as used in the double-hierarchical generalized linear model (DHGLM) (Rönnegård *et al.* 2010): (16)In the absence of any fixed effects, Equation 16 reduces to (17)where and where (see derivation in *Appendix B*). Use of either or would yield identical accuracy of prediction and of response to selection in this case where only their own phenotypes provide information and there are no fixed effects; otherwise squared residuals corrected for leverages are expected to give higher accuracy and response than using squared phenotypic deviations.

### Determining whether the parabola have a maximum or a minimum

The multiple-regression Equation 3 is quadratic and therefore has a parabolic shape. Such quadratic relationships may indicate a biological optimum, for instance birth weight and survival in humans prior to modern medical intervention (Karn and Penrose 1951 in Schluter 1988). Expressions for and (from Equation 3) are given by (18) (19)where (Equation 8) and (Equation 9) and **P** is shown below Equation 4. Whether there is a maximum or a minimum depends on the sign of the quadratic regression coefficient (), *i.e.*, If it has a minimum and if it has a maximum. The sign of the regression coefficient depends on multiple parameters (Equations 18 and 19). As Figure 3 shows, is linear in because it is a linear function of (Equation 19) and is the genetic covariance between and When and other correlations are zero, *i.e.*, when the straight-line relationship between and passes through (0, 0). If other correlations are nonzero, shifts to a point dependent mainly on with other parameters having less influence on In summary, a moderate to strong negative M-V genetic correlation between traits () yields a parabola with a maximum, *e.g.*, animals with phenotypes close to the optimum have maximum fitness, a fitness optimum; while a moderate to strong positive genetic correlation yields one with a minimum; *e.g.*, animals with phenotypes largely deviating from the minimum have the highest fitness.

Similarly, Equation 14 shows whether the genetic parabola has a maximum or a minimum. The regression coefficients now are (20) (21)where and are the elements of (just below Equation 14). Equations 20 and 21 differ from Equations 18 and 19 only in and If then and and consequently and

### Can phenotypic and genetic parabolas have different shapes?

The interest here is in whether there are conditions when the phenotypic parabola has a maximum and the genetic parabola a minimum or vice versa. Because affects only the phenotypic parabola, it determines the relative shapes of the parabolas, and so it is helpful to calculate the magnitude of when and switches to a different sign than that of The most obvious situation is when the environmental correlation and the genetic correlation have opposite signs, but only when Calculations show that, for the parabolas to be of different shape, has to be strong, in which case the regression is almost linear. So in practice, parabolas of opposite sign are likely to occur only rarely, if at all.

### Determining the phenotypic and genetic stationary points

The maximum (minimum) phenotype can be found at the stationary point of (). The stationary point is mathematically the point where the first derivative is zero, *e.g.*, a minimum or maximum. The stationary point of the quadratic relationship of as a function of can be calculated as (22)If we assume for simplicity that when then The population mean is at a stationary point of zero when A positive (negative) stationary point implies that the current population mean is lower (higher) than it is. In general, = 0 if

The genetic stationary point can be determined similarly: (23)The genetic and phenotypic stationary points differ unless They are shown in Figure 4 as functions of (Figure 4A) and (Figure 4B), and although they differ, their shapes do not. Both are near zero when is close to −1 or 1, but large in absolute terms when (Figure 4A), *i.e.*, a hyperbolic form, because largely determines the curvature of the parabolas; and both increase linearly with (Figure 4B). These phenotypic and genetic stationary points can be interpreted as phenotypic and genetic optima in the case of stabilizing selection.

### Effects of selection on genetic parameters

Both natural and artificial selection may bring the population mean to a fitness optimum. As shown in Figure 4, there is a relationship between the distance between the stationary point and the current population mean and the genetic correlations, such that the genetic correlations may change as selection shifts the population toward the stationary point. The formulas underlying Figure 4 can also be expressed to show how genetic correlations depend on the distance between the current population mean and the stationary point, *e.g.*, a fitness optimum (Ω), assuming that the parabola has fixed shape. The regression equation can be written as (24)where *d* is a constant. If we assume and then From Equation 22, substituting and equating to the expression for in Equation 18 assuming (25)The change in genetic correlation can be predicted as the population mean moves to the stationary point. For instance, if all other parameters stay equal, (26) (27)Equation 27 shows that and **Ω** are linearly related.

Similarly, for the genetic correlation between (28) (29)showing that has a hyperbolic relationship with Ω (Figure 4A).

### Example: index selection with changing genetic correlation

If index selection is applied to improve trait 2 using the phenotypic deviation of trait 1 and its square, *i.e.*, Equations 14 and 15, the mean of trait 1 will tend to move to the optimum value (Figure 5). If only changes according to Equation 27, the genetic correlation will become smaller as the population mean approaches the optimum. The response in trait 1 is nearly linear up to generation seven and becomes almost zero after generation eight, whereas the response in trait 2 is curvilinear. The response in trait 2 in the first seven generations is due mainly to bringing the population mean for trait 1 to the optimum and subsequently is due to reducing the phenotypic variance in trait 1.

### Deriving the phenotypic relationship between P_{2} and P_{1} when P_{2} is a binary trait

Until now, we assumed multivariate normality but nonlinear relationships between traits occur when one of the traits is not normally distributed, such as survival or presence/absence of disease. Therefore we extend the prediction of on (Equation 3) when is a 0/1 binomial trait. We assume that is normally distributed on the underlying/liability scale ( (Gianola 1982; Foulley 1992) and therefore, can be predicted using Equation 3. On the observed scale (), the proportion of animals that survive (dead = 0; alive =1) is predicted as (30)where *x* is the truncation point of the normal distribution corresponding to the average survival in the population (), represents the predicted truncation point given and is the standard normal density function. Using Equation 30 together with Equation 3 to predict from shows very good agreement with results from Monte Carlo simulation (Figure 6), with *R*^{2} > 0.995 for all cases evaluated. (Fortran scripts for all theoretical predictions and Monte Carlo simulations are provided in Supporting Information, File S3).

## Application to Birth Weight and Stillbirth in Pigs

The theoretical framework developed above shows that an M-V genetic correlation between traits leads to a nonlinear relationship between their phenotypes. The subsequent analyses demonstrated that the genetic and phenotypic optima could differ in position, but that selection could bring the population mean to the genetic optimum. As a practical example, we investigate such a nonlinear relationship between birth weight and stillbirth in pigs. We used 32,450 birth weight records from 2129 litters from 813 sows. Piglets were crossbred whereas the sows were purebred Landrace pigs from TOPIGS. Piglet survival at birth was recorded for each piglet as one (alive) or zero (dead). More details on the data are given elsewhere (Sell-Kubiak *et al.* 2015). Data are provided in Supporting Information, File S2.

### Statistical analysis

We analyzed piglet birth weight, its environmental variance (), and piglet survival at birth as three traits with a DHGLM in ASReml4 (Rönnegård *et al.* 2010; Felleki *et al.* 2012; Gilmour *et al.* 2014). For piglet survival we assumed a linear model with homogeneous residual variance because combining a logistic model for piglet survival with DHGLM for birth weight is not yet feasible and indeed outside the scope of this article. Fixed effects fitted were herd–year–season, parity of the sow, and sex of the piglet for all three traits. The random effects included were maternal genetic effects, litter effects, and residuals for all three traits, with each random effect trivariate normally distributed. The model is (31)where and are the vectors of observations for piglet birth weight, the response variable for of birth weight, and piglet survival; and **X**, **Z**, and **W** are the incidence matrices relating fixed effects random breeding values, and random litter effects to observations. The vectors and are maternal breeding values for birth weight, its and survival, and are the corresponding vectors for random litter effectswhere **I** is the identity matrix; and are the litter variances; and and are the corresponding covariances of and Although sows had repeated observations, at the risk of biasing estimates of genetic covariance estimates, permanent sow effects were excluded from the model because their inclusion yielded estimates at the lower boundaries. Subsequently, these estimates of parameters were used to predict the regression of piglet survival on birth weight and were compared to the averages of survival rate for each of 20 ranked bins with equal numbers of piglet birth weight records. Because the statistical model included litter effects, the extension in *Appendix C* was used to predict the regression equation when the residual covariances between and birth weight and survival are ignored.

## Results

The parameter estimates (Table 2 and Table 3) indicate that the genetic correlation between mean piglet birth weight and survival is negative, as is that between of birth weight and survival, the M-V genetic correlation between traits. This implies that lower birth weight or lower is genetically associated with a higher survival rate and, importantly, that uniform litters have higher mean survival rates. The multiple-regression equation based on the estimated variance components describes the nonlinear relationship between individual survival rate and individual birth weight quite well (Figure 7). Because a linear model for survival is used, the regression equation goes beyond a survival rate of 1.0. Accounting for the binomial nature of survival, using Equation 30, solves the problem of predicting a survival rate beyond 1.0, but the goodness of fit was slightly worse (*R*^{2} = 0.78) than for the regression assuming multivariate normality (*R*^{2} = 0.79). The relationship between individual birth weight and individual survival rate is, however, almost linear because the residual covariance between birth weight and survival substantially dilutes the nonlinear genetic relationship. Similarly Sölkner and Fuerst-Waltl (1996) found that environmental or residual correlations can greatly reduce the nonlinearity of the genetic relationship such that the phenotypic relationship is almost linear. From a biological point of view, a nonlinear relationship between piglet birth weight and survival rate at litter level can arise because extremely small piglets decrease survival rate, as also do very large piglets but to a lesser extent, because the very large piglets induce early farrowing.

## Discussion

### Methodological framework and quantitative genetic aspects of nonlinear relationships

This study shows that a genetic covariance between the environmental variance of one trait and the mean of another trait, *i.e.*, an M-V genetic correlation between them, can lead to nonlinear relationships between these traits and presented a framework based on multiple regression to investigate the consequences of selection in the presence of such an M-V genetic correlation. Although all relationships between breeding values are assumed to be linear, the phenotypic relationship between traits becomes curvilinear because the breeding value for environmental variance for the first trait is associated with the squared phenotype and its covariance to the mean of trait 2. Most quantitative genetic machinery is based on multivariate normality and thus linear relationships (Bulmer 1980), but we show that it can be extended to incorporate nonlinear relationships using methods of Mulder *et al.* (2007) for predicting breeding values and selection responses when there is genetic variation in . Mulder *et al.* (2007) used the additive model for genetic heterogeneity of , but here the exponential model was used because of its statistical tractability and implementation in the DHGLM methodology to estimate genetic variance in . The additive and exponential models are similar, assuming that the genetic variance in is small and parameters can be converted from one model to the other (Mulder *et al.* 2007; Hill and Mulder 2010). Here the focus is on the principles rather than the details and therefore the choice of model does not have a major impact on the nonlinear relationships caused by M-V genetic correlations between traits.

We extended the DHGLM methodology to estimate genetic and environmental variances and the correlations between piglet birth weight, its , and survival. The current algorithm can be run in ASREML4 (code provided in Supporting Information, File S1). It could, for example, also be used to estimate the trade-offs between and other traits and thereby help further unravel the biological mechanisms and causes of genetic differences in . Genetic correlations may change due to selection, as predicted in this study, even under infinitesimal model assumptions, and therefore reestimation of genetic parameters seems important if there are curvilinear relationships between traits. Large family sizes are required to get accurate estimates for genetic variance in (Hill 2004; Hill and Mulder 2010), and consequently very large data sets are required for estimating M-V genetic correlations between traits with sufficient precision (inferred from Robertson 1959).

The regression in Figure 1 shows substantial nonlinearity, but in our examples the coefficient of determination () would not be greatly reduced if a linear regression was applied even though the true relationship is quadratic because of an M-V genetic correlation between traits. The linear regression would generally explain ≥90% of the variation, except when the phenotypic correlation between the traits is close to zero (*i.e.*, −0.15 < phenotypic correlation < 0.15) (Figure 8). Therefore, a linear regression between traits is sufficient when the population is far from the optimum, but if there is a nonzero M-V genetic correlation between them and the population mean is close to the optimum, quadratic regression should be used.

A related question is how much of the nonlinearity observed is due to an M-V genetic correlation between the traits. In the pig example, the quadratic relationship improved only from 0.787 to 0.793, whereas the best-fitting quadratic regression would explain 0.794. Assuming that the true relationship is equal to the best-fitting quadratic regression, it can be concluded that the genetic covariance between of birth weight and survival is the main driver for the quadratic relationship, and no other factors need to be invoked. The environmental correlations seem to have a dominant influence on diluting the nonlinearity, which accords with findings of Sölkner and Fuerst-Waltl (1996) who investigated the effect of nonlinear relationships between traits due only to nonlinearity of their breeding values for trait means. If the environmental relationship between traits is approximately linear, the phenotypic relationship will also be approximately linear, especially when the heritabilities are small.

In this study, we investigated the contribution of M-V genetic correlations between traits to nonlinearity between them. Nonnormality of at least one of the traits can cause a nonlinear relationship between them (Varona and Sorensen 2014). Survival is binomially distributed, although we assumed multivariate normality of birth weight and survival, but estimates of additive genetic correlations on the linear scale are expected to be equal to those on the underlying scale (Vinson *et al.* 1976; Gianola 1982) and therefore not biased by the current procedure. Furthermore, we showed that the prediction assuming multivariate normality fitted the observed survival rates better than when assuming that survival is binomially distributed. Nevertheless, Equation 30 is a useful extension of the framework to predict phenotypes of traits that are not normally distributed, but have an underlying liability scale that is normally distributed. More research is needed to explore the effects of nonnormality on estimation of such M-V correlations.

The focus in the present study was on understanding the nonlinear relationship between traits caused by M-V correlations between traits and its consequences for artificial and natural selection. We focused on mass selection, but the same framework could be used to predict responses to selection when using information of relatives, *e.g.*, sibs or offspring as shown by Mulder *et al.* (2007, 2008). With progeny testing, accuracy will be greatly improved, especially for the breeding value for . In Mulder *et al.* (2008) we showed that when profit has an optimum value, artificial selection will first bring the population mean to the optimum and subsequently selection will be to reduce the variance, similar to that in Figure 5. It should be noted that current calculations as well as those in Mulder *et al.* (2007, 2008) do not take into account reduced genetic variance due to gametic phase disequilibrium (Bulmer 1971).

The framework provides estimates of positions and heights of phenotypic and genetic optima, which are functions of genetic and nongenetic (co)variances. As standard errors, particularly on genetic (co)variances, are moderate to high, those on optima are also likely to be large, especially when the M-V genetic correlation between traits is close to zero (Figure 5). The calculated location of optima gives insight into where the current population mean is compared to the optimum. The piglet example shows that phenotypic and genetic optima may be very different: the current population mean is far from the phenotypic optimum (∼13 phenotypic SD), because higher birth weight still yields higher survival on average based on the regression, although observed survival rate flattens at ∼1.8–2.0 kg. The high prediction for the phenotypic optimum is likely due to neglecting the binomial nature of survival. In contrast, the genetic optimum is ∼6 phenotypic SD below the current population mean, with the discrepancy between the phenotypic and the genetic optimum due to the opposite signs of the genetic *vs.* litter and environmental correlations between birth weight and survival. Yoshimura and Shields (1995) indicate that environmental uncertainty is the driver between differences in optima. If genetic/nongenetic (co)variances differ among environments, the optima, *i.e.*, fitness profiles, would also differ among environments (*e.g.*, Dewitt and Yoshimura 1998).

### Phenotypic variability and fitness in natural populations

Nonlinear relationships are often observed or hypothesized in natural populations, typically as “fitness profiles” between trait values and fitness (Lande and Arnold 1983; Falconer and Mackay 1996; Kingsolver *et al.* 2001, 2012). If the optimum phenotype is intermediate, there would be a negative covariance between and fitness, such that families with small variance would contribute more survivors, and correspondingly families with a large variance would have an advantage with disruptive or strong directional selection (Hill and Zhang 2004; Mulder *et al.* 2007). The strength of natural selection (*i.e.*, the selection gradient) can be determined by regressing relative fitness on phenotypic value if it is directional or on the squared deviation of phenotypic value from the mean if it is stabilizing or disruptive (Lande and Arnold 1983). Thus if fitness is taken as trait 2, the regression coefficients in Equation 3 indicate the strength of selection. For instance, the strength of stabilizing selection is obtained as (Falconer and Mackay 1996), where is the regression of fitness on the squared deviation of the trait under selection, *i.e.*, trait 1 in this study. Applying this to our study and assuming *i.e.*, the population mean is at the optimum, the strength of stabilizing selection (expressed as the “variance” of the fitness profile standardized by on trait 1, so small values imply strong selection) would be (Falconer and Mackay 1996). The estimate of for the example of piglet survival and piglet birth weight is 435

Based on early estimates of the strength of stabilizing selection, Falconer and Mackay (1996) suggested that strong selection would be = 10 and weak selection = 100, with “typical” estimates of the order of 20–25 (Falconer and Mackay 1996; Zhang and Hill 2005). Kingsolver *et al.* (2001, 2012), however, reviewed selection gradients estimated in a range of natural populations and found that in only a few studies negative quadratic regression coefficients () were actually obtained, indicating that strong stabilizing selection is rarely detected. The value obtained here for the piglet data implies very weak stabilizing selection. Indeed, as fitness typically has a low heritability (<0.1) and () in most recent estimates using DHGLM or MCMC (Felleki *et al.* 2012; Mulder *et al.* 2013a; Rönnegård *et al.* 2013), the stabilizing selection driven by the M-V genetic correlation between and fitness is likely to be weak (examples in Table 4).

### Implications and applications in agriculture

In agriculture, a frequent objective is to increase uniformity of animals and plants for some traits because these may have an optimum with respect to profit, for instance for slaughter traits in livestock because slaughterhouses prefer low variability in carcass weight and meat quality to optimize processing and to provide homogeneous products to retailers (Hennessy 2005). Some conformation traits such as udder depth or front teat placement may have a nonlinear relationship with respect to fitness traits, such as longevity (Larroque and Ducrocq 2001), or with milk production (Fuerst-Waltl *et al.* 1998). In contrast, there is also concern that animals with small would not have the capacity to react to unpredictable environments, *i.e.*, lack robustness, so it is likely to be worthwhile to explore the relationship between and fitness traits in different environments. While very little is known about trade-offs between and means of traits, the current study provides insight in showing that a negative genetic covariance between and fitness is advantageous for piglet survival and birth weight. Furthermore, selection on can increase accuracy of selection provided the population is at the optimum and the genetic correlation between trait means is close to zero. Therefore, inclusion of in index selection is mainly of interest for traits that have a direct economic value for uniformity or where has a strong genetic correlation with an economically important trait that has a low heritability and is not strongly correlated to other economic traits.

In conclusion, this study provides a framework to study nonlinear relationships between traits caused by M-V genetic correlations between them and to assess the effects of artificial and natural selection on their means and variances. It is shown that phenotypic and genetic optima may differ; that genetic relationships are often more curvilinear than phenotypic relationships, especially for environmentally correlated traits; and that genetic correlations may change when population means change relative to the optimal phenotypes. The analysis of piglet birth weight and survival shows that the nonlinearity can be partly explained by such an M-V genetic correlation between of birth weight and survival.

## Acknowledgments

We thank Ian White, University of Edinburgh, for helping with some of the derivations. TOPIGS Research Center IPG is acknowledged for providing the data. This study was supported by Breed4Food, a public–private partnership in the domain of animal breeding and genomics.

## Appendix A

### Derivation of Elements in the Multiple-Regression Equation of *P*_{2} on *P*_{1} and *P*_{1}^{2}

The elements of the **P** matrix and the **c** vector were derived as follows for the genetic model in Equation 1 in which and where the roman E denotes expectation.

The elements of **P** arewhereand comes from the coefficient of in The elements of **c** are

## Appendix B

### Derivation of Elements of *P* and *g* When Using Squared Residuals Corrected for Leverages

Instead of using directly, squared residuals of corrected for its leverage (*l*) can be used to predict breeding values for the second trait (see Equation 17 in main text). The leverage is the diagonal element of the hat matrix for each observation (Hoaglin and Welsh 1978) as used in the DHGLM (Rönnegård *et al.* 2010).

Assuming that only information of the animal itself is used for breeding value estimation and in the absence of fixed effects, and hence the elements of in Equation 17 areand similarly the elements of areIf is put into a scaling matrix the derived elements in *Appendix A* can be used together with the scaling matrix (see main text below Equation 17). It can be shown that the accuracy of the estimated breeding value is the same as with using squared phenotypic deviations instead of squared residuals corrected for leverages:

## Appendix C

### Generalization of Equations Involved in Multiple Regression with Multiple Random Effects

Assume the statistical model for trait 1 (= **y**), its and trait 2 contains fixed effects and a number of random effects () in addition to the residual,and is predicted from and The (co)variances in the **P** matrix are where () is the variance for random effect *i* for trait 1 ( of trait 1) and is the covariance for random effect *i* between trait 1 and its :The covariances in **c** are where is the covariance for random effect *i* for trait 1 and trait 2 and is the covariance for random effect *i* between trait 1 and its

## Footnotes

Supporting information is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.114.173070/-/DC1.

*Communicating editor: J. B. Wolf*

- Received November 26, 2014.
- Accepted January 23, 2015.

- Copyright © 2015 by the Genetics Society of America