## Abstract

Data from uterine capacity in rabbits (litter size) were analyzed to determine whether the environmental variance was partly genetically determined. The fit of a classical homogeneous variance mixed linear (HOM) model and that of a genetically structured heterogeneous variance mixed linear (HET) model were compared. Various methods to assess the quality of fit favor the HET model. The posterior mean (95% posterior interval) of the additive genetic variance affecting the environmental variance was 0.16 (0.10; 0.25) and the corresponding number for the coefficient of correlation between genes affecting mean and variance was −0.74 (−0.90;−0.52). It is argued that stronger support for the HET model than that derived from statistical analysis of data would be provided by a successful selection experiment designed to modify the environmental variance. A simple selection criterion is suggested (average squared deviation from the mean of repeated records within individuals) and its predicted response and variance under the HET model are derived. This is used to determine the appropriate size and length of a selection experiment designed to change the environmental variance. Results from the analytical expressions are compared with those obtained using simulation. There is good agreement provided selection intensity is not intense.

THE classical model of quantitative genetics assumes that genotypes affect the mean of a trait but that the environmental variance (variance of phenotype, given genotype) is the same for all genotypes. An extension postulates that both mean and variability differ between genotypes (San Cristobal-Gaudy *et al*. 1998). The extended model has interesting implications in animal and plant improvement (*e.g*., Hill and Zhang 2004; Mulder *et al*. 2007) since it offers the possibility to decrease variation by selection leading to more homogeneous products. In evolutionary biology, a central problem is to understand the forces that maintain phenotypic variation. With the exception of recent work (Zhang and Hill 2005), most of the models assume that environmental variance is constant and explain the level of phenotypic variation by invoking a balance between the gain of genetic variance by mutation and its loss by different forms of selection and drift.

Early evidence for a genetic component affecting environmental variation stems from comparison of levels of variation between inbred lines and the F_{1} cross between them, with inbreds showing in general larger variance (reviewed in Falconer and Mackay 1996). More recent evidence has come from fitting the model to data on litter size in pigs (Sorensen and Waagepetersen 2003), adult weight in snails (Ros *et al*. 2004), body weight in poultry (Rowe *et al*. 2006), slaughter weight in pigs (Ibáñez *et al*. 2007), and litter size and weight at birth in mice (Gutierrez *et al*. 2006). Stronger, more direct support, not derived from fitting the genetically structured heterogeneous variance model, but from analyses of experiments with isogenic chromosome substitution lines of Drosophila, was provided by Mackay and Lyman (2005). Here, homozygote inbred lines that differed in chromosome 2 or 3 were created, and variation between individuals in abdominal and sternopleural bristle number was computed. Difference in within-line variance, between lines, was confirmed. Since individuals within a line are effectively replicates of the same genotype, difference in within-line variance, between lines, provides evidence for the presence of genes located in chromosomes 2 and 3 affecting environmental variance.

With the exception of experimental organisms such as Drosophila and some plant or fish species where replicated individuals of the same genotype (clones) can be produced and variation between individuals composing the clone measured directly, support for the presence of genes controlling environmental variation can be found, fitting the model to data and studying the quality of the fit using modern computational tools. Stronger support would entail showing that the environmental variance responds to selection pressure in an appropriately designed experiment. The latter requires to define an observable that properly reflects environmental variation and to determine the expected change of this observable due to selection and its variance in conceptual replications. This knowledge is needed to design an adequate experiment.

There are two objectives in this work. The first is to provide new results in favor of the existence of genetic variation at the level of environmental variance. Litter size in repeated parities in rabbits is taken as an example. Two models are fitted; one assumes homogeneity of environmental variance and the other postulates a genetically structured variance heterogeneity. The models are compared contrasting the quality of their fit, using posterior predictive model checking (Gelman *et al*. 1996), using cross-validation (Gelfand 1996), and using the deviance information criterion, an index that encapsulates the fit of a model and its complexity (Spiegelhalter *et al*. 2002). The second objective is to derive expressions to predict response to selection for environmental variance, and the variance of the response, with the purpose of studying a number of issues concerning the design and size of experiments to detect this response.

## STATISTICAL ANALYSIS OF LITTER SIZE DATA

#### Data:

The data originate from a selection experiment for uterine capacity in rabbits (litter size in unilateral ovariectomized does; technique described in Santacreu *et al*. 1990) spanning 10 generations. Uterine capacity is referred to as litter size hereinafter. Details of the selection experiment need not concern us here and can be found in Argente *et al*. (1997). From the point of view of the validity of inferences using selected data, it is important to emphasize that all the data used to make selection decisions have been included in the analyses reported in this work. Therefore the conditions for ignorability of selection under a Bayesian (or likelihood) analysis are met (Rubin 1976; Little and Rubin 1987).

Animals were derived from a synthetic population of the experimental farm at the Universidad Politécnica de Valencia that had undergone several generations of random mating before the start of the experiment. Reproduction was organized in discrete generations and mating of close relatives was avoided to reduce inbreeding. Females were first mated at 18 weeks of age and thereafter 10 days after parturition, producing in total up to four parities.

The total number of records was 2996. The number of animals in the pedigree was 1161; 85 of these composed the base population.

#### Models fitted and implementation:

Data were analyzed with two models. The first model assumes homogeneity of environmental variance; the second assumes that the environmental variance is partly genetically determined. Both assume that the conditional distribution of the sampling model for the data is Gaussian, despite the fact that the trait in question is in the form of counts. Sorensen and Waagepetersen (2003) investigated the consequences of this assumption by discretizing data that had been simulated under the normal model. The normal model was fitted to the discretized data and the posterior distribution of the parameters agreed well with the values used to simulate the data.

*Model 1* is the classical repeatability additive genetic model. It assumes that the sampling model of the data vector *y* = (*y _{i}*), given location parameters

*b*,

*a*, and

*p*and given the residual variance , is the normal process(1)where

*b*contains year–season effects with 30 levels (each level included 3 months, from spring 1991 until summer 1998) and parity order effects with 4 levels (first, second, third, and fourth or higher parities). Vectors

*a*and

*p*contain additive genetic values (1161 levels) and permanent effects (929 levels), respectively, and σ is the residual variance. The known incidence matrices are

*X*,

*Z*, and

*W*and

*I*is the identity matrix.

Vectors *p* and *a* were assumed to be *a priori* independently and normally distributed; that is,(2)(3)where *A* is the known additive genetic relationship matrix. The vector *b* was assigned an unbounded uniform prior distribution and the variance components , , and , scaled inverted chi-square distributions.

Under this model, the phenotypic variance is the variance of the conditional distribution of *y _{i}* given

*b*and the variance components,(4)and the heritability is(5)

This model assumes homogeneity of environmental variation. It was fitted using a Gibbs sampling algorithm, as described, for example, in Sorensen and Gianola (2002).

*Model 2* postulates that the environmental variance is heterogeneous and partly under genetic control. It assumes that conditionally on vectors of location and dispersion parameters, the vector of phenotypes is Gaussian,(6)where diag(()) is the diagonal matrix with diagonal entries ,andThe vectors *b* and *b** contain effects associated with year–season and parity order and *X*, *Z*, and *W* are known incidence matrices. Vectors *p* and *p** contain permanent environmental effects and are assumed to be independently distributed with normal structures(7)(8)Vectors (*a ^{T}*,

*a*

^{*T}) contain normally distributed additive genetic effects;

*i.e*.,(9)whereAbove, ρ is the coefficient of genetic correlation between

*a*and

*a**, and are additive genetic variances associated with the distribution of . As discussed in Sorensen and Waagepetersen (2003), this model generates a stochastic relationship between mean and variance when |ρ| < 1, a deterministic relationship when |ρ| = 1, and absence of relationship when .

Under this model, the phenotypic variance is the variance of the conditional distribution of *y _{i}* given

*b*,

*b** and the variance components(10)where (

*Xb**)

_{i}is the

*i*th entry of

*Xb**. The heritability is defined as(11)There is one heritability for each combination of environmental effects (

*Xb**)

_{i}affecting the environmental variance. More details can be found in Sorensen and Waagepetersen (2003) and in Ros

*et al*. (2004).

The third central moment under model 2 is(12)(Ros *et al*. 2004). The conditional distribution of *y _{i}* has therefore negative, zero, or positive coefficient of skewness(13)depending on the value of ρ. There is one coefficient of skewness for each combination of environmental effects (

*Xb**)

_{i}affecting the environmental variance.

Details of the *a priori* distributions and the Markov chain Monte Carlo (MCMC) implementation to fit model 2 are described in Sorensen and Waagepetersen (2003). Briefly, *a priori*, *b* and *b** were assigned normal distributions with zero mean vector and diagonal covariance matrix with very large diagonal elements. The variance parameters , , , and were assigned scaled inverted chi-square distributions and ρ was assigned a uniform prior bounded between −1 and 1. The implementation was based on the MCMC algorithm proposed by Sorensen and Waagepetersen (2003). Vector *b* was sampled using a Gibbs update, vectors (*a ^{T}*,

*a**

^{T}) and (

*p*,

^{T}*p**

^{T}) were reparameterized with the intention of reducing their posterior correlation and subsequently sampled using the Metropolis–Hasting algorithm with a Langevin–Hastings proposal, and the log-variance components and the correlation coefficient were sampled using Metropolis–Hastings with random-walk proposals.

## MODEL CHECKING AND MODEL COMPARISON

The three approaches described below to question the validity of the models address different questions. The deviance information criterion (DIC) provides a comparison of the global quality of two or more models, accounting for model complexity. Cross-validation based on conditional predictive ordinates (CPOs) provides a more detailed inspection, disclosing which specific data points are better fitted by the models. In addition, the set of CPOs contains the same information about model performance as the Bayes factor (Besag 1974) (when the latter exists), and in this way it also provides a measure of the models' overall quality. Finally, a graphical assessment of variance heterogeneity is presented using two approaches. The first does not involve fitting parametric models and is based on regressing the average sampling variance of records within individuals, on mean phenotypic values. The second one uses posterior predictive model checking and is designed to study the ability of a particular model to capture specific putative features of the data. In so doing, it suggests ways in which the model may be expanded to account for scientifically relevant aspects of the data.

#### An informal visualization of mean–variance relationship:

The 929 females with records were sorted according to their mean litter size (across parities) and divided into 11 groups of ∼85 individuals. Mean litter size and average variance between records within individuals were computed for each group. To visually explore a possible association between mean and variance, the average group variances were plotted against the group averages sorted in increasing order.

#### Posterior predictive model checking:

A technique for checking the fit of a model to observed data *y* is to draw simulated values *y*_{rep} from the posterior predictive distributions of replicated data and compare *y*_{rep} with the observed data (Rubin 1984; Gelman *et al*. 1995). Any systematic differences between the observed and the simulated data indicate potential failings of the model. More specifically, the idea is to define a so-called discrepancy measure that depends on the data and perhaps also on θ, an unknown parameter of the model under scrutiny, the null model, say. This measure *T* is specifically designed to test a particular feature of the data *y* that may be of scientific relevance. Replicated data are then simulated from the posterior predictive distribution, given the null model, from which is constructed and compared with . Differences between the *T*'s may be due to sampling or due to the inability of the null model to account for the feature of the observed data disclosed by the discrepancy measure *T*.

In this study we are concerned with studying heterogeneity of environmental variation due to year–season effects, due to parity, and finally due to additive genetic effects. This was accomplished using the discrepancy measures proposed by Sorensen and Waagepetersen (2003). Thus, effects of year–season and parity on variance heterogeneity were investigated using the discrepancy measure(14)where *j* is an index for the two covariates, year–season and parity, *l* = 1, … , *n _{j}* is an index for the

*n*levels of the

_{j}*j*th covariate, and

*L*=

_{ij}*l*if the

*i*th record belongs to the

*l*th level of the

*j*th covariate. The vector θ

_{1}contains the parameters of model 1,

*m*is the number of the records with level

_{jl}*l*for the

*j*th covariate, μ

_{i}is the

*i*th element in

**Xb**+

**Wp**+

**Za**, and is the squared standardized residual associated with record

*i*. Since the expected value of (14) is zero (see Sorensen and Waagepetersen 2003), large or small values of

*T*

_{j}_{,l}indicate possible variance heterogeneity due to the

*j*th covariate.

A possible association between environmental variation and additive genetic values affecting mean litter size was studied using the discrepancy measure(15)where is the number of observations with *a _{i}* ∈

*I*and are subintervals of the real line with −∞ =

_{j}*t*

_{1}< … <

*t*

_{7}= ∞ whose length was chosen to accommodate a similar number of observations in each (∼425). Thus,

*T*measures the average environmental variation in each group, where the groups are obtained by choosing the observations according to the size of their additive genetic values. The seven subintervals are ordered from the smallest group of additive genetic values (subinterval 1) to the largest (subinterval 7). A trend in

_{j}*T*plotted against the seven subintervals would be indicative of an association between environmental variation and additive genetic values affecting mean litter size.

_{j}Since random effects and other parameters involved in the construction of *T* are unknown, one uses the idea of posterior predictive model checking (Rubin 1984; Gelman *et al*. 1996) and considers the posterior predictive distribution of *T _{j}*(

*y*, θ

_{1}) −

*T*(

_{j}*y*

_{rep}, θ

_{1}).

#### Cross-validation:

Gelfand *et al*. (1992) propose using a cross-validation (leave-one-out) approach based on posterior predictive distributions as a means of checking the fit of the model and in model choice. Let *y _{i}* denote datum

*i*, and let

*y*

_{−i}be equal to the data vector

*y*with datum

*y*deleted. That is,The posterior predictive density of

_{i}*y*conditional on

_{i}*y*

_{−i}and on model

*M*is(16)Very often, . With

_{r}*n*data points, there are

*n*posterior predictive densities (Equation 16). Note that the observed

*y*is not included to determine (16). The density evaluated at the observed datum

_{i}*y*is also known as the conditional predictive ordinate ,(17)If the model holds,

_{i}*y*may be viewed as a random draw from whose density is given by (17). The CPO

_{i}_{i}'s can be plotted

*vs. i*as an outlier diagnostic, since data having low CPO

_{i}'s are poorly fitted by the model. Such a plot, for different models, discloses which model does better and which points are poorly fitted under the different models. A Monte Carlo estimate of (17) is(18)(Gelfand

*et al*. 1996). In the above expressions, is the

*j*th MCMC draw from and

*m*is the number of draws (length of chain). An attractive feature of (18) is that it does not require implementing a new Bayesian analysis for each

*y*

_{−i}.

#### Deviance information criterion:

Spiegelhalter *et al*. (2002) have introduced the DIC as a means of comparing models. The DIC uses the posterior expectation of the log-likelihood as a measure of model fit. For a particular model *M*, the DIC is defined as(19)where(20)is the posterior expectation of the so-called deviance . The term in the right-hand side of (19) is the deviance evaluated at the posterior mean of the parameter vector θ_{M}. The term measures the quality of fit of a model, whereas is related to the “effective” number of parameters (Spiegelhalter *et al*. 2002). Expression (19) is the result of combining both terms. Models having a smaller DIC should be favored as this indicates a better fit and a lower degree of model complexity.

DIC is very easily calculated using the MCMC output. The first term in the right-hand side of (19) is estimated using twice the average of the simulated values of , and the second term is estimated as the deviance evaluated at the average of the MCMC simulated values of θ_{M}.

## RESULTS OF THE STATISTICAL ANALYSIS

Results corresponding to models 1 and 2 are based on 1,000,000 samples drawn using the appropriate MCMC algorithm. To give an idea of the accuracy of Monte Carlo computations we report below confidence intervals for various Monte Carlo estimates of posterior means derived from model 2.

#### Variance components and heritability:

Table 1 shows Monte Carlo estimates of posterior means and of 95% posterior intervals for variance components derived from models 1 and 2. The additive variance is a little higher and the permanent environmental variance a little lower in the case of model 2. The posterior mean of the correlation coefficient is −0.74; the Monte Carlo estimate of the 95% posterior interval indicates that the support of the posterior distribution is shifted a long way from zero (see also Figure 5B).

Monte Carlo estimates of features of posterior distributions are subject to Monte Carlo sampling error. Estimates of sampling error yield the following 95% confidence intervals for estimates of posterior means under model 2: σ, σ, ρ, , . These intervals show that the length of the chain (sample size) used to estimate the posterior means results in adequate accuracy.

Under model 1, the posterior mean (and 95% posterior interval) of the environmental variance is 4.37 . Under model 2, the smallest posterior mean of the environmental variance and 95% posterior interval, corresponding to year–season 30 and parity 1, is 2.83 . The corresponding largest number, for year–season 15 and parity 3 is 6.99 .

The posterior mean and the 95% posterior interval of heritability of litter size under model 1 are 0.09 and , respectively. Under model 2, there is one heritability for each combination of environmental effects (*Xb**)_{i} (see Equation 11) affecting the residual variance. The average heritability over all combinations of environmental effects is 0.13 with a minimum of 0.09 (year–season effect 15 and parity order 3) and a maximum of 0.19 (year–season effect 30 and parity order 1). The 95% posterior intervals for these quantities are and , respectively.

Marginal posterior distributions of , , , and are shown in Figure 1. The superimposed lines represent densities of the prior ν*S*χ^{2}-scaled inverted chi-square distributions of the variances. The posterior distributions in the four leftmost graphs (Figure 1A) were obtained using priors with parameters ν = 4 and *S* = 0.45. The four rightmost posterior distributions (Figure 1B) were obtained using priors with parameters ν = 4 and *S* = 0.05. The two sets of prior distributions have modal values of 0.300 and 0.033. The graphs indicate that posterior inferences of , , , and derived from model 2 are affected little by the choice of prior.

#### Deviance information criterion:

The Monte Carlo estimates of the DICs for models 1 and 2 are 7810 and 7719, respectively. Based on 10 replicates, the respective Monte Carlo standard deviations are 0.23 and 0.27, respectively. Smaller values of the DIC indicate a “better” model. Thus, the DIC favors model 2.

#### Cross-validation:

Figure 2A shows the difference in CPOs between model 2 and model 1, where the CPOs are sorted from the smallest to the largest for the 2996 records. For approximately two-thirds of the data there is very little difference in the CPOs for both models. However, for the remaining one-third of the data, model 2 shows a better fit. Figure 2B shows which points are best fitted by model 2. The data are ordered from the smallest to the largest value of litter size. There is wide overlap for both models, with the exception of observations in the center of the distribution, where model 2 results in a better fit than model 1.

#### Graphical assessment of genetic variance heterogeneity:

Results from the exploratory analysis based on posterior predictive model checking are shown in Figures 3 and 4. To test for environmental variance heterogeneity due to year–season, data were simulated under the homogeneous variance model (model 1) and the discrepancy measure *T*_{ys,l}(*y*_{rep}, θ_{1}) was computed [see expression (14)]. The difference between the discrepancy measure evaluated using the observed data, *T*_{ys,l}(*y*, θ_{1}), and the one evaluated using the simulated data is shown in Figure 3A. A similar exercise was carried out to study environmental variance heterogeneity due to parity order (Figure 3B). Due to posterior uncertainty of the discrepancy statistic, Figure 3A) does not convey a clear message. It is difficult to conclude from this exploratory analysis whether there is evidence in the data for heterogeneity of variance due to year–season. On the other hand, the picture emerging from Figure 3B is that parity orders 1 and 2 seem to be associated with smaller environmental variance than parity orders 3 and 4.

A graphical display of the relationship between the variance between records within individuals and litter size, sorted in increasing order, using a simple plot that does not involve fitting parametric models (see *An informal visualization of mean–variance relationship*), is shown in Figure 4A. A linear regression was fitted and the estimate is −0.23 (standard error 0.07), indicating that as litter size increases, the variation among records within an individual decreases.

Figure 4B is in the same spirit as Figure 4A but is based on residuals computed from model 1. This enables filtering out variation due to fixed effects and genetic values affecting the mean and hence a more precise look at the possible genetic variance heterogeneity. The boxplots in Figure 4B obtained from MCMC samples of the posterior predictive distributions of *T _{j}* −

*T*

_{j}^{rep}exhibit a decreasing trend that in accordance with the Figure 4A is indicative of a negative association between environmental variation and additive genetic values affecting litter size, as postulated by model 2. This is in good agreement with the posterior distribution of the correlation coefficient in Figure 5B (see also Table 1), whose support is in the negative real line, clearly shifted away from the value ρ = 0. More details on this approach to posterior predictive model checking are provided in Sorensen and Waagepetersen (2003).

Figure 5A shows a histogram of estimated residuals from model 1. The distribution is skewed to the left, in agreement with the coefficient of skewness (12) derived under model 2.

In conclusion, this model-fit study using different approaches favors the genetically structured environmental variance model 2 relative to the homogeneous environmental variance model 1.

## SELECTION FOR ENVIRONMENTAL VARIANCE

The results of a study of two interrelated issues are presented in this section. First, an expression is derived to predict response to selection for environmental variance using the average sums of squares between records within individuals as a criterion of selection. This requires an experiment where repeated measurements are taken for each individual. We believe that a selection experiment should be designed in such a way that inferences could be drawn using simple approaches and particularly without explicitly invoking a model for variance heterogeneity to interpret the results. The criterion for selection proposed above can in principle fulfill this goal.

The second problem that is studied involves the size of the experiment required to detect a given change in environmental variance. The response to selection and the variance of the criterion of selection across conceptual replications of the experiment are derived under a relatively simple version of the heterogeneous variance model. This is used to calculate the power to detect response. The results are compared with those obtained using simulated data.

#### Response to selection for environmental variance:

An expression for the expected response to selection is derived, where the selection criterion or index is the variance between repeated records within individuals. We consider a scenario that mimics selection for reduced variation in litter size in repeated parities (such as in pigs, mice, or rabbits) and records are therefore available on females only. A hierarchical mating structure is assumed, with *N*_{m} males, *N*_{f} females (*N*_{f}/*N*_{m} females mated to each male). Litter size is considered a trait of the female not influenced by the father of the litter. The index is computed for each female, and the males and females of the next generation are chosen among the unscored offspring from the best-scoring females of the *N*_{f} scored.

As in Sorensen and Waagepetersen (2003), consider the simplified version(21)of the genetically structured heterogeneous variance model, where *b* and *b** are common to all records. Assume a simple scenario where there are available *n* records , *j* = 1, 2, … , *n* per individual, on a large number of unrelated individuals. The goal is to reduce the variance among repeated records within individuals, and the criterion of selection is the empirical index *I* defined as the variance of records within individuals(22)where . When selection with average intensity between sexes *i* operates on the index (22), calculations detailed in the appendix show that the expected genetic mean (at the level of the environmental variance) among the selected individuals is(23)There are two sources of inaccuracy contributing to (23). The first is associated with the linear approximation to truncation selection based on (A3) (see appendix), which is expected to deteriorate with increasing selection intensity. Second, since the marginal distribution of the index (22) is not normal, use of the standard values of *i* based on normality introduces another source of error. The accuracy of (23) is studied below using simulated data.

How does the expected change in the unobservable *a** based on (23) translate into an expected change in the observable (22)? Consider the second-order Taylor expansion(24)In the absence of selection, taking expectation with respect to the distribution of *a**,(25)since and . With selection, ignoring changes in , and the expectation of (24) can be approximated as(26)The expected change in the index (22) due to the expected change in *a** or expected response to selection in the index, *R*_{I}, is, approximately,(27)where is given by (23). The estimated response to selection in the index is simply the difference between the two average indexes and is labeled . Formula (27) can be interpreted as an expected difference either involving a selected line and the base population or involving a selected and a control line. It holds provided that there are common nongenetic effects *b* and *b** affecting all records. The expected difference in the average index (22) (over individuals, within lines) given by (27) includes the unknown nongenetic parameter *b** acting on the environmental variance, but does not include the nongenetic parameter *b* acting on the mean. In the classical homogeneous variance model, the expected difference in phenotypic means between a selected and a control line is free from nongenetic generation effects provided that these affect the selected and control lines in the same manner. The dependence of the expected response to selection on the unknown parameter *b** need not concern us here. As shown below, expressions for determining the size of the experiment are free from *b**.

A more realistic model would allow for possibly different nongenetic effects affecting records at the level of the mean and environmental variance. For example, for a particular individual the vector of *n* records could be assumed to be the result of the sampling process(28)where are nongenetic effects peculiar to each of the *n* time periods when a record is registered; *X* is an incidence matrix (here, the identity matrix); 1 is a column vector of 1's; *a*, a scalar, is the additive effect of the individual affecting the mean; *D* is a diagonal matrix with the element corresponding to record *j* equal to , *j* = 1, 2, … , *n*; and *a**, a scalar, is the additive effect of the individual affecting the environmental variance. Then, writing (22) as , with , where *I* here is the identity matrix, and using the standard expression for the expectation of a quadratic form, it is easy to show that the expected value of the index under the model is(29)In this particular case, the first term in the right-hand side reduces to . If there is a common nongenetic parameter affecting records at the level of the mean, *X* = 1 (a column vector of ones) and the first term in (29) vanishes. If there is only one common nongenetic parameter affecting records at the level of the environmental variance, the second term reduces to (A1).

Under model (28), provided records are appropriately matched so that the first term in (29) is equal in selected and unselected individuals, the expected change in the index (22) due to the expected change in *a** is, approximately,(30)

#### Variance of the mean index *Ī*:

In the following derivations it is assumed that model (21) holds and that parents are randomly chosen and mated to create the *m* offspring of the next generation. Let *I _{i}* denote the index (22) for individual

*i*and let denote the average of the

*I*'s over the

*m*individuals. To simplify notation, in this section we let and . The variance of the average of the

*I*'s over the

*m*individuals is(31)where is the variance of the marginal, with respect to , distribution of the selection criterion

*I*, given by (A5), and is the marginal covariance between indexes of two individuals

*i*and

*j*. In a finite population the covariance term increases with time since individuals become more related. In the appendix it is shown that at generation

*t*, the variance of the average of the

*I*'s over the

*m*individuals is approximately(32)where

*N*is the effective population size. The second term in (32) can be thought of as a drift term that causes the variance of the index to increase with time. The validity of the exact result (A8) (see appendix) and of approximation (32) is checked below using simulated data.

In the absence of genetic variation for environmental variance,(33)

The coefficient of variation of the average index, ignoring selection, is(34)independent of *b* and *b**. Under random mating, the coefficient of variation increases with increasing length of the experiment. For example, with , *m* = 160 individuals measured, *n* = 3 records per individual, and an effective population size *N* = 116, , 11.4, and 13.3% for *t* = 0, 4, and 8, respectively. For an effective population size *N* = 75, the corresponding numbers are 9, 12.5, and 15.2%. In the absence of genetic variation for environmental variance, , independent of *t*. For *m* = 160 and *n* = 3, . Since the conditional mean and standard deviation of the index given by (A2) are equal, changes due to selection cancel in the ratio and (34) is expected to provide a good approximation under selection.

The difference between and (34) or between (33) and (32) could provide support for the genetically structured environmental variance model. Given the model, the variance of the index and its coefficient of variation should increase with time, whereas they should remain unchanged if .

#### Change in variance of the mean index due to scale:

Formula (32) does not account for the change of variance due to the change in the mean of the index under selection [the strong scale effect referred to in (A2)]. When selection is for reduced variance, (32) will overestimate the variance of the index and the opposite is expected to hold when selection is for larger variation. It has not been possible to obtain an exact analytical description of the evolution of the variance of the index under selection. We have instead explored a heuristic formula that captures the increase in variance due to genetic drift and the decline (increase) due to selection for small (large) index values. The starting point is the conditional variance of the index given by (A2). Expanding in a Taylor series as in (26) and taking expectations as before yieldsSubstituting in (32) leads to(35)a heuristic expression for the variance of the mean index under selection [in the absence of selection, ; due to the expansion, the exponential term is not equal to the corresponding exponential term in (32)]. Above, , where is given by (23). The behavior of (35) is investigated below using simulated data.

#### Design considerations:

Under the homogeneous variance model, the design of a classical selection experiment to change the mean of a trait typically includes a selection and a control line or two divergently selected lines. The estimated response to selection and the estimated divergence are the difference in phenotypic means between the selected and control lines and the difference in phenotypic means between the divergently selected lines, respectively. One reason for using the difference between the line means is that, provided a number of assumptions hold, environmental effects on the trait cancel out. The estimator of response based on the difference between line means is free from the stronger parametric assumptions that characterize estimators based on the infinitesimal model. One requires that genetic means equal phenotypic means; the design should take care of the influence of nongenetic effects.

Here we are concerned with the design of an experiment where the selection criterion is the average sum of squares (Equation 22) for each individual. The purpose of the experiment is to test whether the environmental variance of the trait is partly under genetic control without invoking a genetically structured environmental variance model in the analysis. The estimator of response to selection in the index (22) is the difference between the average index in the selected line and the average index, either among individuals from the base population (unselected) or among contemporaneous individuals in a control line.

The need for diverting facilities in a control line depends on what one believes is the correct model to account for effects that, if ignored, would distort inferences. If a simple model can be postulated with only one unknown nongenetic parameter at the level of the mean and one at the level of the environmental variance, the estimator of response defined as the difference in mean index values between the selected line and the base population has expectation given by (27). In this case there is no need to diverge facilities in a control line. The estimator is approximately normally distributed (invoking the central limit theorem) but it has the disadvantage that its expected value depends on the single nongenetic effect acting at the level of the environmental variance, which must be estimated in some manner. An alternative estimator involving the ratio of the average indexes of the selected line and the base population could be considered. The (approximate) expectation does not depend on the nongenetic effects but on the variance . Further, the ratio of two (approximately) distributed random variables is not normally distributed; therefore sample size calculations based on normality may be inaccurate. A Monte Carlo approach such as the bootstrap could be used to carry out sample size calculations.

If the degree of experimental control is such that the simple model (21) cannot be justified and model (28) must be invoked to account for nongenetic effects peculiar to each generation, then, as shown in (29), the expectation of the average sum of squares involves these nongenetic effects at the level of the mean and the variance. A control line is then necessary to remove the nongenetic effects acting at the level of the mean, but it will do so if records are carefully matched, so that the first term in (29) cancels in the difference involving the average indexes of the selected and the control line. If this is the case, the estimator of response (defined as the difference in the average indexes of the selected and control lines) has approximate expectation given by (30).

#### Sample size calculations:

Two criteria are used to determine the size of the population required for artificial selection. The first one is the probability of detecting a response of a given magnitude; this involves classical power calculations. The second criterion, due to Nicholas (1980), is the probability of achieving an estimated response greater than a proportion φ of the expected response *R*_{I}.

#### Sample size based on power calculations:

After *t* generations of selection, an approximation for the expected response to selection of the index using (23) and (27) is given by(36)Let the estimator of response be the difference in the average indexes in the selected line at generation *t* and the base population of unrelated individuals. Assume that *m* individuals are measured in each group and that each individual has *n* records. Then the standard deviation of the estimator of response isThe term is the contribution of the covariance among the indexes of related individuals in the selected line, which is assumed absent in the base population.

The standard calculation of the size of the sample required to detect a difference between two groups equal to is(37)which for given values of *t*, *i*, , *n*, *N*, *z*_{α}, and *z*_{β} can be solved for *m*. In this expression, *z*_{α} is the cutoff point of the standard normal distribution corresponding to a probability of a type I error equal to α and *z*_{β} is the cutoff point of the standard normal distribution corresponding to a probability of a type II error equal to β (the power of the experiment is equal to 1 − β). This expression does not depend on *b**.

#### Sample size based on Nicholas' criterion:

The second criterion, proposed by Nicholas (1980), is the size of population *m* required to obtain, with probability δ, an estimated response that is greater than a proportion φ of the expected response . Nicholas (1980) shows that(38)where *Z* is the standard normal deviate. As in (37), for given values of *t*, *i*, , *n*, *N*, δ, and φ the required value of *m* satisfies(39)where *z*_{δ} is the cutoff point of the standard normal distribution corresponding to δ. As in (37), this expression is proportional to the inverse coefficient of variation of response, which does not depend on *b**.

## RESULTS

#### Variance of the index and response to selection:

Table 2 shows the behavior of the variance of the index (A8), which is an exact result, and of the approximate variance (32), in a random mating situation. Data were simulated according to model (21) with *b** = 2.14, = 1.28, , and ρ = −0.75. The values of *b** and of are taken from García and Baselga (2002), typically representative for litter size in rabbits or pigs (Sorensen and Waagepetersen 2003), which is the type of trait we wish to mimic. Each generation, *N*_{m} = 40 males and *N*_{f} = 160 females are randomly mated, *N*_{f}/*N*_{m} = 4 females mated to each male. Among the 160 females (litters), 1 female from each of the 4 mated to each male is randomly chosen, so that 20 females and 20 males contribute offspring to the next generation. From each of the 20 females, 8 female offspring and 2 male offspring per female are chosen as parents of the next generation. There is no variance in the number of males or females contributed by male and female parents, giving an effective population size of 80. The number of records measured in each of the 160 females is *n* = 3. The index (22) is calculated for each female and the process is repeated during eight generations. There is good agreement between the empirical variance of the index obtained from 1000 replicated simulations and the variance computed from the exact formula (A8). The approximate formula (32) understates the variance in later generations. The coefficient of variation of the index increases from a value of 11.3% in generation 2 to 15.2% in generation 8 (simulation results, not shown). The corresponding values predicted by formula (34) are 10.9 and 14.9%, respectively.

The validity of the approximations to predict changes of various quantities under selection is illustrated in Table 3. Columns 2–4 refer to predictions based on formulas (23), (32), and (34). Columns 5–7 show the corresponding results obtained from 500 replications of the experiment. A similar data structure as in the random mating situation is simulated. The number of males is *N*_{m} = 40, and the total number of females is *N*_{f} = 160 with *N*_{f}/*N*_{m} = 4 females mated to each male. The index (22) was computed for each of the 160 females and the 80 best-scoring females were selected. From each of these 80 females, two daughters were chosen as mothers to produce the next generation, and one son was chosen as father from each of 40 females randomly selected among the group of 80. The effective population size calculated from the expression in Hill (1972) is 116. This value agrees well with that derived from the simulated data based on the rate of inbreeding, despite the fact that Hill's formula assumes no selection. The expected response in the unobservable additive genetic values affecting environmental variance (column 2) tends to overpredict the simulated results. Part of this can be attributed to the selection intensity *i* based on normality used in (23). For a proportion selected of 50%, a ratio of the standardized mean among selected individuals obtained from a chi-square distribution with *n* − 1 = 2 d.f. (as an approximation to the distribution of the selection criterion *I*) and from a normal distribution is ∼0.87 (not shown). The ratio of the simulated to the predicted response is between 0.90 and 0.92. The overprediction of formula (23) increases at higher selection intensities.

Contrary to the situation under random mating, the variance of the mean of the index computed from (32) overestimates the variance obtained from the simulation. Formula (32) predicts an increase due to genetic drift, but the variance falls instead. This is due to the strong scale effect between the mean and the variance of the mean index, not accounted for in the predictions. The opposite holds when selection is for higher variation (not shown). The predictions of the variance of the mean of the index corresponding to generations 2, 4, 6, and 8, computed from (35) are 0.828, 0.836, 0.815, and 0.774, which better reflect the simulated results (see Table 3). When 40 of 160 litters are selected, the results for generations 2, 4, 6, and 8 are as follows (predicted; simulated): , , , and .

There is good agreement between the evolution of the coefficient of variation of the mean index predicted using (34) and the simulated results (Table 3, columns 4 and 7). The agreement holds also with more intense selection (contrary to the case with the prediction of changes of the other quantities, which deteriorate with higher selection pressure). For example, with 40 females selected of 160 scored, the values of the coefficient of variation of the mean index in generations 2, 4, 6, and 8 are (predicted; simulated): , , , and . Under selection formula (34) for the coefficient of variation, ignoring selection provides a good approximation.

#### Sample size calculations:

We consider the hierarchical mating design described above, also with either 40 or 80 females contributing as parents of 160 scored. We envisage a situation where a fixed number of facilities per generation is available and the length of the experiment can be varied. Table 4 shows the power to detect response to selection computed using simulation and using approximation (37). Power numbers agree well at the low selection intensity but are overpredicted at the higher selection intensity. This is due to the overprediction of selection response. As shown in Table 4, to detect response to selection with 80% power and a probability of type I error of 5%, when 40 (80) of 160 females are selected, requires an experiment spanning over 8 (11) generations. As a point of reference, from formulas in Hill (1980), when 40 (80) females are phenotypically selected from 160 scored, in the absence of a control line, detecting a response at the level of the mean with 80% power and a probability of type I error of 5% requires an experiment spanning 4 (6) generations.

Second, consider the length of the experiment required on the basis of the Nicholas criterion. If selection is for reduced variation, with the same values of parameters, to obtain with probability 80% a response in the index that is negative (as desired), when 40 (80) of 160 females are selected, requires an experiment spanning over 2 (3) generations. To obtain a response that is 50% of the expected response with 80% probability, when 40 (80) of 160 females are selected, requires an experiment spanning over 4 (6) generations. To obtain a response that is 80% of the expected response with 80% probability, when 40 (80) of 160 females are selected, requires an experiment spanning over 12 (18) generations.

## DISCUSSION

#### Statistical analysis:

In this work two models were compared using various criteria and the results of this exercise favor model 2 (the genetically heterogeneous environmental variance model). Several lines of evidence support this conclusion. First, the graphical assessments displayed in Figure 4 are suggestive of a negative correlation between environmental variation of litter size and genetic values for litter size. The posterior mean of the correlation coefficient ρ was −0.74 and the support was shifted a long way from the value of zero. Further, the Monte Carlo estimate of the 95% posterior interval of the additive genetic variance associated with the environmental variance was and the support is comfortably away from extremely small values in the vicinity of zero. The modal value of one of the priors used was 0.033, and the Monte Carlo estimate of the modal value of the posterior distribution was 0.139, indicating that the data caused the distribution to shift farther away from zero. The prior with modal value 0.300 resulted in a posterior mode of 0.141 (see Figure 1). Finally, model 2 is also supported by the deviance information criterion.

A negative correlation between genes affecting mean and variance could in principle facilitate improving litter size and simultaneously reducing its variation in repeated parities. The model postulates that genotypes that are expected to produce large litters are less likely to show departures from their expectation than genetically less prolific individuals. A discussion on possible mechanisms involved in the regulation of environmental variation is in Falconer and Mackay (1996).

The rabbit data span 10 generations. The raw coefficient of variation of the average index was computed each generation to search for the trend predicted by (34). However, the picture is too erratic and no clear trend could be detected.

One could speculate on what other unaccounted factors could misleadingly lead to a better relative fit for model 2, even in the absence of . Variance heterogeneity could be caused by unaccounted major genes. To investigate whether a gene of large effect might be operating, a segregation analysis was performed, fitting a Bayesian–MCMC model similar to model 1 with the addition of one putative major locus. This mixed inheritance model is described in Sorensen and Gianola (2002, pp. 672–679), where full details can be found. The parameters associated with the major locus are the initial gene frequency, *q*, the additive and dominance deviations, the additive and dominance genetic variances associated with the major locus , and their sum . The posterior means for these quantities and 95% posterior intervals are shown in Table 5. Given the model, the results show that litter size is affected by a recessive gene whose frequency at the start of the experiment is low and whose contribution in terms of and relative to the total phenotypic variance for litter size is 0.020 and 0.038, respectively. Relative to the total genetic variance (sum of the polygenic component and the total major locus genetic variance), these numbers are 0.16 and 0.30. The recessive homozygote genotype decreases litter size by 0.81 kits, and the heterozygote genotype increases it by 0.73 kits (based on estimated posterior means in Table 5). The 95% posterior intervals indicate that there is very large posterior uncertainty associated with these inferences. The Monte Carlo estimate of the DIC for this model is 7825. The estimated DIC corresponding to model 1 is 7810, which reveals that the data do not provide support for the presence of a major locus. This is consistent with the amount of posterior uncertainty (see the 95% posterior intervals in Table 5).

The segregation analysis was fitted as an extension of model 1 only. The amount of variation contributed by the major locus (total genetic variance) is at a theoretical maximum in those groups of individuals where the gene frequency is 0.70, and it does not contribute when the gene is fixed or lost. Using the estimates of posterior means from Table 5, the maximum possible contribution is 0.60. The range of within-group variation in Figure 4A is from 2.4 to 3.9. More precisely, the largest and smallest values for the environmental variance estimated fitting model 2 are 2.83 and 6.99, respectively. Further, we computed the gene frequency of the recessive allele in each of the groups in Figure 4A, and it fluctuated around 0.30 with no detectable association with within-individual variation across groups. We conclude that the variance heterogeneity postulated by model 2 cannot be explained by an unaccounted effect of the major locus only. Under the mixed-inheritance model, our results are in very good agreement with those reported by Argente *et al*. (2003) who used a similar data set. However, Argente *et al*. (2003) fitted a mixed-inheritance model only and based their conclusions on this single model. The analysis reported here based on the mixed-inheritance model agrees with theirs, but the DIC favors model 1.

#### Selection for environmental variation:

Attempts to show the existence of genetic heterogeneity of environmental variance analyzing field or experimental data that have not been specifically collected to address this issue provide circumstantial support for the model. One can never rule out artifacts, such as confounding between genetic and environmental effects, the scale of measurement, or the existence of a hidden mixture due to disease, say, that would favor the genetically heterogeneous environmental variance model relative to the other models. For example, if the distribution of the data is skewed in either direction, the genetically heterogeneous environmental model would fit relatively better than the standard model, yielding a coefficient of correlation between the additive genetic values affecting mean and variance whose sign and magnitude would depend on the direction and degree of skewness (Ros *et al*. 2004). Another source of spurious results could be due to the wrong choice of functional relationship between mean and variance postulated by the model (Ros *et al*. 2004). Some but certainly not all of these limitations can be overcome if the evidence comes not from fitting the model, but rather from using simple functions of the data whose pattern would be consistent with the presence of genetic effects on the variance. Figure 4A is an example. A more sophisticated approach in this spirit is in Rowe *et al.* (2006), who found evidence for genetic variance in environmental variance, analyzing the variance of the variance of large half-sib families for juvenile growth in broiler chickens.

On the other hand, a selection experiment showing that the environmental variance responds to selection would provide strong evidence. Such an experiment poses challenges in terms of number of individuals, design, criterion of selection, and the analysis and evaluation of response. Here we attempted to address some of these issues, with special emphasis on the size of the experiment.

The predictions of the evolution of the mean of genetic values affecting the variance and of the mean index deteriorate as selection intensity becomes more intense. We have also studied selection at the level of ln *I*, which, using the delta method, can be shown to be asymptotically normally distributed with mean *b** and variance equal to . The mean and variance of ln *I* | *b**, *a** are *b** + *a** and 2/(*n* − 1), respectively, mutually independent. Standard normal theory can then be used, which leads to simple algebra. For example, the expected response in *a** and in ln *I* when selection is based on ln *I* is , approximately the same as (23) if is small and *n* large (selecting on *I* or on ln *I* should lead to the same choice of parents), but different from (27). The variance of the mean ln *I* equivalent to (32) is . There is an appealing simplicity in these expressions that are independent of *b**. Unfortunately, with typical values of *n* ∼ 3 or 4 the approximation based on the delta method does not perform well, especially when several generations of selection are involved.

An alternative model of genetically structured environmental variance was postulated by Hill and Zhang (2004) and Mulder *et al*. (2007, 2008). In their model, the phenotype is assumed to be conditionally normally distributed (given genotype), with the same mean as in (21), and with conditional variance equal to , rather than as in (21). In the case of our model, as mentioned in Sorensen and Waagepetersen (2003),where is , independent of *a*. Thus our model postulates a stochastic relationship between the mean and the log-variance. When the components of the argument of the exponential of the right-hand side are small,so the mean–variance relationship is approximately linear. The Mulder *et al*. (2007, 2008) model assumes a linear stochastic relationship between the mean and the variance. There is thus little difference in this feature between the models when is small. Other functional relationships could be conceived, with no *a priori* mechanistic justification for the choice. A rationale for this choice could be based on a posterior analysis. In both of these models, the marginal distribution of phenotype is unknown. The mean of the marginal distribution of phenotype is the same in both models but the marginal variance under the model of Mulder *et al.* (2007, 2008) is , as in the homogeneous variance model. The focus in Mulder *et al*. (2007, 2008) is to derive expressions to predict changes in after one cycle of selection on the basis of either phenotype or an index, but not on the evaluation of these changes. Under their model, the conditional expectation of the index, given *a**, isand the expected change in the index (22) due to the expected change in *a**, equivalent to expression (27), is , not dependent on *b** [this expectation is different from (23)]. However, it can be shown that the term , which features in the determination of size of experiment, depends on . We have not found that the types of computations carried out in this study are simpler under this alternative model.

This work has centered exclusively on describing a simple approach to reduce environmental variation by selection and to measure this reduction, as a means of validating the genetically heterogeneous environmental variance model. No attention was given to the correlated change in mean, a subject studied in detail in the work of Mulder *et al.* (2007, 2008). In a typical selection program in cases where a near optimum for the mean has been approached, one may be interested in increasing relative selection pressure on the variance (Mulder *et al.* 2008). This leaves a proportion of the variance to manipulate environmental variation by selection. Clearly, in such a situation a relatively larger experiment would be needed to detect selection response in variance.

## APPENDIX

#### Expected response to selection:

An expression is derived for the expected response to selection (average values of additive genetic values affecting the environmental variance) when selection operates on the index (22).

Given the assumption of conditional normality of the data as specified in (21), the distribution of . Therefore,(A1)which is equal to the variance of the conditional distribution of a datum. From properties of the chi-square distribution, the conditional variance of the index is(A2)The conditional mean of the index and its conditional variance are functionally related; this generates a strong scale effect.

As in Gavrilets and Hastings (1994), assume that directional selection can be approximated using the linear fitness function(A3)where *s* is a small quantity that defines the strength and direction of selection and that can be made arbitrarily small so that remains positive. Letand define, for an individual with index *I*, the fitness of its genotype as

If selection operates on *I*, the mean value of *a** in the selected group, or response to selection on *a**, is(A4)where and , the marginal variance of the distribution *y* | *b*. San Cristobal-Gaudy *et al*. (1998) show that the selection intensity *i* (defined as the standardized deviation of the mean *I* among selected individuals from the population mean) can be expressed asand that, unconditionally with respect to (*a*, *a**),(A5)where . Replacing these expressions in (A4) yieldswhere *i* is the selection intensity averaged over sexes.

#### Covariance between indexes in related individuals:

To compute the covariance between the indexes of two related individuals *i* and *j*, we usesince the first term is equal to zero, and given *a _{i}**,

*I*is conditionally independent of

_{i}*a*and of . Then,(A6)The first expectation in the second line iswhere

*r*is the coefficient of relationship between individuals

_{ij}*i*and

*j*(

*i.e*., or for noninbred full or half sibs). Using this result and , (A6) reduces to(A7)

This covariance vanishes if individuals are genetically unrelated (*r _{ij}* = 0) and/or if . From (A5) and (A7) the sampling variance of the mean of the selection criterion (31) is(A8)where , the square of the environmental variance. This expression ignores changes of over generations but does account for an increase of the variance of the index with time via the term

*r*(in a finite population, individuals become more genetically correlated in later generations). It is therefore appropriate to write to emphasize the dependence of the genetic relationship with time

_{ij}*t*measured in generations. The second term can be simplified, writingwhere ,

*i*≠

*j*. Further, since the average genetic relationship is equal to twice the inbreeding coefficient in the next generation (assuming the same average relationship between individuals of the same or of different sexes), we can write , where

*F*is the average inbreeding coefficient in generation

_{t}*t*. A further simplification is possible provided

*t*< < 2

*N*; then , where

*N*is the effective population size. Finally the variance of the average index at generation

*t*is approximately

## Footnotes

Communicating editor: C. Haley

- Received May 18, 2008.
- Accepted September 19, 2008.

- Copyright © 2008 by the Genetics Society of America