# Multilevel Selection 2: Estimating the Genetic Parameters Determining Inheritance and Response to Selection

- Piter Bijma
^{*}, - William M. Muir
^{†},^{1}, - Esther D. Ellen
^{*}, - Jason B. Wolf
^{‡}and - Johan A. M. Van Arendonk
^{*}

^{*}Animal Breeding and Genetics Group, Wageningen University, 6709PG Wageningen, The Netherlands,^{†}Department of Animal Science, Purdue University, West Lafayette, Indiana 47907-1151 and^{‡}Faculty of Life Sciences, University of Manchester, Manchester, M12 9PT, United Kingdom

- 1
*Corresponding author:*Department of Animal Science, 1151 Lilly Hall, Purdue University, W. Lafayette, IN 47907-1151. E-mail: bmuir{at}purdue.edu

## Abstract

Interactions among individuals are universal, both in animals and in plants and in natural as well as domestic populations. Understanding the consequences of these interactions for the evolution of populations by either natural or artificial selection requires knowledge of the heritable components underlying them. Here we present statistical methodology to estimate the genetic parameters determining response to multilevel selection of traits affected by interactions among individuals in general populations. We apply these methods to obtain estimates of genetic parameters for survival days in a population of layer chickens with high mortality due to pecking behavior. We find that heritable variation is threefold greater than that obtained from classical analyses, meaning that two-thirds of the full heritable variation is hidden to classical analysis due to social interactions. As a consequence, predicted responses to multilevel selection applied to this population are threefold greater than classical predictions. This work, combined with the quantitative genetic theory for response to multilevel selection presented in an accompanying article in this issue, enables the design of selection programs to effectively reduce competitive interactions in livestock and plants and the prediction of the effects of social interactions on evolution in natural populations undergoing multilevel selection.

INTERACTIONS among individuals are universal, both in animals and in plants and in natural populations as well as in domestic populations. These interactions severely affect the direction and magnitude of responses to either natural or artificial selection (Dawkins 1982) and thus have important implications both for domestic breeding (Griffing 1967; Muir 1996) and for the outcome of evolution (Griffing 1981a; Ridley 1995; Goodnight and Stevens 1997; Keller 1999; Clutton-Brock 2002).

With respect to agriculture, understanding how to reduce competitive interactions through artificial breeding is critical for global food security (Denison *et al*. 2003) and required to improve animal well-being in species currently used in agriculture, such as swine and poultry (Muir 2003). Animal and plant breeders have been concerned about interactions among individuals, but have lacked direction as to methods to address these issues, and have been unsure as to the magnitude of the problem or even whether it exists at all. Griffing (1976a) observed that group selection could be used to ensure positive response to selection, but also noted that group selection was not very efficient. Determining the impact of interactions for artificial selection programs requires first of all knowledge of the genetic parameters underlying the interactions. Such knowledge would enable optimization of breeding programs to maximize total genetic improvement of complex traits, but is currently lacking.

With respect to the study of evolution of interactions, one can distinguish two approaches: the adaptionist approach (*e.g*., Hamilton 1964a; Williams 1966; Maynard-Smith 1974; Wilson and Sober 1989; West *et al*. 2002) and the genetic approach (Williams and Williams 1957; Griffing 1967, 1981a; Wright 1968, 1977; Wade 1979; Moore and Boake 1994; Goodnight and Stevens 1997; Roff 1997; Goodnight 2005). The adaptionist approach focuses on explaining existing phenotypic adaptations, whereas the genetic approach focuses on modeling ongoing evolution by natural selection. Adaptionists treat the current population as an end point and seek to explain current phenotypes by considering whether they are evolutionary stable (Maynard-Smith 1974). Hence, they implicitly assume that current trait values are optimal, meaning that traits have no heritable covariance with fitness. Geneticists, in contrast, take the opposite approach by assuming that all traits are heritable. With respect to behavioral traits, the adaptionist approach has so far had the most impact (Moore and Boake 1994; Goodnight 2005). In many cases, however, it is unknown whether or not traits in natural populations have a heritable covariance with fitness, which is particularly true for complex traits related to behavior.

In general, response to selection depends on the heritable variation in the traits of interest. Classical quantitative genetic theory states that response to selection equals the product of heritability and the selection differential, *R* = *h*^{2}*S*, a result known as the “breeder's equation” (Lynch and Walsh 1998). Griffing (1967, 1968a,b, 1969, 1976a,b, 1981a,b) and later developments of others (Moore *et al*. 1997; Wolf *et al*. 1998), however, showed that interactions among individuals involve additional heritable components. The accompanying article (Bijma *et al*. 2007, this issue) shows that interactions among individuals may create substantial additional heritable variation. Selection among individuals does not capture this additional variation, but selection acting on higher levels of organization, such as a group, captures the full heritable variation. Thus, knowledge of the amount of heritable variation hidden due to social interactions is critical to understand and predict consequences of multilevel selection.

At present, there is very little information on the amount of heritable variation hidden due to social interactions. The large responses observed in selection experiments applying selection among groups, however, provide indirect evidence for substantial hidden variation (Craig and Muir 1996; Muir 1996; see Goodnight and Stevens 1997 for a review). It is clear from those experiments (see Table 1 in Goodnight and Stevens 1997), and from results presented herein, that multilevel selection could result in the next major advance in both domestic plant and animal breeding. Full utilization of heritable interactions and multilevel selection in artificial breeding, and a better understanding of the impacts of interactions for evolution by natural selection, however, requires methodology to estimate the relevant genetic parameters in general populations.

Here we present statistical methodology to estimate the genetic parameters of traits affected by interactions, for interactions among any number of individuals in general nondesigned populations with any degree of relatedness among individuals. Next, we present estimates for the amount and composition of the total heritable variation for survival days in a population of layer chickens (*Gallus gallus*) and investigate the prospects of using multilevel selection and relatedness to improve survival in that population, by utilizing the total heritable variation.

## BACKGROUND

This section briefly summarizes the quantitative genetic theory of multilevel selection as presented in Bijma *et al*. (2007).

#### Model:

We consider a population in which interactions occur within “groups” of *n* individuals each and affect phenotypic values of individuals. In classical quantitative genetic theory, the phenotypic trait value of an individual, *P*, is the sum of a heritable component, *A*, referred to as “breeding value” and a residual component, *E*, referred to as “environment”: *P* = *A* + *E*. With interactions among *n* individuals, however, the observed phenotypic trait value of an individual is due to two unobserved phenotypic effects: a direct effect originating from the focal individual and the sum of the associative effects originating from of each of its *n* − 1 group members (Griffing 1967). Associative effects may be interpreted as heritable environmental effects provided by associates to the focal individual (Wolf 2003). Both phenotypic direct and associative effects are the sum of a heritable component (*A*) and a nonheritable component (*E*), so that the observed phenotypic trait value of an individual is given by(1)(Griffing 1967), in which *i* denotes the focal individual and *j* one of its associates (*i.e*., group members). The first two terms in Equation 1 represent effects due to the focal individual; the last two terms represent the effects due to its *n* − 1 associates. The effect is the direct breeding value (DBV) and *E*_{D,i} is the direct nonheritable value of individual *i*, whereas is the associative breeding value (SBV) and *E*_{S,j} the associative nonheritable value of associate *j*. The DBV and the SBV represent the heritable components underlying the observed phenotypes.

Note that nonheritable associative effects, indicated by in Equation 1, may contribute to the observed phenotype. This is because the full associative effect of an individual is a phenotypic effect, which may contain both heritable and nonheritable components. The genes of an individual do not *directly* affect the associates; the associates experience a phenotype. We emphasize those nonheritable associative effects, because they modify the statistical model required to obtain unbiased estimates of genetic parameters (see below).

#### Heritable variation:

Because each individual interacts with *n* − 1 associates, the total heritable impact of an individual on the population mean, referred to as its total breeding value (TBV), equals the sum of its DBV and *n* − 1 times its SBV: TBV_{i} = DBV_{i} + (*n* − 1)SBV_{i}. Response to selection equals the change per generation of the TBV. Hence, the TBV is a generalization of the classical breeding value, to account for interactions. The total heritable variation available for response to selection, therefore, equals the variance of the TBV,(2)(Bijma *et al*. 2007), in which is the variance of DBV, is the variance of SBV, and is the covariance between DBV and SBV. The sign of the covariance between DBV and SBV is a measure of competition *vs*. cooperation. Negative may be interpreted as “heritable competition,” in the sense that individuals with positive breeding values for their own phenotype (DBV) have on average a negative heritable impact on the phenotypes of their associates (SBV). Conversely, positive may be interpreted as “heritable cooperation.”

Classical quantitative genetic methodology to estimate additive genetic variances of traits rests on similarity of phenotypes' relatives. Associative breeding values of individuals, however, are not expressed in their own phenotypes, but in the phenotypes of their associates. Thus only the variance of DBV surfaces in classical analyses; the remaining part of the full heritable variance, , is hidden. Depending on group size and the heritability of interactions, this hidden heritable variance may be substantially larger than the variance of DBVs. In those cases, classical methods will substantially underestimate the total heritable variation. With specific family structures, the SBV may also be confounded with the DBV (see discussion and Wolf 2003), which would make the result of classical analyses an inaccurate estimate of even the direct genetic variance. The total heritable variation (Equation 2) does not depend on relatedness among associates. This is because all components of the TBV of an individual refer to its own genes. (See Bijma *et al*. 2007 for the effect of population structure.)

#### Response to selection:

In classical quantitative genetic theory, response to selection is in the same direction as selection pressure; heritability is merely a scaling factor affecting the magnitude of response, not its direction. With interactions among individuals, however, not only the magnitude, but also the direction of response depend on the genetic parameters (Griffing 1967; Kirkpatrick and Lande 1989; Moore *et al*. 1997). To illustrate this phenomenon, we consider response to selection among unrelated individuals, which follows from substituting *r* = *g* = 0 into Equation 5 of Bijma *et al*. (2007), giving , in which is the selection gradient. (See also Griffing 1976a.) This result shows that the direction of response depends on the covariance between DBV and SBV (). Strong competition, *i.e*., large and negative , will yield response in the direction opposite to the direction of selection. With interactions, therefore, the genetic parameters play a key role in the outcome of selection; they are no longer merely a scaling factor.

## METHODS

Powerful methods to estimate genetic parameters have been developed in the field of animal breeding (Henderson 1950; Lynch and Walsh 1998). Those methods have become known as “animal models” and have recently been introduced into other fields in biology (Kruuk *et al*. 2003). In the following, we extend the animal model to account for interactions. Particular emphasis is given on the proper inclusion of nonheritable associative effects.

In its basic form, *i.e*., in the absence of interactions, an animal model is given by(3)in which **y** is a vector of observed phenotypes; **b** is a vector of so-called “fixed” effects, with incidence matrix **X** linking observations on fixed effects; **a** is a vector of breeding values, with incidence matrix **Z** linking phenotypic values of individuals to their breeding values; and **e** is a vector of residuals. Fixed effects in **b** account for systematic nongenetic differences among observations, such as age or treatment effects. Covariance structures of model terms are: , in which **A** is a matrix of coefficients of relatedness between individuals and the variance of breeding values (which is to be estimated), and , in which **I** is an identity matrix (*I _{jj}* = 1 and

*I*= 0) and the residual variance. The

_{jk}**Za**term in the model, together with its variance structure, accounts for the genetic covariances between all observations.

To account for interactions in the animal model, we investigated the structure of the covariances between observations that arise from Equation 1 and formulated a statistical model that yields the same covariance structure. (The derivation is in the next section.) The resulting statistical model is summarized as(4)in which **a**_{D} is a vector of DBVs, with incidence matrix **Z**_{D} linking phenotypic values of individuals to their DBV; **a**_{S} is a vector of SBVs, with incidence matrix **Z**_{S} linking phenotypic values of individuals to the SBVs of their associates; and **e** is a vector of residuals. The covariance structure of genetic terms is , whereand indicates the Kronecker product of matrices. (Note that the element *y _{i}* of

**y**in Equation 4 corresponds to

*P*in Equation 1.) This fully describes the genetic components of the covariance between observations.

_{i}The covariance structure of the residual term of the model, **e**, can be derived using Equation 1 and is given by(5)(see next section for derivation), in which and , which is the correlation between residuals of group members. Together, Equations 4 and 5 fully describe the covariance structure that arises from Equation 1.

#### Derivations:

This section describes the derivations of Equations 4 and 5. Consider two individuals, *i* with group members denoted by *j* and *k* with group members denoted by *l*. The covariance between phenotypes of *i* and *k* follows from substituting Equation 1 for both *P _{i}* and

*P*, giving , which consists of a genetic and an environmental component. Analogous to the basic animal model, we account for the genetic components by including direct and associative breeding values in the model, together with their covariance structure. This yields Equation 4. For example, in Equation 4, the

_{k}*i*th element of

**y**corresponds to

*P*, the

_{i}*i*th element of

**Z**

_{D}

**a**

_{D}corresponds to

*A*

_{D,i}, the

*i*th element of

**Z**

_{S}

**a**

_{S}corresponds to , and the

*i*th element of

**e**corresponds to . As usual in an animal model, we account for covariances between breeding values in

**a**

_{D}and

**a**

_{S}by incorporating the relatedness matrix

**A**in the statistical analyses. (See below and the mixed-model equations in Muir 2005.)

What remains to be done is the covariance structure of the vector of residuals **e**. The covariance between residuals in **e** originates from the environmental component of the covariance between observations and can be partitioned into . As usual in quantitative genetics, we assume that *E*-terms of distinct individuals are independent. (Environmental effects common to multiple individuals are accounted for by the fixed effects **Xb**.) Thus a nonzero residual covariance occurs only when it involves the same individual, which happens either when *i* and *k* are the same individual or when *i* and *k* are group members. When *i* and *k* are the same individual, , which is the residual variance. The first term is due to the direct effect of the individual itself, and the second term is due to its *n* − 1 group members. When *i* and *k* are group members, their residuals are correlated for two reasons. First, for groups larger than two, *i* and *k* will have *n* − 2 group members in common, and thus they have *n* − 2 identical associative effects in their phenotype, giving . Second, the direct effect of *i* is expressed in its own phenotype, whereas the associative effect of *i* is expressed in the phenotype of *k*, and vice versa for *k*. As with any two traits of a single individual, the direct and associative effect originating from a single individual may be correlated Walsh (Lynch and Walsh 1998), giving . Thus the full covariance between residuals of group members equals . The represents the nonheritable relationship between the direct and associative effect of an individual and is a nonheritable measure of competition *vs*. cooperation (analogous to , see above). A negative may, for example, arise from variation in the juvenile environment experienced by individuals. Individuals having experienced good (bad) environments may be more (less) competitive, in which case they will have higher (lower) phenotypic values at the expense (to the advantage) of their associates. In Equation 5, we accounted for the nongenetic covariance between group members by fitting the correlation ρ between records of group members.

#### Experimental design for application:

The use of Equations 4 and 5 does not require a specific experimental design; groups may be composed of any combination of related and/or unrelated individuals and may vary in size. To calculate the relationship matrix **A**, information on at least one generation of family relationships will be required, so that one can distinguish full sibs, half sibs, and unrelated individuals. Information on more generations of pedigree, which provides information on more distant relationships, will add to the accuracy of results, but is not required for application. With groups of equal size, there will be no information in the data to separate and , because different combinations of those two parameters may yield an identical residual correlation. The inability to estimate and with equal group size is no problem, because the residual correlation itself, which included these components as part of a ratio, is estimable and sufficient in the analysis to avoid bias in the estimates of the genetic parameters. There are population structures from which the parameters of interest cannot be estimated. (See discussion on the work of Wolf 2003).

We used Monte Carlo simulations to validate our statistical methodology (appendix). Table 1 shows that estimated genetic parameters obtained from simulated data are unbiased; true values of genetic parameters are within ±3 standard errors of mean estimated values in all cases. These results confirm that Equations 4 and 5 correspond to the covariance structure that arises from Equation 1 and that there is no need for a specific structure of the family relationships within and between groups.

## APPLICATION

#### Material:

The methodology presented above was applied to analyze mortality in a commercial population of layer chickens. Field data on this population were provided by Hendrix Genetics and consisted of observations on survival days of a single generation of 3800 hens bred from 36 sires and 287 dams, which had been mated at random. Each sire had been mated to ∼8 dams, and each dam contributed on average 13.2 female offspring. Five generations of pedigree were included in the calculation of the matrix of relatedness (**A**).

Mortality in commercial chicken populations is largely due to cannibalistic pecking behavior (*e.g*., Craig and Muir 1996). The direct effect of an individual is, therefore, related to its ability to survive by avoiding being pecked, whereas its associative effect measures its effect on survival of its cage members, which is closely related to its own pecking behavior. In commercial egg production, individuals are frequently beak trimmed, to reduce mortality due to pecking behavior. In this data set, however, individuals had not been beak trimmed, which will be obligatory in the European Union from 2011 onward.

Data consisted of four age groups of approximately equal numbers of hens, differing by 2 weeks of age each. At an age of 20 weeks, individuals were allocated randomly to 950 standard commercial battery cages, four individuals per cage. Due to chance, some cages contained full or half sibs, but most cages contained unrelated individuals only. Facilities consisted of eight rows of cages. Each row was 40 cages long and consisted of 3 cages on top of each other, giving 120 cages per row. The eight rows were grouped into four double rows; both rows of a double row were put with the backs against each other. Each age group was allocated to a single double row. Back fences of cages consisted of latticework allowing limited contact between “backdoor neighbors,” *i.e*., between the individuals of two corresponding cages, each in one of the two rows of a double row. Adjacent cages in the same row were separated by a closed fence fully preventing contact. Individuals were kept 60 weeks and culled at an age of 80 weeks. Survival (0/1) was recorded daily on all individuals. For each individual, survival days were defined as age at death in days. Individuals surviving until 80 weeks of age had 80 × 7 = 560 survival days. Fifty-four percent of individuals survived until 80 weeks of age. Mean survival time was 454 days, with a standard deviation of 122 days. Inspection of dead hens showed that the vast majority of chickens had died due to being pecked. This observation was supported by previous data on beak-trimmed individuals of the same line kept in the same facilities, which showed substantially lower mortality (<10%). Thus the absence of beak trimming substantially amplified the consequences of social interactions.

Data were analyzed with Equations 4 and 5 as a model, using restricted maximum likelihood (REML) (Patterson and Thompson 1971) as implemented in the ASReml software package (Gilmour *et al*. 2002). The model included a fixed effect for each combination of row (1 through 8) and level (top, middle, and bottom) of the cage and for the number of surviving birds in the back cage, to account for a possible associative effect due to the backdoor neighbors. As usual with an animal model, this analysis directly yields estimates of the parameters of interest and utilizes the full data.

#### Results:

When using a conventional model without associative effects, estimated heritability for survival days in the chicken line was 6.7%, which is in the range common for survival traits in livestock (Table 2; Beaumont *et al*. 1997; Carlen *et al*. 2005; Gitterle *et al*. 2005). The total heritable variance estimated from the full model (Equations 2, 4, and 5), however, was 2742 days^{2}, which corresponds to 20% of the phenotypic variance. This result indicates a “heritability” threefold greater than that obtained using classical methods, *i.e*., 20% *vs*. 6.7%. Two-thirds of the heritable variation, therefore, is due to social interactions among individuals and is hidden from classical analyses.

At first glance, one might expect direct and associative effects for mortality due to pecking behavior to be negatively correlated, because dominant individuals may kill others and survive themselves. However, we found a small but nonsignificant positive correlation of +0.28 (*P* = 0.016), suggesting that individuals benefit from not harming others. In this population, there seems to be mutual benefit of behaving peacefully and a mutual cost of behaving aggressively. A positive covariance between DBV and SBV increases the total heritable variation (Equation 2) and therefore increases the potential of the population to respond to selection.

Figure 1 shows predicted responses to multilevel selection applied to the chicken population, using equations presented in Bijma *et al*. (2007). (Details and an example of calculation are in the appendix.) In Figure 1, the strength of multilevel selection is indicated by a factor *g*, where *g* = 0 represents selection among individuals, *g* = 1 represents selection among groups, and values of *g* between 0 and 1 represent a mix of selection among individuals and groups (Bijma *et al*. 2007). Classical theory suggests a response of only 7.8 days of survival. Predictions accounting for heritable interactions, however, yield substantially higher responses. Selection among individuals applied to a population composed of groups of unrelated individuals yields an expected response of ∼11 days of survival (*g* = 0, *r* = 0). Mild multilevel selection applied to a population composed of groups of full sibs more than doubles the predicted response (*g* ≥ 0.3, ). The maximum response that can be obtained equals 23 days (*g* = 0.6–0.8 and ), which is nearly threefold greater than suggested by classical theory. (Clones do not occur in reality.) The lines in Figure 1 are curved, which indicates that a degree of between-group selection (*g*) that maximizes response exists, which was also observed by Griffing. This value of *g* may be applied in artificial breeding programs, where the aim is to maximize response to selection. Note that results in Figure 1 are specific to the chicken population and to the environment in which it was kept, such as the density of housing; other populations may have other genetic parameters and other relationships among relatedness, multilevel selection, and response. Results in Figure 1 should, therefore, not be taken to conclude that relatedness has a bigger impact than multilevel selection in general.

## DISCUSSION

We have presented methodology to estimate the genetic parameters of traits affected by interactions among individuals of any degree of relatedness in general populations. The Monte Carlo simulations confirm that our methods yield unbiased results. In an accompanying article (Bijma *et al*. 2007, this issue), we provide the quantitative genetic theory for the consequences of multilevel selection. That article shows that, in theory, heritable interactions may have substantial consequences, such as the existence of considerable yet hidden heritable variation. This article provides the methods to estimate the genetic parameters required for applying the theory. Analysis of the chicken population revealed a threefold greater heritable variation and potential response to selection than estimated by classical methods. These findings demonstrate the relevance of our theoretical work for real populations.

Together, both articles provide testable predictions of response to multilevel selection in general populations. These predictions may be used to optimize artificial breeding programs for heritable traits that have nevertheless been refractory to selection among individuals (Teichertcoddington and Smitherman 1988; Vangen 1993; Kruuk *et al*. 2001; Muir 2005). Even for traits not refractory to selection, responses obtained in current selection programs are most likely less than what can be achieved by accounting for interactions. For example, results for the chicken population show that optimal multilevel selection will yield a response twofold greater than that of individual selection (23 *vs*. 11 days, Figure 1). The additional response to selection is expected to be even greater for situations where the covariance between direct and associative effects is negative, *e.g*., in situations with competition for limited resources. In the field of evolutionary biology, our methods may be used to quantify the potential of social traits to evolve in response to natural selection acting on multiple levels and also to investigate whether social traits in current populations show a heritable covariance with fitness, in which case they would respond to natural selection, indicating that current trait values deviate from evolutionary stable state trait values.

In stark contrast to classical theory, the total heritable variance () can *in principle* exceed the observed phenotypic variance. Consider, for example, groups consisting of unrelated individuals. In that case, the variance among observed phenotypes equals (Equation 1; appendix), whereas the total heritable variance equals (Equation 2). Comparing both expressions shows that is most likely to exceed the phenotypic variance when groups are large, associative effects are largely heritable, and the covariance between DBV and SBV is positive. For example, with *n* = 10, , heritable variances of direct and associative variances equal to 50% of the corresponding phenotypic variances, and , the phenotypic variance equals 1 + 11 × 0.2 = 3.2, whereas the heritable variance equals 0.5 + 9^{2} × 0.1 = 8.6. This example shows that interactions can in principle cause the heritable variance to exceed the phenotypic variance. Whether this occurs in real populations has not yet been investigated to our knowledge.

Previous work on the genetic background and evolution of social effects (Hamilton 1963, 1964a,b; Wolf *et al*. 1998; Wolf 2003) has focused predominantly on pairs of individuals, whereas the current approach applies to groups of any number of individuals. For certain types of interactions, such as maternal effects, it is obvious to consider only two individuals. Results for two individuals, however, do not generalize to larger groups. In larger groups, the variance of SBVs becomes the dominant term, both for response to selection and for the total heritable variation (Equation 2) (Bijma *et al*. 2007). The total heritable variation contains a term (Equation 2), which increases rapidly with group size, indicating that populations consisting of large groups may contain substantially more heritable variation than populations consisting of small groups.

The impact of group size on the heritable variation and on response to selection depends critically on the relationship between the variance of SBVs and group size. Depending on the nature of the trait, the variance of SBVs may or may not decrease with group size, and consequently the total heritable variance may either increase rapidly with group size or maintain a stable value (Bijma *et al*. 2007). Knowledge of the relationship between group size and the heritable variance would provide insight into the contribution of group augmentation to the evolution of social behaviors (Clutton-Brock 2002). For populations consisting of groups of different sizes, the methodology presented here can be extended to examine that relationship (not shown).

In comparison to other methods, Wolf (2003) presented an experimental design and a statistical method to estimate direct and associative genetic (co)variances in populations consisting of groups of two individuals and presented results for body length in *Drosophila melanogaster*. Results indicated an associative genetic variance of approximately half the direct genetic variance and a strong negative genetic correlation of −0.85 between direct and associative effects. In the experimental setup, flies were kept in pairs of individuals that were full sibs, half sibs, or unrelated. With unrelated individuals, the same two half-sib families were always combined in a pair. Dams were nested within sires; each sire was mated to exactly two dams and each dam to a single sire only. Data were analyzed using a sire and dam-within-sire model (referred to as “sire–dam model” in the following), and residuals were treated as independent, both within and between tubes (*i.e*., groups). Direct and associative genetic (co)variances were estimated using an ANOVA method, in which estimated sire and dam variances were equated to their expectations. Further details are in Wolf (2003). We investigated this experimental design using Monte Carlo simulation of the population structure of Wolf (2003). Simulated populations were analyzed using either Equations 4 and 5 or a sire–dam model. To our surprise, results from Equations 4 and 5 indicated that the associative genetic variance could not be estimated with this experimental design, whereas estimates from a sire–dam model were severely biased. To determine for certain whether or not the associative genetic variance can be estimated in this experimental design, we examined in detail the expectations of the sire and dam variances presented in Wolf (2003), which revealed that they are incomplete (appendix). Investigation of the complete expectations confirmed that the associative genetic variance cannot be estimated using the experimental design proposed by Wolf (2003). Moreover, it revealed that a sire–dam model is unsuitable for estimating direct and associative genetic (co)variances, because the true expectations of sire and dam “variances” can take negative values (because they are, in reality, covariances that, unlike variances, are not restricted to nonnegative values). Indeed, the expected sire variance would take a negative value when estimates presented in Wolf (2003) would be the true values (appendix). Results presented in Wolf (2003), therefore, may need to be reconsidered.

Furthermore, the assumption of independent residuals within tube may have caused overestimation of the genetic variances presented in Wolf (2003). To illustrate the potential consequences of assuming independent residuals, we reanalyzed the data on the chicken line by fitting both direct and associative genetic effects according to Equation 4, but replacing Equation 5 by . A likelihood-ratio test (Wald 1943; Lynch and Walsh 1998) strongly favored correlated residuals (*P* < 0.001). Results (not shown) indicated severely affected estimates of the genetic parameters, which increased by 37% for the variance of DBV, by 175% for the variance of SBV, and by 360% for the covariance between DBV and SBV, yielding a total heritable variation 2.6-fold greater than the original result (7294 instead of the 2742 days^{2} in Table 2). These results demonstrate that even a very small residual correlation (, Table 2) may substantially bias the estimated genetic parameters and illustrate the need to allow for nonheritable associative effects in the statistical analysis.

Muir (2005) presented similar methodology to estimate direct and associative genetic (co)variances in general populations. With respect to genetic components, our model (Equation 4) is the same as that of Muir (2005). However, Muir (2005) assumed that . This would imply that , which is necessarily positive. That assumption is likely to be violated, because an environmental effect that increases its own direct effect will most likely have a negative impact on others (except for altruistic effects). Here we presented the general solution, which obeys the basic quantitative genetic assumptions that (i) phenotypic values are the sum of both genetic and environmental values and (ii) phenotypic covariances are the sum of both genetic and environmental covariances.

The above considerations show that the choice of proper models and statistical methods is critical to the outcome and biological interpretation of results. Furthermore, the analysis of the experimental design of Wolf (2003) shows that for certain population structures the genetic parameters of interest cannot be estimated. That result is not specific to our statistical methodology, but holds in general. The experimental design used in Wolf (2003) yields a covariance structure between observations that is independent of the relative magnitudes of direct and associative genetic variances (appendix). To estimate genetic parameters of traits affected by interactions, we suggest the use of animal models and random composition of groups. Animal models directly include the animal component, instead of underlying components as sire and dam, and are a powerful tool for data analyses in both natural and domestic populations (Kruuk 2004). Application of animal models does not require designed populations and avoids the complexity of deriving expectations of sire and dam variances and the risk of obtaining estimates outside the parameter space, which may occur in ANOVA-based methods.

## APPENDIX

#### Monte Carlo simulation used for Table 1:

A population of five discrete generations was simulated. First, a base generation of 250 sires and 500 dams was generated. DBVs of base individuals were sampled from and nongenetic direct effects from . Associative effects of individuals were simulated conditional on their direct effects; the SBV of the *i*th individual of the base generation was sampled as , in which *r _{A}* is the correlation between DBV and SBV, and

*z*is a random variable taken from a standard normal distribution, . Nongenetic indirect effects were sampled as , in which

*r*is the nongenetic correlation between direct and associative effects. Phenotypes were created according to Equation 1 and parents for the next generation were taken at random from the base generation.

_{E}Second, four additional generations were generated. DBVs of individuals in those generations were sampled as and SBVs as , in which MS denotes Mendelian sampling components of breeding values, which were sampled as and . Nongenetic effects were sampled in the same way as in the base generation. Phenotypic values from the fifth generation were used to estimate variance components, and all five generations were included in the calculation of the relatedness matrix (**A**). Fifty replicates were simulated for each set of genetic parameters and estimates were averaged over replicates.

#### Example of calculation of response to selection for the chicken line:

Estimated genetic parameters for the chicken line were , , , , and ρ = 0.09. In the data, groups consisted of four unrelated individuals, so that the phenotypic variance equals (Table 2).

In the following, we give an example of calculation of response to selection for a population consisting of groups composed of four full sibs (*n* = 4, *r* = ), with substantial selection pressure given to group performance (*g* = 0.8), and for an intensity of selection of 1 unit (ι = 1). The general expression for response to selection is , in which *C* denotes the selection criterion, , and *g* the relative emphasis given to group performance in the selection criterion (Bijma *et al*. 2007). Thus, calculation of response to selection requires calculating the variance of the total breeding values, , the covariance between phenotypic values and TBVs of individuals, , and the selection gradient, . We start with the selection gradient and subsequently calculate and .

The denominator of the selection gradient, the standard deviation of the selection criterion, , depends on the phenotypic variance, , and on the covariance between phenotypes of group members, . The general expression for the phenotypic variance follows from Equation 1, giving . The covariance between phenotypes of associates also follows from Equation 1, . Next, it follows from that . Thus the selection gradient equals .

The variance of the total breeding value equals (Equation 2). The covariance between phenotypic values and TBVs of individuals equals (Bijma *et al*. 2007). Finally, response to selection follows from .

#### Expected sire and dam covariances in the design of Wolf (2003):

In the experimental setup of Wolf (2003), flies were kept in pairs of two individuals that were full sibs, half sibs, or unrelated. With unrelated individuals, the same two half-sib families were always combined in a pair. Dams were nested within sires; each sire was mated to exactly two dams and each dam to a single sire only.

Investigation of the expectations of sire and dam variances presented in Wolf (2003) shows that they are incomplete. Consider, for example, the case where tube members are half sibs. In this case, the sire covariance between two individuals in different groups (tubes) equals , instead of as listed in Table 1 of Wolf (2003), *i.e*., a difference of . This extra originates from the full-sib relationship that exists between a focal individual and the tube member of a half sib of the focal individual, which is not accounted for in Wolf (2003). When each sire is mated to exactly two dams, the group member of a half sib of the focal individual must be a full sib of the focal individual (in the case where tube members are half sibs). Note that is not a variance, but a covariance. For example, if estimates presented in Wolf (2003) would have been the true values, so that , , and , then the expected sire variance would equal −2666, *i.e*., a negative value. Hence, a sire plus dam-within-sire model is inappropriate, because the expected sire variance may be negative for combinations of direct and indirect genetic variances that are biologically possible.

To investigate whether direct and associative genetic variances can be estimated from the data structure used in Wolf (2003), we derive general expressions for expectations of sire and dam covariances. We distinguish between (i) covariances between individuals in the same group and (ii) covariances between individuals in different groups. The expected covariance between two individuals, *i* and *j*, in the same group is(A1)and in different groups is(A2)in which is relatedness between *i* and *j*, is relatedness between *i* and the group member of *j*, is relatedness between *j* and the group member of *i*, and is relatedness between the group members of *i* and *j*. Table A1 shows expected covariances between full sibs, half sibs, and unrelated individuals for each of the three tube compositions used in Wolf (2003). The expectations of sire and dam covariances given in Table 3 differ from those presented in Wolf (2003), primarily because covariances due to group members of full and half sibs are incomplete in Wolf (2003).

It follows from Equations A1 and A2 that and can be separated only when . Thus the information in the data to separate and depends on whether and how frequently occurs. It follows from Table A1 that occurs only in the situation where groups consist of two unrelated individuals and only when relatedness between the two group members of both individuals differs from that between both individuals considered. (see Table A1 footnote). Whether or not that situation occurs in the data depends on details of the experimental implementation that are not described in Wolf (2003). Thus, the data structure used in Wolf (2003) provides either very little or no information at all to separate the direct and the associative heritable variance.

To validate this theoretical result we performed Monte Carlo simulation of the data structure used in Wolf (2003), where, in the case of groups consisting of unrelated individuals, group members of full sibs were full sibs as well, and group members of half sibs were half sibs as well. Hence, theoretically this data structure should not contain any information to separate and . Statistical analyses of the simulated data using Equations 4 and 5 as a model showed that the matrix of second derivatives of the likelihood (Fisher's information matrix) was singular, indicating that there are an infinite number of combinations of parameters that yield the identical maximum likelihood. Indeed identical maximum likelihoods were observed for multiple combinations of and , which confirms the theoretical result derived above.

## Acknowledgments

We thank Henk Bovenhuis, Bart Ducro, Charles Goodnight, Han Mulder, Jeroen Visscher, Patrick Wissink, and Addie Vereijken for stimulating discussions during the development of this work and Hendrix Genetics for providing the data on laying hens. Part of this work was funded by Senter Novem, The Netherlands.

## Footnotes

Communicating editor: J. B. Walsh

- Received June 27, 2006.
- Accepted October 26, 2006.

- Copyright © 2007 by the Genetics Society of America