## Abstract

The **G** matrix measures the components of phenotypic variation that are genetically heritable. The structure of **G**, that is, its principal components and their associated variances, determines, in part, the direction and speed of multivariate trait evolution. In this article we present a framework and results that give the structure of **G** under the assumption of neutrality. We suggest that a neutral expectation of the structure of **G** is important because it gives a null expectation for the structure of **G** from which the unique consequences of selection can be determined. We demonstrate how the processes of mutation, recombination, and drift shape the structure of **G**. Furthermore, we demonstrate how shared common ancestry between segregating alleles shapes the structure of **G**. Our results show that shared common ancestry, which manifests itself in the form of a gene genealogy, causes the structure of **G** to be nonuniform in that the variances associated with the principal components of **G** decline at an approximately exponential rate. Furthermore we show that the extent of the nonuniformity in the structure of **G** is enhanced with declines in mutation rates, recombination rates, and numbers of loci and is dependent on the pattern and modality of mutation.

PHENOTYPIC evolution is channeled through patterns of heritable variation. This is exemplified by the breeder's equation, which for a single quantitative trait states that the evolutionary response (*R*) to selection (*S*) of the trait is proportional to its heritability, *h*^{2} (see, *e.g.*, Falconer and MacKay 1996; Lynch and Walsh 1998). Likewise, the simultaneous evolutionary responses of several potentially correlated quantitative characters are predicted by the multivariate extension of the breeder's equation, where is a vector describing the responses of each character to selection, **β**—the selection gradient vector—describes the pattern of directional selection applied to the traits, and **G** is the additive-genetic covariance matrix describing the heritable patterns of variation and covariation of the traits (Lande 1979).

Quantitative geneticists and evolutionary biologists alike are particularly interested in the principal components of the **G** matrix, since these indicate independent paths along which evolution can occur, and the variance associated with each principal component, since these modulate the speed of evolution along the direction of each principal component. For instance, the “evolutionary line of least resistance” for adaptive radiation (Schluter 1996) is defined by the principal component of **G** with the largest variance, *i.e.*, with the greatest amount of heritable variation. At the other extreme, principal components of **G** with the smallest variances represent evolutionary avenues with the greatest degrees of genetic resistance (Kirkpatrick and Lofsvold 1992; Hine and Blows 2006).

Since the **G** matrix itself can evolve, it is natural to consider the influences of evolutionary forces (such as selection, mutation, random genetic drift, and recombination) on the patterns of variation associated with the principal components of **G**. For example, principal components with the smallest eigenvalues might reflect the strongest patterns of multivariate selection (*e.g.*, Blows *et al.* 2004; Hine *et al.* 2004). Indeed, this is expected on general theoretical grounds (Lande 1980; Lande and Arnold 1983; Phillips and Arnold 1989) and has been verified by computer simulations (Jones *et al.* 2003).

A second force that causes the **G** matrix to evolve is random genetic drift. Phillips *et al*. (2001) showed that severe inbreeding led to an average reduction in total variation of the **G** matrix that was proportional to the theoretical expectation 1 *− F*, where *F* is the inbreeding coefficient. Furthermore, the principal components of the **G** matrices differed (in an apparently random way) among replicated inbred lines, although, on average, the **G** matrices across replicated inbred lines did not differ from those of the outbred control population. Subsequent work by Whitlock *et al*. (2002) on the same set of inbred lines showed that the **G** matrix in some lines continued to change over 20 generations of random mating, but there was no evidence that the **G** matrices were returning to the original state in the outbred control line. On the basis of work by Avery and Hill (1977) and Zeng and Cockerham (1991), Whitlock *et al*. (2002) hypothesized that the subsequent changes in **G** matrices in the generations following inbreeding were caused, in part, by the decay of linkage disequilibria that were generated during the inbreeding episode. Besides the force of recombination, theoretical work by Lande (1980) shows that mutational pressure can also cause the **G** matrix to evolve.

At equilibrium the proximate causes of the structure of a **G** matrix are linkage disequilibria and the pleiotropic and nonpleiotropic effects of segregating mutations. The ultimate forces that contribute to the structure of the **G** matrix are recombination, mutation, genetic drift, and selection (*e.g.*, Phillips and McGuigan 2006). To date, no theoretical work presents a neutral expectation for the structure of the **G** matrix when the only forces shaping its evolution are mutation, recombination, and drift, although Lynch and Hill (1986) presented a neutral model of the evolution of genetic variance for a single character. In the case of multiple characters Lynch (1994) and Phillips and McGuigan (2006) postulate that **G** will evolve to be proportional to **M**, the mutational variance–covariance matrix.

The focus of this article is the question of how mutation, recombination, and random genetic drift determine the eigenstructure of the **G** matrix at equilibrium, where the eigenstructure of the **G** matrix is measured according to its eigenvectors and eigenvalues. The eigenvectors of **G** are its principal components and the eigenvalues of **G** are the variances associated with each eigenvector. Naïvely, for the case of free recombination among loci, one might expect that the eigenvalues of **G** would be proportional to those of **M**, with drift-induced linkage disequilibrium and stochastic sampling from **M** causing nondirectional variation around this expectation. Accordingly, if the eigenvalues of **M** were all similar in size, then on average so would the eigenvalues of **G**. We explore the validity of this intuitively reasonable expectation.

One of the main insights of this article is that this intuitively reasonable expectation is not valid. A primary reason is that segregating alleles in a population are related to one another due to shared common ancestry or, to put it another way, alleles are related genealogically through shared common descent via reproduction. Patterns of shared common ancestry can be described for a sample from a population using Kingman's (1982) coalescent and multilocus extensions such as Griffiths and Marjoram's (1997) ancestral-recombination graph. In this article we use the coalescent and the ancestral-recombination graph as a framework to investigate neutral evolution of **G**.

The neutral expectations developed here aid the interpretation and analysis of recent empirical work that showed significant additive genetic variance for 20 wing-shape characters in *Drosophila melanogaster* and an approximately exponential decline in the ordered eigenvalues of the **G** matrix for these characters (Mezey and Houle 2005). The exponential decline in ordered eigenvalues may be interpreted as signatures of either variable selection across characters and/or variable mutational heritabilities across characters. Our analyses show that, under the neutral model, the ordered eigenvalues of **G** will evolve toward a regular pattern. Surprisingly, even if the ordered eigenvalues of **M** are perfectly uniform and selection is absent, the expected eigenvalues of **G** in a sample from a population are predicted to decline at an approximately exponential rate.

## METHODS

#### The model:

We consider the evolution of a multivariate quantitative trait affected by multiple loci in a diploid population of size *N* with Wright–Fisher random mating. (For nonideal populations, *N* = *N*_{e}, the variance-effective population size.) Evolution is neutral as we assume the trait experiences no selection. Alleles affecting the trait act additively within and across loci (*i.e.*, no dominance, epistasis, or genotype–environment interactions). For example, if and are multivariate effects of maternally and paternally inherited alleles at locus *i* (*e.g.*, the effects of each allele on *m* traits), then the genotypic value **g** of a typical individual is where is the number of loci. The phenotype is where **e** is an independent, mean-zero random variable. The **G** matrix is the *m* × *m* covariance matrix of additive values, **g**. We hereafter ignore **e** because, in the absence of selection, it plays no role in the evolution of **G**. It is important to note, however, that **e** can affect estimates of **G**, which may affect predictions of the eigenstructure of any particular **G**, especially for small samples from a population. We do not address this issue in this article and leave it for further study.

Under neutrality and additivity, three main processes influence **G**-matrix evolution: spontaneous mutation, recombination, and random genetic drift via stochastic reproduction. In this article we focus on the recombination–mutation–drift equilibrium characteristics of **G**. For spontaneous mutation we examined two infinite-allele mutation models: the continuum-of-alleles model (Crow and Kimura 1964; Kimura 1965) and the house-of-cards model (Kingman 1977, 1978). The continuum-of-alleles model assumes that each mutation generates a new multivariate effect that is added to the current effect of an allele at a locus, where a new effect is sampled from a mutational distribution. In our simulations we used a zero-mean multivariate normal distribution with covariance matrix **A** for the distribution of mutational effects. The diagonal elements of **A** are the variances in mutational effects and the off diagonals are the covariances of mutational effects on different traits. The house-of-cards model (Kingman 1978) assumes that each new mutation completely replaces the current allelic effect at a mutated locus.

#### Coalescent approaches:

Because the alleles contributing to **g** are neutral and every mutation is novel, the alleles at individual loci should follow the distribution expected at equilibrium for the infinite-alleles neutral model (see, *e.g.*, Ewens 2004). Rather than sample from this distribution directly, we used a different but equivalent approach for *n < N*, where *n* is the sample size—the coalescent. The coalescent is useful in that it retains information about the underlying genealogical history of alleles. Thus it can be used to model how variation in allelic values is determined by patterns of shared common ancestry.

##### Coalescent with independent loci:

Figure 1 illustrates the approach that we used to generate the genotypic values **g** of individuals in a sample of size *n* based on the neutral coalescent (Kingman 1982). Loci were treated independently. For each locus we generated a tree with 2*n* tips and placed mutations on the tree using the standard neutral coalescent approach. (We employed the algorithm in the Appendix of Hudson 1991.) When 1 unit of coalescent time is scaled to be 2*N* generations, coalescent events occur at a rate per coalescent time unit using the exponential approximation of a geometric process, where *k* is the number of lineages present at a particular time back in history. Once a gene tree is drawn, branch segments in the tree have times associated with them as a result of the coalescent process. Mutations were randomly placed along branch segments such that the number of mutations along a branch segment of the coalescent tree was Poisson distributed with mean 2*N* where is the haploid per locus mutation rate and *t _{i}* is the length (in coalescent time units) of branch segment

*i*. With each mutation, we then associated a vector of effects drawn from a multivariate normal distribution with mean

**0**and covariance

**A**. A population was generated by randomly assigning two alleles from each locus (without replacement) to each of the

*n*individuals. Note that this coalescent approach to generating individual genotypic values imposes the strict rule that loci are completely independent;

*i.e.*, there are no transient linkage disequilibria.

The overall process is illustrated in Figure 1 for a sample of *n =* 2 individuals, each with two loci. Mutations in the coalescent tree are marked with hatch marks and the associated vector of effects is represented by For locus A, alleles *a1* and *a2* are unmutated relative to the ancestral allele and, accordingly, have the ancestral value Allele *a3* is differentiated from the ancestral allele by one mutation and has the value = The value of allele *a4* depends on the mutation model. For the continuum-of-alleles model, the value of *a4* is but for the house-of-cards model, the value of *a4* is since the later mutational effect completely replaces the earlier effect The allelic effects at locus B are determined similarly.

Next, the alleles are randomly assigned two at a time, without replacement, to an individual. In Figure 1, alleles *a1* and *a4* at locus A and alleles *b3* and *b4* at locus B were assigned to the first individual. On the basis of the alleles sampled and the mutations in the gene tree, the genotypic value of the first individual is **g** = for the continuum-of-alleles model and **g** = assuming the house-of-cards.

##### Ancestral recombination graph:

As stated at the outset, linkage disequilibria are also a proximate cause of additive-genetic covariances. Strictly applying independence among loci may therefore eliminate a factor promoting genetic covariances even when there is free recombination among loci. To allow for physical linkage among loci and for the possibility of linkage disequilibria we used the ancestral recombination graph coalescent approach (Griffiths 1981, 1991; Griffiths and Marjoram 1996, 1997; see Nordborg 2001, pp. 195–200, for a useful introduction).

Figure 2 illustrates how a sample of two individuals each consisting of two loci is generated using the ancestral recombination graph. Instead of having two independent loci, the loci are linked. In Figure 2, the rectangles at the top each represent a haplotype in the current generation with the open section corresponding to locus A and the solid section corresponding to locus B. In an ancestral recombination graph, two processes occur when forming a gene tree, namely recombination and coalescent events. Scaling time to be 2*N* generations, coalescent events occur as before at rate where *k* is the number of lineages at a given time. Recombination events occur at the rate 4*Nrk*/2, where *r* is the per haplotype recombination rate per generation. The total rate of events is and, given an event, the probability it is a coalescent is Although the rate of recombination is typically much faster than the rate of coalescence, a common ancestor is guaranteed (Griffiths and Marjoram 1997).

Proceeding backward in time from the present, a recombination event is represented by the splitting of a single haplotype lineage into two lineages. For a two-locus model, one locus is retained in one ancestral lineage, whereas the other locus is retained in the second ancestral lineage. In Figure 2, shading corresponds to a locus that is recombined out in the descendant lineage following a recombination event. Often recombination events occur in an ancestral lineage that consists of one locus that is present in the current generation and one that is subsequently recombined out and not present in the current generation. With respect to the coalescence of alleles in the sample, these recombination events are relevant only with respect to incrementing time backward in the gene tree, but do not add additional information. Consequently, the splitting of these types of lineages is not recorded in the gene tree, although the time of the event is.

Once an ancestral recombination graph is drawn, there are times associated with each branch segment (in units of 2*N* generations) and numbers of mutations are modeled to occur along each branch segment according to a Poisson distribution with mean 2*N* where *t _{i}* is the time associated with branch segment

*i*. The per locus mutation rate is assumed to be equal across loci, such that for loci and given a mutational event, the probability the mutation occurs at a particular locus is In Figure 2, a hatch mark represents a mutation, the box next to it represents the locus at which the mutation occurred, and is the vector of effects of mutation

*i*. To get an allele's value, simply trace its history back in the gene tree, being careful to follow the correct ancestral lineage at a recombination event.

Because loci can be physically linked, individual genotypes are formed by randomly sampling haplotypes as opposed to alleles at independent loci. With a sample consisting of two individuals there are four haplotypes, each individual consisting of two haplotypes. In the example (Figure 2), individual 1 consists of haplotypes 1 and 4. Following the history of the allele *a1* back in time, only the mutation with effect occurs before reaching the common ancestor of the population so the additive effect of *a1* is Note that the mutation with effect occurred in an ancestral allele not present in the current generation. As with the independent-loci coalescent approach described above, an allele's additive value is the sum of the effects of mutations in a lineage for the continuum-of-alleles mutation model whereas, for the house-of-cards model, the additive effect is equal to the most recent mutation for the allele.

Given the relatively large sample sizes examined in this study combined with a large number of loci and a moderate to fast recombination rate, there were too many possible ancestral haplotypes for computers with 1 GB of memory to simulate the ancestral recombination graph algorithm; hence, there are significant computational limitations when considering recombination while utilizing the coalescent method.

#### Parameters:

Parameters underlying the results presented in this article are cast in terms of the number of loci, and **M** (see below for a definition of **M**). The expression gives the expected number of new mutations that enter a population each generation at each locus. In this article, ranges from 0.1 to 10.0. The expression gives the expected number of recombination events per generation during the meiotic phase of the life cycle for an entire population per adjacent pair of loci and it ranges from 0.0 to 1000.0. Numbers of loci vary from 1 to 10 in the results, and it is assumed that each locus has the same value of

In this article, mutations at a locus act either pleiotropically or nonpleiotropically. When mutations act pleiotropically, a single mutation potentially affects all characters and the covariance matrix of their effects is given by **A**. For simplicity, we concentrate on diagonal covariance matrices such that where **I** is the identity matrix. When mutations at a locus act nonpleiotropically, the mutations affect only a single character and the variance of their effects is 0.05.

Throughout this article it is assumed that mutation rates and mutational effects are uniform across characters. Thus, the expected eigenvalues of the mutational covariance matrix, **M**, are uniform, where and *U* is the zygotic mutation rate. We make these simplifying assumptions so that any nonuniformity in the eigenvalues of **G** arises via factors other than nonuniformity in the eigenvalues of the mutational covariance matrix. In reality, characters of organisms will likely have nonuniform mutational variance and this may be an important factor in determining the ordered eigenvalues of **G**. But, here we are particularly interested in demonstrating that genealogical history *per se* is important in shaping the ordered eigenvalues of **G**. Furthermore, we do not explore cases in which there is a nonorthogonal transformation between the underlying mutational process and the measured characters of an organism. If the transformation is not orthogonal then this will generally affect the eigenvalues of **G.**

## RESULTS

All cases we examined were consistent with the prediction that, at equilibrium, the average **G** matrix is equal to under neutrality (Phillips and McGuigan 2006). However, our results revealed two new, consistent features of neutral evolution of **G** matrices at recombination–mutation–drift balance. First, the eigenvalues of equilibrium **G** matrices are expected to be highly nonuniform—declining in an approximately exponential manner—even when the mutation covariance matrix **M** has perfectly uniform eigenvalues (*i.e.*, when **M** has a single repeated eigenvalue). Second, we found that genealogical ancestry of alleles in a finite population is particularly important in determining the intensity of decay of the ranked eigenvalues of **G** matrices under neutrality. In the absence of selection, two forces, namely recombination and random genetic drift, generate genealogies of alleles, and the genotypic values of alleles in a particular genealogy are determined by the pattern of mutation. In what follows we show in turn how mutation, drift, and recombination shape the ordered eigenvalues of **G** matrices.

#### Genealogical effect on G:

Figure 3 demonstrates how genealogical ancestry can generate nonuniformity in the eigenvalues of **G** assuming allelic variation at a single locus. In Figure 3a, the top left corner shows a genealogy of a sample of six alleles. On the genealogy are 10 mutations that give rise to five allelic types out of the six sampled alleles. The bivariate effects of these 10 mutations are plotted. The principal components of mutational variation are of similar magnitude, suggesting that variation is evenly distributed across the two dimensions. Figure 3b shows the values of the alleles at the tips of the tree, plotted as solid dots, using the continuum-of-alleles model. What is immediately apparent in this example is that allelic variation, instead of being evenly spread across two dimensions (as in Figure 3a), is primarily concentrated in one dimension. The reason for this is that allelic values are dependent due to their shared common ancestry whereas the mutational effects are independent.

Figure 4 extends the principle illustrated in Figure 3 by replicating the genealogical history of a single locus many times to determine if, on average, there is a greater rate of decline in the eigenvalues associated with breeding values *vs.* mutational variation. Furthermore, a larger sample is taken from each population and there are more trait dimensions. The boxes in Figure 4 give the average ordered eigenvalues of mutational variation for a single locus (as was done in Figure 3a, except replicated 500 times). Note that variation is nonuniform across the principal components even though mutation is sampled from a distribution with uniform variance across its principal components. The nonuniformity is the result of stochasticity, in that there are a finite number of mutations segregating at a locus and by chance the effects of these mutations are nonuniformly distributed across characters. The triangles in Figure 4 suggest that the eigenvalues associated with the breeding values of individuals decline at a greater rate than the eigenvalues associated with mutational variation. Two processes within the genealogy appear to bring about the enhanced decline in eigenvalues. One is that genealogy builds up statistical dependence among allelic values due to the pattern of shared common ancestry (branching). The second is that along a single branch multiple mutations may occur, which reduces the number of independently acting mutations. Figure 4 suggests that the point illustrated in Figure 3, namely that genealogy potentially enhances the nonuniformity in the eigenvalues associated with variation in breeding values, is general because, on average, there is a faster decline in eigenvalues associated with variation in breeding values than in mutational effects. On occasion, genealogical ancestry may have no effect on eigenvalues, but, on average, Figure 4 suggests that it enhances the nonuniformity in their values.

#### Linkage:

Linkage causes the genealogies of alleles at different loci to be correlated. Our results show that tighter linkage between two loci tends to enhance the nonuniformity in the expected eigenvalues of **G** (Figure 5). When the loci are completely linked (2*Nr* = 0) there is greater nonuniformity in eigenvalues than with a higher rate of recombination (2*Nr* = 1000).

#### Other factors influencing eigenstructure:

Besides finite mutation, genealogy, and linkage, our results show that several other factors contribute to the pattern of eigenvalues of **G**.

##### Number of loci:

Alleles at each locus vary in how they contribute to patterns of additive genetic variance. As more loci are added, the relative contribution of each to the variability in **G** is reduced. Accordingly, there is more nonuniformity in the eigenvalues of **G** when there are few *vs.* many loci (Figure 6).

##### Mutation rate and effective population size:

The rate of decline in eigenvalues is greater with smaller values of (Figure 7). In fact, there is a limit to the expected number of principal components that will have nonzero variance associated with it (Table 1). In Table 1, the **M** matrix has 25 dimensions with equal variances across dimensions and no covariance, yet beyond 12 principal components for a sample size of 75 and 15 principal components for a sample size of 300, principal components are expected to have zero variance associated with them. As the mutation rate or effective population size increases, the number of principal components with nonzero variance will increase (data not shown). Figure 7 illustrates another important point, namely that the decline in eigenvalues is not log linear for high-dimensional cases.

##### Mutational matrices with variable eigenvalues or nonzero correlations:

Until now, we have discussed only cases in which the expected mutational variance–covariance matrix has perfectly uniform eigenvalues with completely uncorrelated mutational effects on different traits. This is clearly a hypothetical extreme. In reality, we expect that the eigenvalues of **M** itself will vary and that mutational effects on different traits will be correlated (*e.g.*, see Ajie *et al.* 2005; Estes *et al.* 2005). Results suggest that nonuniformity in the eigenvalues of **M** increases nonuniformity in the eigenvalues of **G** (data not shown).

We also found that introducing nonuniformity in the eigenvalues of **M** had an important influence on the eigen*vectors* of **G**. Simulations for two traits show that, when **M** has uniform eigenvalues, the leading eigenvector of **G** is uniformly distributed between 0° and 360°. In contrast, when the eigenvalues of **M** differ, the leading eigenvector of **G** follows a unimodal, symmetric distribution approximately centered at the leading eigenvector of **M** (results not shown). Thus, uneven eigenvalues of **M** affect the entire expected eigenstructure (*i.e.*, eigenvalues and eigenvectors) of **G** under neutrality.

As for nonzero mutational correlations, their impacts on the eigenstructure of **G** are channeled through their effects on the eigenstructure of **M**. For example, consider a pair of traits each with mutation variance *V*_{m} and mutational correlation ρ (as in Jones *et al.* 2003). That is, which has eigenvalues The expected eigenvalues of **G** given this **M** are exactly the same as those produced by the mutational covariance matrix which has the same eigenvalues as **M** but with uncorrelated mutational effects (results not shown). This suggests that the eigenvalues of **M**—and not the specific patterns of mutational variances and covariances—determine the expected eigenvalue distribution of **G** at equilibrium. Note, however, that the distributions of expected eigenvectors of **G** would differ for the two mutational matrices since the eigenvectors of **M** and differ.

##### Mutational history:

We examined two models of mutation: the continuum-of-alleles model (Crow and Kimura 1964; Kimura 1965) and the house-of-cards model (Kingman 1977, 1978). The two models represent opposing extremes of how to model the effects of mutation. With respect to a genealogy of alleles, under the house-of-cards model the genealogy is important only in shaping allele frequencies, whereas in the continuum-of-alleles model the genealogy shapes both allele frequencies as well as the genotypic values of alleles. Thus, by comparing results from house-of-cards and continuum-of-alleles models, it is possible to disentangle, in part, the effects of variable allele frequency from the historical sequence of mutations on the eigenstructure of **G**.

Both the house-of-cards and the continuum-of-alleles models yield, on average, nonuniform eigenvalues at equilibrium (Figure 8). Therefore, it appears that variable allele frequencies contribute to the nonuniformity of eigenvalues of **G** and the mutational history of alleles further increases this nonuniformity.

##### Pleiotropic vs. nonpleiotropic mutation:

All of the results discussed thus far have assumed universal pleiotropy. For example, if five characters are modeled then each random mutation could potentially affect all five characters. Under universal pleiotropy variation at each character is determined by the same set of gene trees. At the other extreme, when mutations at a locus are not pleiotropic, each character has a unique set of gene trees underlying its variation, provided that there is some recombination between loci. Figure 9 shows that pleiotropic mutation enhances nonuniformity in eigenvalues of **G** (particularly for lower mutation rates), presumably because the allelic ancestries that underlie variation of different characters overlap. If mutations act nonpleiotropically, then, to the extent that loci freely recombine, the allelic ancestries underlying each character's variation will be independent.

##### Whole-population properties of **G**:

The results have focused on the properties of **G** based on a sample of size *n* from a population of size *N*, where *n < N*. An important question is the extent to which the nonuniformity in **G** is the result of taking a sample from a population. In the appendix, we describe results for the eigenstructure of **G** for an entire population based on individual-based simulations. The results suggest that the eigenvalues of **G** decline at an approximately exponential rate and that the decline is slightly less than the decline based on taking a subsample from the population to estimate the properties of **G**.

## DISCUSSION

Our results are consistent with the equilibrium expectation of **G**, predicted by Phillips and McGuigan (2006) under neutral evolution. However, our results demonstrate that the expected eigenvalues of **G** at equilibrium and the eigenvalues of *E*(**G**) are highly discordant. For instance, when **M** has uniform eigenvalues associated with it, then *E*(**G**) has uniform eigenvalues, but the expected eigenvalues of **G** are nonuniform. The practical consequence of this is that although *E*(**G**) may imply that variation in breeding values is evenly distributed across characters, in fact due to stochasticity, variation in breeding values is never evenly distributed. Some characters by chance will have less genetic variation than others and furthermore, due to stochasticity and the pleiotropic nature of mutation and/or linkage, breeding values will typically covary among characters even if the expectation is independence. Our results suggest that the nonuniformity in the eigenvalues of **G** is not trivial because the nonuniformity is approximately exponential. This approximate exponential decline in eigenvalues may lead to potentially strong constraints on the freedom of a multivariate phenotype to respond to selection.

A key factor shaping the nonuniformity in the eigenvalues of **G** is genealogy. Genealogy causes the effects of alleles at a locus to be dependent. The principle that genealogy causes dependencies is not new and has been a key basis of analyses in phylogenetics (*e.g.*, Piazza and Cavalli-Sforza 1974 and Felsenstein 1985). That genealogy is important in shaping the eigenstructure of **G** has been until now unrecognized. Shared common descent enhances the nonuniformity in the eigenvalues of **G** due to dependencies generated by branching patterns within the genealogy. In addition, multiple mutations may occur within a single branch of a genealogy, and this too can enhance the nonuniformity in the eigenvalues of **G** because there are fewer independently acting mutations. The genealogical ancestry of alleles is intrinsic to biology. The only way to partially overcome the covariance in allelic values that is induced by genealogy is for the value of an allele to be decoupled from its ancestry. The house-of-cards mutation model is one where the value of an allele is decoupled from its ancestry and our results show that this causes the decline in eigenvalues to be less than if the value of an allele is coupled to its ancestry (such as in the continuum-of-alleles model). Tighter linkage enhances nonuniformity in the eigenvalues of **G** because alleles at different loci do not segregate independently, thus decreasing the number of independent gene genealogies contributing to the structure of **G**.

In general, then, the nonuniformity in the eigenvalues of **G** is exaggerated with more dependency among allelic effects. We found that these dependencies can be affected by a variety of other factors as well. For instance, when there are fewer loci or the per locus mutation rate is lower the observed nonuniformity in the eigenvalues of **G** is increased. The degree of pleiotropy of new mutations affects the dependency of allelic effects. Under universal pleiotropy all characters are influenced by the same set of gene trees and this overlap increases the observed nonuniformity in the eigenvalues of **G**. In contrast, in the absence of pleiotropy, each character has a unique set of gene trees (as long as there is some recombination), and this independence decreases correlations between genotypic values and reduces nonuniformity in the expected eigenvalues of **G**.

What leads to the discrepancy between the expected eigenvalues of **G** and the eigenvalues of E(**G**)? Mathematically, the discordance traces to the fact that eigenvalues are nonlinear functions of the elements of **G**, and hence the expectations of the eigenvalues of **G** will not generally be equal to the eigenvalues of the expectation of **G**. This can be seen more directly in the following two-trait case. Suppose the expected **G** matrix is diagonal with the repeated eigenvalue where A realized **G** under neutral evolution can be viewed as a perturbation of *E*(**G**), say, where the 's are random variables with zero means and variances much smaller than If we imagine that the perturbations are drawn, say, from a zero-mean normal distribution with variance then it is not hard to show that the expected eigenvalues of **G** are not uniform (*i.e.*, not repeated). Indeed, the expected leading eigenvalue of **G** exceeds by at least whereas the smaller eigenvalue of **G** is expected to fall below by at least the same amount. This suggests that nonuniformity among the expected eigenvalues of **G** is driven by the variability of perturbations from *E*(**G**). This variability arises from the assorted sources of stochasticity inherent to evolution in finite populations.

That the eigenvalues of random covariance matrices (such as a **G** matrix) are nonuniform is not a new result (*e.g.*, see a review by Edelman and Rao 2005 and Wagner's 1984 random pleiotropy model). Additionally, similar patterns found in this article for the average of ranked eigenvalues are observed if it is assumed that the variances for characters are independently drawn from a chi square for a given degrees of freedom. In particular, a deviation from a purely exponential decline occurs as was observed in Figure 7. Broadly, stochasticity causes the nonuniformity in eigenvalues, but there are biologically unique processes that contribute to this stochasticity. This article suggests that genealogical ancestry is one important process.

Our neutral theoretical predictions for the expected structure of **G** at recombination–mutation–drift balance form an important set of null expectations for empirical studies of multivariate evolution. In particular, our finding that, under neutrality, the eigenvalues of **G** tend to be unequal and decay approximately exponentially from largest to smallest is consistent with empirical estimates of **G** (*e.g.*, Mezey and Houle 2005). However, our results also show that the rate of decay is sensitive in predictable ways to numerous factors including population size, the number and patterns of linkage among loci, and the rate of and modality of mutation. If one includes selection as well (as in Jones *et al.* 2003), current theoretical considerations suggest that inferring underlying process from an equilibrium estimate of **G** may be difficult because different underlying processes have similar effects on the eigenvalues associated with **G**. For instance, it may be difficult to disentangle mutation rates from numbers of loci and the pattern of recombination among loci. Single-locus expectations and statistics may be developed, but this ignores the likely possibility that more than one locus contributes variation to **G**. Rather, to infer processes that influence the evolution of **G** may require direct approaches (see Estes *et al.* 2004) or possibly the study of transient changes in **G** (Phillips *et al.* 2001; Whitlock *et al.* 2002).

## APPENDIX

#### Whole-population predictions for G:

To investigate neutral evolution of **G** for an entire population we used forward time, individual-based simulations. Our approach is similar to that of Jones *et al*. (2003) except for the absence of selection and our use of Wright–Fisher mating (*i.e*., sampling parents with replacement). The individual-based simulations were founded on the same model assumptions as the two coalescent-based methods. Events in the individual-based simulations were ordered mutation → recombination → random mating within generations. At the beginning of a replicate simulation run, each individual was initialized with a genotypic value of zero by assigning allelic values of zero to all loci. Population sizes of 140 and 1366 were simulated. An inspection of simulations suggests that populations reach quasi-recombination–mutation–drift balance by 12,000 generations for populations consisting of 140 and 1366 individuals given our mutation rate and rates of recombination. The basis for this inference was that the average **G** matrix from simulations was not significantly different from the expected **G** matrix. For each replicate run, we calculated the **G** matrix at generation 12,000 using the genotypic values of all individuals in the population. We ran 20 replicates for a particular parameter set when using 50 loci and 200 replicates when using 2 and 5 loci.

Some of the parameters used in both the individual-based simulations were similar to those utilized by Jones *et al*. (2003). For example, the per haploid locus mutation rate utilized was 0.0002 and all traits were uncorrelated with an expected per character variance in the effects of mutations of 0.05.

#### Results:

Figure A1 illustrates that when the whole population is modeled, the eigenvalues of **G** decline at an approximately exponential rate and that the rate of decline is greater for smaller populations. Figure A2 suggests that nonindependent segregation of alleles among loci enhances the nonuniformity in the eigenvalues of **G** when the whole population is modeled. Overall these whole-population simulations combined with the coalescent simulations suggest that the nonuniformity in the eigenvalues of **G** is a property of both an entire population and subsamples from the population. Furthermore, a comparison of individual-based and coalescent results in Figures A1 and A2 suggests that estimating the eigenstructure of **G** on the basis of a subsample from a population (as is done with the coalescent) results in slightly enhanced nonuniformity in the eigenvalues of **G** relative to the true value of the whole population.

## Acknowledgments

Comments by Ruth Shaw, Joe Felsenstein, and an anonymous reviewer improved the presentation and content of this article. We thank the anonymous reviewer for suggesting that we examine the case in which the variance for each character is independent and chi-square distributed. We thank P. Phillips and K. McGuigan for allowing us to cite their then unpublished work and the National Science Foundation (NSF) (grants DEB 0337582, EF 0328594, and EF 0531870 to RG) for financial support. The Initiative for Bioinformatics and Evolutionary Studies program at the University of Idaho kindly provided computer resources and they are supported by NSF grant EPS 0080935 and National Institutes of Health grants P20 RR16454 and P20 RR16448 from the Center for Biomedical Research Excellence and Idea Networks of Biomedical Research Excellence programs of the National Center for Research Resources.

## Footnotes

↵1

*Present address:*Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14583.Communicating editor: M. Nordborg

- Received December 13, 2006.
- Accepted February 19, 2007.

- Copyright © 2007 by the Genetics Society of America