- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Mezey, J. G.
- Articles by Houle, D.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Mezey, J. G.
- Articles by Houle, D.
Comparing G Matrices: Are Common Principal Components Informative?
Jason G. Mezeya and David Houleaa Department of Biological Science, Florida State University, Tallahassee, Florida 32306-1100
Corresponding author: Jason G. Mezey, Florida State University, Tallahassee, Florida 32306-1100., mezey{at}bio.fsu.edu (E-mail)
Communicating editor: Z-B. ZENG
| ABSTRACT |
|---|
Common principal components (CPC) analysis is a technique for assessing whether variance-covariance matrices from different populations have similar structure. One potential application is to compare additive genetic variance-covariance matrices, G. In this article, the conditions under which G matrices are expected to have common PCs are derived for a two-locus, two-allele model and the model of constrained pleiotropy. The theory demonstrates that whether G matrices are expected to have common PCs is largely determined by whether pleiotropic effects have a modular organization. If two (or more) populations have modules and these modules have the same direction, the G matrices have a common PC, regardless of allele frequencies. In the absence of modules, common PCs exist only for very restricted combinations of allele frequencies. Together, these two results imply that, when populations are evolving, common PCs are expected only when the populations have modules in common. These results have two implications: (1) In general, G matrices will not have common PCs, and (2) when they do, these PCs indicate common modular organization. The interpretation of common PCs identified for estimates of G matrices is discussed in light of these results.
COMPARISON of additive genetic variance-covariance matrices (the G matrices) of different populations is an important goal in evolutionary quantitative genetics (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Although the utility of comparing G matrices seems clear, which methods will furnish informative conclusions is not (![]()
Of the many multivariate statistical techniques proposed for comparison of G matrices (reviewed in ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The hierarchy of CPC models provides a valuable descriptive summary of matrix structure. However, the biological meaning of the results is unclear (![]()
![]()
![]()
| PLEIOTROPIC MODULARITY |
|---|
![]()
Modularity can be defined both in terms of pleiotropic effects of alleles segregating in a population and in terms of the pleiotropic effects that may be introduced into the population by mutation (![]()
![]()
![]()
| TWO-LOCUS, TWO-ALLELE MODEL |
|---|
The goal of discussing this simple model is to provide an intuitive illustration of the relationship between modules and common PCs that also applies to the more general model of constrained pleiotropy (![]()
![]() |
(1) |
where the ith element of the vector
jk.i is the additive effect of allele k associated with trait i (![]()
![]() |
(2) |
The relationship between these two vectors is pjk
j.i =
jk.i, where pjk is the frequency of allele k at locus j. Because there are two alleles at each locus and no nonadditive effects, each
j is constant, and the direction of each
jk is constant where
j1
-
j2. An allelic substitution at locus j has a pleiotropic effect in this model if both of the elements of the allelic vectors are nonzero:
j.i
0,
jk.i
0. In this model, forward and backward mutations occur at each locus j at the same rate µj, where a mutation changes an allele's identity to the other possible allelic state. The structure of the G matrix depends on the
j (the
jk) vectors and on the allele frequencies at the two loci (Appendix A). Note that, in the following, we assume that G can be estimated without error. We return to sampling issues in the DISCUSSION.
In this two-locus, two-allele model, existence of a module depends on the orientation of the allelic vectors associated with each locus. If the allelic vectors at one locus are orthogonal to the allelic vectors at the other locus, two perfect modules are present, because the pleiotropic effects can be divided into two groups that do not have overlapping effects. As an example, consider the case diagrammed in Fig 1A.1, where the allelic vectors of the loci are orthogonal to one another. Rotating the trait axes to the direction of the allelic vectors associated with each locus produces two new traits, f1 and f2, where the effects of allelic substitutions at each of the two loci are limited entirely to one of the two traits. Both of these new traits, f1 and f2, therefore define perfect modules. Fig 1B.1 diagrams a case without perfect modules. Because the allelic vectors are not orthogonal, two modules cannot be defined by a rotation of the axes. However the axes are rotated, allelic substitutions at both loci have effects on both new traits. Note that a modular organization is possible in Fig 1B.1 if a nonorthogonal rotation is used, but such transformations do not result in modules in an evolutionary sense; i.e., directional selection cannot be applied to such a "module" without resulting in a correlated response. Such nonorthogonal modules will be the subject of another article (J. G. MEZEY and D. HOULE, unpublished results). We confine the discussion here to modules that can be defined by rotations of the trait axes.
|
In this two-locus, two-allele model, the existence of modules places a major constraint on the possible orientations of G matrix PCs. When modules exist, the PCs of the G matrix have the direction of the modules, regardless of allele frequencies and changes in allele frequencies (Appendix A). To visualize this relationship between PCs and modules, again consider Fig 1. Fig 1A.2 diagrams the G matrix and PCs associated with two populations, both of which have the modules diagrammed in Fig 1A.1. The populations have different allele frequencies at the two loci, and as a result, the G matrices of the populations differ. Although the PCs of G are associated with different eigenvalues, the PCs have the same direction as the modules in both populations. Further, all variation attributable to the allelic substitutions defining an individual module is described by a single PC and its associated eigenvalue. Contrast this case with that diagrammed in Fig 1B.2, which diagrams G and the PCs for two populations where allelic vectors are described by Fig 1B.1. In this case, the different allele frequencies correspond to different structures of G and PCs that have different directions. In such cases, there is no simple relationship between PCs and the allelic effects associated with each locus.
For an individual population, the directions of G matrix PCs are always the same regardless of allele frequencies only if perfect modules exist (Appendix A). Therefore, if distinct populations have such modules in common (the modules have the same direction), the G matrices of the populations will always have common PCs, regardless of allele frequencies. Note that this relationship depends entirely on the direction of the modules and not the specific allelic effects defining the modules. Populations with different allelic effects at the two loci always have common PCs as long as both populations have modules in the same direction. In contrast, if the populations being compared have no modules, only a restricted subset of allele frequencies results in common PCs (Appendix A), even if the allelic vectors are the same in the populations being compared.
The cases diagrammed in Fig 2 illustrate these concepts. Fig 2A diagrams two populations (A and B) that have modules in common. For these populations, Fig 2A.1 diagrams in gray the allele (heterozygote) frequencies in population A that result in common PCs, given fixed allele frequencies in population B. Fig 2A.2 provides the equivalent diagram for population B given fixed allele frequencies in population A. Note that every possible allele frequency results in common PCs, regardless of the allele frequency in the other population. Contrast this situation with the case diagrammed in Fig 2B.1, where the populations have the same allelic effect vectors, but no modules are present. For these populations, the only allele frequencies for which common PCs occur are described by the dashed and dotted lines in Fig 2B.1 and b.2, respectively. The allele frequencies that do not fall on these lines result in no common PCs. Therefore, only a very constrained set of allele frequencies results in common PCs when no modules are present.
|
The implication of these results is that, when comparing evolving populations, we should not expect common PCs unless there are common modules. Without modules, the allele frequencies required for common PCs are so constrained that they are unlikely to occur given the stochastic effects of mutation and genetic drift. As an example, consider a case in which populations A and B have the same allelic vectors but no modules exist (as in Fig 2A.1). As demonstrated in Appendix A, common PCs occur in these two populations when the following constraint is satisfied,
![]() |
(3) |
where Hj.P = 2pjkpjl is the heterozygote frequency at locus j in population P and pjk and pjl are the frequencies of the alleles k and l. Note that, even if this constraint is satisfied at some point, any change in allele frequencies at the loci in one of the populations must be exactly balanced by a specific change in allele frequencies in the other population to preserve the ratios in (3). Any other changes in allele frequencies in the other population result in no common PCs. Stochastic changes are therefore not expected to preserve the necessary ratios. Of course, situations can be constructed in which the probability of common PCs is high even in the absence of modules. For example, infinite populations with the same allelic-effect vectors that have reached the same mutation-selection equilibrium would be such a case, but barring such extreme conditions, we should generally expect common PCs only when populations have common modules.
Fig 3 provides a summary of the four possible cases that can arise when two populations are compared for the two-locus, two-allele model: (1) The populations have modules in common (Fig 3A), (2) both populations have modules but the directions are different (Fig 3B), (3) one population has a module and the other does not (Fig 3C), and (4) neither population has modules (Fig 3D). Only the case in Fig 3A will always have common PCs. For the cases in Fig 3B–D, the vast majority of allele frequencies will result in no common PCs, and we should not expect to find common PCs when the populations are evolving.
|
Note that, if the allelic vectors in the two populations approximate a perfectly modular case (they are almost but not quite 90°), only very constrained allele frequencies result in common PCs as in Fig 3D. This result may seem strange. The reason for it is that the PCs in each G matrix must have exactly the same direction for common PCs to exist. In the absence of perfect modules, the vast majority of allele frequencies result in slight differences in the directions of the PCs in the populations and therefore in no common PCs. This is not to say that we would be able to determine that the PCs are different in such a case when analyzing estimates of the G matrices. The effects of sample size will tend to obscure such subtle differences, so cases that approximate perfect modules will be indistinguishable from perfect modules in practice. We return to this issue of how sample size affects the expectation of finding common PCs in the DISCUSSION.
| MODEL OF CONSTRAINED PLEIOTROPY |
|---|
Constraints on pleiotropic effects are the key to whether common PCs are expected. The model of constrained pleiotropy (![]()
![]()
In a quantitative genetic formulation, the model of constrained pleiotropy makes the assumption that the absolute values of the additive-effect vectors of any alleles k and l at a locus j are proportional:
![]() |
(4) |
Any number of alleles may be segregating at each locus, and mutations may introduce new alleles at a locus, although the effects of all alleles conform to the constraint of Equation 4. The structure of the G matrix depends on the effects and frequencies of the alleles segregating in a population. As in the two-locus, two-allele model, we consider the exact structure of G and assume no disequilibrium (gametic-phase or otherwise), no maternal effects, no sex linkage, no genotype-environment covariance or genotype-environment interactions, and random mating among diploid individuals.
In the model of constrained pleiotropy, modules exist if, for the N loci that may result in genetic variation in n traits, a subset of M loci (M < N) can be defined where the allelic vectors at each of these M loci are orthogonal to the allelic vectors at each of the other N - M loci. In this case, for each
jk(M) that may occur at the M loci and each
jk(N-M) that may occur at the remaining N - M loci,
![]() |
(5) |
Each subset M defines a module because new traits can be defined by a rotation of the n trait axes where genetic variation in the new traits affected by the M loci is independent of genetic variation in the rest of the new traits. Note that variation associated with a module M need not fall along a single vector. Modules may therefore be multidimensional, but the relationship of such higher-dimensional modules to the PCs of the G matrix is more complicated. In this article, we restrict the discussion to modules in which the allelic vectors at all of the M loci defining a module have the same direction; i.e., |
jk(M)|
|
mk(M)| for all loci j, m in M. We consider higher-dimensional modules in another article (J. G. MEZEY and D. HOULE, unpublished results).
Just as in the two-locus, two-allele model, when a one-dimensional module exists in a population, a PC with the same direction as the module will exist regardless of allele frequencies (Appendix B). Therefore, when populations have a module in common, they will always have a common PC with the same direction as the modules, regardless of allele frequencies in the populations. Also as in the two-locus two-allele model, if the populations do not have a one-dimensional module in common, very restricted allele frequencies are required for common PCs to exist (Appendix B). In the model of constrained pleiotropy, x common modules can exist, 0 < x
n, when n traits are considered. The same reasoning applies to such cases: If populations have x modules in common, 0 < x
n, at least x (excluding n - 1) common PCs will exist, and very restricted allele frequencies in the two populations will result in more than x common PCs (Appendix B).
Fig 4 illustrates these concepts. It diagrams three different possibilities that may arise when two populations (A and B) are compared when n = 3. In Fig 4A, the two populations have three modules in common. The G matrices of these populations will always have three common PCs; i.e., all PCs will be common PCs. Note that even if both A and B had three modules but the modules had different directions in the two populations, only very restricted allele frequencies would result in common PCs (Appendix B). In Fig 4B, the two populations have a single one-dimensional module in common. In this case, the G matrices will always have one PC in common, although the PC may be associated with different eigenvalues in the two populations. For there to be more than a single common PC in case 4b, very restricted allele frequencies are required in the two populations. In Fig 4C, neither population has any modules. Again, only very restricted combinations of allele frequencies would yield common PCs.
|
In summary, when comparing evolving populations with x common one-dimensional modules, we expect to find exactly x common PCs. The stochastic effects of mutation and genetic drift are very likely to result in allele frequencies where the other PCs differ in their orientiations (Appendix B).
| DISCUSSION |
|---|
The goal of the theory developed in this article is to assess whether the CPC model that is the basis of the CPC analysis can be informative for comparing G matrices beyond a descriptive summary of matrix similarity. When assessed solely from this perspective, the results are quite positive. Because of the close relationship between common PCs and modular structure, when common PCs do exist they have a biological interpretation: Common PCs indicate the existence of common modules. The intuition that common PCs have a biologically meaningful interpretation is therefore well founded (![]()
The modular structure that is sufficient to create common PCs is quite restrictive. It requires that the genetic effects of some set of loci be orthogonal to those of all other segregating loci. This requirement is equivalent to the requirement that some rotation of the axes in phenotype space that produces traits that are independent of all other traits exists. Given the general assumption that pleiotropy is ubiquitous, which we share, the existence of such extreme modules seems somewhat unlikely. Thus, we expect that the form of modular structure and therefore common PCs is unusual. This is not to say that cases approximating modular organizations are expected to be so rare that the possibility of their existence should be discounted. As discussed by a number of authors (![]()
![]()
![]()
How are we to reconcile these results with those of studies that have applied CPC analysis to G matrices and reported many common PCs? For example, ![]()
![]()
If the intuition that common modules should be rare is correct, the most likely explanation is that the power to detect differences in the direction of matrix PCs is low for the sample sizes commonly used in estimates of G. This explanation seems particularly likely given the results of ![]()
![]()
![]()
![]()
One reason for the inability of CPC analysis to distinguish distinct PCs when sample sizes are low may be the way that position in the Flury hierarchy is assessed (![]()
![]()
![]()
The sensitivity of CPC results to sample size means that, in practice, we cannot necessarily interpret common PCs as a demonstration of common modules. However, CPC analysis could be a useful tool for indicating which sets of traits are likely to have a modular organization, particularly if methods for assessing confidence in the existence of common PCs could be developed. We would not expect to have high confidence in a common PC among G matrices unless the populations have approximately modular organizations in common. The existence of modules would always have to be confirmed by independent means, because even without common modular organization, the allele frequencies required to produce a true common PC among G matrices could have occurred by chance.
In the context of identifying which sets of traits may have a modular organization, the reordering option available in the CPC analysis software of ![]()
![]()
![]()
![]()
The possibility that CPC analysis could be developed for the detection of modules is a particularly exciting prospect because modules have clearly defined genetic and evolutionary properties. For example, from a genetics perspective, modules represent a specific constraint on how variation at the gene level is related to variation in the phenotype (![]()
![]()
![]()
![]()
![]()
![]()
In conclusion, our results suggest (1) that common PCs are unlikely without modular organization and (2) that there is a biological interpretation of common PCs and a possible role for common PCs in the identification of modular organization. In both cases, interpretation of common PCs will be stymied until a systematic study of the sensitivity of CPC analysis to sample size is performed. If this problem could be addressed, CPC analysis of G matrices could provide biologically useful insight beyond a summary of matrix structure. In this role, CPC analysis could be particularly useful for addressing questions that require a relatively complete picture of genetic architecture: Do modules correspond to functional architectures (![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank Kyle Galivan, Thomas F. Hansen, Frances C. James, Eric Klassen, Joseph Travis, Zhao-Bang Zeng, and two anonymous reviewers for their comments on this manuscript. This work was supported by National Science Foundation grant no. 0129219.
Manuscript received November 4, 2002; Accepted for publication April 28, 2003.
| APPENDIX A |
|---|
TWO-LOCUS, TWO-ALLELE MODEL
It is assumed that the entirety of the genetic variation in n = 2 traits is determined by alleles segregating at N = 2 loci where only two alleles are possible at each locus. Forward and backward mutations occur at locus j at the same rate, µj. We assume no dominance, epistasis, disequilibrium (linkage or otherwise), maternal effects, sex linkage, genotype-environment covariance, or genotype-environment interactions. We assume random mating among diploid individuals.
jk.i is the additive effect of allele k at locus j associated with trait i, pjk is the frequency of allele k, and
j.i is the average effect of an allelic substitution at locus j on trait i such that pjk
j.i =
jk.i (![]()
Hj
0.5), the G matrix can be written as
![]() |
(A1) |
Note that, under the assumption of no nonadditive effects, the
j.i are constant, so the structure of G is a function of the allele frequencies, which may change as a result of mutation, selection, or genetic drift. A population is defined as having two modules if the vectors describing the average effect of an allelic substitution are orthogonal:
, where
j = [
j.1,
j.2]. In such a case, each
j defines a module (see text). The existence of modules can also be written as
for all alleles k and l at the two loci where
jk = [
jk.1,
jk.2]. If these conditions do not apply, no modules exist. Below, we assume that G has no multiplicity of eigenvalues and is of full rank unless noted. This latter assumption requires that all pjk > 0 and that
1 and
2 have different directions.
Result A1:
If populations have modules with the same direction, the G matrices have common PCs for all allele frequencies.
The matrix G is a real, 2 x 2, symmetric matrix. An orthonormal matrix Q and a diagonal matrix
therefore exist, such that
![]() |
(A2) |
where each column vector q of Q is a PC, an eigenvector, of G (q1
q2 = 0 and
) and each diagonal element of
(
1 and
2) is an eigenvalue. In the absence of a multiplicity of eigenvalues, the spectral decomposition of G exists and is unique. In this case, no other matrix Q, defined up to the multiplication of columns by -1 and column permutation, produces a diagonalization of G (![]()
If two modules exist (
1
2), the matrix
can be defined as
![]() |
(A3) |
where
. Note that
is an orthonormal matrix with column vectors that have the same direction as
1 and
2. Also define the diagonal matrix D with diagonal elements dj = Hj||
j||2. The matrix G can be written as
![]() |
(A4) |
This relation holds with the same orthonormal matrix
no matter what the allele frequencies in the population. Because this expression is a diagonalization of G, the uniqueness of the spectral decomposition implies that Q =
(up to column permutation and multiplication of columns by -1). Therefore, if
1
2, the matrix G has PCs with the same direction as the modules (i.e., the same direction as
1 and
2) regardless of allele frequencies. Also, each eigenvalue
j is a function of the allele frequency at a single locus:
j = Hj||
j||2. Therefore, in a population with modules, each PC accounts for the entirety of the variation attributable to a single module, which in this case is defined by allelic variation at a single locus. Because the G of a population with modules has the same PCs regardless of allele frequencies, if other populations have modules with the same direction, the G matrices of the populations will have two common PCs (i.e., both PCs will have the same direction) although the eigenvalues associated with the PCs in the two populations may differ.
Note that in the special case where an allele at one locus goes to fixation in one of the populations, the same argument can be used to demonstrate that there will still be two common PCs if the
j at the other locus has the same direction as a module in the other population. In such a case, a zero eigenvalue will be associated with one of the PCs in the population with the fixed allele. Similarly, if both
j in one population have the same direction, if the direction is the same as that of a module in the other populations, common PCs will still exist. These results also hold in the case where one or several of the G matrices have a multiplicity of eigenvalues. The reason is that the common PC model framework handles such cases where the eigenvector matrix is not unique by choosing the direction of the PCs to correspond to the PCs of other matrices (if possible). Also note that, if populations have modules with different directions, they will never have common PCs, unless allele frequencies are such that a multiplicity of eigenvalues exists. Because the approach used in Result A2 can be used to demonstrate that a multiplicity of eigenvalues in the G matrix occurs only for a very restricted set of the possible allele frequencies, common PCs are not expected when modules are not in common.
Result A2:
If populations A and B have no modules, given heterozygote frequencies in population A, a line intersecting the region bounded by the square of possible heterozygote frequencies in population B (0
Hj.B
0.5) describes the frequencies that result in common PCs in GA and GB.
An intuitive interpretation of this result is that the number of heterozygote (allele) frequencies for which GA and GB have common PCs is far smaller than the number of heterozygote (allele) frequencies for which the PCs are different. For example, given heterozygote frequencies in population A (H1.A and H2.A) for every heterozygote frequency H1.B at the first locus in population B, a single frequency H2.B at the second locus produces common PCs. All other frequencies at the second locus will result in different PCs.
Assume that there are no modules in population B, such that
T1.B
2.B
0. Fix H1.A and H2.A between 0 and 0.5 and assume that alleles are segregating at both loci in population B. Define the PC matrices of GA and GB as QA and QB. By the spectral theorem, if population B has the same PCs as population A, then QB = QA (up to column permutation and multiplication of columns by -1), and we can write
![]() |
(A5) |
From (A1), we can rewrite GB as
![]() |
(A6) |
Therefore, for population B to have the same PCs as population A, the following relation must be satisfied:
![]() |
(A7) |
The off-diagonal elements of the matrix on the left side of (A7) are the same and are equal to zero elements in the matrix on the right side:
![]() |
(A8) |
where
j.i.B is the average effect of an allelic substitution at locus j on trait i in population B and qjk.A is element k of column j of matrix QA. Note that if we assume population B has no modules,
T1.B
2.B
0, and because
, if the term in (A8) associated with either H1.B is zero, the other term associated with H2.B is positive. In such a case (A8) can be satisfied only if H2.B = 0 (and vice versa). Because we are currently concerned with common PCs and assume alleles are segregating at all loci, it is the case that
![]() |
(A9) |
Given (A8) and (A9), for population B to have the same PCs as population A, the heterozygote frequencies at the two loci in population B must satisfy the following relationship:
![]() |
(A10) |
Note that a single set of PCs is associated with each pair of heterozygote frequencies, so if (A10) holds for H1.B and H2.B, the equations defined by the diagonal elements of (A7) are satisfied by the eigenvalues of GB corresponding to H1.B and H2.B:
![]() |
(A11) |
Therefore, only when the heterozygote frequencies in population B satisfy (A10) are the PCs of GB in the same direction as the PCs of GA (i.e., two PCs are in common). These heterozygote frequencies can be visualized as falling on a one-dimensional "plane" that cuts through the region bounded by the square of possible heterozygote frequencies in population B where 0
Hj.B
0.5. The ratio of the number of heterozygote frequencies for which common PCs occur to all possible heterozygote frequencies is small. Therefore, the vast majority of the possible allele frequencies in population B result in different PCs in the two populations and similarly for population A when heterozygote frequencies in population B are held constant. Under the special case in which the same alleles are segregating in both populations (
1.A =
1.B and
2.A =
2.B), common PCs occur only when
![]() |
(A12) |
which happens when GA
GB.
The constraint of (A10) makes common PCs unexpected among the G matrices of populations A and B if there are no modules. The reason is that, even if this constraint is satisfied at some point, any change in allele frequencies at one locus must be exactly balanced by a change at the other locus that preserves the ratios in (A10). The stochastic changes in allele frequencies due to mutation and genetic drift are therefore not expected to preserve the necessary ratios.
Note that, although two populations are considered in this section, the reasoning can also be extended to multiple populations. Also, the same reasoning can be used to demonstrate that, in the case where an allele at one locus goes to fixation in one of the populations or where both
j in one population have the same direction, the allele frequencies required for common PCs are highly constrained in the same fashion. The case of multiplicity of eigenvalues does not occur in the special case of the two-locus, two-allele model when the
j in a population are not orthogonal.
| APPENDIX B |
|---|
THE MODEL OF CONSTRAINED PLEIOTROPY
Appendix B extends the framework outlined in Result A1 and Result A2 to the model of constrained pleiotropy of ![]()
jk|
|
jl|. Mutations are assumed to occur at each locus j at a rate µj. Here, n traits are being considered in all populations being compared, although the populations may have different numbers of loci. We assume random mating among diploid individuals in a population. We also assume no dominance, epistasis, disequilibrium (linkage or otherwise), maternal effects, sex linkage, genotype-environment covariance, or genotype-environment interactions.
The additive effect of allele k at some locus j for n traits is
![]() |
(B1) |
where pjl is the frequency of allele l, Jj is the number of alleles at locus j, each entry of gkl = [gkl.1, ... , gkl.n] is the mean phenotype of trait i given alleles k, l at locus j, and each entry of µg = [µg.1, ... , µg.n] is the mean genotypic value of trait i (![]()
![]() |
(B2) |
![]() |
(B3) |
Under the assumption of constrained pleiotropy, the genotypic values associated with a locus are proportional and this condition requires that gkl
gqr for all alleles k, l, q, r at locus j. We can therefore write
![]() |
(B4) |
where
j = [
j.1, ... ,
j.n] is the unit scaled vector in the direction of the pleiotropic effect associated with locus j, and the
are scalars. In the model of constrained pleiotropy, each of the
j are constant. Note that under the assumed conditions the G matrix can be written
![]() |
(B5) |
Setting
and
we can write G as
![]() |
(B6) |
Below, we assume that G has no multiplicity of eigenvalues and is of full rank unless noted.
Modules exist in a population if a subset of M loci (M < N) exists in which the allelic vectors at each of these M loci are orthogonal to the allelic vectors at each of the other N - M loci. This means that, for each
jk(M) that may occur at the M loci and each
jk(N-M) that may occur at the remaining N - M loci,
![]() |
(B7) |
If no subsets of M loci satisfy this relationship, a module does not exist. Note that in the following, we are concerned only with modules that are one-dimensional where |
jk(M)|
|
mk(M)| for all loci j, m in a subset of M loci that define a module.
Result B1:
For each pair of modules that populations have in common, the G matrices have a common PC with the same direction as the module, regardless of allele frequencies or effects of mutations in the populations.
In a population in which M < N loci define a module, the G matrix can be written as
![]() |
(B8) |
where the first summation is over the M loci defining the module and the second is over the remaining N - M loci. Each summation term is itself a matrix:
![]() |
(B9) |
Because we have assumed that G is of full rank, at least two alleles are segregating at at least one locus in M. If so, regardless of the specific set of alleles and the frequency of alleles in the population, the allelic vectors defining the module span a one-dimensional space in n, and the rest of the allelic vectors span an n - 1-dimensional space that is orthogonal. Correspondingly, the matrix GM is of rank 1 and GN-M is of rank n - 1. The spectral decomposition of GN-M produces a zero eigenvalue:
![]() |
(B10) |
Because the allelic vectors of the N - M loci span the n - 1 space, the eigenvector corresponding to the zero eigenvalue is orthogonal to the n - 1-dimensional space and has the same direction as the module regardless of allele frequencies or the effects of mutation in the population. Applying QN-M to the matrix GM produces
![]() |
(B11) |
where
M corresponds to the eigenvector with the same direction as the module. The orthonormal matrix QN-M therefore diagonalizes G,
![]() |
(B12) |
and is unique up to column permutation and multiplication of columns by -1, by the spectral theorem. Therefore a PC of G that has the same direction as the module always exists, and from (B11), this PC also accounts for the entirety of the variation segregating at the loci defining the module. Because the population has a PC that always corresponds to the module regardless of allele frequencies, if several populations have a module with the same direction, these populations will always have a common PC with the same direction as the module. The argument also holds for any number of modules (up to n), so a common PC occurs in the G matrices corresponding to each pair of modules that the populations have in common. Note that the same argument can be used to demonstrate that populations with modules in common will have common PCs corresponding to the modules even when the G matrix is not of full rank. Similarly, as explained for the two-locus, two-allele model, common PCs will occur when common modules do, even if G has a multiplicity of eigenvalues.
Result B2:
For populations A and B with no common modules, given allele frequencies in population A, the allele frequencies that result in common PCs in GA and GB are described by n overlapping quadratic (NB
B - n + 1)-dimension planes intersecting the NB
B-dimension region describing the possible allele frequencies at each of the NB loci in population B.
(x) indicates a matrix with elements
ij, where element
xx is a positive value, all other elements in column
x- and row
-x are zero, and all other elements may or may not be equal to zero. For example,
(1) is an instance of a matrix with the following form,
![]() |
(B13) |
where each
ij except
11 may or may not be equal to zero. In this notation, if populations A and B have PC x in common, the following relation holds:
![]() |
(B14) |
We consider the GA and GB associated with populations A and B at a given point in time. The populations segregate for NA and NB loci, respectively. By the spectral theorem and (B6), for population B to have PC x in common with population A, the following relation must be satisfied:
![]() |
(B15) |
Under the definition of Lj above, the off-diagonal elements of column (or row) x of the matrix on the left side of this relation define a system of n - 1 equations of the form
![]() |
(B16) |
for 0 < i
n, i
x. Note that each
k is a linear function of the allele frequencies at locus j, excluding allele k. The highest-order terms of (B16) are therefore quadratic. The allele frequencies in population B that satisfy the



































