## Abstract

Modeling and detecting nonallelic (epistatic) effects at multiple quantitative trait loci (QTL) often assume that the study population is in zygotic equilibrium (*i.e.*, genotypic frequencies at different loci are products of corresponding single-locus genotypic frequencies). However, zygotic associations can arise from physical linkages between different loci or from many evolutionary and demographic processes even for unlinked loci. We describe a new model that partitions the two-locus genotypic values in a zygotic disequilibrium population into equilibrium and residual portions. The residual portion is of course due to the presence of zygotic associations. The equilibrium portion has eight components including epistatic effects that can be defined under three commonly used equilibrium models, Cockerham's model, F_{2}-metric, and F_{∞}-metric models. We evaluate our model along with these equilibrium models theoretically and empirically. While all the equilibrium models require zygotic equilibrium, Cockerham's model is the most general, allowing for Hardy-Weinberg disequilibrium and arbitrary gene frequencies at individual loci whereas F_{2}-metric and F_{∞}-metric models require gene frequencies of one-half in a Hardy-Weinberg equilibrium population. In an F_{2} population with two unlinked loci, Cockerham's model is reduced to the F_{2}-metric model and thus both have a desirable property of orthogonality among the genic effects; the genic effects under the F_{∞}-metric model are not orthogonal but they can be easily translated into those under the F_{2}-metric model through a simple relation. Our model is reduced to these equilibrium models in the absence of zygotic associations. The results from our empirical analysis suggest that the residual genetic variance arising from zygotic associations can be substantial and may be an important source of bias in QTL mapping studies.

WITH increasing availability of fine-scale and highly dense genetic maps for different organisms, it is now possible to simultaneously identify and map several quantitative trait loci (QTL) controlling a trait of economic and/or adaptive significance in plant and animal species. Unlike earlier efforts to map individual QTL “one QTL at a time” (see Lynch and Walsh 1998 for review), this new approach requires that gene action models specify both allelic (additive and dominance) effects at individual loci and nonallelic interactions (epistatic effects) between loci. However, these genic effects must be defined in a “reference” population because they depend not only on genotypic values but also on genotypic frequencies of the reference population. The most obvious choice of reference population has been an “ideal” random mating population where Hardy-Weinberg and linkage equilibria are assumed but the gene frequencies may be arbitrary at different loci (Kempthorne 1957; Crow and Kimura 1970; Lynch and Walsh 1998). Cockerham (1954) considered a more general population where departures from Hardy-Weinberg equilibrium at individual loci such as inbreeding could be accommodated but the genotypic frequencies between loci would be uncorrelated. On the basis of this reference population, Cockerham (1954) then developed a set of orthogonal scales (contrasts) so that the two-locus genotypic values could be partitioned into independent components due to additive, dominance, and epistatic effects. Cockerham's model has been a standard for conventional quantitative genetic analysis and recently for modeling multiple QTL (Kao and Zeng 2002).

For populations derived from a cross between two inbred lines such as F_{2} and subsequent populations, two simpler models, F_{2}-metric and F_{∞}-metric, have often been used. While translation of the genic effects defined under the F_{2}-metric and F_{∞}-metric models can be easily done (Van der Veen 1959), each model has its own characteristics. The F_{2}-metric model is actually a special case of Cockerham's model where the gene frequencies are one-half and the genotypic frequencies are those expected under Hardy-Weinberg equilibrium at each of two uncorrelated loci. Thus the F_{2}-metric model possesses the same property of orthogonality among the genic effects as Cockerham's model and has been widely used (*e.g.*, Hayman and Mather 1955; Kempthorne 1957; Cockerham and Zeng 1996; Goodnight 2000; Kao and Zeng 2002). On the other hand, the genic effects defined under the F_{∞}-metric model are not orthogonal, but this model has been equally popular (*e.g*., Crow and Kimura 1970; Mather and Jinks 1982; Yang and Baker 1990; Haley and Knott 1992; Carlborg *et al*. 2000; Yi and Xu 2002) probably because of its easy and clear genetic interpretation as noted by Van der Veen (1959) and Mather and Jinks (1982).

The key assumption with all these gene action models is that the genotypic frequencies at different loci are products of the appropriate single-locus genotypic frequencies; in other words, the population is in zygotic equilibrium. However, zygotic associations can arise from physical linkages between different loci or from many evolutionary and demographic processes even for unlinked loci (see Yang 2002 for review). In the presence of zygotic associations, a large number of genic disequilibria including Hardy-Weinberg and linkage disequilibria are required for a complete characterization of nonrandom associations at different loci (Cockerham and Weir 1973; Weir 1996). These disequilibria have been considered in some general gene action models (*e.g.*, Gallais 1974; Weir and Cockerham 1977; Kao and Zeng 2002) but a prohibitively large number of genic effects arising from such models have prevented them from being estimated in experimental quantitative genetics and QTL mapping studies. In this article, we first develop a new model that partitions the two-locus genotypic values in a zygotic disequilibrium population into equilibrium and residual portions. Such partitioning is made possible by the decomposition of the two-locus genotypic frequencies into the expected frequencies and deviations (zygotic associations) as given in Yang (2000). The residual portion is of course due to the presence of zygotic associations. We also examine the partitioning of the equilibrium portion into different genic effects at the two loci using Cockerham's model and the F_{2}-metric and F_{∞}-metric models.

## COCKERHAM'S MODEL WITH ZYGOTIC ASSOCIATIONS

### Two-locus genotypic frequencies and zygotic associations:

For two loci, each with two alleles, *A* and *a* at locus *A* and *B* and *b* at locus *B*, there are 9 possible genotypes (10 if the coupling and repulsion double heterozygotes are distinguishable). Following Yang (2000), we write frequencies of these genotypes as, , which result from union of gametes *uy* and *vz* with *u*, *v* = *A* or *a*, and *y*, *z* = *B* or *b* (Table 1). The genotypic frequencies at individual loci are the marginal totals of the appropriate two-locus genotypic frequencies. For example, the frequency of genotype *AA* is With the genotypic frequencies at locus *A*, the frequency of allele *A* is and that of allele *a* is *p _{a}* = 1 −

*p*.

_{A}Departures from Hardy-Weinberg equilibrium at locus *A* are 1aand those at locus *B* are 1bIn a Hardy-Weinberg equilibrium population, *D _{A}* =

*D*= 0. In inbred populations,

_{B}*D*and

_{A}*D*depend on the level of inbreeding as measured by the inbreeding coefficients at loci

_{B}*A*(

*f*) and

_{A}*B*(

*f*). For example, the Hardy-Weinberg disequilibrium at locus

_{B}*A*is

*D*=

_{A}*f*, where Yang (2000) defined a zygotic association as a deviation of the two-locus genotypic frequency from the product of single-locus frequencies, 2which has the range of Because these zygotic associations are constrained by the single-locus genotypic frequencies, only 4 of the 9 zygotic associations can be defined freely and the remaining 5 are expressed entirely in terms of the four defined zygotic associations (Table 1).

_{A}p_{A}p_{a}For compact and clear presentation of subsequent developments, we describe the relations of two-locus genotypic frequencies with their expectations and zygotic associations (Table 1) in matrix form, 3where with diag{ } signaling a diagonal matrix.

### Cockerham's model:

In the absence of zygotic associations (**Ω** = **0** and **P** = **Ψ**), Cockerham (1954) developed a regression model to partition the two-locus genotypic value into eight orthogonal scales, 4where is the vector of genotypic values corresponding to genotypes [*AABB AABb Aabb AaBB AaBb Aabb aaBB aaBb aabb*], **1** is a vector of ones, μ = (**G′ Ψ 1**) is the population mean, **w*** _{t}* is the

*t*th column vector of matrix

**W**, and β

*is the coefficient of partial regression of*

_{t}**G**on

**w**

*, 5It should be noted that the orthogonal scales given in the*

_{t}**W**matrix are slight modifications of those by Cockerham (1954) to be consistent with more recent uses of these scales (Cockerham and Zeng 1996; Kao and Zeng 2002). Like the usual orthogonal contrasts, the orthogonal scales satisfy the two basic requirements: (i) and (ii) for

*t*≠

*t*′. In addition, these eight orthogonal scales correspond to the additive and dominance effects at locus

*A*(

**w**

_{1}and

**w**

_{2}), additive and dominance effects at locus

*B*(

**w**

_{3}and

**w**

_{4}), and four epistatic effects between the two loci,

**w**

_{5}(=

**w**

_{1}×

**w**

_{3}),

**w**

_{6}(=

**w**

_{1}×

**w**

_{4}),

**w**

_{7}(=

**w**

_{2}×

**w**

_{3}), and

**w**

_{8}(=

**w**

_{2}×

**w**

_{4}).

The orthogonal scales in the **W** matrix can be used to partition the total genetic variance in a zygotic equilibrium population into eight independent components, 6awhere 6bCockerham (1954) gave detailed expressions for σ^{2}_{t}'s in an inbred population where Hardy-Weinberg disequilibrium is measured in terms of inbreeding coefficient. He also noted that in the absence of inbreeding (*f _{A}* =

*f*= 0 and

_{B}*D*=

_{A}*D*= 0), these variance components became those in a Hardy-Weinberg equilibrium population (Kempthorne 1957; Crow and Kimura 1970; Lynch and Walsh 1998). Because the different scales are orthogonal (

_{B}*i.e.*, ), there are no covariances between them.

When *p _{A}* =

*p*=

_{B}^{1}/

_{2}and

*D*=

_{A}*D*= 0 (

_{B}*i.e.*, the gene and genotypic frequencies are those in an F

_{2}population), Cockerham's model is reduced to the F

_{2}-metric model. In this particular case where the F

_{2}population is in Hardy-Weinberg and zygotic equilibria [

**Ω**=

**0**and

**P**=

**Ψ**= (

^{1}/

_{16})diag{1 2 1 2 4 2 1 2 1}], the expressions for partial regression coefficients in Cockerham's model (β

*'s in Equation 5) are greatly simplified and are identified with specific genic effects, 7(Cockerham and Zeng 1996; Cheverud 2000; Kao and Zeng 2002), where and are the average values of genotypes*

_{t}*uv*(

*AA*,

*Aa*, or

*aa*) at locus

*A*and

*yz*(

*BB*,

*Bb*, or

*bb*) at locus

*B*, respectively. This set of genic effects along with the mean (μ) can be expressed in terms of another set under the F

_{∞}-metric model through a simple translation as given in Van der Veen (1959). A more detailed comparison of the F

_{2}-metric model with the F

_{∞}-metric model is made later in F

_{2}-metric

*vs.*F

_{∞}-metric models.

### Departure from Cockerham's model:

In the presence of zygotic associations (**Ω** ≠ **0** and **P** = **Ψ** + **Ω**), however, a residual term (ε) needs to be added to Cockerham's regression model and Equation 4 becomes 8where μ′ is the mean of the zygotic disequilibrium population, μ′ = **G′P1** = μ + **G′ Ω 1**. Thus, the total genetic variance in the zygotic disequilibrium population is 9where σ^{2}_{G} is the total genetic variance under Cockerham's model as given in Equation 6a and σ^{2}_{ε} is the residual variance arising from zygotic associations. When the means and genetic variances in equilibrium and nonequilibrium populations are known, σ^{2}_{ε} can be obtained from Equation 9, 10aAlternatively, it can be calculated directly from genotypic values and zygotic associations, 10bUsing the relations among zygotic associations given in Table 1, we have 10cwhere and It should be noted that while σ^{2}_{ε} is called the residual “variance” it is not necessarily positive because the zygotic associations in **Ω** that are the deviations can be either positive or negative.

### Application:

Doebley *et al*. (1995) identified two QTL, UMC107 (designated as locus *A*) and BV302 (designated as locus *B*), controlling differences in plant and inflorescence architecture between cultivated maize (*Zea mays* ssp. *mays*) and teosinte (*Z. mays* ssp. *parviglumis*) in the BC_{3}F_{2} population (Teosinte-M1L × Teosinte-M3L) derived from an original cross of Reventador maize × *parviglumis* teosinte. A total of nine morphological and inflorescence traits were measured. Two of those traits, average length of vegetative internodes in the primary lateral branch (LBIL) and percentage of cupules lacking the pedicellate spikelet (PEDS), have been reanalyzed to examine the magnitude of epistasis between loci *A* and *B* (Lynch and Walsh 1998; Goodnight 2000; Kao and Zeng 2002). However, all these analyses have assumed *p _{A}* =

*p*=

_{B}^{1}/

_{2}and

*D*=

_{A}*D*= 0 as in an F

_{B}_{2}population with two uncorrelated loci. We test to determine if such assumption holds for Doebley

*et al*'s data. We then use the actual genotypic frequencies as given in Table 7 of Kao and Zeng (2002) and the trait values in Table 8 of Doebley

*et al*. (1995) to illustrate the application of our new model developed above.

The observed and expected genotypic frequencies and their differences (disequilibria) at loci UMC107 and BV302 are given in Table 2. A sample of 183 individuals from the BC_{3}F_{2} population was recorded in Doebley *et al*. (1995) but as Kao and Zeng (2002) pointed out, 21 individuals had a missing value for one or more traits measured. Thus, a sample size of 161 individuals is used for our analysis. The expected values are derived assuming Cockerham's reference population (CRP) and the F_{2} population. For a single locus, CRP may not be in Hardy-Weinberg equilibrium with arbitrary gene frequencies whereas the F_{2} population is in Hardy-Weinberg equilibrium with the gene frequencies being constrained to the fixed value of one-half. Thus, the deviations of observed from expected single-locus genotypic frequencies in CRP are simply Hardy-Weinberg disequilibria but such deviations in the F_{2} population are Hardy-Weinberg disequilibria plus the differences between expected values in the two reference populations. For example, for genotype *AA* at locus *A*, the difference between observed and expected genotypic frequencies in CRP is 0.2050 − 0.2228 = −0.0178 but the corresponding difference under the F_{2} population is 0.2050 − 0.2500 = −0.0450, which equals to −0.0178 + (0.2228 − 0.2500). The expected two-locus genotypic frequencies in CRP are the products of observed single-locus genotypic frequencies. For example, the expected frequency of genotype *AABb* is (0.2050 × 0.5280) = 0.1082. In contrast, the expected two-locus genotypic frequencies in the F_{2} population are the products of expected single-locus genotypic frequencies. According to the chi-square tests, neither Hardy-Weinberg disequilibria at individual loci nor zygotic associations between the two loci are significant. The observed genotypic frequencies fit well to those expected in CRP and the F_{2} population. Because of small numbers of individuals for some genotypes (*e.g.*, *AAbb*), we also carry out Monte Carlo exact tests (Weir 1996). The “exact” *P* values are very similar to the asymptotic *P* values from the chi-square tests. Thus, the assumption of F_{2} gene and genotypic frequencies used for analyzing Doebley *et al*.'s data (Lynch and Walsh 1998; Goodnight 2000; Kao and Zeng 2002) is probably justified.

Despite insignificant departures from Hardy-Weinberg and zygotic equilibria, we retain these disequilibria for illustrating the use of our new model. The mean values of nine morphological and inflorescence traits for each of the nine genotypic classes as given in Doebley *et al*.'s (1995) Table 8 are used for our analysis. We partition the genotypic values into eight genic effects defined under Cockerham's model and the F_{2}-metric model plus a residual effect (Table 3). The estimated genic effects under the F_{2}-metric model for LBIL are identical to those in Lynch and Walsh (1998)(pp. 88–90) and in Kao and Zeng's (2002) Table 9 (under the “Cockerham” column), barring rounding errors. In most cases, the values under Cockerham's model and the F_{2}-metric model are similar as would be expected from insignificant Hardy-Weinberg and zygotic disequilibria (*cf.* Table 2). However, there are a few exceptions, particularly with the residual effects for LBIL and percentage of staminate spikelets in primary lateral inflorescence (STAM), where the two models differ most. These differences are likely attributable to the deviations of gene frequencies from one-half under Cockerham's model.

Table 4 presents the partitioning of total genetic variance using (9) for each of the nine traits in the presence of zygotic associations. Two different partitions are obtained, depending on whether the equilibrium portion of total variance is calculated using Cockerham's model or the F_{2}-metric model. For example, for LBIL, under Cockerham's model but under the F_{2}-metric model. In other words, since the two models are based on different reference populations, they have different population means and genetic variances under zygotic equilibrium . Like the patterns of estimated genic effects in Table 3, contributions (percentages) due to individual variance components of σ^{2}_{G} under Cockerham's model *vs.* the F_{2}-metric model are similar in most cases; the residual variances are similar as well with the two notable exceptions for LBIL and STAM. Further inspection on contributions due to individual variance components of σ^{2}_{G} under both models shows that additive effects are predominant components for all traits except for PEDS whereas dominance and epistatic effects are minor. Epistasis is most important for PEDs with its effects accounting for nearly 40% of the total equilibrium variance under both models.

It is evident from Table 4 that the residual portion of the total genetic variance is generally small compared to the equilibrium portion . This is of course due largely to insignificant zygotic disequilibrium between the two interacting loci, UMC107 and BV302 (Table 2). To gauge the impact of zygotic disequilibrium on the magnitude of σ^{2}_{ε}, we let the Teosinte-M1L × Teosinte-M3L population be at the maximum level of zygotic disequilibrium that is obtainable if double homozygotes and heterozygotes are equally frequent in the absence of single heterozygotes at both loci (Yang 2003). One of such possibilities is that the required **P** matrix would be estimated by (1/161)diag{20 0 20 0 81 0 20 0 20}. With this particular genotypic distribution, the magnitudes of σ^{2}_{ε} are larger for all nine traits and for some traits are as much as or even greater than those of σ^{2}_{G} (results not presented). For example, for LBIL under the F_{2}-metric model with the maximum zygotic disequilibrium, compared to with insignificant zygotic disequilibrium as shown in Table 4; in both cases, . Evidently, the magnitude of zygotic association affects the partitioning of the total genetic variance. Further research is needed to examine detailed relationships between nonextreme values of zygotic disequilibrium and the residual variance.

## F_{2}-METRIC *VS.* F_{∞}-METRIC MODELS

### Conceptual distinction:

As noted above, Cockerham's model is reduced to the F_{2}-metric model for F_{2} and other populations derived from a cross between two inbred lines (*cf.* Equation 6). However, since the 1950s, the F_{2}-metric model has coexisted with another popular model of gene action, the F_{∞}-metric model. In essence, genic effects under Cockerham's model or the F_{2}-metric model are deviations from the mean of a noninbred equilibrium population whereas genic effects under the F_{∞}-metric model are contrasts among genotypes without reference to any population. For example, the additive effect at locus *A* under Cockerham's model or the F_{2}-metric model (β_{1}) is a weighted average of the two differences and corresponding to the comparisons of genotypes, *AA* − *Aa* and *Aa* − *aa*, with each difference representing an effect of replacing *a* by *A*. The dominance effect (β_{2}) would be present, *i.e.*, , unless the two differences are exactly the same. On the other hand, the two genic effects at locus *A* under the F_{∞}-metric model are one-half the difference between the two homozygotes and the deviation of the heterozygote from the homozygote mean, respectively (*e.g.*, Mather and Jinks 1982). The F_{∞}-metric model is also referred to as the homozygote-based model (Wright 1987).

Van der Veen (1959) gave three reasons why the F_{∞}-metric model would be preferred. First, nonzero coefficients of the genic effects defined under the F_{∞}-metric model are uniquely associated with homozygosity and heterozygosity at different loci. Second, the expressions for describing heterosis and other genetic phenomena are simpler and more surveyable under the F_{∞}-metric model. Finally the F_{∞}-metric model leads to simpler and symmetrical conditions and descriptions of relationships identified for classical F_{2} segregation ratios with epistasis. In contrast, Kao and Zeng (2002) recently argued against the use of the F_{∞}-metric model particularly in QTL mapping studies because there is possible bias in estimating allelic effects in the presence of nonallelic (epistatic) effects due to the fact that the genic effects defined under the F_{∞}-metric model in an F_{2} population are not orthogonal. In what follows, we clarify the relationships and differences between the two models through theoretical analysis and numerical evaluation.

### Parameter relations:

With two uncorrelated loci in an F_{2} population (*p _{A}* =

*p*=

_{B}^{1}/

_{2}and

*D*=

_{A}*D*= 0), the values of nine possible genotypes can be expressed as 11where δ indicates whether the genic effects are defined under the F

_{B}_{2}-metric model (δ = 0) or F

_{∞}-metric model (δ = 1);

**E**

_{δ}= [μ

_{(δ)}

*a*

_{A}_{(δ)}

*d*

_{A}_{(δ)}

*a*

_{B}_{(δ)}

*d*

_{B}_{(δ)}

*aa*

_{(δ)}

*ad*

_{(δ)}

*da*

_{(δ)}

*dd*

_{(δ)}]′ contains the parameters defined in (6) for δ = 0 and 12Obviously,

**G**=

**M**

_{0}

**E**

_{0}and

**G**=

**M**

_{1}

**E**

_{1}describe the partitioning of genotypic values under F

_{2}-metric and F

_{∞}-metric models, respectively. Since the F

_{2}-metric model is a special case of Cockerham's model, columns 2–9 in the

**M**

_{0}matrix are identical to the

**W**matrix when

*p*=

_{A}*p*=

_{B}^{1}/

_{2}and

*D*=

_{A}*D*= 0. The translation of parameters from F

_{B}_{2}-metric into F

_{∞}-metric can be made through the relation

**E**

_{0}=

**TE**

_{1}(Van der Veen 1959), where 13

Thus, the same genotypic values can be partitioned into the mean and eight genic effects defined under either the F_{2}-metric or the F_{∞}-metric model so long as appropriate translations are identified, 14In addition, it is quite obvious from (14) that **M**_{0} = **M**_{1}**T**^{−1} and **M**_{1} = **M**_{0}**T**.

With F_{2} gene and genotypic frequencies [*i.e.*, **Ω** = **0** and **P** = **Ψ** = (^{1}/_{16})diag{1 2 1 2 4 2 1 2 1}] the population mean is 15The total equilibrium genetic variance can be partitioned under the F_{2}-metric model, 16aand under F_{∞}-metric model,
16bIt is evident from comparing (16a) and (16b) that the two models differ in partitioning the total genetic variance σ^{2}_{G} in an F_{2} population. Because, with F_{2} gene and genotypic frequencies, the genic effects under the F_{∞}-metric model are not orthogonal the additive and dominance variances include covariances between allelic and nonallelic effects plus a portion of appropriate epistatic variances. It is also evident that the eight independent components of σ^{2}_{G} as identified in Cockerham's model have simpler forms under the F_{2}-metric and F_{∞}-metric models, 17

### Numerical evaluation:

We evaluate 20 genetic models exhibiting different segregating ratios for two independent loci in F_{2} populations (Table 5). These models are chosen to represent varying levels of additive, dominance, and epistatic effects under the F_{2}-metric and F_{∞}-metric models. Yang and Baker (1990) also examined these genetic models, but only with the F_{∞}-metric model. The first 10 models involve no epistasis and they are variants of the Mendelian F_{2} ratio of 9:3:3:1 for two independently segregating loci (model 7). These models vary from those of pure additive genic effects (models 1 and 6) to those of strong dominance (*e.g.*, models 9 and 10). Models 11–19 are those with classical epistatic ratios that can be found in F_{2} populations (Strickberger 1976). Model 12 is the partial complementary model, a modification of model 11 with 0 ≤ *w* ≤ 1. Model 20 is hypothetical with the relation of *a _{A}*

_{(0)}=

*a*

_{B}_{(0)}=

*d*

_{A}_{(0)}=

*d*

_{B}_{(0)}=

*aa*

_{(0)}=

*ad*

_{(0)}=

*da*

_{(0)}=

*dd*

_{(0)}under the F

_{2}-metric model. The multiplier

*x*in each of the genotypic values in Table 5 can be any nonzero integer. For convenience, we take

*x*= 1 and

*w*= 0.5 (for model 12) for subsequent calculations.

We partition each genotypic value into the mean and eight genic effects under either F_{2}-metric or F_{∞}-metric, thereby allowing for identification of relationships among the genic effects for each of the 20 genetic models (Table 6). The relationships identified under F_{2}-metric and F_{∞}-metric are the same for nonepistatic models (models 1–10) but different for epistatic models (models 11–20). For example, the genic relationships for model 4 are 2*a _{A}*

_{(δ)}= 2

*a*

_{B}_{(δ)}=

*d*

_{A}_{(δ)}=

*d*

_{B}_{(δ)},

*aa*

_{(δ)}=

*ad*

_{(δ)}=

*da*

_{(δ)}=

*dd*

_{(δ)}= 0 under either F

_{2}-metric (δ = 0) or F

_{∞}-metric (δ = 1), whereas the genic relationships for model 16 are 2

*aA*

_{(0)}= 10

*a*

_{B}_{(0)}= 2

*d*

_{A}_{(0)}= 10

*d*

_{B}_{(0)}= −5

*aa*

_{(0)}= −5

*ad*

_{(0)}= −5

*da*

_{(0)}= −5

*dd*

_{(0)}under F

_{2}-metric (δ = 0) but

*a*

_{A}_{(1)}= 3

*a*

_{B}_{(1)}=

*d*

_{A}_{(1)}= 3

*d*

_{B}_{(1)}= −3

*aa*

_{(1)}= −3

*ad*

_{(1)}= −3

*da*

_{(1)}= −3

*dd*

_{(1)}under F

_{∞}-metric (δ = 1). It is evident from Equation 13 that while the conditions of

*a*

_{A}_{(0)}=

*a*

_{A}_{(1)}and

*a*

_{B}_{(0)}=

*a*

_{B}_{(1)}always hold regardless of whether or not epistasis is present, those of

*d*

_{A}_{(0)}=

*d*

_{A}_{(1)}and

*d*

_{B}_{(0)}=

*d*

_{B}_{(1)}are true only in the absence of epistasis. It should be emphasized that because these genic relationships are derived from the coded genotypic values with

*x*being set to 1 (

*cf.*Table 5), they will be multiplied by a constant if

*x*is set to be any other nonzero integer.

The contributions due to individual genic effects are calculated using (17) (Table 7). The nonepistatic models (models 1–10) differ only in the sizes of additive genic effects and levels of dominance at each of the two loci. The 10 epistatic models (models 11–20) show varying levels of epistasis as evident from inspecting percentages of the contributions of individual epistatic effects to the total variance. Model 13 (duplicate epistasis) and model 19 (partial dominant epistasis) were the most epistatic of the models considered. Epistatic effects in model 19 account for almost 75% of the total variance whereas those in model 13 account for ∼61% of the total variance, although the constitutions of the four epistatic variance components in both F_{2}-metric and F_{∞}-metric are quite different.

## DISCUSSION

The models of gene actions including epistasis between different loci developed for conventional quantitative genetics and recent QTL mapping studies do not have the same level of applicability because the genic effects in these models are defined with reference to different types of populations. These models range from the ones for general experimental and natural populations where gene frequencies are arbitrary and Hardy-Weinberg disequilibrium is often present (*e.g.*, Cockerham's model by Cockerham 1954) to the ones for special populations derived from a cross between two inbred lines with gene frequencies of one-half (*e.g.*, F_{2}-metric model by Hayman and Mather 1955 or F_{∞}-metric model by Mather and Jinks 1982). The key assumption involved in all the models is that the study population is in zygotic equilibrium (*i.e.*, genotypic frequencies at different loci are products of corresponding single-locus genotypic frequencies). However, zygotic associations can arise from physical linkages between different loci or from many evolutionary and demographic processes even for unlinked loci (Yang 2002). Zygotic associations are considered in some general gene action models (*e.g.*, Gallais 1974; Weir and Cockerham 1977) but a prohibitively large number of genic effects arising from such models have prevented them from being estimated in experimental quantitative genetics and QTL mapping studies. This article develops a new model that partitions the two-locus genotypic values in a zygotic disequilibrium population into equilibrium and residual portions. The equilibrium portion has eight genic effects that can be defined under those equilibrium models (Cockerham's model and the F_{2}-metric and F_{∞}-metric models). The residual portion is of course due to the presence of zygotic associations and can be a substantial part of the total genetic variance (Table 4). Thus, bias will occur when an equilibrium gene action model is used in modeling and detecting epistatic QTL for a zygotic disequilibrium population.

Kao and Zeng (2002) also considered associations of genotypic frequencies between different loci in modeling epistatic effects but our model differs from theirs in two ways. First our model separates the genotypic values and genetic variance into a portion arising from zero interlocus association (Cockerham's model) and the other portion arising from the presence of interlocus associations; in contrast, Kao and Zeng's model incorporated interlocus associations into Cockerham's model (F_{2}-metric model) by defining genetic and statistical parameters for the same genic effects. Second, Kao and Zeng's model considered linkage disequilibrium in a Hardy-Weinberg equilibrium F_{2} population, which would be only one of several genic disequilibria required to completely characterize zygotic associations; the residual variance in our model essentially measures the relative importance of genetic variance due to zygotic associations (Equation 10c and see Table 4 for examples). While our model does not expound the relationship between those genic disequilibria and zygotic associations, it can be easily achieved using Cockerham and Weir's (1973) disequilibrium functions (Yang 2002). For example, the zygotic association for genotype *AABB* can be expressed as where each genic disequilibrium (*D*) is the deviation of a frequency from that based on random association of genes and accounting for any lower-order disequilibria. For example, linkage disequilibrium is the deviation of frequency of gamete *AB* from the product of frequencies of allele *A* at locus *A* and allele *B* at locus *B*, , with When zygotes result from random union of gametes as assumed in Kao and Zeng (2002), all nongametic disequilibria including Hardy-Weinberg disequilibrium disappear (*e.g.*, ). In this case, the zygotic association for genotype *AABB* reduces to which agrees with the frequency of the same genotype given in Table 6 of Kao and Zeng (2002). The same argument is true for the remaining eight genotypes.

Our theoretical and empirical (Table 4) analyses clearly show substantial differences between Cockerham's model and the F_{2}-metric model. Kao and Zeng (2002) considered the F_{2} population where gene frequencies are one-half and genotypic frequencies are those expected under Hardy-Weinberg equilibrium at each of the two loci. In this case, Cockerham's model is the same as the F_{2}-metric. In all other cases, however, the two models would be different. In analyzing the PEDS data of Doebley *et al*. (1995), Goodnight (2000) used Cockerham's model to calculate eight variance components of the total genetic variance using three gene frequencies at both UMC107 and BV302: *p _{A}* =

*p*= 0.25, 0.5, and 0.75, but no Hardy-Weinberg disequilibrium was considered at either locus. His case of

_{B}*p*=

_{A}*p*= 0.5 corresponds to the F

_{B}_{2}-metric as evident from our analysis of the same trait (Table 4). Nevertheless, it is evident from his Table 4 that changes in gene frequencies greatly affect the distributions of variance components due to the genic effects, which is certainly consistent with our finding.

We distinguish the genic effects defined under the F_{2}-metric model (δ = 0) from those under the F_{∞}-metric model (δ = 1). Such distinction should help to reduce the confusion arising from the use of the same notations for both models. Such distinction also enables us to clearly identify the differences in parameter relations between the two sets of parameters for different genetic models describing F_{2} segregation patterns (Table 6). Until recently, there was a lack of appreciation that different gene action models would lead to different partitions of genic effects and subsequently different variance components due to the genic effects. Because the F_{∞}-metric model has clear interpretation advantages as noted by Van der Veen (1959) and Mather and Jinks (1982), it has often been used for modeling multiple QTL (*e.g*., Haley and Knott 1992; Carlborg *et al*. 2000; Yi and Xu 2002). However, because the genic effects defined under the F_{∞}-metric model with the F_{2} gene and genotypic frequencies are not orthogonal, bias will occur in estimating allelic effects when nonallelic (epistatic) effects are present; in contrast, the genic effects under the F_{2}-metric model are orthogonal and thus unbiased estimates of allelic effects can be obtained regardless of whether or not epistasis is present (Cheverud 2000; Kao and Zeng 2002). Since the translation between the F_{2}-metric and F_{∞}-metric models can be readily done through the simple relation (13), we recommend that in QTL mapping studies the F_{2}-metric model should be used to first estimate the genic effects and then the estimated genic effects should be converted into those under the F_{∞}-metric model to capture its interpretation advantages.

Our comparative assessment of different gene action models suggests the need for caution when reading the standard labels of “additive genetic variance,” “dominance variance,” etc., in the literature. As shown in this article and elsewhere (*e.g.*, Cockerham 1963; Cockerham and Tachida 1988), genic effects and their variances change with the population of reference. For example, in the CRP, the additive variance at locus *A* can be obtained from (6b) and (7), If CRP is an inbred population (IP), this additive variance becomes which is reduced to the more familiar form in a random mating population (*f _{A}* = 0), Clearly, the so-called additive variance is “contaminated” with the dominance effect unless the gene frequencies are one-half as obtained under F

_{2}-metric model (Equation 17) or the dominance effect is absent. On the other hand, as shown in Equation 17, the additive variance under the F

_{∞}-metric model is contaminated with the epistatic effect. With zygotic associations (ZA) as in most natural populations, the additive variance takes an even more complicated form, whose amount can be quantified in terms of various genic disequilibria. For example, part of the dominance and additive × additive variances within finite populations undergoing bottleneck, drift, or population subdivision has been found to behave as additive variance (Cockerham and Tachida 1988; Goodnight 1988; Whitlock

*et al*. 1993). Such nonadditive genetic variances contribute to immediate and permanent response to selection in finite populations in an intricate fashion. These discussions serve to emphasize that the additive variance is not free from the influence of nonadditive genic effects and thus does not have simplicity and “clean” meaning unless they arise from the F

_{2}-metric model. Fortunately, most of the current QTL mapping activities particularly with plant and laboratory species have focused on the use of planned mating designs corresponding to the F

_{2}-metric model (Lynch and Walsh 1998).

The extension of our two-locus model for nonequilibrium populations to three or more loci is straightforward. For example, for three uncorrelated loci each with two alleles, the frequencies of 27 possible genotypes are the appropriate products of three single-locus genotypic frequencies. These three-locus genotypic frequencies are required for partitioning the three-locus genotypic values into 26 genic effects as expected under Cockerham's model or the F_{2}-metric model: 3 additive, 3 dominance, 12 two-locus epistatic, and 8 three-locus epistatic effects. In the presence of two-locus and/or three-locus zygotic associations, an additional residual genic effect arises from nonzero deviations of the observed from expected three-locus genotypic frequencies. Such deviations are actually functions of 12 two-locus zygotic associations and 8 three-locus zygotic associations. It remains to be investigated what the exact forms of these functions are. Perhaps the functions of multilocus heterozygosity developed by Yang (2002) may be modified to accommodate individual genotypes as required here. Our model can also be extended to the case of multiple alleles per locus. For example, with three alleles at each of two independent loci, there are 6 possible genotypes at each locus and 36 genotypes at the two loci. The genotypic values are partitioned into the mean and 35 genic effects under Cockerham's model or the F_{2}-metric model: 10 (5 at each locus) allelic effects and 25 nonallelic (epistatic) effects. With zygotic associations between the two loci, the residual genic effect is due to the presence of 25 zygotic associations that can be freely defined.

## Acknowledgments

I am grateful to Marjorie Asmussen for her help and regret her recent and sudden death while this manuscript was still under review. I thank Yun-Xin Fu for stepping in to complete the review and two reviewers for valuable comments. This research was supported in part by the Natural Sciences and Engineering Research Council of Canada grant OGP0183983.

## Footnotes

Communicating editor: Y.-X. Fu

- Received July 15, 2003.
- Accepted April 9, 2004.

- Genetics Society of America