NONRANDOM associations between genes at different loci are often assessed in population genetic and evolution studies because such associations provide the basis for inferring about demographic and genetic events in the past, such as population history and evolutionary forces governing the loci. Current intensive interest in the association studies largely stems from the prospect of exploiting the relation between the extent of association and the recombination fraction for finescale mapping of quantitative trait loci (QTL) controlling complex diseases in humans (Ardlieet al. 2002) or quantitative traits of economical or adaptive importance in animals and plants (Farniret al. 2002). In either case, the focus has been on the use of gametic association or commonly called linkage disequilibrium (LD). Several statistical measures have been proposed to characterize LD (see Hedrick 1987 for review), but the use of these measures is often limited to a pair of alleles at two loci. With increasing availability of multiallelic systems such as microsatellites, pairwise LD measures may be too numerous to be readily manageable and interpretable in initial genomewide studies. More importantly, unless a stringent significance level is imposed, the large number of required pairwise tests under commonly used significance levels 5 and 1% may produce spurious association realizations (Karlin and Piazza 1981).
Recently, Sabatti and Risch (2002) suggested the use of haplotype homozygosity as a possible measure of LD to circumvent the problem of measuring multilocus associations relating to multiple alleles and loci. When zygotes result from the random union of gametes (i.e., HardyWeinberg equilibrium) as assumed in Sabatti and Risch (2002), LD can be estimated from observed homozygosities and heterozygosities. The advantage of this approach is that the homozygosities and heterozygosities are defined independently of the number of alleles per locus, thereby allowing one to measure LD between highly polymorphic markers. In the presence of HardyWeinberg disequilibrium as often in natural populations, however, LD is only one of several genic disequilibria that are required for a complete characterization of nonrandom associations at different loci (Cockerham and Weir 1973). In a similar but independent development, Yang (2000, 2002) advocated a direct characterization and test of zygotic associations at multiple loci regardless of whether or not the population is in HardyWeinberg equilibrium. The purposes of this letter are (i) to elucidate the relationship between the two approaches by Sabatti and Risch (2002) and by Yang (2000, 2002) and (ii) to point out possible bias in calculating LD if other nonzero genic disequilibria are ignored.
For simplicity, the consideration is given only to the case of two loci (say j and l), each with multiple alleles (j_{1}, j_{2},..., j_{r}; l_{1}, l_{2},..., l_{s}). Frequencies of zygotes at loci j and l from the union of gametes, j_{u} l_{y} and j_{v} l_{z} (u, v = 1, 2,..., r and y, z = 1, 2,..., s), are written as
When zygotes result from random union of gametes, all nongametic disequilibria including HardyWeinberg disequilibrium disappear (e.g.,
Evidently, since the zygotic association is a composite measure, the direct onetoone relationship between zygotic and gametic associations is possible only when there are two alleles at each of the two loci with the absence of all nongametic disequilibria (i.e., Equation 3). Thus, with knowledge of ω and allelic frequencies (p's and q's), LD can be calculated by solving the equation 4D^{2} + 2(p_{1} – p_{2})(q_{1} – q_{2})D –ω= 0. In the special case of p_{1} = p_{2} = 0.5 or q_{1} = q_{2} = 0.5,
Numerical calculation is carried out to examine patterns of the solutions for D. Consider first the case where all genic disequilibria except for LD are zero. For a given set of gene frequencies, LD falls in the range of
Because ω is a summary statistic at the zygote level, it may represent a loss of haplotype information such as gametic disequilibrium. In other words, zero zygotic association (ω= 0) does not preclude the existence of certain nonzero gametic disequilibria (D ≠ 0) as evident from Equation 3. Thus, with ω= 0, the nontrivial solution as derived from Equation 4b for LD, D =–(p_{1} – p_{2})(q_{1} – q_{2})/2, is not necessarily zero unless there are symmetric allele frequencies (p_{1} = p_{2} = 0.5 or q_{1} = q_{2} = 0.5). For example, if p_{1} = q_{1} = 0.3, the nontrivial solution for LD is D =–0.08, but zygotic frequencies are f(00) = 0.3364, f(01) = f(10) = 0.2436, and f(11) = 0.1764, leading to ω= (0.3364)(0.1764) – (0.2436)^{2} = 0.
In the presence of all genic disequilibria, the relationship between zygotic and gametic associations becomes far less clear (cf. Equation 1). Table 2 presents five selected examples of solutions for LD (D_{1} and D_{2}) from zygotic associations (ω). For each of five gene frequencies that are equal at the two loci (i.e., p_{1} = q_{1} = 0.1, 0.2, 0.3, 0.4, and 0.5), minimum and maximum values of HardyWeinberg disequilibria (HWD), nonallelic digenic disequilibria including both gametic (D) and nongametic disequilibria(D′), trigenic disequilibria (TRID), and quadrigenic disequilibria (QD) are determined just as LD is determined for Table 1. As with LD, the strength
of each genic disequilibrium is represented by the five levels (maximum negative, halfmaximum negative, zero, halfmaximum positive, and maximum positive). Thus, a total of 3125 (5 × 5 × 5 × 5 × 5) combinations are examined. Frequencies of 10 genotypes are calculated using Cockerham and Weir's (1973) disequilibrium functions involving these genic disequilibria and 4 zygotic frequencies are simply appropriate sums of the 10 genotypic frequencies. In the first example, all nonallelic genic disequilibria (D = D′, TRID, and QD) are zeros, zygotic association is zero (ω= 0) as expected, and the first solution (D_{1} = 0) corresponds to the absence of gametic disequilibrium (D = 0). However, because one or more nonallelic genic disequilibria are present in each of the remaining four examples, there is no correspondence between either of the two solutions (D_{1} or D_{2}) and D. In the third and fifth examples, there is no LD (D = 0), but because of nonzero TRID and/or QD, neither solution is zero. In particular, the fifth example represents a wellknown scenario where nonzero quadrigenic disequilibrium between two unlinked loci is present in a population undergoing mixed selfing and random mating with s being the proportion of selfing (e.g., Weir and Cockerham 1973). For the case of two alleles at each of the two loci, the zygotic association is
While the selected examples in Table 2 are somewhat arbitrary, the point is clear: there is little correspondence between gametic and zygotic associations when other types of genic disequilibria are present. Sabatti and Risch (2002, p. 1718) also noted that “unfortunately, the relation between homozygosity and recombination fraction is not always direct...” although they considered only the haplotype homozygosity and heterozygosity in a HardyWeinberg equilibrium population. The important values of zygotebased measures may lie in (i) their ability to quickly detect suspected “hot spots” of associations in genomewide scans (Sabatti and Risch 2002) and (ii) the comparative assessment of gametic vs. zygotic associations to infer about adaptive significance of genotypes at different loci (Yang 2002). For the genome scanning, the primary purpose of the zygotic association analysis, just like that of the LD analysis, is to detect markers that are tightly linked to QTL. In such detection, spurious associations (false positives) between markers and QTL may occur in two ways. First, strong associations between unlinked loci may arise from many evolutionary factors (see below for a discussion). Genetic designs and statistical tests are now available to avoid these kinds of falsepositive findings (Gibson and Muse 2002). Second, the huge number of comparisons that are required to scan the genome for association will inevitably produce abundant false positives unless a significance level that is much more stringent than 5% or 1% is imposed (Karlin and Piazza 1981).
Most current LD studies, whether on evolution or on QTL mapping, focus on patterns of LD as predicted by simple demographic models of population expansions or contractions, but do often acknowledge the impact of other factors such as natural selection, random drift, admixture, or gene flow and inbreeding (e.g., Pritchard and Przeworski 2001; Ardlieet al. 2002). In essence, these factors cause the departure from HardyWeinberg equilibrium, thereby producing the zygotic association even in a gametic equilibrium population (cf. Yang 2000, Table 2, case 4). Thus, if these factors are present but ignored, LD will be definitely over or underemphasized in evolution or QTLmapping studies.
Acknowledgments
This research was partially supported by the Natural Sciences and Engineering Research Council of Canada grant OGP0183983.
Footnotes

Communicating editor: M. A. Asmussen
 Received July 3, 2002.
 Accepted February 5, 2003.
 Copyright © 2003 by the Genetics Society of America