## Abstract

Unlike gametic linkage disequilibrium defined for a random-mating population, zygotic disequilibrium describes the nonrandom association between different loci in a nonequilibrium population that deviates from Hardy–Weinberg equilibrium. Zygotic disequilibrium specifies five different types of disequilibria simultaneously that are (1) Hardy–Weinberg disequilibria at each locus, (2) gametic disequilibrium (including two alleles in the same gamete, each from a different locus), (3) nongametic disequilibrium (including two alleles in different gametes, each from a different locus), (4) trigenic disequilibrium (including a zygote at one locus and an allele at the other), and (5) quadrigenic disequilibrium (including two zygotes each from a different locus). However, because of the uncertainty on the phase of the double heterozygote, gametic and nongametic disequilibria need to be combined into a composite digenic disequilibrium and further define a composite quadrigenic disequilibrium together with the quadrigenic disequilibrium. To investigate the extent and distribution of zygotic disequilibrium across the canine genome, a total of 148 dogs were genotyped at 247 microsatellite markers located on 39 pairs of chromosomes for an outbred multigenerational pedigree, initiated with a limited number of unrelated founders. A major portion of zygotic disequilibrium was contributed by the composite digenic and quadrigenic disequilibrium whose values and numbers of significant marker pairs are both greater than those of trigenic disequilibrium. All types of disequilibrium are extensive in the canine genome, although their values tend to decrease with extended map distances, but with a greater slope for trigenic disequilibrium than for the other types of disequilibrium. Considerable variation in the pattern of disequilibrium reduction was observed among different chromosomes. The results from this study provide scientific guidance about the determination of the number of markers used for whole-genome association studies.

THE extent and distribution of nonrandom associations between genes at different loci, *i.e.*, linkage disequilibria, throughout the genome have been used often as a criterion to infer demographic and genetic events of a population in the past, such as population history and evolutionary forces governing the loci. Because of its relation with the recombination fraction, the extent of association has provided a foundation for fine-scale mapping of quantitative trait loci (QTL) that control complex diseases in humans (Ardlie *et al.* 2002) or economical and adaptive traits in livestock (Farnir *et al.* 2000; McRae *et al.* 2002) and plants (Remington *et al.* 2001). Emerging as an important model system for human health research, canines have recently received a resurgence of interest in unraveling the mysteries of mammalian genomes using linkage disequilibrium (LD) analysis (Hyun *et al.* 2003; Lou *et al.* 2003; Sutter and Ostrander 2004; Sutter *et al.* 2004; Lindblad-Toh *et al.* 2005). In a study of canine mapping, aimed to detect QTL affecting canine hip dysplasia in a multihierarchic outbred pedigree, we analyzed the extent of pairwise linkage disequilibrium to change over genetic distances with a set of microsatellite markers (240) genotyped from the entire canine genome (Lou *et al.* 2003).

As a common case for many comparable studies, the measure of the extent of linkage disequilibrium between different loci in our canine genetic study was based on multilocus disequilibrium at the gametic level (Weir 1996). Although such a gametic disequilibrium analysis is mathematically simple, it relies upon a fundamental assumption that the population under study is at Hardy–Weinberg equilibrium (HWE), in which individuals are assumed to be randomly mating to produce the next generations. In such an HWE population, the nonrandom associations of alleles at different loci occur only within gametes rather than between gametes. The randomly mating assumption may be violated in the canine pedigree used for our earlier study because different offspring are related to each other to a varying degree although multiple dog founders were used.

For a nonequilibrium population at Hardy–Weinberg disequilibrium (HWD), zygotic disequilibria that have power to characterize nonrandom associations at both gametic and zygotic levels (Weir 1996) may be more relevant. Earlier studies have documented possible genetic and evolutionary causes for zygotic associations in a nonequilibrium population (Haldane 1949; Bennett and Binet 1956; Charlesworth 1991; Barton and Gale 1993). In this article, we revisit our outbred canine pedigree by estimating the extent of zygotic disequilibria throughout the canine genome. Although zygotic disequilibria have been theoretically developed in the literature (see Weir 1996 for an excellent description), there is no application yet, to our best knowledge, for these measures to extensively study the structure of the genome in a case study. Recently, Yang (2000, 2002) proposed a multilocus zygotic measure for association study in a nonequilibrium population. Yang's two articles present the most thoughtful survey on zygotic disequilibrium analysis. The incorporation of zygotic disequilibrium analysis into genomic research is a necessary first step toward the formulation of an optimal strategy for characterizing genome structure and organization.

## ESTIMATION OF ZYGOTIC DISEQUILIBRIUM

#### Genotype, allele, gamete, and nongamete frequencies:

Suppose that there is a natural or experimental population in which there are two codominant markers **A** with two alleles *A* and *a* and **B** with two alleles *B* and *b*, respectively. Let *p _{A}* and

*p*(

_{a}*p*+

_{A}*p*= 1) as well as

_{a}*p*and

_{B}*p*(

_{b}*p*+

_{B}*p*= 1) be the corresponding allele frequencies. At each of the two loci, four different formations of zygotic genotypes lead to three distinguishable genotypes,

_{b}*i.e*.,

*AA*,

*Aa*, and

*aa*for marker

**A**and

*BB*,

*Bb*, and

*bb*for marker

**B**. The two markers form 10 genotypic configurations, but only 9 can be genetically distinguished from each other. This is because genotypic configurations and have the same genotype

*AaBb*. Let

*P*, subscripted and superscripted by the genotype notation, be the genotypic configuration frequencies that are individually tabulated in Table 1. It is not difficult to estimate one-marker genotype frequencies from two-marker genotypic configuration frequencies by(1)for marker

**A**and(2)for marker

**B**and estimate the allele frequencies from the one-marker genotype frequencies by(3)The two markers form four gametes,

*AB*,

*Ab*,

*aB*, and

*ab*, whose frequencies can be estimated from genotypic configuration frequencies by(4)Similarly, the frequencies of nonalleles from different gametes can be estimated by(5)The frequencies of triple alleles from different markers are estimated as(6)

#### Complete disequilibrium parameters:

The zygotic disequilibrium is defined as the deviation of two-locus genotype frequencies from products of single-locus genotype frequencies and, thus, is composed of all nonallelic genic disequilibria at the two loci (Weir 1996). Assume that the population considered above is at HWD. This population thus has no desirable property of an equilibrium population, such as independence of different allele frequencies at the same locus (Lynch and Walsh 1998). The HWD attempts to test for two alleles at the same locus, but on different gametes, whereas (gametic) linkage disequilibrium describes two alleles on the same gametes, but at different loci. For the zygotic disequilibrium, however, there is a third test, *i.e.*, two alleles on different gametes and at different loci.

Since the population is not in HWE, two alleles at each marker are not independent, with the coefficients of Hardy–Weinberg disequilibrium defined as(7)for marker **A** and(8)for marker **B**, respectively. The coefficient of digenic gametic linkage disequilibrium between the two markers is defined as(9)

For the nonequilibrium population, digenic linkage disequilibrium that occurs between nonalleles at different gametes is defined as(10)

The trigenic disequilibrium between two alleles from marker **A** and one allele from marker **B** is defined as(11)The trigenic disequilibrium between two alleles from marker **A** and one allele from marker **B** is defined as(12)

With genotypic configuration frequencies, allele frequencies, HWD, gametic and nongametic disequilibria, and trigenic disequilibria, we can estimate the quadrigenic disequilibrium (*D*_{AB}) between two alleles from marker **A** and two alleles from marker **B** using the formulas given in Table 2 (see Weir 1996). Note that we use lower- and uppercase letters to denote gametic and zygotic disequilibria, respectively. From Table 2, we can see that each of the genotypic configuration frequencies can be expressed in terms of the allele frequencies (*p _{A}*,

*p*and

_{a}*p*,

_{B}*p*), HWD coefficients (

_{b}*D*

_{A}and

*D*

_{B}), and gametic (

*D*

_{ab}) and nongametic disequilibria of different orders (

*D*

_{a/b},

*D*

_{Ab},

*D*

_{aB}, and

*D*

_{AB}).

#### Composite zygotic disequilibria:

It can be seen that 10 genotypic configurations have nine independent frequencies that are defined by two allele frequencies for each marker and seven disequilibrium parameters as defined above. But since two configurations of the double heterozygote cannot be separated in practice, it is not possible to estimate all these frequencies and disequilibrium parameters. To solve this problem, Weir (1996) suggested a set of composite disequilibrium coefficients. These include the digenic disequilibrium measured by the sum of the gametic and nongametic coefficients, *i.e.*,(13)As shown by Equations 9 and 10, Δ_{ab} will include the summation of gamete (*p _{AB}*) and nongamete frequencies (

*p*

_{A}_{/B}). On the basis of the definitions of these two frequencies (Equations 4 and 5), Δ

_{ab}will finally need the summation of two configuration frequencies ( and ) of the double heterozygote. Thus, Δ

_{ab}can be estimated directly on observable genotype frequencies. Weir (1996) also defined a quadrigenic disequilibrium measured by(14)which can be finally measured from genotype frequencies.

The two composite digenic and quadrigenic disequilibria can make it possible to estimate the parameters on the basis of observable genotype frequencies rather than unobservable configuration frequencies. Table 3 tabulates the compositions of the composite quadrigenic disequilibrium in terms of genotype and allele frequencies and the coefficients of disequilibria with lower orders (see also Weir and Cockerham 1989).

#### Estimates and tests:

Two markers **A** and **B** are observed for a population of size *n* with nine genotypes listed in Table 1. Let *u* and *v* denote the marker genotypes, *u* = 2 for *AA*, 1 for *Aa*, and 0 for *aa* and *v* = 2 for *BB*, 1 for *Bb*, and 0 for *bb*. The multinomial log-likelihood of the genotype frequencies given marker observations is written as(15)which gives the MLEs of the genotype frequencies as(16)On the basis of the estimated genotype frequencies, the allele frequencies for the two markers (*p _{A}* and

*p*), the HWD coefficients (

_{B}*D*

_{A}and

*D*

_{B}), the composite digenic disequilibrium (Δ

_{ab}), two trigenic disequilibria (

*D*

_{Ab}and

*D*

_{aB}), and the composite quadrigenic disequilibrium (Δ

_{AB}) can be estimated.

Each of these disequilibria should be tested for its significance. The hypotheses for testing HWD are formulated by(17)(18)for two different markers, respectively. The hypotheses for testing each of the zygotic disequilibria between the two markers are given as(19)(20)(21)(22)For these hypotheses (17–22), we calculate the likelihoods under H_{0} and H_{1}, respectively, from which the log-likelihood ratio (LR) is calculated. The LR test statistic calculated follows a χ^{2}-distribution with 1 d.f.

The likelihoods for testing HWD on the basis of hypotheses (17) and (18) can be calculated from marginal totals of one-marker genotype frequencies and observations separately for markers **A** and **B**, respectively. For these two hypotheses, allele frequencies under H_{0} can be estimated with a closed form and, thus, no EM algorithm is needed for computation. However, for the tests of hypotheses (19–22), parameter estimation under H_{0} needs the implementation of numerical algorithms, like the Newton–Raphson method, because the number of unknown parameters to be estimated is less than the number of genotype frequencies. It is also possible to test whether all the disequilibrium coefficients are together equal to zero. The parameters that need to be estimated under H_{0}: Δ_{ab} = *D*_{Ab} = *D*_{aB} = Δ_{AB} = 0, include allele frequencies and HWD coefficients that can be estimated with a closed form. The LR value for this hypothesis should asymptotically follow the χ^{2}-distribution with 4 d.f.

Alternatively, hypotheses (17–22) for a given disequilibrium can be tested by calculating test statisticswhere denotes the estimate of the disequilibrium coefficient and is the sampling variance of the estimate, calculated by formulas given in Weir (1996). This test statistic is asymptotically χ^{2}-distributed with 1 d.f.

#### Bounds and normalization:

To make zygotic disequilibria comparable between different studies, the estimates of disequilibria should be normalized. Lewontin (1964) proposed a standardized approach by expressing linkage disequilibrium as a proportion of the most extreme value. Thus, the new measure from this approach will lie between 0 (for linkage equilibrium) and | ± 1| (for complete linkage disequilibrium). A similar idea was used by Weir and Cockerham (1989) to derive bounds for trigenic and quadrigenic disequilibria for zygotic nonequilibrium analysis. More recently, Zaykin (2004) and Hamilton and Cole (2004) independently proposed algebraically equivalent bounds for a composite measure of gametic linkage disequilibrium. The bound for the composite zygotic disequilibrium has not been provided thus far. In the appendix, we provide bounds and normalized measures for all six disequilibria, *D*_{A}, *D*_{B}, Δ_{ab}, *D*_{Ab}, *D*_{aB}, and Δ_{AB}, for zygotic disequilibrium analysis. These bounds for the first five disequilibria are consistent with those published in Weir and Cockerham (1989), Zaykin (2004), and Hamilton and Cole (2004).

## MATERIALS

A canine pedigree was developed to map QTL responsible for canine hip dysplasia (CHD) using molecular markers. Seven founding greyhounds and six founding Labrador retrievers were intercrossed, followed by backcrossing F_{1}'s to the greyhounds and Labrador retrievers and intercrossing the F_{1}'s. A series of subsequent intercrosses among the progeny at different generation levels led to a complex network pedigree structure (Figure 1), which maximized phenotypic ranges in CHD-related quantitative traits and the chance to detect substantial linkage disequilibria (Todhunter *et al.* 1999, 2003a,b; Bliss *et al.* 2002). A total of 148 dogs from this structured pedigree were chosen for genetic analyses. This set of samples would not be appropriate for traditional gametic linkage disequilibrium analysis because the population is not randomly mating. Lou *et al.* (2003) estimated gametic linkage disequilibria for this pedigree on a critical foundation that the pedigree was originally derived from multiple unrelated founders. But although the resulting conclusions are consistent with the evolutionary history of dogs, Lou *et al.*'s analysis can be improved by estimating and testing the chromosomal distribution of zygotic disequilibria as will be done in this study.

For the sampled dogs from the structured pedigree, 247 microsatellite markers distributed on 38 pairs of autosomes and 1 pair of sex chromosomes were genotyped to construct a linkage map for the canine genome, which displays a good coverage of each chromosome (Mellersh *et al.* 1997, 2000; Breen *et al.* 2001; Richman *et al.* 2001). The recombination fractions between different markers were estimated for segregating families, which are converted to genetic distances in centimorgans on the basis of a map function. The average genetic distances between two adjacent markers on each chromosome are listed in Table 4 (Breen *et al.* 2001).

## RESULTS

The microsatellite markers genotyped display high heterozygosity in the dog pedigree, with the number of alleles at a marker ranging from 2 to 11 (Todhunter *et al.* 2003b). The multialleles of the microsatellite markers are collapsed into two categories, the most frequent allele *vs.* all the rest pooled alleles. Thus, the simple biallelic model can be directly used to analyze the extent and distribution of zygotic disequilibria throughout the canine genome using the model developed above.

The zygotic disequilibria that describe the association between two different markers in a nonequilibrium population, like the canine pedigree as used in this study, were estimated and tested for each pair of markers located on the same chromosome. The zygotic associations were partitioned into Hardy–Weinberg disequilibria at each locus (*D*_{A}), composite gametic disequilibrium including two alleles each from a different locus (Δ_{ab}), trigenic disequilibria including a zygote at one locus and an allele at the other (*D*_{Ab} or *D*_{aB}), and composite quadrigenic disequilibrium including two zygotes each from a different locus (Δ_{AB}). All these disequilibrium coefficients were normalized using a procedure described in the appendix. All the comparisons are based on the normalized coefficients.

Overall, 28% of the markers genotyped were observed to deviate from HWE, but showed considerable interchromosomal variation ranging from 0 (chromosomes 26, 29, 34, 36, and 38) to 100% (sex chromosome) (Table 4). Of the four types of dilocus disequilibria, Δ_{ab} displays the most important impact on zygotic associations because its estimates are generally much larger than those of the other disequilibrium types. Furthermore, this disequilibrium, as well as the composite quadrigenic disequilibrium, has larger normalized values than the other types (Figure 2). Overall, the largest percentage of marker pairs is significant for Δ_{ab} (61%), followed by trigenic disequilibria *D*_{Ab} (23%) and *D*_{aB} (19%) and composite quadrigenic disequilibrium Δ_{AB} (22%). The percentages of marker pairs that exhibit significant associations vary among different chromosomes (Table 4).

Figure 2 illustrates the patterns of the relationship between zygotic disequilibria, Δ_{ab}, *D*_{Ab}, *D*_{aB}, and Δ_{AB}, and genetic distances, all exhibiting a trend of decay with increased map distance. All the types of zygotic disequilibria occur more frequently between pairs of markers separated by <40 cM than between those separated by >40 cM. As compared with *D*_{Ab} and *D*_{aB}, Δ_{ab} and Δ_{AB} tend to extend within a broader region of the canine genome. Both Δ_{ab} and Δ_{AB} decay with map distance, to a greater extent for the former than for the latter.

Each of the four types of zygotic association was plotted against the map distance separately for individual chromosomes (Figures 3–6⇓⇓). Although the data are sparse, a general trend can be observed for the extent of zygotic disequilibria; *i.e.*, whereas the distributions of *D*_{Ab} and *D*_{aB} follow a similar pattern among different chromosomes, there is substantial interchromosomal variation in the extent and distribution of Δ_{ab} and Δ_{AB} over the canine genome.

## MONTE CARLO SIMULATION

To our best knowledge, this is the first study of the distribution of zygotic disequilibrium across the genome in a nonequilibrium population. Given the tradition that most current linkage disequilibrium analyses are based on gametic associations without a test for zygotic disequilibria, we perform a reciprocal simulation study to examine the influence of such analyses on the power of the disequilibrium test in a nonequilibrium population. According to this reciprocal simulation study, data are simulated, respectively, under zygotic and gametic disequilibrium models, but are subject to separate analyses by each of these two models.

#### Simulated data by the zygotic model:

Table 5 lists four simulation designs in each of which all types of associations occur for an assumed nonequilibrium population. But these four designs are different in terms of the allocation pattern of zygotic associations. In designs 1 and 2, a large composite digenic disequilibrium is contributed mainly by gametic or nongametic disequilibrium, respectively. Designs 3 and 4 purport to have a large trigenic and a quadrigenic disequilibrium, respectively. The sample size is 150, mimicking the canine example used above. The simulated data are analyzed by both the gametic and the zygotic disequilibrium models. The simulation under each design is repeated 200 times to calculate the precision of parameter estimation and statistical power of disequilibrium detection. The results from this simulation study (Table 6) are summarized as follows:

The zygotic disequilibrium model provides reasonable estimation of any type of disequilibria and shows a great power to detect disequilibria for a nonequilibrium population under simulation.

As expected, the gametic linkage disequilibrium model can estimate only gametic linkage disequilibrium, but when used to estimate a nonequilibrium population, its estimation of this parameter is largely biased. Actually, the gametic model tends to estimate the composite gametic and nongametic disequilibrium when both exist, but its estimation precision is very poor. If the composite digenic disequilibrium is mainly due to the nongametic disequilibrium (design 2), the gametic disequilibrium model cannot be used, given its large estimation error.

The gametic disequilibrium model can accurately estimate allele frequencies, but cannot provide precise estimation of these parameters. The second and third findings indicate that gametic disequilibrium analysis should never be used for a nonequilibrium population and that the test for zygotic disequilibrium is always crucial before gametic disequilibrium analysis is used.

#### Simulated data by gametic model:

As a follow-up, we simulated the data for an equilibrium population by a gametic linkage disequilibrium model. The simulated data were analyzed by both the zygotic and the gametic models (Table 7). It can be seen that the zygotic model estimates the coefficient of linkage disequilibrium as precisely as the gametic model. The result from this simulation indicates that the zygotic model is powerful to estimate the degree of linkage disequilibrium for an equilibrium population. In conjunction with the results from the simulation by the zygotic disequilibrium model, it is concluded that the zygotic model is more general than the gametic model.

## DISCUSSION

The characterization of the architecture of linkage disequilibrium in the genome is an area of explosive recent growth (Farnir *et al.* 2000; Remington *et al.* 2001; Ardlie *et al.* 2002; Hyun *et al.* 2003; Lou *et al.* 2003; Sutter and Ostrander 2004; Sutter *et al.* 2004; Lindblad-Toh *et al.* 2005) because the positional cloning of genes underlying common complex diseases relies on the identification of linkage disequilibrium between genetic markers and disease. Traditional linkage disequilibrium is defined as the nonrandom association between alleles at different loci in gametes or haplotypes. The estimation of such gametic linkage disequilibrium between different loci requires the assumption that the population under consideration is randomly mating, following HWE. However, for many nonequilibrium populations that are founded by a small number of ancestors and/or are frequently under evolutionary pressure, such as mutation, genetic drift, and population admixture and structure, or under artificial selection (Lynch and Walsh 1998), HWE may be violated and, therefore, a new analysis that relaxes the random-mating assumption should be formulated. Weir (1996) introduced the concept of zygotic association or zygotic disequilibrium that can characterize the disequilibria between different loci in a nonequilibrium population. Recently, Yang (2000, 2002) proposed a multilocus statistic to examine zygotic associations in nonequilibrium populations. Different disequilibria due to a single locus or multiple loci can be summarized in such a statistic.

In a multigenerational canine pedigree constructed by several founders (Todhunter *et al.* 1999), individual dogs are related to each other and, thus, sampled dogs from this pedigree violate the HWE assumption due to inbreeding. For this reason, zygotic disequilibrium should be more appropriate for this related pedigree to investigate the extent and distribution of associations throughout the canine genome. We found extensive linkage disequilibria in a broad region of chromosomes (≥40 cM), as compared with the human genome, even for the most isolated human populations (Hall *et al.* 2002; Varilo *et al.* 2003; Tenesa *et al.* 2004). This finding seems to be comparable with those of earlier linkage disequilibrium studies of purebred dogs (Hyun *et al.* 2003; Sutter *et al.* 2004). The extent of linkage disequilibrium across the chromosomes was also investigated for the same data set by the gametic linkage disequilibrium model (Lou *et al.* 2003). Although the results of the two models are broadly in agreement, the linkage disequilibrium detected by the zygotic model seems to be distributed more extensively over the genome than that detected previously by the gametic model. Given the finding from the simulation, the gametic model tends to estimate a combined gametic and nongametic linkage disequilibrium, *i.e.*, composite digenic disequilibrium, and, therefore, to provide a biased estimate of gametic linkage disequilibrium especially when a large nongametic linkage disequilibrium exists. The extensive distribution of linkage disequilibrium in the canine genome detected by the zygotic model suggests that a relatively small number of markers will be required for whole-genome association mapping in dogs. However, an optimal number of markers should be determined separately for individual chromosomes, because the extent of linkage disequilibrium shows substantial interchromosomal variation. Historically, different degrees of selection pressure may have been operational on various chromosomes, which causes interchromosomal differentiation in linkage disequilibrium extent (Sutter and Ostrander 2004; Ostrander and Wayne 2005; Parker and Ostrander 2005).

The most significant contribution of this article may lie in the first systematic use of a zygotic disequilibrium analysis to characterize the extent of disequilibrium for a nonequilibrium population of canines although the conclusions obtained from our analysis may be explained only for the specific canine pedigree used, in which individual dogs are related to different extents. On the basis of simulation analyses, the idea and concept of zygotic disequilibrium can be readily applied to any population genetic studies. Results from simulation analyses indicate that a popular gametic linkage disequilibrium analysis when employed to understand the genetic structure of the population at HWD should be used with caution because the results from this analysis will be misleading. The zygotic disequilibrium model that does not rely on the assumption of random mating has great power to detect various types of disequilibrium at different orders. Therefore, it is safe to say that the zygotic disequilibrium model covers well the gametic disequilibrium model in practical population genetic studies.

In this study, the zygotic disequilibrium model mostly modified from Weir (1996) was proposed on the basis of biallelic markers although the data from a canine genetic project are multiallelic microsatellites. Given the current modest sample size used, it should be more reasonable to collapse multiple alleles into bialleles than to direly use the multiallelic zygotic model in terms of reducing the number of parameters being estimated. Also, with the development of high-throughput technologies for single-nucleotide polymorphism (SNP) markers, the biallelic model will be useful to analyze the genetic architecture of zygotic disequilibria over the entire genome for any nonequilibrium or isolated populations including humans and other agriculturally important species. However, when a sample size is sufficiently large, the multiallelic model, in which the number of disequilibrium parameters increases exponentially with the number of alleles, will be more informative than the biallelic model based on the collapsing of alleles. Technically, it is straightforward, although tedious, to model zygotic disequilibria with multiallelic markers. For example, consider two triallelic markers that each form six distinguishable genotypes. A total of 35 genotype frequencies for these two markers contain four allele frequencies, six HWD coefficients, four composite digenic disequilibria, 12 trigenic disequilibria, and nine composite quadrigenic disequilibria. Also, our zygotic model can be readily extended to manipulate three biallelic markers at the same time as seen in Yang (2000, 2002). With these extensions and modifications, the zygotic disequilibrium analysis will provide a routine tool for the identification of the overall picture of disequilibria across the genome. The results obtained from the zygotic disequilibrium model, like those for canine genetics in this study, will have important implications for the gene mapping of complex traits.

## APPENDIX

In what follows, we derived the ranges of the disequilibrium parameters for a nonequilibrium population and defined the normalized zygotic disequilibrium in a way as for gametic LD (Lewontin 1964, 1988). On the basis of Equations 7 and 8, the ranges of the HWD coefficients are expressed asfor marker **A**, andfor marker **B**.

For the composite gametic disequilibrium, the range is derived, on the basis of Equations 9, 10, and 13, aswhere *A* = 2*p _{A}p_{b}*,

*B*= 2

*p*,

_{a}p_{b}*C*=

*p*

^{2}

*+*

_{A}p_{b}*p*

^{2}

*+*

_{a}p_{B}*p*, ,

_{A}p_{a}*E*= 2

*p*,

_{A}p_{B}*F*= 2

*p*, , , , and . The normalized Δ

_{a}p_{b}_{ab}is defined aswhere

On the basis of Equations 11 and 12, two trigenic disequilibria have the ranges expressed, respectively, aswhere *A* = 2*p _{A}p_{a}p_{b}*, , , , , , ,

*H*=

*p*,

_{A}p_{B}*I*=

*p*, , , , , , , , , , , , , , , , , , , , andwhere

_{a}p_{B}*A*′ = 2

*p*, , , , , , , , , , , , , , , , , , , , , , , , , , , and . The normalized

_{a}p_{B}p_{b}*D*

_{Ab}and

*D*

_{aB}are defined, respectively, aswhereandwhere

On the basis of Table 3, the range of the composite quadrigenic disequilibrium is expressed aswhere , , , , , , , , and . The normalized Δ_{ab} is defined aswhere

## Acknowledgments

We thank Dmitri Zaykin and an anonymous reviewer for clarifying the concept of zygotic association and providing other constructive comments. The preparation of this manuscript was supported by a grant from the Morris Animal Foundation, National Institutes of Health (NIH) AR36554, the Consolidated Research Grant Program, the Cornell Advanced Technology Biotechnology Program, Nestle Purina, Marshfield Medical Research Foundation (Marshfield, WI), and Cornell University College of Veterinary Medicine unrestricted alumni funds, NIH R01 NS041670 and National Science Foundation 0540745.

## Footnotes

Communicating editor: M. K. Uyenoyama

- Received April 28, 2006.
- Accepted July 1, 2006.

- Copyright © 2006 by the Genetics Society of America