Abstract
A Bayesian method for determining if there are large departures from independence between pairs of alleles at a locus, HardyWeinberg equilibrium (HWE), is presented. We endorse the view that a population will never be exactly in HWE and that there will be occasions when there is a need for an alternative to the usual hypothesistesting setting. Bayesian methods provide such an alternative, and our approach differs from previous Bayesian treatments in using the disequilibrium and inbreeding coefficient parameterizations. These are easily interpretable but may be less mathematically tractable than other parameterizations. We examined the posterior distributions of our parameters for evidence that departures from HWE were large. For either parameterization, when a conjugate prior was used, the prior probability for small departures was itself small, i.e., the prior was weighted against small departures from independence. We could avoid this uneven weighting by using a step prior which gave equal weighting to both small and large departures from HWE. In most cases, the Bayesian methodology makes it clear that there are not enough data to draw a conclusion.
ASSESSING independence of alleles within loci, HardyWeinberg equilibrium (HWE), has long been of interest to population geneticists in a variety of contexts. In evolutionary theory, nonindependence is a starting point for determining what forces may be acting to affect variation or changes in allele frequencies (Allardet al. 1972; Clegget al. 1972); in forensic science, independence is assumed by the product rule for estimating genotypic frequencies (National Research Council 1996). Studies of departures from independence at single loci go back at least as far as Hardy (1908) and Weinberg (1908).
There are many tests for independence of alleles within loci using classic methods: for reviews see Emigh (1980), Hernández and Weir (1989), and Weir (1996). Bayesian approaches to HWE have been given by Pereira and Rogatko (1984), Lindley (1988), Chow and Fong (1992), and Ayres and Balding (1998). We were motivated to consider a Bayesian approach for several reasons. First, this approach allows us to determine the degree of departure from HWE. The classic approach addresses this question indirectly by testing the null hypothesis of zero departure. Second, in describing departures from HWE, we are not directly interested in the allele frequencies; however, they are necessary to define some measure of disequilibrium, which is of primary interest. Bayesian methods provide a convenient way to deal with parameters like allele frequencies, called nuisance parameters, by accounting for their different possible values. Third, it is sometimes the case that we have a set of data in which we observe zero, or very few, individuals with certain genotypes. Classic methods may have trouble accommodating these situations and Bayesian methods offer an alternative to exact tests (e.g., Zaykinet al. 1995).
This article presents a Bayesian approach to HWE based on the disequilibrium parameter and on the withinpopulation inbreeding coefficient. These parameterizations have the advantage of being easy to interpret biologically. The disadvantage is that for some questions they may be less tractable mathematically.
The traditional approach, to which we have contributed previously and that we still employ, is to test the hypothesis of no departure from HWE. We recognize that failure to reject the hypothesis is not equivalent to declaring HWE, and power calculations allow us to quantify likely bounds on the strength of any departure. An attractive alternative is based on the recognition that real populations are not in HWE and that data, when coupled with prior knowledge, lead to probability distributions for the levels of departure. Bayesian methods replace traditional point estimates and tests of hypotheses about HWE departures with these posterior distributions. We use the posterior distributions to find the probabilities that disequilibrium or inbreeding parameters lie inside or outside intervals of interest. Discussions of Bayesian statistics may be found in Bernardo and Smith (1994), Berger (1985), Lee (1989), O'Hagan (1994), and Gelman et al. (1995).
BAYESIAN APPROACH
In this section, we outline a Bayesian approach, and compare it to the classic hypothesistesting framework in which the HWE problem is usually cast. In both Bayesian and classic methods, the parameters of interest, ϕ, are considered fixed and unknown. In addition, the conditional distribution of the data given particular values of the parameters, π(Xϕ), is the same. This distribution has the same form as the likelihood,
Prior distributions may be constructed based on prior information, mathematical convenience, or ignorance about the parameters (Bernardo 1979; Kass and Wasserman 1993; Gelmanet al. 1995). Different prior distributions imply different levels of uncertainty about the parameters, not only when the parameters are considered together (joint prior distribution), but also when the parameters are considered separately (marginal prior distribution), or when one of the parameters is considered conditional on values of the other parameters (conditional prior distribution). It is useful to consider what each distribution implies about the parameters and to decide which one(s) to emphasize when constructing the joint prior distribution. This concept is illustrated below in our choice of prior distributions. There are several ways to report features of the posterior distribution, and thus, there are several ways to compare results from Bayesian and classic approaches.
The interpretation of a Bayesian result is completely different from the classic result to which it is being compared. With that caveat, one could report the posterior probabilities that the parameter lies inside an interval of interest. This probability statement is not interpreted as a long term frequency, but as an uncertainty. The most natural classic result to which to compare the posterior probability is a P value, the probability, under the null hypothesis, of obtaining data that are as extreme or more extreme than the observed result. Another way to summarize the posterior distribution is to specify an interval in which most (95%, for example) of the posterior distribution lies. One way to do this is to calculate a highest posterior density region. The most appropriate comparison to a 95% highest posterior density is a 95% confidence interval. In this article, we calculate the posterior probability that the parameter of interest lies within an interval suggested by the National Research Council (1996). Therefore, the classic result to which we compare our Bayesian result is a P value.
In a Bayesian framework there are no preset critical values. Of course, this statement is also true in a classic setting—the traditional critical values of 0.05 or 0.01 in a hypothesistesting framework are not intrinsically part of the method. They are determined by convention or by the experience of the scientist. If specific cutoff values are desired, one could extend the analysis to a decision theory framework (Berger 1985) by factoring in costs associated with making different decisions.
HARDYWEINBERG EQUILIBRIUM
In a large, randomly mating population in which the evolutionary forces such as selection, migration, and mutation are not acting, allele and genotypic frequencies do not change. Furthermore, they are related in a simple way. At a locus with two alleles, A, B, with frequencies p_{A}, 1 − p_{A}, and genotypes AA, AB, BB with genotypic frequencies P_{AA}, P_{AB}, P_{BB}, the relationship is P_{AA} = p^{2}_{A}, P_{AB} = 2p_{A}(1 − p_{A}), and P_{BB} = (1 − p_{A})^{2}, and the locus is said to exhibit HWE. One generation of random mating is sufficient to produce HWE.
Classic tests for HWE include goodnessoffit, exact, likelihood ratio, and ztests (Haldane 1954; Emigh 1980; Hernández and Weir 1989; Maiste 1993). Simulations show that one of the most powerful tests for HWE is the exact test, particularly when the number of alleles is large (Maiste 1993).
There are various ways of describing departures from HWE. One way is to use a disequilibrium parameter D_{A} (Hernández and Weir 1989):
Another parameterization uses f_{A}, the inbreeding coefficient within populations (Weir 1996),
Two Bayesian studies (Pereira and Rogatko 1984; Lindle 1988) that address departures from HWE at loci with two alleles used a third parameterization
Ayres and Balding (1998) were interested in determining departures from HWE and in evaluating the inbreeding model. They used the f_{A} parameterization and relied upon Markov Chain Monte Carlo methods to sample from the posterior distribution of f_{A}. They treated loci with multiple alleles.
Chow and Fong (1992) were interested in comparing estimators for genotype frequencies under HWE. They compared Bayes' estimators, using different priors, to two classic estimators, the uniformly minimum varianceunbiased estimator and the maximum likelihood estimator. The estimators were compared on the basis of risk; risk functions provide means for evaluating a decision, and require a loss function, which quantifies the consequences of making a decision given the true state of nature (O'Hagan 1994). The estimator with the lowest risk was considered the best estimator. The authors showed that, under a standard loss function, Bayesian estimators corresponding to two of the priors were better than classic estimators if the allele frequency was not near zero or one.
METHODS
Let n_{1}, n_{2}, and n_{3} be the observed numbers of AA, AB, and BB genotypes from a simple random sample. The counts have a multinomial distribution with independent parameters P_{AA}, P_{AB} and sample size n = n_{1} + n_{2} + n_{3}:
We can work directly with the posterior distribution of f_{A} to calculate the probability that f_{A} lies in the region of interest. In contrast, we must work with the joint posterior distribution of D_{A}, p_{A} to find the corresponding probability for D_{A}. In each case, however, we use posterior distributions to determine the posterior probabilities of regions 6 or 7. We take high values of these probabilities as evidence that departures from HWE are no larger than f_{A} = 0.03. If the probabilities are we low, view this as evidence that the departures are not within the NRC range. Probabilities that are neither high nor low will be regarded as indicating that the data provide an insufficient basis for drawing a conclusion.
We examine three prior distributions for each parameterization: one based on mathematical convenience, and two based on the question of interest; i.e., the posterior probabilities of regions 6 or 7.
We will examine first the D_{A} parameterization and then the f_{A} parameterization.
D_{A} parameterization: The first of the three priors we examine is the mathematically convenient Dirichlet prior, which we use because it is conjugate to the multinomial. Use of a conjugate prior results in a posterior that has the same form as the prior. We apply the Dirichlet prior to the genotypic proportions, P_{AA}, and P_{AB}, and then transform to D_{A}, p_{A}. We write the Dirichlet parameters as γ_{i}(i = 1,2,3):
The third prior distribution was constructed so that the joint prior distribution of D_{A}, p_{A} was uniform over each of the three ranges listed in Table 1, with the corresponding weights. We refer to this prior as prior C (Figure 3, top).
f_{A} parameterization: The development of the three priors for this parameterization paralleled that for the D_{A} parameterization. The Dirichlet prior distribution on the genotype frequencies, P_{AA} and P_{AB}, was transformed to f_{A} and p_{A}:
Combining the joint prior distribution with the likelihood, the joint posterior distribution of the genotype proportions, given the data, is Dirichlet (γ_{i} + n_{i}), i = 1,2,3. After the transformation, the joint posterior density is
The second and third priors were constructed so that the prior probability of region 6 was 0.5. For the second prior, prior B, the prior for p_{A} was Beta(α, β), with α = β = 1, as for the D_{A}, p_{A} parameterization, and the prior for f_{A} given p_{A} was uniform over each of the three pieces corresponding to the ranges in Table 1. The joint prior distribution π(f_{A}, p_{A}) is π(p_{A})π(f_{A}p_{A}), and is shown in Figure 5 (top).
The third prior, prior C, was constructed so that the joint prior distribution of f_{A}, p_{A} was uniform over the ranges, with corresponding weights, listed in Table 1. The joint prior distribution of f_{A}, p_{A} is shown in Figure 6 (top).
Under either prior A or prior B, the posterior probabilities of regions 6 and 7 were the same. In the case of prior A, it is because the Dirichlet prior is placed on the genotypes, and not directly on D_{A}, p_{A} or f_{A}, p_{A}. In the case of the B prior, it is because the relationship between D_{A} (or f_{A}) conditional on p_{A} is linear. Despite the identity of the posterior probabilities for specific regions, we think it useful to present both parameterizations: questions may be more conveniently phrased in terms of one of the two parameterizations, depending upon the context. In addition, it is possible to construct priors, such as the C prior, in which the posterior distributions will not be identical, because the priors were constructed by focusing not on the conditional distributions of D_{A} (or f_{A}) given p_{A}, but on the joint distributions of D_{A} (or f_{A}) and p_{A}.
For both parameterizations, we evaluated the posterior distributions by using a midpoint rule with 100,000 points (Stoer and Bulirsch 1980).
RESULTS
It is critical to understand that the prior and posterior distributions reflect uncertainty about the fixed but unknown parameter values. It is not the case that the parameters themselves have a distribution. Any statements regarding prior and posterior distributions or probabilities should be interpreted in this context.
The three different prior distributions describe uncertainty about the parameters D_{A} (or f_{A}) and p_{A} in different ways. Each prior distribution implies information about the uncertainty of D_{A} (or f_{A}) and p_{A} together (joint distribution) or D_{A} (or f_{A}) and p_{A} separately (marginal distribution), or D_{A} (or f_{A}) conditional on a fixed value of p_{A} (conditional distribution). The first prior was chosen for convenience. It places a Dirichlet(1,1,1), i.e., a uniform prior distribution, on the genotypes, which translates to a different prior on parameters of interest (Figures 1 and 4, top). In the D_{A} parameterization, the joint distribution is also uniform over all possible pairs of D_{A} and p_{A} values. This means that if we consider uncertainty about D_{A} and p_{A} together, each possible pair of D_{A}, p_{A} values is equally uncertain. In the case of the f_{A} parameterization, a uniform prior on the genotypes does not translate into a uniform prior on f_{A} and p_{A} (Figure 4, top). Instead, pairs of f_{A}, p_{A} with central values have a higher probability than pairs with extreme values. The marginal prior distributions of D_{A} and p_{A} are symmetric with modes of 0 and 0.5, respectively; the marginal prior distribution of f_{A} implies that, under this prior, large negative values are less certain than small negative values, and that positive values of f_{A} are (approximately) as certain as the smallest negative value. Finally, the prior probability that D_{A} is inside the NRC range is only 0.04, compared to a prior probability of 0.50 when either prior B or C is used. The low prior probability of 0.04 reflects the fact that the first prior is more diffuse, and so the prior probability of any small interval, including the one in which we are interested, is small. This low prior probability makes it difficult to interpret the posterior probability (of regions 6 or 7) because it is weighted against being in the NRC range.
Prior B (Figures 2 and 5, top) and prior C (Figures 3 and 6, top) were constructed so that the probability that D_{A} (or f_{A}) is in the NRC range is equal to the probability that it is outside the NRC range. In addition, under this prior, the probability that pairs of values D_{A}, p_{A} (or f_{A}, p_{A}) are inside the NRC range is equal to the probability that they are outside the NRC range. However, the second prior emphasizes uncertainty about the marginal distribution of p_{A} and the conditional distribution, D_{A}p_{A} (or f_{A}p_{A}). The marginal distribution of p_{A}, which is uniform, implies that each value of p_{A} is equally uncertain. The conditional distribution of D_{A}p_{A} (or f_{A}p_{A}) implies that, for each fixed value of p_{A}, values of D_{A} (or f_{A}) in the NRC range are equally uncertain, values of D_{A} (or f_{A}) below the NRC range are equally uncertain, and values of D_{A} (or f_{A}) above the NRC range are equally uncertain. The third prior emphasizes the joint distribution; it implies that all pairs of D_{A} (or f_{A}) and p_{A} within the NRC range are equally uncertain, all pairs below the NRC range are equally uncertain, and all pairs above the NRC range are equally uncertain.
We applied the above methods to the published genotypic data shown in Table 2. The “FBI” data are given in Budowle et al. (1995) and the “Cellmark” data are provided by Weir et al. (B. S. Weir, J. Storey, R. Cotton, C. Basten, J. Buckleton and J. Morris, unpublished results). Three twoallele loci were used: D7S8 (Gyllensten and Erlich 1988), LDLR (Yamamotoet al. 1984), and GYPA (Siebert and Fukuda 1987).
For locus LDLR in the FBI Caucasian sample, Figures 1, 2 and 3 (bottom) show the posterior distribution of D_{A}, p_{A} given the data, using the three different priors. In Figure 1, since the joint prior on D_{A}, p_{A} was uniform, the posterior represents the transformed likelihood, standardized to integrate to 1. As expected, the joint posterior was in three sections when the prior B or C was used. In these cases (Figures 2 and 3, bottom), the volume in the middle section represents the probability that D_{A} lies in the NRC range. Figures 4, 5 and 6 (bottom) show the analogous posteriors for f_{A} and p_{A}.
Table 3 shows the posterior probabilities that D_{A} or f_{A} are inside the NRC range as well as the P values for an exact test of HWE. As expected, except for very small differences because of the numerical integration, the posterior probabilities were the same for either parameterization when the Dirichlet or prior B was used; thus, we reported only one number in these cases (columns P_{A}, P_{B}). There is a dramatic difference between the posterior probabilities when the mathematically convenient prior was used versus when priors B or C were used. This reflects the very different prior probabilities for the parameters lying within the NRC range—0.04 for the prior A and 0.50 for priors B and C. In no cases were the posterior probabilities for the parameters lying within the NRC range very high, although in a few cases, they approached high values. In one case (LDLR locus, Cellmark African American), the posterior probability approached a low value, which would lend support to the view that the parameters lie outside the NRC range.
DISCUSSION
Bayesian methods are not new to genetics. Animal breeders have used Bayesian methods in assessing response to selection, estimating breeding values and/or variance components (Gianola and Fernando 1986; Sorensen et al. 1994; Wanget al. 1994) and locating markers linked to quantitative trait loci (Hoeschele and VanRaden 1993a,b; Uimariet al. 1996). In population genetics, Bayesian approaches have been used to to estimate allele frequencies (Gunel and Wearden 1995), classify genotypes (Alexanderet al. 1995), and distinguish disomic from tetrasomic inheritance (Olson 1997). Lange (1995) used empirical Bayes methods to estimate allele frequencies in different subpopulations. Bayesian methods have been used in linkage analysis in recombinant inbred lines (Silver and Buckler 1986), pine trees (Geburek and von Wuehlisch 1989) and humans, the latter in the context of risk assessment (Rogatko 1995a,b). In molecular evolution, Bayesian methods for inferring phylogenetic trees have been proposed (Mau and Newton 1997; Yang and Rannala 1997). Our goal was to develop a Bayesian approach to HWE in a framework familiar to population geneticists. Our method differs from those of previous authors Pereira and Rogatko 1984; Lindley 1988; Ayres and Balding 1998) by using the disequilibrium and inbreeding parameterizations, and in defining levels of disequilibrium that are of interest in one context.
The posterior probabilities for prior A and priors B or C were quite different, indicating that the prior had a large effect with the sample sizes for the data we considered. Using prior A, the prior probabilities of the regions 6, 7 were only 0.04, while with priors B or C, the prior probability was 0.50. Therefore, the larger posterior probabilities for the latter priors are not surprising. Because the uniform prior was more diffuse than priors B or C, it weighted against departures from HWE that included only a small part of the parameter space, including the small departures from HWE in which we were interested. It is therefore difficult to draw conclusions from the analysis when the prior A is used. Because priors B and C were chosen to reflect our ignorance of whether or not the NRC recommendation is conservative, the resulting analyses are easier to interpret.
For most of the loci, there were not enough data in Table 1 to draw conclusions. For the LDLR locus in the Hispanic data base, the NRC recommendation may be conservative. In one case (LDLR locus in the Cellmark African American data base), the NRC recommendation may not be conservative. This was the same locus for which we would reject the HWE hypothesis using a classical test. These results do not preclude the use of these loci for forensic calculations because genotype proportions can be substituted for HWE proportions. Note that, under the Bayesian analysis, power calculations are not needed to distinguish the case in which there are not enough data from the case in which the departures are within the NRC range. In addition, we are asking directly the question of interest, i.e., what is the probability of the parameter being in the NRC range, instead of asking the question indirectly via a hypothesis test.
We have presented both the D_{A}, p_{A} and f_{A}, p_{A} parameterizations, because some researchers may find it more helpful to phrase their question in the context of one parameterization or the other. In our case, the NRC recommendation is phrased in terms of the f_{A}, p_{A} parameterization. Although the lower bound of f_{A} depends on the nuisance parameter p_{A}, because of the way the question is posed, we can work directly with the posterior distribution of f_{A}, although we find it more helpful to present graphs for the joint rather than the marginal posterior distribution. In contrast, for the D_{A}, p_{A} parameterization, we cannot work directly with the marginal distribution of D_{A}, and instead must work with the joint posterior distribution of D_{A}, p_{A}. Because we have only two parameters, the methods used to evaluate the joint posterior are not difficult to implement. However, with multiple alleles, the problem could be significant, in which case Markov Chain Monte Carlo methods would be useful (Ayres and Balding 1998).
In summary, we have proposed a Bayesian method for examining departures from HWE based on the disequilibrium and inbreeding parameterizations. The priors based on the conjugate Dirichlet distribution were heavily weighted against small departures from HWE. It is difficult to interpret the posterior probabilities in these cases. We need to find other priors that will more accurately reflect prior information or the question of interest. The priors may be formulated as step priors and depend on the question of interest. In the present context, we based our question on recommendations by the NRC and asked whether there was evidence that the inbreeding (or disequilibrium) parameter lay inside the NRC range. In many of the samples we examined there were not enough data to draw a conclusion about that question.
Acknowledgments
We appreciate the advice of Drs. Ian Evett and John Monahan. The manuscript was improved by comments from Drs. James Curran, Jeff Leips, Chris Triggs, and two anonymous reviewers. Thanks are due to Dmitri Zaykin for use of his program to calculate probabilities for the exact test. This work was supported in part by U.S. Public Health Service grants P01 GM43544 and T32 GM08443.
Footnotes

Communicating editor: A. G. Clark
 Received November 18, 1997.
 Accepted April 27, 1998.
 Copyright © 1998 by the Genetics Society of America