A Bayesian method for determining if there are large departures from independence between pairs of alleles at a locus, Hardy-Weinberg equilibrium (HWE), is presented. We endorse the view that a population will never be exactly in HWE and that there will be occasions when there is a need for an alternative to the usual hypothesis-testing setting. Bayesian methods provide such an alternative, and our approach differs from previous Bayesian treatments in using the disequilibrium and inbreeding coefficient parameterizations. These are easily interpretable but may be less mathematically tractable than other parameterizations. We examined the posterior distributions of our parameters for evidence that departures from HWE were large. For either parameterization, when a conjugate prior was used, the prior probability for small departures was itself small, i.e., the prior was weighted against small departures from independence. We could avoid this uneven weighting by using a step prior which gave equal weighting to both small and large departures from HWE. In most cases, the Bayesian methodology makes it clear that there are not enough data to draw a conclusion.
ASSESSING independence of alleles within loci, Hardy-Weinberg equilibrium (HWE), has long been of interest to population geneticists in a variety of contexts. In evolutionary theory, nonindependence is a starting point for determining what forces may be acting to affect variation or changes in allele frequencies (Allardet al. 1972; Clegget al. 1972); in forensic science, independence is assumed by the product rule for estimating genotypic frequencies (National Research Council 1996). Studies of departures from independence at single loci go back at least as far as Hardy (1908) and Weinberg (1908).
There are many tests for independence of alleles within loci using classic methods: for reviews see Emigh (1980), Hernández and Weir (1989), and Weir (1996). Bayesian approaches to HWE have been given by Pereira and Rogatko (1984), Lindley (1988), Chow and Fong (1992), and Ayres and Balding (1998). We were motivated to consider a Bayesian approach for several reasons. First, this approach allows us to determine the degree of departure from HWE. The classic approach addresses this question indirectly by testing the null hypothesis of zero departure. Second, in describing departures from HWE, we are not directly interested in the allele frequencies; however, they are necessary to define some measure of disequilibrium, which is of primary interest. Bayesian methods provide a convenient way to deal with parameters like allele frequencies, called nuisance parameters, by accounting for their different possible values. Third, it is sometimes the case that we have a set of data in which we observe zero, or very few, individuals with certain genotypes. Classic methods may have trouble accommodating these situations and Bayesian methods offer an alternative to exact tests (e.g., Zaykinet al. 1995).
This article presents a Bayesian approach to HWE based on the disequilibrium parameter and on the within-population inbreeding coefficient. These parameterizations have the advantage of being easy to interpret biologically. The disadvantage is that for some questions they may be less tractable mathematically.
The traditional approach, to which we have contributed previously and that we still employ, is to test the hypothesis of no departure from HWE. We recognize that failure to reject the hypothesis is not equivalent to declaring HWE, and power calculations allow us to quantify likely bounds on the strength of any departure. An attractive alternative is based on the recognition that real populations are not in HWE and that data, when coupled with prior knowledge, lead to probability distributions for the levels of departure. Bayesian methods replace traditional point estimates and tests of hypotheses about HWE departures with these posterior distributions. We use the posterior distributions to find the probabilities that disequilibrium or inbreeding parameters lie inside or outside intervals of interest. Discussions of Bayesian statistics may be found in Bernardo and Smith (1994), Berger (1985), Lee (1989), O'Hagan (1994), and Gelman et al. (1995).
In this section, we outline a Bayesian approach, and compare it to the classic hypothesis-testing framework in which the HWE problem is usually cast. In both Bayesian and classic methods, the parameters of interest, ϕ, are considered fixed and unknown. In addition, the conditional distribution of the data given particular values of the parameters, π(X|ϕ), is the same. This distribution has the same form as the likelihood, . In the HW testing framework, the data are counts, and are assumed framework, the data are counts, and to have a multinomial distribution. Bayesian statistics differ from classic statistics in the mode of inference. In the classic approach, we make statements about values of the parameters from the distribution of the data given the parameters. For example, a P value is a long run probability statement about the data under the null hypothesis. In a Bayesian setting, we make statements about probability distributions on the parameters. All the inference is made according to the posterior distribution on the parameters given the data, π(ϕ X). To make such statements, it is necessary to formally|express knowledge (or uncertainty) about the parameters using a probability distribution. This distribution, π(ϕ), is called the prior distribution. If the prior uncertainty is large, the prior distribution is diffuse. It is critical to understand that by using a prior (and posterior) distribution, we are not stating that we think the parameters themselves have a distribution; the parameters are considered unknown and fixed. Instead, the prior distribution on the parameter represents our uncertainty about the fixed value of the parameter. The posterior distribution of the parameters represents our uncertainty, updated by the data, about the parameter values. Using Bayes' rule, the posterior distribution of the parameters is proportional to the product of the prior distribution and the conditional distribution of the data: (1) To make the two sides equal, the right-hand side of the equation is divided by π(X).
Prior distributions may be constructed based on prior information, mathematical convenience, or ignorance about the parameters (Bernardo 1979; Kass and Wasserman 1993; Gelmanet al. 1995). Different prior distributions imply different levels of uncertainty about the parameters, not only when the parameters are considered together (joint prior distribution), but also when the parameters are considered separately (marginal prior distribution), or when one of the parameters is considered conditional on values of the other parameters (conditional prior distribution). It is useful to consider what each distribution implies about the parameters and to decide which one(s) to emphasize when constructing the joint prior distribution. This concept is illustrated below in our choice of prior distributions. There are several ways to report features of the posterior distribution, and thus, there are several ways to compare results from Bayesian and classic approaches.
The interpretation of a Bayesian result is completely different from the classic result to which it is being compared. With that caveat, one could report the posterior probabilities that the parameter lies inside an interval of interest. This probability statement is not interpreted as a long term frequency, but as an uncertainty. The most natural classic result to which to compare the posterior probability is a P value, the probability, under the null hypothesis, of obtaining data that are as extreme or more extreme than the observed result. Another way to summarize the posterior distribution is to specify an interval in which most (95%, for example) of the posterior distribution lies. One way to do this is to calculate a highest posterior density region. The most appropriate comparison to a 95% highest posterior density is a 95% confidence interval. In this article, we calculate the posterior probability that the parameter of interest lies within an interval suggested by the National Research Council (1996). Therefore, the classic result to which we compare our Bayesian result is a P value.
In a Bayesian framework there are no preset critical values. Of course, this statement is also true in a classic setting—the traditional critical values of 0.05 or 0.01 in a hypothesis-testing framework are not intrinsically part of the method. They are determined by convention or by the experience of the scientist. If specific cutoff values are desired, one could extend the analysis to a decision theory framework (Berger 1985) by factoring in costs associated with making different decisions.
In a large, randomly mating population in which the evolutionary forces such as selection, migration, and mutation are not acting, allele and genotypic frequencies do not change. Furthermore, they are related in a simple way. At a locus with two alleles, A, B, with frequencies pA, 1 − pA, and genotypes AA, AB, BB with genotypic frequencies PAA, PAB, PBB, the relationship is PAA = p2A, PAB = 2pA(1 − pA), and PBB = (1 − pA)2, and the locus is said to exhibit HWE. One generation of random mating is sufficient to produce HWE.
Classic tests for HWE include goodness-of-fit, exact, likelihood ratio, and z-tests (Haldane 1954; Emigh 1980; Hernández and Weir 1989; Maiste 1993). Simulations show that one of the most powerful tests for HWE is the exact test, particularly when the number of alleles is large (Maiste 1993).
There are various ways of describing departures from HWE. One way is to use a disequilibrium parameter DA (Hernández and Weir 1989): (2) where the bounds on DA are functions of the allele frequencies, Note that a negative disequilibrium coefficient corresponds to a deficiency of homozygotes relative to HWE, while a positive DA corresponds to an excess of homozygotes. With m alleles, there are the m(m − 1)/2 independent disequilibrium coefficients.
Another parameterization uses fA, the inbreeding coefficient within populations (Weir 1996), (3) and the bounds on fA are Only the lower bound of fA depends on the allele frequencies. A negative value of fA corresponds to a deficiency of homozygotes, while a positive value reflects an excess of homozygotes. Despite its name, the fA parameterization may be used to describe departures from HWE due to any cause, not only departures due to inbreeding.
Two Bayesian studies (Pereira and Rogatko 1984; Lindle 1988) that address departures from HWE at loci with two alleles used a third parameterization (4) Pereira and Rogatko (1984) framed their analysis in terms of Bayesian analogs to classic estimation, hypothesis testing, and confidence intervals. They used the mathematically convenient Dirichlet prior on genotype frequencies and estimated θ. A Bayes factor, which can be thought of as a likelihood ratio, and Bayesian confidence intervals provided the bases for hypothesis testing. Lindley (1988) considered various priors, both uniform and nonuniform, on two new parameters His formulation has the advantage that α and β have bounds that do not depend on the other, making it straightforward to determine the marginal posterior distribution for each variable separately. The posterior distribution of α and Bayes' factor for α = 0 were used for estimation and to draw conclusions regarding HWE.
Ayres and Balding (1998) were interested in determining departures from HWE and in evaluating the inbreeding model. They used the fA parameterization and relied upon Markov Chain Monte Carlo methods to sample from the posterior distribution of fA. They treated loci with multiple alleles.
Chow and Fong (1992) were interested in comparing estimators for genotype frequencies under HWE. They compared Bayes' estimators, using different priors, to two classic estimators, the uniformly minimum variance-unbiased estimator and the maximum likelihood estimator. The estimators were compared on the basis of risk; risk functions provide means for evaluating a decision, and require a loss function, which quantifies the consequences of making a decision given the true state of nature (O'Hagan 1994). The estimator with the lowest risk was considered the best estimator. The authors showed that, under a standard loss function, Bayesian estimators corresponding to two of the priors were better than classic estimators if the allele frequency was not near zero or one.
Let n1, n2, and n3 be the observed numbers of AA, AB, and BB genotypes from a simple random sample. The counts have a multinomial distribution with independent parameters PAA, PAB and sample size n = n1 + n2 + n3: (5) We assume that the population is never exactly in HWE and ask whether departures from HWE are large enough to be of importance in the relevant context. In this article, a relevant context is forensic science in which the genotype proportions are estimated as products of allele proportions. The United States National Research Council (NRC) report (National Research Council 1996) suggested that fA = 0.03 might be the upper limit of departure from HWE in human populations. Strictly, the report assumes that departures make fA positive, but we will interpret the suggestion to mean |fA| ≤ 0.03. The lower bound must be modified to allow for the dependence on allele frequencies: (6) Expression 6 represents a region of the parameter space, and means that the lower bound of fA is −0.03 unless pA lies outside the interval [0.03, 0.97]. Since DA = pA(1 − pA)fA, the bounds of the disequilibrium parameterization become (7)
We can work directly with the posterior distribution of fA to calculate the probability that fA lies in the region of interest. In contrast, we must work with the joint posterior distribution of DA, pA to find the corresponding probability for DA. In each case, however, we use posterior distributions to determine the posterior probabilities of regions 6 or 7. We take high values of these probabilities as evidence that departures from HWE are no larger than |fA| = 0.03. If the probabilities are we low, view this as evidence that the departures are not within the NRC range. Probabilities that are neither high nor low will be regarded as indicating that the data provide an insufficient basis for drawing a conclusion.
We examine three prior distributions for each parameterization: one based on mathematical convenience, and two based on the question of interest; i.e., the posterior probabilities of regions 6 or 7.
We will examine first the DA parameterization and then the fA parameterization.
DA parameterization: The first of the three priors we examine is the mathematically convenient Dirichlet prior, which we use because it is conjugate to the multinomial. Use of a conjugate prior results in a posterior that has the same form as the prior. We apply the Dirichlet prior to the genotypic proportions, PAA, and PAB, and then transform to DA, pA. We write the Dirichlet parameters as γi(i = 1,2,3): (8) where . The corresponding joint prior distribution of DA and pA is (9) Combining the prior density with the likelihood, the posterior density of the genotype proportions given the data is Dirichlet (γi + ni), i = 1,2,3. After transformation to pA, DA the joint posterior density becomes (10) In the absence of prior information and without regard to the question of interest, we choose to set the γi's equal to 1, i.e., a uniform prior for the genotype proportions. In this case, the joint prior of DA and pA is uniform over the parameter space (Figure 1, top). We refer to this prior as prior A. The second and third prior distributions were based on the question about departures from HWE. We wanted the prior probability that DA lies inside the NRC range to be 0.5, reflecting our ignorance about whether or not departures from HWE are within the NRC range. For the second prior, we let the prior distribution for pA, π(pA), be Beta(α, β), which has been shown by Wright (1951) to be appropriate under certain conditions. We set α = β = 1, the uniform prior. The prior distribution for DA given pA, π(DA|pA) was a step function uniform over each of 3 ranges for DA|pA, with the corresponding weights 0.25, 0.50, and 0.25 (Table 1). We refer to this prior as prior B. The joint prior distribution π(DA, pA) is π(pA)π(DA|pA) and is shown in Figure 2 (top).
The third prior distribution was constructed so that the joint prior distribution of DA, pA was uniform over each of the three ranges listed in Table 1, with the corresponding weights. We refer to this prior as prior C (Figure 3, top).
fA parameterization: The development of the three priors for this parameterization paralleled that for the DA parameterization. The Dirichlet prior distribution on the genotype frequencies, PAA and PAB, was transformed to fA and pA: (11) If the γi's are set equal to 1, the resulting prior distribution, in contrast to that for the DA parameterization, is not uniform over the parameter space (Figure 4, top).
Combining the joint prior distribution with the likelihood, the joint posterior distribution of the genotype proportions, given the data, is Dirichlet (γi + ni), i = 1,2,3. After the transformation, the joint posterior density is (12)
The second and third priors were constructed so that the prior probability of region 6 was 0.5. For the second prior, prior B, the prior for pA was Beta(α, β), with α = β = 1, as for the DA, pA parameterization, and the prior for fA given pA was uniform over each of the three pieces corresponding to the ranges in Table 1. The joint prior distribution π(fA, pA) is π(pA)π(fA|pA), and is shown in Figure 5 (top).
The third prior, prior C, was constructed so that the joint prior distribution of fA, pA was uniform over the ranges, with corresponding weights, listed in Table 1. The joint prior distribution of fA, pA is shown in Figure 6 (top).
Under either prior A or prior B, the posterior probabilities of regions 6 and 7 were the same. In the case of prior A, it is because the Dirichlet prior is placed on the genotypes, and not directly on DA, pA or fA, pA. In the case of the B prior, it is because the relationship between DA (or fA) conditional on pA is linear. Despite the identity of the posterior probabilities for specific regions, we think it useful to present both parameterizations: questions may be more conveniently phrased in terms of one of the two parameterizations, depending upon the context. In addition, it is possible to construct priors, such as the C prior, in which the posterior distributions will not be identical, because the priors were constructed by focusing not on the conditional distributions of DA (or fA) given pA, but on the joint distributions of DA (or fA) and pA.
For both parameterizations, we evaluated the posterior distributions by using a midpoint rule with 100,000 points (Stoer and Bulirsch 1980).
It is critical to understand that the prior and posterior distributions reflect uncertainty about the fixed but unknown parameter values. It is not the case that the parameters themselves have a distribution. Any statements regarding prior and posterior distributions or probabilities should be interpreted in this context.
The three different prior distributions describe uncertainty about the parameters DA (or fA) and pA in different ways. Each prior distribution implies information about the uncertainty of DA (or fA) and pA together (joint distribution) or DA (or fA) and pA separately (marginal distribution), or DA (or fA) conditional on a fixed value of pA (conditional distribution). The first prior was chosen for convenience. It places a Dirichlet(1,1,1), i.e., a uniform prior distribution, on the genotypes, which translates to a different prior on parameters of interest (Figures 1 and 4, top). In the DA parameterization, the joint distribution is also uniform over all possible pairs of DA and pA values. This means that if we consider uncertainty about DA and pA together, each possible pair of DA, pA values is equally uncertain. In the case of the fA parameterization, a uniform prior on the genotypes does not translate into a uniform prior on fA and pA (Figure 4, top). Instead, pairs of fA, pA with central values have a higher probability than pairs with extreme values. The marginal prior distributions of DA and pA are symmetric with modes of 0 and 0.5, respectively; the marginal prior distribution of fA implies that, under this prior, large negative values are less certain than small negative values, and that positive values of fA are (approximately) as certain as the smallest negative value. Finally, the prior probability that DA is inside the NRC range is only 0.04, compared to a prior probability of 0.50 when either prior B or C is used. The low prior probability of 0.04 reflects the fact that the first prior is more diffuse, and so the prior probability of any small interval, including the one in which we are interested, is small. This low prior probability makes it difficult to interpret the posterior probability (of regions 6 or 7) because it is weighted against being in the NRC range.
Prior B (Figures 2 and 5, top) and prior C (Figures 3 and 6, top) were constructed so that the probability that DA (or fA) is in the NRC range is equal to the probability that it is outside the NRC range. In addition, under this prior, the probability that pairs of values DA, pA (or fA, pA) are inside the NRC range is equal to the probability that they are outside the NRC range. However, the second prior emphasizes uncertainty about the marginal distribution of pA and the conditional distribution, DApA (or fA|pA). The marginal distribution of pA, which is uniform, implies that each value of pA is equally uncertain. The conditional distribution of DA|pA (or fA|pA) implies that, for each fixed value of pA, values of DA (or fA) in the NRC range are equally uncertain, values of DA (or fA) below the NRC range are equally uncertain, and values of DA (or fA) above the NRC range are equally uncertain. The third prior emphasizes the joint distribution; it implies that all pairs of DA (or fA) and pA within the NRC range are equally uncertain, all pairs below the NRC range are equally uncertain, and all pairs above the NRC range are equally uncertain.
We applied the above methods to the published genotypic data shown in Table 2. The “FBI” data are given in Budowle et al. (1995) and the “Cellmark” data are provided by Weir et al. (B. S. Weir, J. Storey, R. Cotton, C. Basten, J. Buckleton and J. Morris, unpublished results). Three two-allele loci were used: D7S8 (Gyllensten and Erlich 1988), LDLR (Yamamotoet al. 1984), and GYPA (Siebert and Fukuda 1987).
For locus LDLR in the FBI Caucasian sample, Figures 1, 2 and 3 (bottom) show the posterior distribution of DA, pA given the data, using the three different priors. In Figure 1, since the joint prior on DA, pA was uniform, the posterior represents the transformed likelihood, standardized to integrate to 1. As expected, the joint posterior was in three sections when the prior B or C was used. In these cases (Figures 2 and 3, bottom), the volume in the middle section represents the probability that DA lies in the NRC range. Figures 4, 5 and 6 (bottom) show the analogous posteriors for fA and pA.
Table 3 shows the posterior probabilities that DA or fA are inside the NRC range as well as the P values for an exact test of HWE. As expected, except for very small differences because of the numerical integration, the posterior probabilities were the same for either parameterization when the Dirichlet or prior B was used; thus, we reported only one number in these cases (columns PA, PB). There is a dramatic difference between the posterior probabilities when the mathematically convenient prior was used versus when priors B or C were used. This reflects the very different prior probabilities for the parameters lying within the NRC range—0.04 for the prior A and 0.50 for priors B and C. In no cases were the posterior probabilities for the parameters lying within the NRC range very high, although in a few cases, they approached high values. In one case (LDLR locus, Cellmark African American), the posterior probability approached a low value, which would lend support to the view that the parameters lie outside the NRC range.
Bayesian methods are not new to genetics. Animal breeders have used Bayesian methods in assessing response to selection, estimating breeding values and/or variance components (Gianola and Fernando 1986; Sorensen et al. 1994; Wanget al. 1994) and locating markers linked to quantitative trait loci (Hoeschele and VanRaden 1993a,b; Uimariet al. 1996). In population genetics, Bayesian approaches have been used to to estimate allele frequencies (Gunel and Wearden 1995), classify genotypes (Alexanderet al. 1995), and distinguish disomic from tetrasomic inheritance (Olson 1997). Lange (1995) used empirical Bayes methods to estimate allele frequencies in different subpopulations. Bayesian methods have been used in linkage analysis in recombinant inbred lines (Silver and Buckler 1986), pine trees (Geburek and von Wuehlisch 1989) and humans, the latter in the context of risk assessment (Rogatko 1995a,b). In molecular evolution, Bayesian methods for inferring phylogenetic trees have been proposed (Mau and Newton 1997; Yang and Rannala 1997). Our goal was to develop a Bayesian approach to HWE in a framework familiar to population geneticists. Our method differs from those of previous authors Pereira and Rogatko 1984; Lindley 1988; Ayres and Balding 1998) by using the disequilibrium and inbreeding parameterizations, and in defining levels of disequilibrium that are of interest in one context.
The posterior probabilities for prior A and priors B or C were quite different, indicating that the prior had a large effect with the sample sizes for the data we considered. Using prior A, the prior probabilities of the regions 6, 7 were only 0.04, while with priors B or C, the prior probability was 0.50. Therefore, the larger posterior probabilities for the latter priors are not surprising. Because the uniform prior was more diffuse than priors B or C, it weighted against departures from HWE that included only a small part of the parameter space, including the small departures from HWE in which we were interested. It is therefore difficult to draw conclusions from the analysis when the prior A is used. Because priors B and C were chosen to reflect our ignorance of whether or not the NRC recommendation is conservative, the resulting analyses are easier to interpret.
For most of the loci, there were not enough data in Table 1 to draw conclusions. For the LDLR locus in the Hispanic data base, the NRC recommendation may be conservative. In one case (LDLR locus in the Cellmark African American data base), the NRC recommendation may not be conservative. This was the same locus for which we would reject the HWE hypothesis using a classical test. These results do not preclude the use of these loci for forensic calculations because genotype proportions can be substituted for HWE proportions. Note that, under the Bayesian analysis, power calculations are not needed to distinguish the case in which there are not enough data from the case in which the departures are within the NRC range. In addition, we are asking directly the question of interest, i.e., what is the probability of the parameter being in the NRC range, instead of asking the question indirectly via a hypothesis test.
We have presented both the DA, pA and fA, pA parameterizations, because some researchers may find it more helpful to phrase their question in the context of one parameterization or the other. In our case, the NRC recommendation is phrased in terms of the fA, pA parameterization. Although the lower bound of fA depends on the nuisance parameter pA, because of the way the question is posed, we can work directly with the posterior distribution of fA, although we find it more helpful to present graphs for the joint rather than the marginal posterior distribution. In contrast, for the DA, pA parameterization, we cannot work directly with the marginal distribution of DA, and instead must work with the joint posterior distribution of DA, pA. Because we have only two parameters, the methods used to evaluate the joint posterior are not difficult to implement. However, with multiple alleles, the problem could be significant, in which case Markov Chain Monte Carlo methods would be useful (Ayres and Balding 1998).
In summary, we have proposed a Bayesian method for examining departures from HWE based on the disequilibrium and inbreeding parameterizations. The priors based on the conjugate Dirichlet distribution were heavily weighted against small departures from HWE. It is difficult to interpret the posterior probabilities in these cases. We need to find other priors that will more accurately reflect prior information or the question of interest. The priors may be formulated as step priors and depend on the question of interest. In the present context, we based our question on recommendations by the NRC and asked whether there was evidence that the inbreeding (or disequilibrium) parameter lay inside the NRC range. In many of the samples we examined there were not enough data to draw a conclusion about that question.
We appreciate the advice of Drs. Ian Evett and John Monahan. The manuscript was improved by comments from Drs. James Curran, Jeff Leips, Chris Triggs, and two anonymous reviewers. Thanks are due to Dmitri Zaykin for use of his program to calculate probabilities for the exact test. This work was supported in part by U.S. Public Health Service grants P01 GM43544 and T32 GM08443.
Communicating editor: A. G. Clark
- Received November 18, 1997.
- Accepted April 27, 1998.
- Copyright © 1998 by the Genetics Society of America