- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Deng, H.-W.
- Articles by Recker, R. R.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Deng, H.-W.
- Articles by Recker, R. R.
Population Admixture: Detection by Hardy-Weinberg Test and Its Quantitative Effects on Linkage-Disequilibrium Methods for Localizing Genes Underlying Complex Traits
Hong-Wen Denga,b,c, Wei-Min Chena,b, and Robert R. Reckeraa Osteoporosis Research Center, Creighton University, Omaha, Nebraska 68131
b Department of Biomedical Sciences, Creighton University, Omaha, Nebraska 68131
c Laboratory of Molecular and Statistical Genetics, Hunan Normal University, ChangSha, Hunan 410081, People's Republic of China
Corresponding author: Hong-Wen Deng, Osteoporosis Research Ctr., Creighton University, 601 N. 30th St., Omaha, NE 68131., deng{at}creighton.edu (E-mail)
Communicating editor: M. A. ASMUSSEN
| ABSTRACT |
|---|
In association studies searching for genes underlying complex traits, the results are often inconsistent, and population admixture has been recognized qualitatively as one major potential cause. Hardy-Weinberg equilibrium (HWE) is often employed to test for population admixture; however, its power is generally unknown. Through analytical and simulation approaches, we quantify the power of the HWE test for population admixture and the effects of population admixture on increasing the type I error rate of association studies under various scenarios of population differentiation and admixture. We found that (1) the power of the HWE test for detecting population admixture is usually small; (2) population admixture seriously elevates type I error rate for detecting genes underlying complex traits, the extent of which depends on the degrees of population differentiation and admixture; (3) HWE testing for population admixture should be performed with random samples or only with controls at the candidate genes, or the test can be performed for combined samples of cases and controls at marker loci that are not linked to the disease; (4) testing HWE for population admixture generally reduces false positive association findings of genes underlying complex traits but the effect is small; and (5) with population admixture, a linkage disequilibrium method that employs cases only is more robust and yields many fewer false positive findings than conventional case-control analyses. Therefore, unless random samples are carefully selected from one homogeneous population, admixture is always a legitimate concern for positive findings in association studies except for the analyses that deliberately control population admixture.
COMPLEX traits refer to diseases and quantitative traits with complex and multiple genetic and environmental determinations. Association studies that depend on linkage disequilibrium between markers and genes underlying complex traits have helped to decipher some genetic basis of variation of quantitative traits and the differential susceptibility to complex diseases (e.g., ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
One of the most important causes that may underlie the inconsistent results from association studies is population admixture (![]()
![]()
![]()
![]()
Family-based analyses such as the transmission disequilibrium test (TDT; ![]()
![]()
![]()
It is well known that population admixture can lead to deviation of genotype frequencies from what are expected on the basis of the Hardy-Weinberg (HW) law (![]()
![]()
![]()
![]()
![]()
![]()
In this article, through analytical and/or computer simulation approaches, first we quantify the power of the HWE test under various degrees of population differentiation (as reflected by different population allele and disease frequencies) and various degrees of population admixture (as reflected by the different proportions that populations admix). Second, we quantify the effects of various degrees of population differentiation and population admixture on the outcome of association studies. Two types of analyses for complex diseases [a conventional one that employs cases and controls and a recently developed one (![]()
![]()
![]()
| THEORY AND METHODS |
|---|
In this section, we first present our theoretical investigation and then outline our simulation methods. For simplicity, we focus our investigation on association studies of complex diseases in a population (P) admixed of two differentiated large subpopulations (P1 and P2). In the P1 and P2 populations, HWE holds at a marker locus in which alleles can be classified into two classes, M and m. The frequencies of M in P1 are f1 and in P2 are f2. The disease prevalences are, respectively,
1 in population P1 and
2 in P2. The disease and the marker locus are not associated by any cause in the P1 and P2 populations. A proportion k of individuals in population P come from population P1; the rest (1 - k) come from P2. The frequencies of the M allele (f) and the disease (
) in population P are then, respectively, f = kf1 + (1 - k)f2 and
= k
1 + (1 - k)
2.
The power of HWE test for population admixture at marker loci:
To focus our investigation on the power of the HWE test for population admixture, we assume that in population P, HW disequilibrium is entirely due to the population admixture. The HW disequilibrium can be measured by the deviations of genotype frequencies from those expected under the HWE (![]()
![]() |
(1a) |
![]() |
(1b) |
![]() |
(1c) |
where
f = f2 - f1. k(1 - k) can serve as a measure for the degree of population admixture. The larger the k(1 - k), the larger the degree of population admixture. k(1 - k) is maximized when k = 0.5. The
2-test is often employed to test for HWE (![]()
![]() |
(2) |
which has 1 d.f. N is the sample size, and
indicates estimated frequencies of genotypes (MM, Mm, or mm) or alleles (M or m) from the sample. Under the alternative hypothesis that there is population admixture, the
2HW statistic follows a noncentral
2-distribution with 1 d.f. and the noncentrality parameter
![]() |
(3a) |
where D's are defined in Equation 1aEquation 1bEquation 1c. Substituting D's from Equation 1aEquation 1bEquation 1c into Equation 3a, we have
![]() |
(3b) |
With
HW given in Equation 3b, we can compute the power (
) of the
2HW-test under various degrees of population differentiation
f and various degrees of population admixture k (Appendix A). Some numerical results substantiated in later computer simulations are given in Table 1 Table 2 Table 3 Table 4.
|
|
|
|
Differentiation between populations can be measured by various indices in population genetics (![]()
![]()
![]()
![]()
![]()
![]() |
(4) |
where N is the sample size for the
2HW-test. Equation 4 establishes a direct relationship between the power of the HWE test for population admixture and a classical measure (GST) of the degree of population differentiation. Apparently, the larger the population subdivision as reflected by the GST, the higher the power for a sample to detect population subdivision by the HW test.
The effects of admixture of differentiated populations on the outcome of association studies:
To focus on quantifying the effects of admixture on association studies, we assume that the marker locus does not underlie the disease susceptibility in populations P1 and P2 and any association between the marker locus and the disease in population P will be entirely due to the admixture of the two differentiated populations.
Two types of tests are investigated, both depending on the basis that the marker locus is a disease gene per se or that it is in linkage disequilibrium with a disease gene. The first one is the
2-test employed in the conventional case-control studies (![]()
![]() |
(5) |
where
M|D is the allele M frequency in cases (D) and
M|C is the allele M frequency in controls (C).
m|D and
m|C are similarly defined for allele m. This
2-test has 1 d.f. N is the sample size, and
indicates an estimated value from the sample. With population admixture,
2CC approximately follows a noncentral
2-distribution as corroborated later in our computer simulations. In the admixed population, pM|D, pM|C, pm|C, pm|C can be derived in terms of k, f1, f2,
1,
2, f, and
(Appendix C). With these, we can obtain the noncentrality parameter of the
2CC-statistic:
![]() |
(6a) |
If none of the terms of k, (1 - k),
f, or 
is equal to 0, i.e., if there is population admixture for two populations differentiated in both allele and disease frequencies, then
![]() |
(6b) |
It can be seen qualitatively that the larger the term k(1 - k)
f
, the larger the
CC. The magnitude of the noncentrality parameter
CC determines the power to detect association between marker alleles and the disease due to admixture of the two differentiated populations P1 and P2.
CC may help us understand intuitively the effects of population admixture and population differentiation on case-control analyses of association studies.
The null hypothesis to be tested in association studies for disease genes is that the marker alleles are not causally associated with the disease; i.e., the marker locus and a disease gene are not linked. For this null hypothesis, the power of the
2CC-test in the admixed population under the condition of no causal relationship between the marker alleles and the disease is, in fact, the type I error (
) due to population admixture and the statistical sampling error (a prespecified significance level for the
2CC-test,
). With a given noncentrality parameter, the power under a specific set of parameters is computed in the same way as detailed in the previous section. The dependency of
on various parameters of population differentiation and admixture is depicted in Fig 1. The difference of
and
is the inflated type I error due solely to the admixture of differentiated populations.
|
The second analysis investigated is developed by ![]()
![]()
![]()
![]()
![]()
2HW-D) in cases (D) for finding the causal association between a marker locus and a disease. The test statistic is
![]() |
(7) |
which has 1 d.f. ND is the sample size of cases. With the derivations in Appendix C, we can obtain the noncentrality parameter of the
2HW-D statistic in the presence of population admixture:
![]() |
(8) |
It can be seen, by calculus, that
HW - D maximizes, with respect to
, when
=
- k
. The maximum of
HW - D with respect to
is

HW-D (max) is useful in that it may allow us to characterize maximum type I error in association studies with the
2HW-D-test, irrespective of the population differentiation in the disease frequencies in populations P1 and P2.
The dependency of the type I error of the
2HW-D-test (
, for the null hypothesis of no causal association between the disease and the marker) on various parameters of population differentiation and admixture is computed, on the basis of
HW-D for the
2HW-D-statistic, in a way similar to that described in APPENDIX A. The results are depicted in Fig 2.
|
From Equation 3aEquation 3b and Equation 8, we can obtain the following relationship between the noncentrality parameter for the test of HWE in random population samples for detecting admixture and that in the cases for only detecting linkage disequilibrium between a marker locus and disease genes. Assume that both tests employ the same sample sizes,

where
=
.
Choice of population samples and marker loci for the HWE test:
Ideally, random samples (for any locus) or marker loci not associated with the disease (for any sample) should be employed to detect population admixture (
2HW-test, Equation 2). For association studies (cases and controls for the
2CC-test and cases only for the
2HW-D-test), what samples at hand should be employed for the HWE test to detect population admixture are not entirely clear and have not been formally investigated. Apparently, we cannot use the cases only, as the HWE test in cases is a test for linkage disequilibrium when the whole population is randomly mating (![]()
![]()
![]()
To investigate these questions, we performed two types of simulations. The first type is to investigate the power to detect HW disequilibrium with cases and/or controls in large randomly mating populations, when the marker locus is at or closely linked to a disease susceptibility locus. For the null hypotheses of no population admixture, this power is in fact the rate of false positive findings (type I error rate,
) for HW disequilibrium that is due to nonrandom choices of samples (cases and/or controls) and the statistical sampling error (a prespecified significance level,
). The simulation procedures are detailed in ![]()
![]()
![]()
![]()
![]()
2HW-test (Equation 2) is applied to the 200 controls and the combined sample of 200 controls and 200 cases. The population is sampled and tested 5000 times, and the proportion of the significant tests is the power to detect HW disequilibrium when there is no population admixture (
). Thus this proportion provides an estimate of the type I error (
) of the null hypothesis of no population admixture. The simulation is repeated 100 times, and the mean and standard deviation of
is computed and depicted in Fig 3, a and b.
|
The second type of simulations is to compare the power to detect population admixture with controls only and that with both cases and controls in an admixed population. A population P admixed with P1 (with f1 = 0.1,
1 = 0.1) and P2 (with f1 = 0.3,
2 = 0.3) is simulated with various k in Fig 3C. In Fig 3D, a population P (f = 0.2) is simulated from admixture (k = 0.5) of two populations P1 and P2 that have allele M frequencies that differ by
f. In the P1 and P2 populations, the disease and the marker are not associated by any means. A total of 200 controls and 200 cases are sampled from the P population. Then the
2HW-test (Equation 2) is applied to the 200 controls and to the combined sample of 200 controls and 200 cases. The population is sampled and tested 50,000 times, and the proportion of the significant tests is the power to detect HW disequilibrium due to population admixture (
).
Testing HWE for population admixture in reducing false positive findings in association studies:
In the first two sections, through the analytical approach, we study separately the power of the HWE test for population admixture and the effects of admixture of differentiated populations on elevating the type I error rate in association studies. In this section, through computer simulations, we investigate the effect of testing HWE at candidate genes for population admixture in association studies, a practice (e.g., ![]()
![]()
A population (P) admixed of two differentiated populations (P1 and P2) is simulated. A proportion k of individuals of the population P comes from P1 and the rest from P2. In the P1 and P2 populations, HWE holds at the marker locus in which alleles can be classified into two classes M and m. The frequencies of M in P1 are f1 and in P2 are f2. The disease prevalences are, respectively,
1 in population P1 and
2 in P2. The disease and the marker locus are independent in the P1 and P2 populations. For a specific parameter set, a sample of 3N individuals is simulated from the P population, with 2N cases and N controls. The HWE test (Equation 2) is performed to detect population admixture with the N controls only (see RESULTS, Choices of population samples and marker loci for HWE test). If the test is significant, further testing of association between the marker and the disease will not be pursued to avoid confounding of association results due to admixture. If the test is not significant, i.e., if the test fails to reveal population admixture, tests of association that employ N random cases and N controls (Equation 5) and those that employ 2N cases (Equation 7) are conducted. This sampling scheme ensures that the test of HWE is performed on the basis of the same sample of controls for the test employing cases and controls (Equation 5) and the tests that are based on cases only (Equation 7). It also ensures that the two tests of associations have the same sample sizes of 2N so that the comparison of false positive findings of these two tests will not be confounded by the different sample sizes employed. N = 200 in our investigations. Note that this sampling design is used for the purpose of simulations only and is in no way intended as a design in collecting data in practice.
The proportion of the simulations with significant associations and a nonsignificant HWE test is the type I error of the association study approach that is aided with the HWE test to guard against confounding from population admixture. This type I error includes both the specified type I error rate in the statistical testing (
) and that inflated due to population admixture that failed to be revealed by the HWE test. For comparison, the type I error rate of association studies with and without (computed as indicated in the second section) the aid of the HWE test for population admixture is contrasted in Fig 1 and Fig 2. In simulations, we corroborated that the results for the power or type I error based upon the analytical approach in the first two sections are accurate (to avoid repetitiveness the results are not shown).
| RESULTS |
|---|
The sample size required and the power of the HWE test for population admixture (Table 1 Table 2 Table 3 Table 4):
It can be seen from Table 1 and Table 2 that the sample size (n) required to detect population admixture by the HWE test is generally quite large, except when the degree of population admixture is large (i.e., k
0.5) and the differentiation of populations P1 and P2 is large (i.e., when
f is large). Generally speaking, when
f = 0.2 (i.e., the frequencies of the allele M differ by 0.2 in the populations P1 and P2), n required is >2000 even with the largest degree of population admixture (k = 0.5) and n is >20,000 if the degree of population admixture is small (k = 0.1). These sample sizes well exceed those feasible and typically employed in association studies. When
f gets larger and k gets closer to 0.5, n gets smaller. Generally speaking, only when
f > 0.4, and when k > 0.2, can the population admixture be detected by the sample sizes typically employed in association studies (<1000).
For samples sizes 200 and 400 that are typically feasible, Table 3 and Table 4 list the power to detect population admixture via the HWE test under various degrees of population differentiation and admixture. It can be seen that the power depends on both k and
f. Generally speaking, if
f < 0.2, there is little power to detect population admixture via the HWE test regardless of k. Only when
f is quite large (>0.4) and k > 0.2 is the power relatively high. When
f = 0.8, the power is almost always 100%. However,
f > 0.4 is probably rather rare in natural populations, especially in humans for candidate genes.
The effects of admixture of differentiated populations on the outcome of association studies ( Fig 1 and Fig 2):
It can be seen (Fig 1) that when
f and 
increase, the false findings of association studies (the type I error rate,
) increase rapidly for the
2CC-test that employs both cases and controls. When
f = 0, irrespective of the magnitude of 
,
remains the same magnitude of a prespecified type I error rate
(0.05). Data not shown indicate that when 
= 0,
remains relatively stable at
(0.05) for the case-control analyses (Equation 5), regardless of the magnitude of
f. This is consistent with the analytical prediction based on the noncentrality parameter of the test statistic (Equation 6a). These results are consistent with qualitative analyses of the noncentrality parameters of these tests (Equation 6a and Equation 8).
Noticeable (Fig 2) is the fact that, over a range of
f, 
, and k,
remains fairly stable and close to the specified significance level
(0.05) for the
2HW-D test (Equation 7) that employs cases only. Generally speaking, for the same sample sizes employed, the
2HW-D test has much smaller
(Fig 1 and Fig 2) than the
2CC-test under the same parameters.
Choice of population samples and marker loci for HWE test ( Fig 3):
In large randomly mating populations (Fig 3, a and b), if the marker locus is in linkage disequilibrium with the disease locus due to linkage, testing HWE with both the cases and controls (selected for population association studies) will result in false positive findings of population admixture at a rate (
) higher than the specified statistical type I error rate
(0.05).
increases dramatically with increasing levels of linkage disequilibrium. When the marker locus is not linked to a disease locus,
remains at the specified
of 0.05. However, if tested only in controls,
remains at the level close to 0.05 whether the locus is linked to a disease locus or not.
In an admixed population P (Fig 3C and Fig D), if the marker locus is not linked to a disease locus, combining cases and controls for the HW test will generally have higher power to detect population admixture than testing in the controls alone, due to larger sample sizes. Testing the HWE with controls only has similar power to the testing with random samples of the same sizes. Testing HWE with combined sample sizes of cases and controls has similar or slightly greater power to detect population admixture than the test employing random samples of the same sizes. This is probably due to the elevated level of HW disequilibrium in cases due to population admixture. Although the marker is not linked to a disease locus in subpopulations P1 and P2, linkage disequilibrium between the marker and the disease is created upon admixture of P1 and P2 that differ in disease and marker frequencies. Such linkage disequilibrium leads to the elevated level of HW disequilibrium in cases.
Testing HWE for population admixture in association studies ( Fig 1 and Fig 2):
By contrasting the
's for the association studies that do and do not employ the HWE test for population admixture, it can be seen easily (Fig 1 and Fig 2) that those employing the HWE test will suffer reduced levels of
. However, the reduction of
is generally small by accepting only those significant associations in samples with a nonsignificant HWE test. Therefore, the utility of testing HWE in reducing false positive findings due to population admixture is generally limited. This is consistent with earlier results on the limited power of the HWE test for population admixture.
| DISCUSSION |
|---|
With random population samples, extensive association studies have been conducted to search for genes underlying complex traits through linkage disequilibrium of these genes with markers. It is well known (![]()
![]()
![]()
![]()
![]()
Through analytical and computer simulation approaches, we quantified the power of the HW test for population admixture and the effects of population admixture on increasing the false positive findings (type I error,
) in association studies under various scenarios of population admixture and population differentiation. We found that (1) the power of the HWE test for detecting population admixture is usually small, even with large samples, unless the degrees of population admixture and population differentiation are rather large; (2) population admixture seriously elevates
for detecting genes underlying complex traits, the extent depending on the degrees of population admixture and population association; (3) HWE testing for population admixture should be performed with random samples, or only with controls at candidate genes, or the test may be performed for combined samples of cases and controls at marker loci that are not linked to the diseases under study; (4) testing HWE for population admixture generally reduces false positive findings of genes underlying complex traits but the effect is generally small due to the limited power to detect population admixture by the HWE test; and (5) compared with the conventional case-control analyses (
2CC-test, Equation 5) in association studies for complex diseases, the
2HW-D-test (Equation 7) that employs only cases is more robust and yields much smaller
.
In this study, we focus on studying
in a common practice (e.g., ![]()
![]()
in detecting disease genes in the presence of population admixture, it is also intuitive that in the absence of population admixture, such a practice will decrease the power to detect diseases genes. This is simply because the spurious population admixture will be detected by HWE tests that are entirely due to sampling error (at a rate specified by the level of the test significance
) in the absence of population admixture. Such spurious findings of population admixture may erroneously halt the testing for disease loci at the candidate genes. Recently, ![]()
2-tests in cases and controls as a means to effectively reduce
in disease gene searching in association studies. Although combining a series of unlinked marker loci may increase the power to detect population admixture in the HWE test, the increase in power requires that each marker locus included in analyses is differentiated in subpopulationsa valuable piece of information that is generally unknown for markers in most admixed populations. Including markers that are not differentiated among subpopulations will generally decrease the power to detect population admixture. Most importantly, it is the differentiation of disease frequencies and allele frequencies at candidate genes in subpopulations of an admixed population that affects
in disease gene testing at candidate genes. This can be easily seen analytically via the noncentrality parameters of association test statistics (Equation 6aEquation 6b and Equation 8). Various loci across the human genome may be differentiated to various degrees in subpopulations of an admixed population. If the candidate genes are not differentiated, but the unlinked marker loci selected are differentiated in subpopulations and population admixture is detected by HWE tests at these unlinked marker loci, we will suffer substantial loss of power by stopping to test candidate loci for detecting disease genes. On the other hand, if the marker loci are not differentiated but the candidate genes are differentiated in subpopulations of an admixed population, we will still suffer inflated
due to admixture of subpopulations differentiated at the candidate genes to be tested. The above problem may also undermine the usefulness of the approach of ![]()
![]()
![]()
![]()
![]()
HWE is a fundamental topic in population genetics. Issues related to HWE have been subjected to extensive studies and have various applications in many research areas. Examples are the propositions of various tests of HWE (e.g., ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
f
p, where k and
f are defined earlier and
p is the difference of the allele frequency of the second locus. The two loci are assumed to be in linkage equilibrium in the P1 and P2 populations. In this study, we assume that a locus and a disease are not associated in the P1 and P2 populations. The association in the P population is entirely due to the "disequilibrium" between the marker locus and the disease created by admixture. The degree of such disequilibria may be measured as D' = k(1 - k)
f
. It is noted that the power to detect an association between the marker and the disease created by admixture critically depends on D' as reflected by Equation 6a and Equation 6b for the
2CC-test. However, the disequilibrium due to population admixture between a marker locus and the disease may not have the same effects on different association studies, as is demonstrated by our simulation results for the two tests examined (
2CC- and
2HW-D-tests). This is also apparent from the noncentrality parameters found for the two test statistics (Equation 6a, Equation 6b, and Equation 8). Different from the
2CC-test, the power of the test does not have a direct relationship with 
for the
2HW-D-test.
Population association studies that depend on linkage and strong linkage disequilibrium between marker loci and loci underlying complex traits have been conducted extensively and have helped in deciphering some genetic bases of complex traits (e.g., ![]()
![]()
![]()
![]()
Finally, it should be pointed out that although we examine the detection of HW disequilibrium due to population admixture in the context of localizing genes underlying complex diseases, some issues investigated here should be of general interest in genetics. For example, it is noted here for the first time that the degree of population differentiation as measured by GST has a direct relationship with the noncentrality parameter (and thus the power) of the test to detect HW disequilibrium (Equation 4). In addition, it is a general practice in population and evolutionary genetics to test for HW disequilibrium as a means to substantiate the assumptions for HW equilibrium (such as population admixture, inbreeding, and assortative mating). Nonsignificant results are generally interpreted as an indication of random mating in the study populations (e.g., ![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We are grateful to Professor Asmussen and the two anonymous reviewers for providing careful comments that helped to improve the manuscript. This study was partially supported by grants from the National Institutes of Health, the Health Future Foundation, and HuNan Normal University and by a graduate student tuition waiver to Wei-Min Chen from Creighton University.
Manuscript received January 4, 2000; Accepted for publication October 30, 2000.
| APPENDIX A |
|---|
COMPUTATION OF THE STATISTICAL POWER BASED ON THE NONCENTRALITY PARAMETER OF THE
2HW-STATISTIC
The power (
) of the
2HW-test under various degrees of population differentiation
f and various degrees of population admixture k can be computed as
![]() |
(A1) |
where f(x, 1,
) is the p.d.f. of a noncentral
2-distribution with 1 d.f. and the noncentrality parameter
HW defined in Equation 3b.
2
is the critical value (quantile of order 1 -
) so that for the central
2-distribution [with p.d.f. f(x, 1, 0)], which holds under the null hypothesis of HWE, the following relationship holds:

Thus
specifies the type I error (or significance level) of the HWE test for population admixture. For a specified
,
2
can be found in most statistics books. Therefore, with a certain sample size N,
of the test for HWE under various
f and k can be computed by Equation 3aEquation 3b and Equation A1. In the power computations, PEARSON's (1959) method (![]()
2-distributions, which requires the c.d.f. of central
2-distributions that are approximated by the ![]()
![]()
![]()
, the sample size N required to detect population admixture with the HWE test can be simply obtained by Equation 3aEquation 3b and Equation A1 with the aid of the tables for noncentral
2-distributions (![]()
| APPENDIX B |
|---|
THE RELATIONSHIP OF
HW WITH GST
GST =
, where HT is the heterozygosity if all the isolated populations were converted into a single randomly mating population.
S measures the average heterozygosity of isolated subpopulations. In a population P admixed of populations P1 and P2 with a proportion k from P1 and (1 - k) from P2, for a locus with two alleles M and m with frequencies of M being f1 in P1 and f2 in P2 and the frequency of M in P is f, HT = 1 - f2 - (1 - f)2, where f is defined in the text. If the average heterozygosity of P1 and P2 is computed by weighting the heterozygosity in P1 and P2, respectively, by their relative contributions to population P,
S = 2kf1(1 - f1) + 2(1 - k)f2(1 - f2), then

By the above equation and Equation 3a, we have

where N is the sample size for the
2HW-test.
| APPENDIX C |
|---|
FREQUENCIES OF MARKER ALLELES AND GENOTYPES IN CASES AND CONTROLS IN AN ADMIXED POPULATION
Assume that the marker locus is not causally associated with the disease and assume that the marker genotypes (or alleles) and the disease are not associated in populations P1 and P2; the association between the marker and the disease in population P is then due entirely to the admixture. In population P, the expected frequency of the allele M in cases is
|
(C1) |
where Pr(M|P1, D) = Pr(M|P1) due to the independence of the marker allele and disease within populations of P1 and P2,

Similarly,

In population P, the expected frequency of genotype MM in cases is
|
(C2) |
Therefore, from Equations C1 and C2, and after some algebra simplification, we have

Similarly, we have

| LITERATURE CITED |
|---|
ALLISON, D. B., 1997 Transmission-disequilibrium tests for quantitative traits. Am. J. Hum. Genet. 60:676-690[Medline].
BLUM, K., E. P. NOBEL, P. J. SHERIDAN, A. MONTGOMERY, and T. RITCHIE et al., 1990 Allelic association of human dopamine D2 receptor gene in alcoholism. J. Am. Med. Assoc. 263:2055-2060
BLUM, K., E. P. NOBLE, P. J. SHERIDAN, O. FINLEY, and A. MONTGOMERY et al., 1991 Association of the A1 allele of the D2 Dopamine receptor gene with severe alcoholism. Alcohol 8:409-416[Medline].
BOERWINKLE, E., R. CHAKRABORTY, and C. F. SING, 1986 The use of measured genotype information in the analysis of quantitative phenotypes in man. Ann. Hum. Genet. 50:181-194[Medline].
BOERWINKLE, E., S. VISCIKIS, D. WELSH, J. STEINMETZ, and S. M. HAMASH et al., 1987 The use of measured genotype information in the analysis of quantitative phenotypes in man. II. The role of the apolipoprotein E polymorphisms in determining levels, variability, and covariability of cholesterol, betalipoprotein, and triglycerides in a sample of unrelated individuals. Am. J. Med. Genet. 27:567-582[Medline].
BRISCOE, D., J. C. STEPHENS, and S. J. O'BRIEN, 1994 Linkage disequilibrium in admixed populations: applications in gene mapping. J. Hered. 85:59-63
CHAGNON, Y. C., L. PERUSSE, and C. BOUCHARD, 1998 The human obesity gene map: the 1997 update. Obes. Res. 5:76-92.
CHAKRABORTY, R. and P. SMOUSE, 1988 Recombination in haplotypes leads to biased estimates of admixture proportions in human populations. Proc. Natl. Acad. Sci. USA 85:3071-3074
CROW, J. F., 1983 Basic Concepts in Population, Quantitative, and Evolutionary Genetics. Freeman, New York.
CROW, J., and M. KIMURA, 1970 An Introduction to Population Genetics Theory. Harper & Row, New York.
DENG, H.-W. and W. M. CHEN, 2000 Re: "biased tests of association: comparison of allele frequencies when departing from Hardy-Weinberg proportions.". Am. J. Epidemiol. 151:335-357
DENG, H.-W. and M. LYNCH, 1996 Change of genetic architecture in response to sex. Genetics 143:203-212[Abstract].
DENG, H.-W., J. LI, J. L. LI, M. JOHNSON, and G. GONG et al., 1999 Association of VDR and estrogen receptor genotypes with bone mass in postmenopausal caucasian women: different conclusions with different analyses and the implications. Osteoporos. Int. 9:499-507[Medline].
DENG, H.-W., W. M. CHEN, and R. R. RECKER, 2000 QTL fine mapping in extreme samples from populations. Am. J. Hum. Genet. 66:1027-1045[Medline].
DEVLIN, B. and K. ROEDER, 1999 Genomic control for association studies. Biometrics 55:997-1004[Medline].
EGUCHI, S. and M. MATSUURA, 1990 Testing the Hardy-Weinberg equilibrium in the HLA system. Biometrics 46:415-426[Medline].
EISMAN, J. A., 1995 Vitamin D receptor gene alleles and osteoporosis: an affirmative view. J. Bone Miner. Res. 10:1289-1293[Medline].
FEDER, J. N., A. GNIRKE, W. THOMAS, and Z. TSUCHIHASI, 1996 A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nat. Genet. 13:399-408[Medline].
GELERNTER, J., D. GOLDMAN, and N. RISCH, 1993 The A1 allele at the D2 dopamine receptor gene and alcoholism. J. Am. Med. Assoc. 269:1673-1677
GONG, G., S. STERN, S. C. CHENG, F. FONG, and N. MORDESON et al., 1999 On the association of bone mass density and Vitamin-D genotype polymorphisms. Osteoporos. Int. 9:55-64[Medline].
GUO, S. W. and E. A. THOMPSON, 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48:361-372[Medline].
HARTL, D. L., and A. G. CLARK, 1989 Principles of Population Genetics. Sinauer Associates, Sunderland, MA.
HEBERT, P. D. N., 1987 Genetics of Daphnia. Mem. Ist. Ital. Idrobiol. 45:439-460.
HERNANDEZ, J. L. and B. S. WEIR, 1989 A disequilibrium coefficient approach to Hardy-Weinberg testing. Biometrics 45:53-70[Medline].
HOLDEN, C., 1994 A cautionary genetic tale: the sobering story of D2. Science 264:1696-1697
JOHNSON, N. L., S. KOTZ and N. BALAKRISHNAN, 1994 Continuous Univariate Distributions I. John Wiley & Sons, New York.
JOHNSON, N. L., S. KOTZ and N. BALAKRISHNAN, 1995 Continuous Univariate Distributions II. John Wiley & Sons, New York.
LANDER, E. S. and N. J. SCHORK, 1994 Genetic dissection of complex traits. Science 265:2037-2048
LOUIS, E. J. and E. R. DEMPSTER, 1987 An exact test for Hardy-Weinberg and multiple alleles. Biometrics 43:805-811[Medline].
LING, R. F., 1978 A study of the accuracy of some approximations for t,
2, and F tail probabilities. J. Am. Stat. Assoc. 73:274-283.
LYNCH, M., and K. SPITZE, 1994 Evolutionary genetics of Daphnia, pp. 109128 in Ecological Genetics, edited by L. REAL. Princeton University Press, Princeton, NJ.
MORRISON, N. A., J. C. QI, A. TOKITA, P. J. KELLY, and L. CROFTS et al., 1994 Prediction of bone density from vitamin D receptor alleles. Nature 367:284-287[Medline].
NAM, J. M., 1997 Testing a genetic equilibrium across strata. Ann. Hum. Genet. 61:163-170[Medline].
NEI, M., 1975 Molecular Population Genetics and Evolution. North-Holland/American Elsevier, Amsterdam.
NIELSEN, D. M., M. G. EHM, and B. S. WEIR, 1999 Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am. J. Hum. Genet. 63:1531-1540.
OTT, J., 1991 Analysis of Human Genetic Linkage. Johns Hopkins University Press, Baltimore.
PAGE, G. P. and C. I. AMOS, 1999 Comparison of linkage-disequilibrium methods for localization of genes influencing quantitative traits in humans. Am. J. Hum. Genet. 64:1194-1205[Medline].
PATO, C. N., F. MACCIARDI, M. T. PATO, M. VERGA, and J. L. KENNEDY, 1993 Review of the putative association of dopamine D2 receptor and alcoholism: a meta-analysis. Am. J. Med. Genet. 48:78-82[Medline].
PEACOCK, M., 1995 Vitamin D receptor gene alleles and osteoporosis: a contrasting view. J. Bone Miner. Res. 10:1294-1297[Medline].
PEARSON, E. S., 1959 Note on an approximation to the distribution of noncentral
2. Biometrika 46:364.
PRITCHARD, J. K. and N. A. ROSENBERG, 1999 Use of unlinked genetics markers to detect population stratification in association studies. Am. J. Hum. Genet. 65:220-228[Medline].
PRITCHARD, J. K., M. STEPHENS, and P. DONNELLY, 2000a Inference of population admixture using multilocus genotype data. Genetics 155:945-959
PRITCHARD, J. K., M. STEPHENS, N. A. ROSENBERG, and P. DONNELLY, 2000b Association mapping in structured populations. Am. J. Hum. Genet. 67:170-181[Medline].
RISCH, N. and J. TENG, 1998 The relative power of family-based and case-control designs for linkage disequilibrium studies of complex human diseases. I. DNA polling. Genome Res. 8:1273-1288
SCHAID, D. J. and S. J. JACOBSEN, 1999 Biased tests of association: comparisons of allele frequencies when departing from Hardy-Weinberg proportions. Am. J. Epidemiol. 149:706-711
SHOEMAKER, J., I. PAINTER, and B. S. WEIR, 1998 A Bayesian characterization of Hardy-Weinberg disequilibrium. Genetics 149:2079-2088
SPIELMAN, R. S., R. E. MCGINNIS, and W. J. EWENS, 1993 Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52:506-516[Medline].
TIRET, L. and F. CAMBIEN, 1995 Letter: Departure from Hardy-Weinberg equilibrium should be systematically tested in studies of association between genetic markers and disease. Circulation 92:3364-3365.
WEIR, B. S., 1996 Genetic Data Analysis II. Sinauer, Sunderland, MA.
WILSON, E. B. and M. M. HILFERTY, 1931 The distribution of chi-square. Proc. Natl. Acad. Sci. USA 17:684-688
XIONG, M. M., J. KRUSHKAL, and E. BOERWINKLE, 1998 TDT statistics for mapping quantitative trait loci. Ann. Hum. Genet. 62:431-452[Medline].
This article has been cited by other articles:
![]() |
K.-D. Yu, G.-H. Di, W.-T. Yuan, L. Fan, J. Wu, Z. Hu, Z.-Z. Shen, Y. Zheng, W. Huang, and Z.-M. Shao Functional polymorphisms, altered gene expression and genetic association link NRH:quinone oxidoreductase 2 to breast cancer with wild-type p53 Hum. Mol. Genet., July 1, 2009; 18(13): 2502 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-D. Yu, G.-H. Di, L. Fan, J. Wu, Z. Hu, Z.-Z. Shen, W. Huang, and Z.-M. Shao A functional polymorphism in the promoter region of GSTM1 implies a complex role for GSTM1 in breast cancer FASEB J, July 1, 2009; 23(7): 2274 - 2287. [Abstract] [Full Text] [PDF] |
||||
![]() |
H-L Wong, W-P Koh, N M Probst-Hensch, D Van den Berg, M C Yu, and S A Ingles Insulin-like growth factor-1 promoter polymorphisms and colorectal cancer: a functional genomics approach Gut, August 1, 2008; 57(8): 1090 - 1096. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. J. Meyers, T. H. Mosley, E. Fox, E. Boerwinkle, D. K. Arnett, R. B. Devereux, and S. L.R. Kardia Genetic Variations Associated With Echocardiographic Left Ventricular Traits in Hypertensive Blacks Hypertension, May 1, 2007; 49(5): 992 - 999. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Evans, V. Salomaa, S. Kulathinal, K. Asplund, F. Cambien, M. Ferrario, M. Perola, L. Peltonen, D. Shields, H. Tunstall-Pedoe, et al. MORGAM (an international pooling of cardiovascular cohorts) Int. J. Epidemiol., February 1, 2005; 34(1): 21 - 27. [Full Text] [PDF] |
||||
![]() |
H.-L. Wong, K. DeLellis, N. Probst-Hensch, W.-P. Koh, D. Van Den Berg, H.-P. Lee, M. C. Yu, and S. A. Ingles A New Single Nucleotide Polymorphism in the Insulin-Like Growth Factor I Regulatory Region Associates with Colorectal Cancer Risk in Singapore Chinese Cancer Epidemiol. Biomarkers Prev., January 1, 2005; 14(1): 144 - 151. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. van Onna, A. A. Kroon, A. J.H.M. Houben, D. Koster, M. P.A. Zeegers, L. H.G. Henskens, A. W. Plat, H. E.J.H. Stoffers, and P. W. de Leeuw Genetic Risk of Atherosclerotic Renal Artery Disease: The Candidate Gene Approach in a Renal Angiography Cohort Hypertension, October 1, 2004; 44(4): 448 - 453. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kaput and R. L. Rodriguez Nutritional genomics: the next frontier in the postgenomic era Physiol Genomics, January 15, 2004; 16(2): 166 - 177. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Koushik, R. W. Platt, and E. L. Franco p53 Codon 72 Polymorphism and Cervical Neoplasia: A Meta-Analysis Review Cancer Epidemiol. Biomarkers Prev., January 1, 2004; 13(1): 11 - 22. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.-C. Lee Detecting population stratification using a panel of single nucleotide polymorphisms Int. J. Epidemiol., December 1, 2003; 32(6): 1120 - 1120. [Full Text] [PDF] |
||||
![]() |
J. Hernandez-Sanchez, P. Visscher, G. Plastow, and C. Haley Candidate Gene Analysis for Quantitative Traits Using the Transmission Disequilibrium Test: The Example of the Melanocortin 4-Receptor in Pigs Genetics, June 1, 2003; 164(2): 637 - 644. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-W. Deng, F.-H. Xu, Q.-Y. Huang, H. Shen, H. Deng, T. Conway, Y.-J. Liu, Y.-Z. Liu, J.-L. Li, H.-T. Zhang, et al. A Whole-Genome Linkage Scan Suggests Several Genomic Regions Potentially Containing Quantitative Trait Loci for Osteoporosis J. Clin. Endocrinol. Metab., November 1, 2002; 87(11): 5151 - 5159. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-W. Deng Population Admixture May Appear to Mask, Change or Reverse Genetic Effects of Genes Underlying Complex Traits Genetics, November 1, 2001; 159(3): 1319 - 1323. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Email this article to a friend
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Deng, H.-W.
- Articles by Recker, R. R.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Deng, H.-W.
- Articles by Recker, R. R.

























