Abstract
The probability of multilocus genotype counts conditional on allelic counts and on allelic independence provides a test statistic for independence within and between loci. As the number of loci increases and each sampled genotype becomes unique, the conditional probability becomes a function of total heterozygosity. In that case, it does not address betweenlocus dependence directly but only indirectly through detection of the Wahlund effect. Moreover, the test will reject the hypothesis of allelic independence only for small values of heterozygosity. Low heterozygosity is expected for population subdivision but not for population admixture. The test may therefore be inappropriate for admixed populations. If individuals with parents in two different populations are always considered to belong to one of the populations, then heterozygosity is increased in that population and the exact test should not be used for sparse data sets from that population. If such a case is suspected, then alternative testing strategies are suggested.
IN forensic science multilocus genotype frequencies are often estimated as products of allele frequencies. Although this is expected to be appropriate for large randommating populations, especially for unlinked loci, it is customary to check for evidence of allelic dependencies before invoking the product rule. Exact tests have been shown to have satisfactory power, at least in comparison to alternative testing strategies (Maiste and Weir 1995), and it was shown by Zaykin et al. (1995) that power appears to increase with the number of loci for populations with substructure. We show that a similar increase in power may not hold for admixed populations. This is a consequence of increased heterozygosity as opposed to the decrease expected in populations with substructure (Walsh and Buckleton 1988).
STATISTICAL TESTING PROCEDURES
Exact test: Suppose that the lth of L loci has alleles A_{li}, where the range of i is left arbitrary but is understood to depend on the locus index, l. Then the product rule expresses the frequency of the Llocus genotype, A_{1}_{i} A_{1}_{j} A_{2}_{i} A_{2}_{j}... A_{Li} A_{Lj}, as the product of the frequencies of all 2L constituent alleles, along with a factor of 2 for each locus that is heterozygous. If this genotype is regarded as being the gth of all possible Llocus genotypes, the productrule frequency is
With random sampling, the probability of a sample of size n having n_{g} copies of the gth genotype (n = Σ_{g} n_{g}) is
The conditional probability P_{c} of the genotype counts given the allelic counts, if Equation 1 holds, is therefore
The quantity h = Σ_{g}n_{g}h_{g} is the total number of heterozygous loci in the sample and lies between 0 and nL. The unknown allele probabilities have canceled out and, if the hypothesis of independence is false, small values of P_{c} will be observed. To carry out an exact test all possible arrays of genotype counts with the same allelic counts as the observed data are examined. The significance level for the test is the sum over all arrays of the values of the conditional probability that are as small or smaller than the value of P_{c} for the data (Guo and Thompson 1992). In practice, this exact test is performed by repeatedly permuting the alleles at each locus separately to form new multilocus genotypes and noting the proportion of permuted data sets with a P_{c} value as small or smaller than that for the original data. Small values of P_{c} can result from values of h that are either smaller or larger than that expected under allelic independence (e.g., Table 3.1 of Weir 1996), so the test is two sided in terms of heterozygosity.
As the number of alleles at a locus and the number of loci increase, it becomes more and more likely that each multilocus genotype in a sample will occur once only. The product over g of the factorials n_{g}! therefore tends to one, and this is unlikely to be changed by permuting alleles. Permutation leaves the allele counts n_{li} unchanged but can change the number of heterozygotes in the sample at each locus. In other words, the probability P_{c} for an array of genotype counts becomes proportional to 2^{h}. Evidently the exact test tends toward a test for heterozygosity, but in the sense that only arrays with small values of h can lead to rejection of the hypothesis implied by Equation 1. The test is now a onesided test for total heterozygosity. The number of heterozygotes, h, has additive contributions from each locus and so has no betweenlocus component, although its distribution and hence its variance are affected by betweenlocus dependencies. It retains an indirect ability to detect some betweenlocus dependencies by its ability to detect the Wahlund effect as shown below.
Goodnessoffit tests: The original aim of testing for independence over all alleles with the exact test is lost when sampled genotypes are unique as the conditional probability does not involve any direct information on the relationships between the loci observed in the data. Would a goodnessoffit test for independence over genotypes do any better? When every genotype in a sample is unique, n_{g} = 1, such a chisquare goodnessoffit statistic becomes
Heterozygosity test: We have shown that for sparse data sets, with each sampled genotype being unique, allelic independence between loci is not addressed by the exact test. That test can be regarded instead as supporting the hypothesis that the total heterozygosity h has the value expected under independence rather than the alternative that h is less than expected. We write the test as T_{h}. Alternatively, a onesided test can be made against the alternative that h is greater than expected. When this distinction is needed the tests will be denoted by
At each individual locus, the hypothesis that heterozygosity has the value expected under allelic independence (HardyWeinberg equilibrium) can be also addressed by a comparison of observed and expected heterozygosities, resulting in a chisquare statistic with 1 d.f. This test is twosided as both large and small values of heterozygosity can lead to rejection. Under the hypothesis of independent loci the singlelocus statistics can be added together. Alternatively, allelic permutation can be carried out separately for each locus and empirical significance levels generated for tests based on h. In this case the empirical significance levels of each of the tests would have to be adjusted because of the multiple comparisons.
Evett and Weir (1998) noted that total heterozygosity tests are not tests of the HardyWeinberg hypothesis. It is possible that individual homozygote frequencies could depart considerably from expectation in a way that the sum of homozygote frequencies would still be near its expected value. Alternatives such as the Wahlund effect would produce a situation where all the homozygote proportions increase relative to expectation so that this type of effect should be detected by this test.
Varianceofheterozygosity test: Sums of singlelocus heterozygosities do not contain information about betweenlocus dependencies, but there is information in the variance of the singlelocus heterozygosities. Brown et al. (1980) and Chakraborty (1984) pointed out that the variance over genotypes of the number of heterozygous loci has a value that depends on betweenlocus associations. Those authors considered randommating populations where linkage disequilibrium is the only dependency, and Yang (2000) extended this work to allow for nonrandom mating populations and HardyWeinberg disequilibrium. The varianceofheterozygosity test can be used to test for the presence of linkage disequilibrium among loci.
For a sample of n genotypes the varianceofheterozygosity test statistic, V, is given by
GENETIC MODELS
Structured populations: An idealized model of population structure either has all current populations descending from a reference population (Falconer 1960) in a star phylogeny or has a series of bifurcations of populations over time so that there is a tree of populations. In either case, it may be supposed that a large ideal population consists of a series of subpopulations in which allelic frequencies are different because of genetic drift. Consider such a population in which a proportion α_{k} of individuals belong to the kth subpopulation. The frequency of allele A_{li} is p_{lik} in the kth subpopulation and is p_{li} = Σ_{k}α_{k}p_{lik} in the whole population.
Even if there is allelic independence within subpopulations, the Wahlund effect causes a dependence to exist at the population level whenever the allele frequencies vary among subpopulations. One way to quantify this effect is by the difference between actual and expected heterozygosities in the whole population,
The Wahlund effect also produces betweenlocus dependencies. Linkage disequilibrium in the whole population, or the difference between the joint frequency of pairs of alleles at different loci and the product of their separate frequencies, is given by
If the whole population now mates at random, allele frequencies remain at p_{li}. Singlelocus genotype frequencies become products of these allele frequencies. The population heterozygosity equals the value expected under withinlocus allelic independence, so goodnessoffit tests for heterozygosity, or exact tests for allelic association in sparse data sets, are not expected to give significant results. Linkage disequilibria will decay at a rate depending on the recombination fractions between loci and can be detected by tests at each pair of loci or by the test of Brown et al. (1980) over all loci.
Admixed populations: A model for human populations that may be more appropriate for recent history has previously distinct populations admixing. Such admixture also creates dependencies among allele frequencies, but in a way different from that of the Wahlund effect. The population structure model assumed that subpopulations remained distinct and provides relationships between genotype and allele frequencies in the whole population. The admixture model assumes the modification of some subpopulations by the influx of alleles from other subpopulations.
A simple example supposes the parental generation to be composed of two randommating populations, indexed by k = 1, 2, in which the frequencies of alleles A_{li} at locus l are p_{lik}. In the next generation, a proportion m_{kk} of the individuals in the admixed population have both parents in population k, and a proportion 2m_{kk}_{′} have one parent in each of populations k and k′. The offspring genotype proportions in the admixed population are, therefore,
It is convenient to introduce the quantities π_{k}, k = 1, 2 as the probabilities of a random allele in the admixed population having come from parental population k,so π_{1} = m_{11} + m_{12} and π_{2} = m_{12} + m_{22}. We can define an assortativemating parameter M, which measures the tendency for withinpopulation mating,
Equation 3 shows that a preference for withinpopulation matings, M > 0, will result in fewer heterozygotes than expected. The exact test for allelic independence, acting as a onesided test for heterozygosity, should therefore detect such assortative mating. However, if M < 0 the exact test will not perform well.
Population dominance: A quite different situation arises when there is some “dominance” in population assignment. If individuals with either one or two parents in population k are assigned to that population, then there will be an excess of heterozygotes in that population. For the admixed population considered in the last section, suppose that individuals with both parents from population 1 are considered to belong to population 1 but individuals with at least one parent in population 2 are considered to belong to population 2. This may be the situation in New Zealand where population 1 represents Caucasians and population 2 represents Maoris.
Among the population 2 members of the admixed population, homozygote and allele frequencies for A_{i} at locus l are
NUMERICAL RESULTS
Structured populations: Simulations of the drift process were performed for 10 populations of size N = 1000 and for L = 1, 2, 3, 4, and 10 loci each with 5 or 10 equiprobable alleles per locus. The power of the exact test was found for samples of n = 200 individuals from the whole population after t = 0, 20, 60, 103, and 213 generations—corresponding to population structure parameter, θ = 1 – (1 – 1/2N)^{t}, with values of 0.00, 0.01, 0.03, 0.05, and 0.10. The power was also found for test T^{–}_{h} on the basis of values of the total heterozygosity h. For each set of simulated data, the exact and heterozygosity tests led to rejection if the P_{c} or h values were among the smallest 5% of the values found from 2500 sets of data formed by permuting alleles separately at each locus. Powers were calculated as the proportions of rejections from 500 simulated data sets. The standard errors for the estimated powers can be calculated assuming sampling from the binomial distribution. Since 500 replications were used, the standard errors for the estimated powers are <0.0224. The results are shown in Table 1.
The power for both tests increases with θ, with the number of loci, and with the number of alleles per locus, as found previously by Zaykin et al. (1995). However, as might be expected, the power was somewhat greater for the heterozygosity test as the data became sparser.
The relationship between the two tests is shown graphically in Figure 1, as plots of ln(P_{c}) against h. For sparse data, the relationship between these two statistics becomes linear, reflecting the dependence of P_{c} only on h among permuted data sets. Even for three loci and samples of size 200, the data are sufficiently sparse that the exact test does not detect betweenlocus dependencies.
Admixed populations: Two of the simulated populations described in the previous section were allowed to contribute equally to an admixed population, m_{11} = m_{12} = m_{22} = 0.25. One generation of random mating was simulated and the exact test, the total heterozygosity tests, and the variance of heterozygosity test were performed using a sample of n = 200 genotypes. The powers were calculated as in the previous section and results are shown in Table 2. The simulations were repeated using samples of size 500 and the results for θ = 0.10 are shown in Table 3.
As expected, the heterozygosity test has power equal to the significance level since all singlelocus heterozygosities are equal to their expected values. There is linkage disequilibrium, however, so the variance of heterozygosity has power that increases with θ. The power does increase with θ for the exact test similar to the variance of heterozygosity test until the number of loci becomes so large (greater than two) that data sparseness reduces the test to one of heterozygosity.
Admixed populations with population dominance were also simulated by setting m_{11} = 0 and m_{12} = m_{22} = 0.33. The exact test, the total heterozygosity tests, and the variance of heterozygosity test were performed using samples of size 200. The powers calculated from 500 replications are shown in Table 3.
There is a discontinuity in the results of the exact test for admixture under random mating between one and two loci shown in Table 3. For a randomly mating admixed population there will be little or no withinlocus association, but substantial betweenlocus association. Thus the exact test for a single locus has power 0.05. However, the exact test can detect betweenlocus association for two or more loci if the sample size is sufficiently large. A sample size of n = 500 will detect the betweenlocus association for two loci with a power of 0.35. As the number of loci increases for a fixed sample size the genotype array becomes increasingly sparse and the power is seen to drop back to 0.05.
The powers of the heterozygosity test T^{–}_{h} are less than the significance level while the powers of the T^{+}_{h} tests increase with θ, since the heterozygosity tests are onesided. The exact test is similar to the heterozygosity test T^{–}_{h} except for onelocus tests. This is because when only one locus was used in the test, the genotype arrays were not sparse and the exact test is in effect a twosided test.
The varianceofheterozygosity test statistic is affected by both within and betweenlocus associations. In the case of population dominance, withinlocus and betweenlocus associations have opposite effects on V. When the number of loci used is small, V is affected mostly by the withinlocus association. As the number of loci increases, the number of pairwise betweenlocus associations increases and these balance out the effects of withinlocus associations. As a result, the empirical power first increases and then decreases as the number of loci increases.
DISCUSSION
Care is needed in applying tests for allelic independence to check on the validity of the product rule in Equation 1. For small numbers of loci, when multilocus genotypes can occur several times in a sample, the exact test is good for associations both within and between loci.
As the number of loci increases, however, the exact test becomes a test of total heterozygosity, but it offers no information on betweenlocus associations. The numerical work of Zaykin et al. (1995) for association generated by population structure showed increased power for increased numbers of loci, but this reflected only the decreased heterozygosity. The increase in betweenlocus association did not affect the exact test when each sampled genotype was unique.
For populations undergoing random mating following amalgamation of a set of divergent subpopulations, i.e., admixture, there is little point in performing a test for overall heterozygosity or in performing the exact test for sparse data sets. There are no withinlocus associations, and the betweenlocus associations will not contribute to these test statistics. This randommating situation is the one envisaged by Brown et al. (1980), so their amonglocus variance of heterozygosity appears to be an appropriate test statistic. Any test that detects less heterozygosity than expected will not be appropriate with population dominance and increases in heterozygosity.
Acknowledgments
Very helpful comments were made by the reviewers. This work was supported in part by a New Zealand Institute of Environmental and Scientific Research grant to B. Law and U.S. National Institutes of Health grant GM 45344 to North Carolina State University.
Footnotes

Communicating editor: A. H. D. Brown
 Received November 14, 2002.
 Accepted January 14, 2003.
 Copyright © 2003 by the Genetics Society of America