- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Cornuet, J.-M.
- Articles by Solignac, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Cornuet, J.-M.
- Articles by Solignac, M.
New Methods Employing Multilocus Genotypes to Select or Exclude Populations as Origins of Individuals
Jean-Marie Cornueta, Sylvain Pirya, Gordon Luikartb, Arnaud Estoupa, and Michel Solignacca INRA-CBGP, 34090 Montpellier, France,
b Laboratoire de Biologie des Populations d'Altitude, CNRS UMR 5553, Université Joseph Fourier, BP 53, 38014 Grenoble, France
c Laboratoire Population, Génétique et Evolution, CNRS, 91198 Gif-sur-Yvette, France
Corresponding author: Jean-Marie Cornuet, INRA-CBGP, 488 rue Croix de Lavit, 34090 Montpellier, France., cornuet{at}ensam.inra.fr (E-mail)
Communicating editor: G. B. GOLDING
| ABSTRACT |
|---|
A new method for assigning individuals of unknown origin to populations, based on the genetic distance between individuals and populations, was compared to two existing methods based on the likelihood of multilocus genotypes. The distribution of the assignment criterion (genetic distance or genotype likelihood) for individuals of a given population was used to define the probability that an individual belongs to the population. Using this definition, it becomes possible to exclude a population as the origin of an individual, a useful extension of the currently available assignment methods. Using simulated data based on the coalescent process, the different methods were evaluated, varying the time of divergence of populations, the mutation model, the sample size, and the number of loci. Likelihood-based methods (especially the Bayesian method) always performed better than distance methods. Other things being equal, genetic markers were always more efficient when evolving under the infinite allele model than under the stepwise mutation model, even for equal values of the differentiation parameter Fst. Using the Bayesian method, a 100% correct assignment rate can be achieved by scoring ca. 10 microsatellite loci (H
0.6) on 3050 individuals from each of 10 populations when the Fst is near 0.1.
INTRASPECIFIC or juxtaspecific taxonomy has long been based on genetic studies using moderately variable markers (![]()
![]()
![]()
![]()
![]()
![]()
Following the suggestion of ![]()
![]()
![]()
![]()
In contrast, the question of assigning an individual to a population was specifically addressed in several other studies with purposes such as classifying individual fish (![]()
![]()
![]()
![]()
![]()
![]()
In the few examples cited above, a rather large variety of methods have been used, which can be grouped into two categories. The first category includes "general methods," i.e., methods that can be applied to almost any kind of data. Two methods of this category, the discriminant analysis and the method of neural networks, have been used and their performances compared by ![]()
![]()
![]()
![]()
![]()
![]()
A common limitation to all existing assignment methods is that they always designate a single population as the probable source of the individual being assigned. They just give an answer to the question: Among these particular populations, which is the most likely to be the individual's population of origin? If the population of origin of the individual is not represented in the set of reference populations, the methods will still designate a (wrong) population of origin. In other words, the existing genetic methods do not provide any clear indication of the confidence we can put in the designated population. In some contexts, it can be more important to exclude a given population than to designate a most likely one. The questions of the confidence in the choice or the exclusion of a population would be solved if we could dispose of a measure of the probability that the individual belongs to a population. We might then exclude a population because the probability that the individual belongs to it is lower than a given threshold.
In this article, we propose a new assignment method based on genetic distances between an individual and a population without the assumptions of Hardy-Weinberg and linkage equilibrium. The performance of this method is evaluated for different genetic distances and is compared to the assignment methods of ![]()
![]()
| METHODS |
|---|
We first present three methods for assigning individuals to populations. To each assignment method corresponds an "exclusion" method, the principle of which, being common, is only discussed once. All methods are based on the knowledge of multilocus genotypes of representative samples taken from the candidate populations and of the individual(s) to be assigned.
Assignment methods:
The frequency method:
This method, first presented by ![]()
In addition to the two assumptions of Hardy-Weinberg equilibrium and independence of loci, there is a third unstated assumption that the allelic frequencies deduced from the population samples are close to their exact values. A particular case, noted by ![]()
![]()
The Bayesian method:
This second method, similar to the previous one, is largely inspired from ![]()
![]() |
(1) |
![]() |
(2) |
where nijk is the number of alleles k sampled at locus j in population i (not counting the individual to be assigned), nij is the number of gene copies sampled at locus j in population i (nij =
Kjk=1nijk), and Kj is the total number of alleles observed in the whole collection of populations at locus j.
This method is performed in the same way as the frequency method by simply replacing the formulas for computing the likelihood of a genotype by the above Equation 1 and Equation 2. Note that the difficulty raised by null frequencies in the previous method disappears here because of the coefficient 1/Kj, which results from the computations.
The distance method:
Whereas the two previous methods are based on the probability of observing a given genotype in the various reference populations, the distance method assigns the individual to the "closest" population. Because data here are genotypes, the distance will be a genetic distance. Numerous genetic distances have been defined (cf. ![]()
![]()
Apart from providing a different basis for conducting assignment methods, the distance method can be adapted to different categories of genetic markers. For instance, some distances have been defined especially for microsatellites such as the (
µ)2 of ![]()
![]()
Exclusion methods:
The above three methods have two characteristics in common: (i) they are based on a criterion relating the individual to each population (e.g., a genetic distance between the individual and the population), the best candidate population being the one with the highest/lowest value of the criterion; and (ii) they always designate a population to which the individual can be assigned, because there is always a most likely or a closest population in any reference set. However, the set of reference populations may not include the true population of origin of the individual. Therefore, a measure of confidence that the individual truly belongs to a given population is needed. This can be achieved by comparing the value of the criterion of the individual (relative to the given population) with values of the criterion for individuals that belong to the population. More precisely, we need to locate the criterion value of the individual within the distribution of values for individuals of the population. If the individual's criterion is well outside the distribution, it seems logical to consider that the individual does not belong to the population. Furthermore, the proportion of the distribution with values "worse" (higher for a distance criterion or lower for a probability/likelihood criterion) than the tested individual's value can be considered as a measure of the probability that this individual belongs to the population. For instance, suppose that the distance between the tested individual and the population is 0.9 and that 97% of the distribution of distances between population individuals and the population is <0.9. We consider that the probability that the tested individual belongs to the population is only 3%. Note that it is possible to use an exclusion method as an assignment method (the individual is assigned to the population for which its probability of belonging is the highest).
The question is how to compute the distribution of the chosen criterion in each population. Taking only the individuals sampled in each population cannot provide the appropriate distribution because the number of genotype combinations becomes large very quickly even with a moderate number of loci and alleles (e.g., five loci with three alleles can make 7776 possible diploid genotypes). One way to generate this distribution without examining all genotype combinations (weighted by their probability of occurrence) is to simulate multilocus genotypes by randomly taking alleles according to their frequencies in the population. However, only the frequencies in population samples are known. A first way is to simply take population sample frequencies and this is what was performed in the following. One can also follow ![]()
) by
when drawing the first of the two alleles (at locus j in population i) and by (nijk + 1/Kj + 1)/(nij + 2) or (nijk'+ 1/Kj)/(nij + 2) when drawing the second allele according to whether the first allele drawn was allele k or another allele, respectively (formulas 23, 24, and 25 in ![]()
|
Simulation procedures:
To compare the performances of the different methods, we generated samples from 10 populations by simulating the coalescent process of genes and then making diploid genotypes by pairing gene copies at random within a population. This allows the generation of population samples while controlling various factors such as mutation rates, effective population sizes, time of divergence, sample sizes, and mutation model of markers. To simulate the coalescent process of genes in more than one population, we followed the method of ![]()
Data files were simulated with 10 populations diverging simultaneously from a common ancestral population. Each of the 11 populations (10 observed + 1 ancestral) were modeled to have an effective population size of 1000 diploid individuals. The mutation rate was considered equal to 0.0005 (an average value for microsatellites, reviewed in ![]()
![]()
![]()
In some other analyses, the performance of assignment/exclusion methods was assessed as a function of the Fst parameter, a widely used measure of interpopulation genetic differentiation (![]()
estimator of ![]()
| RESULTS |
|---|
Comparison of assignment methods:
In the first analysis, we considered all possible combinations of mutation models, sample sizes, numbers of loci, and times of divergence (i.e., 2 x 3 x 3 x 3 = 54 combinations). The corresponding data files were analyzed with four assignment methods when loci evolved under the IAM and five methods when they evolved under the SMM. The five methods included the frequency method, the Bayesian method, and three different distance methods based on the shared allele distance, the Cavalli-Sforza and Edwards chord distance, and the (
µ)2 of ![]()
The performances of a given method were measured as the average proportion of individuals correctly assigned to their population across 50 data files including each from 100 to 900 individuals. All individuals were tested using the "leave one out" procedure (![]()
)] per population in our simulations.
The results are summarized in Figure 2. For 10 populations, we expect ~10% of individuals to be correctly assigned to their population of origin because this is the base line that corresponds to a random assignment. Note that for populations that diverged recently (20 generations ago in our conditions, resulting in Fst
0.01), curves are close to the base line. When populations have sufficiently diverged, scores can reach 100% even with as few as 10 loci and 10 individuals sampled per population. Loci evolving under the IAM are largely more favorable to assignment scores than those evolving under SMM. Increasing the number of loci increases the performance of any method (when possible). The same is true for the sample size.
|
For loci evolving under the IAM, the Bayesian method provides the best scores and is followed, respectively, by the frequency method, the chord, and the DAS distance methods. The largest difference between the scores (Bayesian method and DAS method) amounts to 30.7% (when considering 90 individuals/population, 5 loci, and 200 generations of divergence). For loci evolving under the SMM, ![]()
In a second analysis, we analyzed the relationship between the proportion of individuals correctly assigned and the genetic differentiation among populations measured by the Fst coefficient. We simulated 50 data files, each containing 10 populations represented by a sample of 30 diploid individuals scored at 10 loci (per population). The time of divergence varied among data files according to a geometric progression allowing a range of Fst values between 0 and 0.35. Figure 3 summarizes the relationships between both quantities (percentage of correctly assigned individuals and Fst) using the different assignment methods and with loci evolving under the IAM or SMM, respectively. To keep the figure readable with all nine possible combinations, point values were replaced by logistic regressions that closely fit the data [R > 0.995 in any combination except for Goldstein et al.'s (
µ)2 distance method where R = 0.93].
|
The relative performance of the methods is the same in Figure 3 as in Figure 2: for any value of Fst, the best score is obtained with the Bayesian method, followed by the frequency method and the distance methods always in the same order (chord then DAS). For SMM loci, ![]()
µ2 distance method is far below other methods. Surprisingly, for any given Fst value and whatever the assignment method, scores are always better when loci evolve under the IAM compared to the SMM. This suggests that the differentiation measured by Fst is not sufficient to predict the score of an assignment method that is also sensitive to the way by which loci evolve. Figure 3 also shows that a perfect assignment can be obtained with Fst values as moderate as 0.1 (in the case where all populations have diverged at the same time and are represented by at least 30 individuals scored at 10 loci evolving under the IAM). There are some ranges of Fst values for which the choice of the assignment method can be critical. For instance, in conditions of Figure 3, ~85% of individuals will be correctly assigned on average with the Bayesian method whereas <50% will be so with the DAS distance method, when the Fst is ~0.05 and all 10 scored loci evolve under the IAM.
Comparison of exclusion methods:
For assignment methods, a simple parameter such as the proportion of correctly assigned individuals provides a good idea of their performance. For exclusion methods, we need more parameters. First, note that the output of an exclusion method can have different forms whereas the output of an assignment method is a single population (which can simply be the right or a wrong origin for the individual). Also note that exclusion methods give (i) the probability that the individual belongs to a certain population (or to any reference population in a data base) and (ii) the list of the populations for which the probability of belonging is at least equal to a given threshold. In the list, one possible outcome is that all populations are excluded, i.e., the probability of belonging is below the threshold for all of populations. A second possible outcome is that only the correct population is listed as the potential origin. A third possible case is that just one population, but not the right one, is listed. A fourth case is more than one population is listed, including the correct one. The last (fifth) case is more than one population is listed, not including the correct one. The five possible outcomes can be classified into two overlapping groups, named here A and E. Type A errors are cases where the correct population is absent from the list (first, third, and fifth cases). Type E errors are cases where one or more erroneous population(s) appear(s) in the list (third, fourth, and fifth cases). According to the situation, one may want to preferentially minimize either type A or type E errors.
Figure 4A and Figure B, provides the frequency of both types of errors over a subset of combinations of the parameters used in Figure 2. In Figure 4, we dropped the shortest time of divergence (20 generations) for which assignment scores are very low, keeping only 200 and 2000 generations as divergence times. The threshold for excluding a population was arbitrarily fixed to 0.01 (Figure 4A) and 0.001 (Figure 4B). Each individual was tested using the leave one out procedure as above. The distribution of the criterion (genotype likelihood or genetic distance) was established by simulating 1000 individuals. Because this computation is very time-consuming, errors were estimated only over 10 simulated data files for each data point on the figures. This may appear to be a low number (five times less than in the assignment methods), but this still represents a total of 1000, 3000, and 9000 individuals because each file contains 10 populations and each one is represented by 10, 30, or 90 individuals, respectively.
|
Considering first the type A errors, results are almost identical for both times of divergence. Thus, the type A error rate does not seem to be influenced by the amount of divergence among populations. While this error rate is always very low for SMM loci, it can reach very high levels for IAM loci. With the IAM, it is very sensitive to the sample sizes but less sensitive to the number of scored loci. At first sight, it may be surprising that, when sample sizes are small, the error increases with the number of loci (Figure 4A and Figure B; 10 individuals/population). One possible explanation is that, discarding the tested individual (leave one out procedure), frequency estimates are more biased with small samples and combining the information from more loci increases the overall bias in the exclusion criterion and hence raises the type A error. This seems consistent with the observation that increasing the sample size is very efficient in reducing the type A error. Moreover, the likelihood-based methods, which have the best assignment scores, are more efficient in excluding populations but will also more often exclude the correct one because of the aforementioned bias, and hence have the largest type A error rates. Note that large type A error rates correspond most generally to cases where the list of possible populations is empty. The comparison of Figure 4A and Figure B, shows that lowering the threshold of exclusion also decreases this kind of error, as expected. In summary, to lower type A errors, one can (i) lower the threshold, (ii) employ SMM loci rather than IAM loci, (iii) use a distance method, and (iv) increase the sample sizes. The latter condition, alone, is sufficient to get negligible type A errors.
Type E errors are much influenced by the time of divergence and the mutation model of loci. After 2000 generations of divergence for IAM loci, this type of error becomes negligible even with small sample sizes and a small number of loci. At 200 generations for SMM loci, errors are maximal (>75%). In the other two combinations, (2000 generations and SMM loci) and (200 generations and IAM loci), errors vary more or less widely with the method, but lowest errors are logically obtained with the best method (Bayesian). The difference of errors between the DAS distance method and the Bayesian method can be as high as 85%. The errors logically decrease when the number of individuals and/or the number of loci increase(s) and when the threshold increases.
As expected, type A and type E errors do not respond in the same direction when parameter (e.g., time of divergence, sample size, number of loci, mutation model) values change. However, at least when populations have diverged for enough time, it is possible to jointly minimize both types of errors by sampling at least 50 individuals per population, scoring at least 10 loci, choosing IAM-like loci if possible, and using the Bayesian method. With SMM-like loci, such as microsatellites, a similar result requires more individuals and/or more loci (e.g., 7090 individuals per population and 1520 loci).
To get a better idea of the influence of the time of divergence of populations on type E errors, an analysis was performed with varying values of Fst (Figure 5). As in Figure 3, simulations were performed with 10 populations, 10 loci, and 30 diploid individuals per population. Relationships between type E errors and Fst were approximated through logistic regressions (R > 0.995 for all methods except the one based on Goldstein et al.'s distance for which R = 0.93). There appear to be large differences among methods and between the two mutation models, with the same relative performance as in the assignment methods.
|
Maximizing assignment scores:
When populations are not highly differentiated, e.g., when Fst < ~0.1, the performances of assignment methods always improve with larger population samples and larger numbers of loci. But an important practical question is whether it is more efficient to increase the former or the latter? If the total number of analyses (e.g., PCR analyses) is limited due to economic constraints, what would be the most efficient combination of sample size and number of loci? Figure 6 provides a tentative answer for an assignment method scenario in which, e.g., 240 analyses (individual x locus) can be conducted per population, using the Bayesian method. The figure shows that the most efficient combination varies with the degree of population divergence. With a very low Fst (0.01, curve 1 in Figure 6), the best combination is 8 loci and 30 individuals scored per population. With increasing Fst's, one should reduce the sample size and increase the number of loci. For instance, when Fst is near 0.025 (curve 2), the optimal number of loci is in the range of 1520 (1216 individuals sampled per population) and when Fst
0.05 (curves 3, 4, and 5), it is in the range of 2030 loci (with as low as 812 individuals sampled per population). However, whenever Fst is large (e.g., curve 5, Fst = 0.225), many combinations from (10 loci x 24 individuals per population) to (48 loci x only 5 individuals per population) equally provide a 100% correct assignment.
|
| DISCUSSION |
|---|
A first conclusion is that differences, sometimes quite large, exist among the "genetic" assignment/exclusion methods. The distance-based methods performed less well than the likelihood-based methods, among which the most efficient was the Bayesian method in all cases studied. Among the genetic distances, we chose to study only three. Additional classical distances proposed by Nei or Nei et al. (standard, minimum, and DA; ![]()
In all our simulations, we considered a unique value for the mutation rate and the effective population size, resulting in a rather constant gene diversity close to 0.67 [=
with M = 4Neµ = 4 x 1000 x 0.0005 = 2] for IAM markers and close to 0.55 [= 1 - (1 + 2M)-0.5] for SMM markers. This somewhat arbitrary choice is justified by the usefulness of microsatellites for conducting assignment methods (![]()
![]()
![]()
When populations with identical Ne's have diverged for a given number of generations, the level of differentiation is higher for loci evolving under the IAM than for those evolving under the SMM (for a given mutation rate) because homoplasy is absent under the IAM and present under the SMM. Because homoplasy reduces differences among taxa, it is logical that everything being equal, IAM markers provide better assignment scores than SMM markers. However, the large difference in the performance of the assignment/exclusion methods between the two types of loci for the same Fst value was rather unexpected. A possible explanation is that assignment methods are sensitive to the distributions of allele frequencies, which are different according to the mutation model of the locus. For instance, with equal mutation rates and effective population sizes, IAM loci have higher heterozygosities than SMM loci (e.g., 0.66 vs. 0.55 on average in our conditions; cf. previous paragraph). If, as already observed, assignment scores increase with the variability of the markers (measured here by the heterozygosity) then more variable IAM loci will provide better assignment scores. Note that microsatellite markers are considered to follow a rather SMM-like model of evolution with possible size constraints that can increase homoplasy (reviewed in ![]()
However, even with the imperfections mentioned above, the knowledge of the Fst value for a set of populations should provide a useful prediction of the performance of assignment methods. The range of Fst's for which the methods perform well with reasonable sample sizes and numbers of loci (Fst
0.05; Figure 5) are within the range found among many natural populations (e.g., mountain sheep, ![]()
![]()
![]()
![]()
![]()
![]()
0.6) on 3050 individuals from each of 10 populations when the Fst is ~0.1. Good assignment scores can also be obtained for lower values of Fst, but will require larger samples of individuals and loci. However, achieving 100% accuracy (i.e., zero error rates) when using the exclusion methods will require >20 loci and 50 individuals, especially when the threshold for excluding a population is very low (e.g., P < 0.001) and when Fst
0.1 among 10 populations (Figure 4). Such low thresholds will provide a very high certainty of correct exclusion and will be necessary for some forensics applications (e.g., convicting poachers). Fewer loci will be sufficient for applications requiring less stringent exclusion thresholds. More research is needed to quantify the power of the exclusion and assignment methods when the number of populations is different from 10.
Assignment/exclusion methods can be conducted in two different ways according to whether the individuals to be assigned are those of the reference set or not. Both ways generally differ in objectives, but there are no conceptual differences in the assignment procedure if choosing the leave one out option. In the Introduction, we gave some examples of studies performed with assignment methods. But a larger scope of applications is likely to develop. In population genetics, many statistics (e.g., gene diversity, fixation indices, genetic distances, etc.) are computed from allele frequencies that are estimated from population samples. A basic and seldom-tested assumption is that population samples do not include "abnormal" individuals. Assignment methods applied to individuals from a reference population using the leave one out procedure can help detect such abnormalities in samples. These abnormal data can result from errors in individual records. They can also correspond to immigrants or their descendants, in which case the question is whether immigration is artificial or natural. When immigration is natural and sufficiently low, assignment tests can be used to estimate dispersal rates in natural populations via direct methods (i.e., computing the proportion of individuals that are identified as immigrants) while simultaneously estimating dispersal rates indirectly (e.g., via estimating Nm from Fst in the island model of migration). The importance of simultaneously using both the direct and the indirect methods has been thoroughly discussed by several authors (![]()
![]()
Another potential use of assignment methods in population genetic studies is in measuring population differentiation. The estimation of the level of differentiation is classically performed with the Fst parameter. Figure 3 clearly illustrates the fact that when differentiation is low, the proportion of correctly assigned individuals varies between 1.0 and 1/p (with p populations) when the Fst is close to 0 (e.g., 0 < Fst < 0.05). Could the assignment score be used as a differentiation index, especially useful in the case of low differentiation? There seem to be several obstacles to the direct use of this parameter. First, we showed a large influence of the sample sizes, the number of loci, and the mutation model of markers. Second, we suspect an effect of the level of variability of markers. Third, the range of levels of differentiation for which the proportion of correctly assigned individuals would be useful seems quite narrow.
In addition to pure scientific objectives, assignment/exclusion methods may have various practical applications. Identifying immigrants or their descendants would be useful in conservation biology for detecting both (unwanted) introgression of foreign genes and (desirable) reproduction by transplanted or natural immigrants that can help maintain a population's genetic variation and evolutionary potential. Assignment methods can also be useful in crop pest management to identify the origin of a newly introduced pest or in forensics science to identify the origin of illegally killed animals or illegally obtained plant and animal parts, and thereby help prosecute poachers and minimize poaching.
| ACKNOWLEDGMENTS |
|---|
This research was supported by a grant from the French Bureau des Ressources Génétiques. We thank D. Paetkau and two anonymous referees for critically reading the manuscript and providing helpful suggestions. A computer program (GeneClass), written in Delphi v4 professional (for Windows 95), performs all the computations required to apply the assignment/exclusion methods described in this article. The executable version is available at http://www.ensam.inra.fr/URLB.
Manuscript received February 8, 1999; Accepted for publication August 16, 1999.
| LITERATURE CITED |
|---|
AYALA, F. J., 1975 Genetic differentiation during the speciation process. Evol. Biol. 8:1-78.
BOWCOCK, A. M., A. RUIZ-LINARES, J. TOMFOHRDE, E. MINCH, and J. R. KIDD et al., 1994 High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368:455-457[Medline].
CAVALLI-SFORZA, L. L. and A. W. F. EDWARDS, 1967 Phylogenetic analysis: models and estimation procedures. Am. J. Hum. Genet. 19:233-257.
CHAKRABORTY, R., and L. JIN, 1993 A unified approach to study hypervariable polymorphisms: statistical considerations of determining relatedness and population distances, pp. 153175 in DNA Fingerprinting: State of the Science, edited by S. D. J. PENA, R. CHAKRABORTY, J. T. EPPLEN and A. J. JEFFREYS. Birkhauser Verlag, Basel, Switzerland.
CORNUET, J. M., S. AULAGNIER, S. LEK, P. FRANCK, and M. SOLIGNAC, 1996 Classifying individuals among infra-specific taxa using microsatellite data and neural networks. C. R. Acad. Sci. Paris, Life Sci. 319:1167-1177.
EFRON, B., 1983 Estimating the error rate of a prediction rule: improvement on cross-validation. J. Am. Stat. Assoc. 78:316-330.
ESTOUP, A., and B. ANGERS, 1998 Microsatellites and minisatellites for molecular ecology: theoretical and empirical considerations, pp. 5586 in Advances in Molecular Ecology, edited by G. CARVALHO. IOS Press, Amsterdam.
ESTOUP, A., and J.-M. CORNUET, 1999 Microsatellite evolution: inferences from population data, pp. 5065 in Microsatellites: Evolution and Applications, edited by D. B. GOLDSTEIN and C. SCHLÖTTERER. Oxford University Press, Oxford.
ESTOUP, A., L. GARNERY, M. SOLIGNAC, and J. M. CORNUET, 1995 Microsatellite variation in honey bee (Apis mellifera L.) populations: hierarchical genetic structure and test of the infinite allele and stepwise mutation models. Genetics 140:679-695[Abstract].
ESTOUP, A., F. ROUSSET, Y. MICHALAKIS, J. M. CORNUET, and M. ADRIAMANGA et al., 1998 Microgeographic differentiation in brown trout (Salmo trutta): a comparison of microsatellite and allozyme loci. Mol. Ecol. 7:339-353[Medline].
FAVRE, L., F. BALLOUX, J. GOUDET, and N. PERRIN, 1997 Female-biased dispersal in the monogamous mammal Crocidura russula: evidence from field data and microsatellite patterns. Proc. R. Soc. Lond. Ser. B 264:127-132[Medline].
FORBES, S. H. and J. T. HOGG, 1999 Assessing population structure at high levels of differentiation: microsatellite comparisons of bighorn sheep and large carnivores. Anim. Conserv. 2:223-233.
GOLDSTEIN, D. B., A. RUIZ LINARES, L. L. CAVALLI-SFORZA, and M. W. FELDMAN, 1995 Genetic absolute dating based on microsatellites and origin of modern humans. Proc. Natl. Acad. Sci. USA 92:6723-6727
JARNE, P. and P. LAGODA, 1996 Microsatellites, from molecules to populations and back. Trends Ecol. Evol. 11:424-429.
KIMURA, M. and J. F. CROW, 1964 The number of alleles that can be maintained in a finite population. Genetics 49:725-738
LUGON-MOULIN, N., H. BRÜNNER, A. WYTTENBACH, J. HAUSSER, and J. GOUDET, 1999 Hierarchical analyses in a hybrid zone of Sorex araneus (Insectivora, Soricidae). Mol. Ecol. 8:419-431.
NEI, M., 1975 Molecular Population Genetics and Evolution. North-Holland, Amsterdam.
NEI, M., 1987 Molecular Evolutionary Genetics. Columbia University Press, New York.
NEI, M. and D. GRAUR, 1984 Extent of protein polymorphism and the neutral mutation theory. Evol. Biol. 17:73-118.
NEIGEL, J. E., 1997 A comparison of alternative strategies for estimating gene flow from genetic markers. Annu. Rev. Ecol. Syst. 28:105-128.
NEVO, E., 1978 Genetic variation in natural populations: patterns and theory. Theor. Popul. Biol. 13:121-177[Medline].
OHTA, T. and M. KIMURA, 1973 A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet. Res. 22:201-204[Medline].
PAETKAU, D., W. CALVERT, I. STIRLING, and C. STROBECK, 1995 Microsatellite analysis of population structure in Canadian polar bears. Mol. Ecol. 4:347-354[Medline].
PAETKAU, D., L. P. WAITS, P. L. CLARKSON, L. CRAIGHEAD, and C. STROBECK, 1997 An empirical evaluation of genetic distance statistics using microsatellite data from bear (Ursidae) populations. Genetics 147:1943-1957[Abstract].
RANNALA, B. and J. L. MOUNTAIN, 1997 Detecting immigration by using multilocus genotypes. Proc. Natl. Acad. Sci. USA 94:9197-9221
ROY, M. S., E. GEFFEN, D. SMITH, E. A. OSTRANDER, and R. K. WAYNE, 1994 Patterns of differentiation and hybridization in North American wolflike canids, revealed by analysis of microsatellite loci. Mol. Biol. Evol. 11:553-570[Abstract].
SHRIVER, M. D., L. JIN, R. E. FERREL, and R. DEKA, 1997 Microsatellite data support an early population expansion in Africa. Genome Res. 7:586-591
SIMONSEN, K. L., G. A. CHURCHILL, and C. F. AQUADRO, 1995 Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413-429[Abstract].
SLATKIN, M. E., 1987 Gene flow and the geographic structure of natural populations. Science 236:787-792
STATSOFT FRANCE, 1997 STATISTICA pour Windows, version 5.0. Charenton-le-Pont, France.
TAYLOR, E. B., T. D. BEACHAM, and M. KAERIYAMA, 1994 Population structure and identification of North Pacific Ocean chum salmon (Oncorhynchus keta) revealed by an analysis of minisatellite DNA variation. Can. J. Fish. Aquat. Sci. 51:1430-1442.
VRANA, R. K. and W. WHEELER, 1992 Individual organisms as terminal entities: laying the species problem to rest. Cladistics 8:67-72.
WASER, P. and C. STROBECK, 1998 Genetic signatures of interpopulation dispersal. Trends Ecol. Evol. 13:43-44.
WEIR, B. S. and C. C. COCKERHAM, 1984 Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.
WRIGHT, S., 1951 The genetical structure of populations. Ann. Eugen. 15:323-354.
This article has been cited by other articles:
![]() |
K. A. Glover, O. T. Skilbrei, and O. Skaala Genetic assignment identifies farm of origin for Atlantic salmon Salmo salar escapees in a Norwegian fjord ICES J. Mar. Sci., September 1, 2008; 65(6): 912 - 920. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Carlsson Effects of Microsatellite Null Alleles on Assignment Testing J. Hered., June 4, 2008; (2008) esn048v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Kuroda, A. Kaga, N. Tomooka, and D. A. Vaughan Gene Flow and Genetic Structure of Wild Soybean (Glycine soja) in Japan Crop Sci., May 1, 2008; 48(3): 1071 - 1079. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S. Kim, S. T. Ratcliffe, B. W. French, L. Liu, and T. W. Sappington Utility of EST-Derived SSRs as Population Genetics Markers in a Beetle J. Hered., March 1, 2008; 99(2): 112 - 124. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Tadano, M. Nishibori, N. Nagasaka, and M. Tsudzuki Assessing Genetic Diversity and Population Structure for Commercial Chicken Lines Based on Forty Microsatellite Analyses Poult. Sci., November 1, 2007; 86(11): 2301 - 2308. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. DUMONTEIL, F. TRIPET, M. J. RAMIREZ-SIERRA, V. PAYET, G. LANZARO, and F. MENU ASSESSMENT OF TRIATOMA DIMIDIATA DISPERSAL IN THE YUCATAN PENINSULA OF MEXICO BY MORPHOMETRY AND MICROSATELLITE MARKERS Am J Trop Med Hyg, May 1, 2007; 76(5): 930 - 937. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Witherspoon, S. Wooding, A. R. Rogers, E. E. Marchani, W. S. Watkins, M. A. Batzer, and L. B. Jorde Genetic Similarities Within and Between Human Populations Genetics, May 1, 2007; 176(1): 351 - 359. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. D. Lorenzen, P. Arctander, and H. R. Siegismund Regional Genetic Structuring and Evolutionary History of the Impala Aepyceros melampus J. Hered., March 1, 2006; 97(2): 119 - 132. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. G. Rose, K. T. Paynter, and M. P. Hare Isolation by Distance in the Eastern Oyster, Crassostrea virginica, in Chesapeake Bay J. Hered., March 1, 2006; 97(2): 158 - 170. [Abstract] [Full Text] [PDF] |
||||
![]() |
|














