- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Yu, N.
- Articles by Li, W.-H.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Yu, N.
- Articles by Li, W.-H.
Larger Genetic Differences Within Africans Than Between Africans and Eurasians
Ning Yua, Feng-Chi Chena,b, Satoshi Otaa, Lynn B. Jordec, Pekka Pamilod, Laszlo Patthye, Michele Ramsayf, Trefor Jenkinse, Song-Kun Shyueg, and Wen-Hsiung Liaa Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637,
b Department of Life Science, National Tsing Hua University, Hsinchu, 300 Taiwan,
c Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112,
d Department of Biology, University of Oulu, 90014 Oulu, Finland,
e Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, H-1518 Budapest, Hungary,
f Department of Human Genetics, South African Institute for Medical Research and University of the Witwatersrand, Johannesburg, 2050 South Africa
g Institute of Biomedical Sciences, Academia Sinica, Taipei, 115 Taiwan
Corresponding author: Wen-Hsiung Li, University of Chicago, 1101 E. 57th St., Chicago, IL 60637., whli{at}uchicago.edu (E-mail)
Communicating editor: Y.-X. FU
| ABSTRACT |
|---|
The worldwide pattern of single nucleotide polymorphism (SNP) variation is of great interest to human geneticists, population geneticists, and evolutionists, but remains incompletely understood. We studied the pattern in noncoding regions, because they are less affected by natural selection than are coding regions. Thus, it can reflect better the history of human evolution and can serve as a baseline for understanding the maintenance of SNPs in human populations. We sequenced 50 noncoding DNA segments each
500 bp long in 10 Africans, 10 Europeans, and 10 Asians. An analysis of the data suggests that the sampling scheme is adequate for our purpose. The average nucleotide diversity (
) for the 50 segments is only 0.061% ± 0.010% among Asians and 0.064% ± 0.011% among Europeans but almost twice as high (0.115% ± 0.016%) among Africans. The African diversity estimate is even higher than that between Africans and Eurasians (0.096% ± 0.012%). From available data for noncoding autosomal regions (total length = 47,038 bp) and X-linked regions (47,421 bp), we estimated the
-values for autosomal regions to be 0.105, 0.070, 0.069, and 0.097% for Africans, Asians, Europeans, and between Africans and Eurasians, and the corresponding values for X-linked regions to be 0.088, 0.042, 0.053, and 0.082%. Thus, Africans differ from one another slightly more than from Eurasians, and the genetic diversity in Eurasians is largely a subset of that in Africans, supporting the out of Africa model of human evolution. Clearly, one must specify the geographic origins of the individuals sampled when studying
or SNP density.
THERE has been much interest in single nucleotide polymorphisms (SNPs) in human populations because such data are useful for studying human evolution and the mechanism of maintenance of genetic variability in human populations and for identifying genes associated with complex disease (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
| MATERIALS AND METHODS |
|---|
DNA samples:
The 10 Africans used were 1 Biaka Pygmy, 1 Mbuti Pygmy, 1 Ghanaian, 1 Kikuyu, 1 !Kung, 1 Luo, 2 Nigerians (Yuroba and Rivers), 1 South African Bantu speaker, and 1 Zulu (also a South African Bantu speaker); the 10 Europeans were 1 Finnish, 1 French, 1 German, 1 Hungarian, 1 Italian, 1 Portuguese, 1 Russian, 1 Spanish, 1 Swedish, and 1 Ukranian; and the 10 Asians were 1 Cambodian, 2 Chinese (North and South), 1 Han Taiwanese, 2 Indians (Punjab and Bengal), 1 Japanese, 1 Mongolian, 1 Vietnamese, and 1 Yakut. As every segment studied is autosomal, the number of sequences studied for each segment is 60 (20 for each continent studied).
Selection of DNA segments:
Fifty noncoding, nonrepetitive genomic segments (each
1 kb), which covered almost all autosomes, were selected randomly with reference to the Genome Channel (http://genome.ornl.gov/GCat/species.shtml); see ![]()
PCR amplification and DNA sequencing:
Touchdown PCR (![]()
![]()
ABI DNA Sequence Analysis 3.0 was used for lane tracking and base calling. The data were then proofread manually and heterozygous sites were detected as double peaks. The forward and reverse sequences were assembled automatically in each individual using SeqMan in DNASTAR. The assembled files were carefully checked by eye. Fluorescent traces for each variant site were rechecked again in all individuals. All singletons, which were variants that appear only once in the total sample, were verified by PCR reamplification and resequencing the PCR products in both directions.
Data analysis:
The sequences were aligned by SeqMan in the DNASTAR or the DAMBE package (![]()
![]()
| RESULTS AND DISCUSSION |
|---|
Distribution of SNPs:
A total of 146 SNPs were found in the total sample; 53 of them were observed only once (i.e., singletons) and 22 only twice (doubletons). The number of variant sites found in the African sample was 118, of which 68 (36 singletons, 15 doubletons, and 17 others) were not found in the Eurasian sequences (i.e., they were unique). In contrast, in the Eurasian sample only 78 variant sites were found and only 28 of them (17 singletons, 4 doubletons, and 7 others) were unique, though the combined sample size was twice the African sample size. Thus, beyond the 50 variants already observed in the African sample, the combined Eurasian sample contains in addition only 17 singletons and 11 nonsingleton variants. The high frequencies of singletons in the African and Eurasian samples are similar to those observed in other studies (![]()
![]()
![]()
, the expected number of mutations of size i in a random sample of n sequences is
/i (![]()
Nucleotide diversity:
Nucleotide diversity (
) is defined as the number of nucleotide differences between two randomly chosen sequences in a population. The
-value fluctuates greatly among the segments studied (Table 1). The range of
is from 0 (5 segments) to 0.27% in the total sample, from 0 (5 segments) to 0.58% in the African sample, from 0 (19 segments) to 0.27% in the Asian sample, and from 0 (18 segments) to 0.29% in the European sample. Such large fluctuations are not surprising because the nucleotide diversity in a short DNA region is subject to strong stochastic effects. In addition, variation in
may also arise from different mutation rates among different segments, although we found no correlation between
and the divergence between human and ape sequences. The average
-values are only 0.061% ± 0.010% among Asians and 0.064% ± 0.011% among Europeans, but almost twice as high among Africans (0.115% ± 0.016%);
is 0.088% ± 0.011% for the total sample. The average
-value within Africans is actually somewhat higher than that between Africans and Eurasians (0.096% ± 0.012%). In other words, Africans differ on average more among themselves than from Eurasians.
|
Adequacy of sampling scheme and sample size:
We now consider the adequacy of our sampling schedule. The fact that the individuals used were chosen to cover various geographic areas and ethnic backgrounds in each of the three continents studied may tend to overestimate
, whereas the inclusion of the two sequences within each individual may tend to underestimate
. The two tendencies should be reflected in between-individual
-values (
b) and in within-individual
-values (
w), respectively, and the average
b and
w can be taken as an upper and a lower bound of the true
-value. To simplify the analysis, we concatenate the segments in an individual in a random manner into two continuous sequences. For the African sequences the distribution of
b-values, which ranges from 0.059 to 0.187%, is only somewhat wider than that of the 10
w-values, which ranges from 0.059 to 0.152%. Therefore, the average
b (0.115%) is only slightly higher than the average
w (0.108%), implying that the geographic locations of the individuals sampled have little effect on the average
-value. For the Asian sample there are two very low
w-values (0.023 for the North Chinese and the Bengal) and the average
w-value (0.051%) is substantially lower than the average
b (0.063%). The average
w without the two outliers becomes 0.058%, which is similar to the between-individual average
b (0.063%). So, our sampling schedule in Asia may cause at most only a minor overestimate of the true average
-value. For the European sample, the average
w-value is only 0.049% and, after excluding the two lowest values (0.027% for the Ukranian and 0.031% for the Russian), it becomes 0.054%, which is still not close to the between-individual average of 0.066%. This comparison suggests that our sampling schedule may have inflated somewhat the average
-value for the Europeans. However, as the final estimate of 0.063% should have been compensated to some extent by the
w-values, which tend to be low, it should not differ much from the true value.
Next let us consider whether the sample size of 10 individuals from each continent studied is sufficiently large for our purpose. To answer this question, let us consider subsamples of the original samples; we consider concatenated sequences. Fig 1 shows the distribution of the average
-values when each subsample in a continent contains only 6 individuals from the original sample of 10; there are 210 such possible subsamples. It is seen that the three distributions are rather concentrated. For example, in each distribution none of the average
-values deviate significantly from the mean
-value of the total sample and the proportions of average
-values in subsamples that deviate more than 10% from the mean are 18.1, 7.6, and 20.4% for the African, Asian, and European samples, respectively. This analysis suggests that even a sample of 6 independent individuals would usually give an estimate of
in a continent reasonably close to the true value.
|
Worldwide pattern of SNP variation and implications for human evolution:
A more general pattern of worldwide nucleotide diversity is shown in Table 2, which includes the present data and data from the literature and our unpublished studies. For the autosomal regions included (47,038 bp) the
-values are 0.105% for Africans, 0.070% for Asians, 0.069% for Europeans, and 0.097% for between Africans and Eurasians. For X-linked regions (47,421 bp) the corresponding values are 0.088, 0.042, 0.053, and 0.082%. As in many previous studies (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
-values and are considerably closer to each other than to Africans.
|
The genome-wide nucleotide diversity has been estimated to be 0.075% for the National Institutes of Health (NIH) diversity panel (INTERNATIONAL SNP WORKING GROUP 2001; see also ![]()
Interestingly, both autosomal and X-linked sequence data show higher DNA variation within Africans than between Africans and Eurasians (Table 2), contrary to the general observation of lower within-population than between-population differences in population genetics. This finding implies that Africans differ on average more among themselves than from Eurasians. Thus, with the exception of many minor unique variants, the nucleotide diversity in Eurasians is essentially a subset of that in Africans, as suggested by the observation that both Y-linked and autosomal haplotypes found outside of Africa were often a subset of the collection of haplotypes found in Africa (![]()
![]()
![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
This work was supported by National Institutes of Health grants GM55759, HD38287, GM30998, and GM59290.
Manuscript received August 30, 2001; Accepted for publication February 19, 2002.
| LITERATURE CITED |
|---|
ARMOUR, J. A. L., T. ANTTINEN, C. A. MAY, E. E. VEGA, and A. SAJANTILA et al., 1996 Minisatellite diversity supports a recent African origin for modern humans. Nat. Genet. 13:154-160[Medline].
CARGILL, M., D. ALTSHULER, J. IRELAND, P. SKLAR, and K. ARDLIE et al., 1999 Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22:231-238[Medline].
CHEN, F.-C. and W.-H. LI, 2001 Genomic divergences between human and other hominoids and the effective population size of the common ancestor of human and chimpanzee. Am. J. Hum. Genet. 68:444-456[Medline].
DON, R. H., P. T. COX, B. J. WAINWRIGHT, K. BAKER, and J. S. MATTICK, 1991 Touchdown PCR to circumvent spurious priming during gene amplification. Nucleic Acids Res. 19:4008
FU, Y. X., 1995 Statistical properties of segregating sites. Theor. Popul. Biol. 48:172-197[Medline].
HALUSHKA, M. K., J. B. FAN, K. BENTLEY, L. HSIE, and N. SHEN et al., 1999 Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat. Genet. 22:239-247[Medline].
HAMMER, M. F., A. B. SPURDLE, T. KARAFET, M. R. BONNER, and E. T. WOOD et al., 1997 The geographic distribution of Y chromosome variation. Genetics 145:787-805[Abstract].
HARDING, R. M., S. M. FULLERTON, R. C. GRIFFITHS, J. BOND, and M. J. COX et al., 1997 Archaic African and Asian lineages in the genetic ancestry of modern humans. Am. J. Hum. Genet. 60:772-789[Medline].
HARRIS, E. E. and J. HEY, 1999 X chromosome evidence for ancient human histories. Proc. Natl. Acad. Sci. USA 96:3320-3324
INGMAN, M., H. KAESMANN, S. PÄÄBO, and U. GYLLENSTEN, 2000 Mitochondrial genome variation and the origin of modern humans. Nature 408:708-713[Medline].
A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. (2001) Nature 409:928-933[Medline].
JARUZELSKA, J., E. ZIETKIEWICZ, M. BATZER, D. E. C. COLE, and J-P. MOISAN et al., 1999 Spatial and temporal distribution of the neutral polymorphisms in the last ZFX intron: analysis of the haplotype structure and genealogy. Genetics 152:1091-1101
JORDE, L. B., W. S. WATKINS, M. J. BAMSHAD, M. E. DIXON, and C. E. RICKER et al., 2000 The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y chromosome data. Am. J. Hum. Genet. 66:979-988[Medline].
KAESSMANN, H., F. HEISSIG, A. VON HAESELER, and S. PÄÄBO, 1999 DNA sequence variation in a non-coding region of low recombination on the human X chromosome. Nat. Genet. 22:78-81[Medline].
NACHMAN, M. W. and S. L. CROWELL, 2000 Contrasting evolutionary histories of two introns of Duchenne muscular dystrophy gene, Dmd, in humans. Genetics 155:1855-1864
NICKERSON, D. A., S. L. TAYLOR, K. M. WEISS, A. G. CLARK, and R. G. HUTCHINSON et al., 1998 DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nat. Genet. 19:233-240[Medline].
RIEDER, M. J., S. L. TAYLOR, A. G. CLARK, and D. A. NICKERSON, 1999 Sequence variation in the human angiotensin converting enzyme. Nat. Genet. 22:59-62[Medline].
ROZAS, J. and R. ROZAS, 1999 DnaSP version 3.0: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175
STONEKING, M., J. J. FONTIUS, S. L. CLIFFORD, H. SOODYALL, and S. S. ARCOT et al., 1997 Alu insertion polymorphisms and human evolution: evidence for a larger population size in Africa. Genome Res. 7:1061-1071
TISHKOFF, S. A., E. DIETZSCH, W. SPEED, A. J. PAKSTIS, and J. R. KIDD et al., 1996 Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271:1380-1387[Abstract].
TISHKOFF, S. A., A. J. PAKSTIS, M. STONEKING, J. R. KIDD, and G. DESTRO-BISOL et al., 2000 Short tandem-repeat Polymorphism/Alu haplotype variation at the PLAT locus: implications for modern human origins. Am. J. Hum. Genet. 67:901-925[Medline].
UNDERHILL, P. A., P. SHEN, A. A. LIN, L. JIN, and G. PASSARINO et al., 2000 Y chromosome sequence variation and the history of human populations. Nat. Genet. 26:358-361[Medline].
VENTER, J. C., M. D. ADAMS, E. W. MYERS, P. W. LI, and R. J. MURAL et al., 2001 The sequence of the human genome. Science 291:1304-1351
VIGILANT, L., M. STONEKING, H. HARPENDING, K. HAWKES, and A. C. WILSON, 1991 African populations and the evolution of human mitochondrial DNA. Science 253:1503-1507
XIA, X., 2000 Data Analysis in Molecular Biology and Evolution. Kluwer Academic Publishers, Boston.
YU, N., Z. ZHAO, Y.-X. FU, N. SAMBUUGHIN, and M. RAMSAY et al., 2001 Global patterns of human DNA sequence variation in a 10-kb region on chromosome 1. Mol. Biol. Evol. 18:214-222
ZHAO, Z., L. JIN, Y.-X. FU, M. RAMSAY, and T. JENKINS et al., 2000 Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22. Proc. Natl. Acad. Sci. USA 97:11354-11358
ZIETKIEWICZ, E., V. YOTOVA, M. JARNIK, M. KORAB-LASKOWSKA, and K. K. KIDD et al., 1998 Nuclear DNA diversity in worldwide distributed human populations. Gene 205:161-171.
This article has been cited by other articles:
![]() |
N. J. R. Fagundes, N. Ray, M. Beaumont, S. Neuenschwander, F. M. Salzano, S. L. Bonatto, and L. Excoffier Statistical evaluation of alternative models of human evolution PNAS, November 6, 2007; 104(45): 17614 - 17619. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. H. Perry, R. D. Martin, and B. C. Verrelli Signatures of Functional Constraint at Aye-aye Opsin Genes: The Potential of Adaptive Color Vision in a Nocturnal Primate Mol. Biol. Evol., September 1, 2007; 24(9): 1963 - 1970. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Rogers, S. Wooding, C. D. Huff, M. A. Batzer, and L. B. Jorde Ancestral Alleles and Population Origins: Inferences Depend on Mutation Rate Mol. Biol. Evol., April 1, 2007; 24(4): 990 - 997. [Abstract] [Full Text] [PDF] |
||||
![]() |
O Thalmann, A Fischer, F Lankester, S Paabo, and L Vigilant The Complex Evolutionary History of Gorillas: Insights from Genomic Data Mol. Biol. Evol., January 1, 2007; 24(1): 146 - 158. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhao, N. Yu, Y.-X. Fu, and W.-H. Li Nucleotide Variation and Haplotype Diversity in a 10-kb Noncoding Region in Three Continental Human Populations Genetics, September 1, 2006; 174(1): 399 - 409. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bamshad Genetic Influences on Health: Does Race Matter? JAMA, August 24, 2005; 294(8): 937 - 946. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Garrigan, Z. Mobasher, T. Severson, J. A. Wilder, and M. F. Hammer Evidence for Archaic Asian Ancestry on the Human X Chromosome Mol. Biol. Evol., February 1, 2005; 22(2): 189 - 192. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Boffelli, C. V. Weer, L. Weng, K. D. Lewis, M. I. Shoukry, L. Pachter, D. N. Keys, and E. M. Rubin Intraspecies sequence comparisons for annotating genomes Genome Res., December 1, 2004; 14(12): 2406 - 2411. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Storz, B. A. Payseur, and M. W. Nachman Genome Scans of DNA Variability in Humans Reveal Evidence for Selective Sweeps Outside of Africa Mol. Biol. Evol., September 1, 2004; 21(9): 1800 - 1811. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. F. Hammer, D. Garrigan, E. Wood, J. A. Wilder, Z. Mobasher, A. Bigham, J. G. Krenz, and M. W. Nachman Heterogeneous Patterns of Variation Among Multiple Human X-Linked Loci: The Possible Role of Diversity-Reducing Selection in Non-Africans Genetics, August 1, 2004; 167(4): 1841 - 1853. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. W. Nachman, S. L. D'Agostino, C. R. Tillquist, Z. Mobasher, and M. F. Hammer Nucleotide Variation at Msn and Alas2, Two Genes Flanking the Centromere of the X Chromosome in Humans Genetics, May 1, 2004; 167(1): 423 - 437. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Yu, M. I. Jensen-Seaman, L. Chemnick, O. Ryder, and W.-H. Li Nucleotide Diversity in Gorillas Genetics, March 1, 2004; 166(3): 1375 - 1383. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Yu, M. I. Jensen-Seaman, L. Chemnick, J. R. Kidd, A. S. Deinard, O. Ryder, K. K. Kidd, and W.-H. Li Low Nucleotide Diversity in Chimpanzees and Bonobos Genetics, August 1, 2003; 164(4): 1511 - 1518. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. S. Watkins, A. R. Rogers, C. T. Ostler, S. Wooding, M. J. Bamshad, A.-M. E. Brassington, M. L. Carroll, S. V. Nguyen, J. A. Walker, B.V. R. Prasad, et al. Genetic Variation Among World Populations: Inferences From 100 Alu Insertion Polymorphisms Genome Res., July 1, 2003; 13(7): 1607 - 1618. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kayser, S. Brauer, and M. Stoneking A Genome Scan to Detect Candidate Regions Influenced by Local Natural Selection in Human Populations Mol. Biol. Evol., June 1, 2003; 20(6): 893 - 900. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Yu, N.
- Articles by Li, W.-H.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Yu, N.
- Articles by Li, W.-H.





