| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Genetics, Vol. 174, 491-497, September 2006, Copyright © 2006
doi:10.1534/genetics.105.052225
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
,1














,***,2
* DNA Link, Seoul, 120-110, Korea,
Bio Lab, *** Future Technology Group, Samsung Advanced Institute of Technology, Yongin-si, Gyeonggi-do 449-901, Korea,
Wellcome Trust Center for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom,
National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon-si, 305-333, Korea, ** Department of Dermatology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, 135-230, Korea, 
Department of Biochemistry and Molecular Biology, University of Ulsan College of Medicine, Seoul, 138-736, Korea, 
National Genome Research Institute, Korean National Institute of Health, Seoul 122-701, Korea and 
Department of Biostatistics, University of Washington, Seattle, Washington 98195-7232
2 Corresponding author: Bio Lab, Future Technology Group, Samsung Advanced Institute of Technology, Mt. 14, Nongseo-ri, Giheung-eup, Yongin-si, Gyeonggi-do, 449-901, Korea.
E-mail: jungjoo.hwang{at}samsung.com
| ABSTRACT |
|---|
|
|
|---|
Linkage disequilibrium (LD) patterns are known to vary across the human genome and across populations (GABRIEL et al. 2002; PHILLIPS et al. 2003; KE et al. 2004a). Two important questions arise from these observations, i.e., how LD patterns are maintained between a HapMap population and a population that the HapMap is applied to and how strong is a population to which the HapMap is applied. These two questions are related and can often be assessed by choosing tagging SNPs (JOHNSON et al. 2001; GIBBS et al. 2003; STRAM 2004) in one population sample and applying them to another population (KE et al. 2004b; AHMADI et al. 2005; MUELLER et al. 2005). In a recent study involving four gene regions in several European populations (MUELLER et al. 2005), tagging SNPs selected from the HapMap CEU data were found to perform very efficiently in local European samples in two of the four regions because of high conservation of LD. In the other two regions, however, restricted applicability of CEU-derived tagging SNPs was observed due to significant variation in LD among populations (MUELLER et al. 2005).
For Asian populations, there are two HapMaps available, CHB and JPT. We are interested in the cross-population robustness of the HapMap in general and the applicability of JPT and CHB maps to the Korean population, in particular. We are also interested in potential advantages of having two related Asian HapMaps. To this end, the Korean government, industries, and academic institutions launched the Korean HapMap project in 2003, which involved the fine mapping of 7 of the 10 HapMap Encyclopedia of DNA Elements (ENCODE) regions.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Genotyping:
Multiplex SNP analyses were performed for ENr213, ENr232, and ENr321 with the GenomeLab SNPstream genotyping platform (Beckman Coulter) and its accompanying SNPstream software suite as described by DENOMME and VAN OENE (2005). For ENr131 as well as for ENr112, ENr113, and ENm013, Sequenom's MassARRAY system was used with standard conditions as described before (BANSAL et al. 2002) except the amount of genomic DNA (2.5 ng) was decreased with multiplexing assays. All genotypes were confirmed by an operator for final genotype calls.
The average genotyping error rate was estimated at
0.1% by routinely running duplicated sample wells in the plate. The genotyping results in the study are available for download from http://www.ngic.re.kr:8080/khapmap/.
LD analysis:
All analyses in this study were based on the HapMap ENCODE genotype downloaded at February 2005, unless otherwise stated. Only markers that were polymorphic in the Chinese, Japanese (http://www.hapmap.org/), and Korean samples were used ("shared data sets" thereafter). This selection of markers, therefore, allowed direct comparison of LD structure between population samples and all other analyses except tagging. For the sliding-window analysis, average pairwise r2 was calculated from 10- to 50-kb interspaced SNPs in 50-kb sliding windows (5-kb increment between windows). Haplotype blocks were defined using Haploview (BARRETT et al. 2005) based on block definition of GABRIEL et al. (2002). Haplotypes and their frequencies in block regions were estimated using snphap (http://www-gene.cimr.cam.uk/clayton/software/).
Fst analysis:
Fst was calculated according to the Wright's F-statistic (WRIGHT 1951). Because Fst estimates for SNPs in LD are correlated (WEIR et al. 2005), here we selected SNPs with pairwise r2 all <0.20 from the CHB + JPT combined set.
Recombination analysis:
Phase 2.1 (LI and STEPHENS 2003; CRAWFORD et al. 2004) was used to analyze the shared data sets as described above.
Tagging analysis:
Tagging SNPs were selected from the full-density CHB and JPT maps (as well as the combined CHB + JPT map) using Tagger (DE BAKKER et al. 2005; http://www.broad.mit.edu/mpg/tagger/) in aggressive tagging mode (r2 or haplotype r2
0.80, minor allele frequency cutoff = 5%, and other settings at default value). Tagging efficiency was defined as n/nh, where nh is the number of tagging SNPs selected to cover the region and n is the total number of markers genotyped (KE et al. 2004b). The performance of those tagging SNPs was evaluated again using Tagger with the same settings by applying tags to the 90 Korean (KR) samples. A marker is "captured" if the pairwise r2 or haplotype r2 with the tags is
0.80.
| RESULTS |
|---|
|
|
|---|
|
Population difference and LD conservation:
The three ENCODE regions had a different level of average LD at a broad scale, with ENr131 being the lowest and ENr213 the highest (Figure 1a). LD patterns in all three regions were similar among Korean, Japanese, and Chinese in general although some differences were apparent in ENr213 and ENr321. Average Fst-values between Korean and Japanese and between Korean and Chinese were 0.0060, 0.0044, and 0.0062 and 0.0064, 0.0075, and 0.0095 for ENr131, ENr213, and ENr321, respectively. This observation indicates that the Korean population is very close to Japanese and Chinese populations in general and, perhaps, closer to Japanese than to Chinese. This observation was reflected in the sliding-window analysis showing more departures of Korean samples from Chinese samples in ENr213 and ENr321 than ENr131 (Figure 1a). The close relatedness of the haplotypes of the Korean, Japanese, and Chinese populations was further confirmed by phylogenetic analysis (supplemental Figure 2 at http://www.genetics.org/supplemental/) (AKEY et al. 2002).
|
|
|
Recombination rate:
The pattern of estimated recombination rate in ENr131 was very simple compared to the other two regions and remarkably similar between KR, JPT, and CHB (Figure 1e). In contrast, recombination rates of ENr213 and ENr321 regions were more varied across regions as well as among populations (Figure 1e). This was consistent with the broad view of LD and the primary reason for the highest similarity of block structure in ENr131 and less concordance among the three populations in the ENr213 and ENr321 regions although they had a higher average LD (Figure 1). In ENr213, the amplitude of recombination rate variations was smaller than in ENr131 and ENr321, in agreement with the fact that this region had the highest LD (Figure 1).
Transferability of tagging SNPs:
We selected tagging SNPs from CHB, JPT, or CHB + JPT samples and applied them to the Korean population. At the same marker densities, tagging efficiency was very similar among the three populations in ENr131 and ENr321 with some difference in ENr213 (Table 2). When tagging SNPs were selected from full-density sets of CHB, JPT, and CHB + JPT, higher tagging efficiencies were obtained as expected (4.5, 5.8, and 5.2 in ENr131; 7.0, 7.0, and 7.7 in ENr213; 5.8, 5.9, and 6.1 in ENr321). In all three regions, at least 80% of SNPs in the Korean samples can be captured by tagging SNPs selected from the CHB, JPT, and CHB + JPT HapMap samples (Figure 2). In ENr131 and ENr213, tagging SNPs selected from JPT were better at capturing SNPs in the Korean samples and more effective than those selected from CHB. Some additional benefit was also observed when tagging SNPs were selected from CHB + JPT than those selected from JPT alone (e.g., ENr213 and ENr321). Since higher tagging efficiency was observed for JPT and CHB + JPT than for CHB, the advantage of using JPT (particularly in ENr131) or CHB + JPT as reference for the Korean population was more apparent. The difference of JPT and CHB + JPT seems to correspond well to the local LD feature of a region. In ENr131, little variation of LD was observed between populations as well as within the Korean population (Figure 1a). As a result, JPT alone seems to be good enough to capture the variations in Korean samples. Whereas in ENr321 and especially in ENr213 more variations were observed between as well as within populations, and the combined CHB + JPT sets seems to perform better than JPT alone.
|
| DISCUSSION |
|---|
|
|
|---|
The human genome is known to be delimited by recombination into hotspot and coldspot regions (GABRIEL et al. 2002; PHILLIPS et al. 2003; CRAWFORD et al. 2004; MCVEAN et al. 2004). Coldspot regions usually correspond to haplotype blocks, whereas hotspots typically occur where haplotype blocks are expected to break down (GABRIEL et al. 2002; PHILLIPS et al. 2003). The recombination rate was very simple in ENr131 showing low LD, while recombination rates of ENr213 and ENr321 were more varied. In general, the patterns of LD and recombination are highly conserved between the Koreans and Chinese and Japanese. Having two related East Asian population HapMaps enables us to examine a region in close detail via comparison. If the Japanese and Chinese maps are highly concordant, as in ENr131, a Korean sample would likely share similar patterns of LD and recombination (Figure 1). On the other hand, if differences are observed between Japanese and Chinese maps, as in ENr213 and ENr321, a Korean sample might be expected to reveal greater variability within and between populations. This understanding could assist in interpreting disease associations in a particular region.
A more practical use of human genome variation maps is perhaps to design tagging SNPs for related populations in regional or genomewide association studies (JOHNSON et al. 2001; GIBBS et al. 2003; STRAM 2004). The present results also show that tagging SNPs selected from the Chinese and particularly from the Japanese samples are highly transferable to our Korean samples. Even in regions where differences were observed among the three groups, tagging SNPs from the Japanese performed at least as effectively as those from Korean samples. These observations suggest that the Japanese and Chinese HapMaps will be robust for the Korean population and serve as an important resource for the association and population studies of the Koreans and possibly other Asian populations.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
| LITERATURE CITED |
|---|
|
|
|---|
AHMADI, K. R., M. E. WEALE, Z. Y. XUE, N. SORANZO, D. P. YARNALL et al., 2005 A single-nucleotide polymorphism tagging set for human drug metabolism and transport. Nat. Genet. 37(1): 8489.[Medline]
AKEY, J. M., G. ZHANG, K. ZHANG, L. JIN and M. D. SHRIVER, 2002 Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12(12): 18051814.
BANSAL, A., D. VAN DEN BOOM, S. KAMMERER, C. HONISCH, G. ADAM et al., 2002 Association testing by DNA pooling: an effective initial screen. Proc. Natl. Acad. Sci. USA 99: 1687116874.
BARRETT, J. C., B. FRY, J. MALLER and M. J. DALY, 2005 Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21(2): 263265.
CRAWFORD, D. C., T. BHANGALE, N. LI, G. HELLENTHAL, M. J. RIEDER et al., 2004 Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat. Genet. 36(7): 700706.[CrossRef][Medline]
DALY, M. J., J. D. RIOUX, S. F. SCHAFFNER, T. J. HUDSON and E. S. LANDER, 2001 High-resolution haplotype structure in the human genome. Nat. Genet. 29(2): 229232.[CrossRef][Medline]
DE BAKKER, P. I., R. YELENSKY, I. PE'ER, S. B. GABRIEL, M. J. DALY et al., 2005 Efficiency and power in genetic association studies. Nat. Genet. 37(11): 12171223.[CrossRef][Medline]
DENOMME, G. A., and M. VAN OENE, 2005 High-throughput multiplex single-nucleotide polymorphism analysis for red cell and platelet antigen genotypes. Transfusion 45(5): 660666.[CrossRef]
DUDBRIDGE, F., and B. P. KOELEMAN, 2004 Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies. Am. J. Hum. Genet. 75(3): 424435.[CrossRef][Medline]
EVANS, D. M., L. R. CARDON and A. P. MORRIS, 2004 Genotype prediction using a dense map of SNPs. Genet. Epidemiol. 27(4): 375384.[CrossRef][Medline]
GABRIEL, S. B., S. F. SCHAFFNER, H. NGUYEN, J. M. MOORE, J. ROY et al., 2002 The structure of haplotype blocks in the human genome. Science 296: 22252229.
GIBBS, R. A., J. W. BELMONT, P. HARDENBOL, T. D. WILLIS, F. YU et al., 2003 The International HapMap Project. Nature 426: 789796.[CrossRef][Medline]
INTERNATIONAL HAPMAP CONSORTIUM, 2005 A haplotype map of the human genome. Nature 437: 12991320.[CrossRef][Medline]
JOHNSON, G. C., L. ESPOSITO, B. J. BARRATT, A. N. SMITH, J. HEWARD et al., 2001 Haplotype tagging for the identification of common disease genes. Nat. Genet. 29(2): 233237.[CrossRef][Medline]
KE, X., S. HUNT, W. TAPPER, R. LAWRENCE, G. STAVRIDES et al., 2004a The impact of SNP density on fine-scale patterns of linkage disequilibrium. Hum. Mol. Genet. 13(6): 577588.
KE, X., C. DURRANT, A. P. MORRIS, S. HUNT, D. R. BENTLEY et al., 2004b Efficiency and consistency of haplotype tagging of dense SNP maps in multiple samples. Hum. Mol. Genet. 13(21): 25572565.
LI, N., and M. STEPHENS, 2003 A new multilocus model for linkage disequilibrium, with application to exploring variations in recombination rate. Genetics 165: 22132233.
MCVEAN, G. A., S. R. MYERS, S. HUNT, P. DELOUKAS, D. R. BENTLEY et al., 2004 The fine-scale structure of recombination rate variation in the human genome. Science 304(5670): 581584.
MORTON, N. E., 2005 Linkage disequilibrium maps and association mapping. J. Clin. Invest. 115(6): 14251430.[CrossRef][Medline]
MUELLER, J. C., E. LOHMUSSAAR, R. MAGI, M. REMM, T. BETTECKEN et al., 2005 Linkage disequilibrium patterns and tagSNP transferability among European populations. Am. J. Hum. Genet. 76(3): 387398.[CrossRef][Medline]
NIELSEN, R., M. J. HUBISZ and A. G. CLARK, 2004 Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 168: 23732382.
PHILLIPS, M. S., R. LAWRENCE, R. SACHIDANANDAM, A. P. MORRIS, D. J. BALDING et al., 2003 Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nat. Genet. 33(3): 382387.[CrossRef][Medline]
PITTMAN, A. M., A. J. MYERS, P. ABOU-SLEIMAN, H. C. FUNG, M. KALEEM et al., 2005 Linkage disequilibrium fine-mapping and haplotype association analysis of the tau gene in progressive supranuclear palsy and corticobasal degeneration. J. Med. Genet. 42(11): 837846.
STRAM, D. O., 2004 Tag SNP selection for association studies. Genet. Epidemiol. 27(4): 365374.[CrossRef][Medline]
TERWILLIGER, J. D., F. HAGHIGHI and T. S. HIEKKALINNA, 2002 Goring HHH A biased assessment of the use of SNPs in human complex traits. Curr. Opin. Genet. Dev. 12(6): 726734.[CrossRef][Medline]
THOMAS, D., R. XIE and M. GEBREGZIABHER, 2004 Two-stage sampling designs for gene association studies. Genet. Epidemiol. 27(4): 401414.[CrossRef][Medline]
WEIR, B. S., L. R. CARDON, A. D. ANDERSON, D. M. NIELSEN and W. G. HILL, 2005 Measures of human population structure show heterogeneity among genome regions. Genome Res. 15: 14681476.
WRIGHT, S., 1951 The genetical structure of populations. Ann. Eugen. 15: 323354.
Communicating editor: R. W. DOERGE
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |