help button home button Genetics J Clin Inv
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Nachman, M. W.
Right arrow Articles by Hammer, M. F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nachman, M. W.
Right arrow Articles by Hammer, M. F.
Genetics, Vol. 167, 423-437, May 2004, Copyright © 2004

Nucleotide Variation at Msn and Alas2, Two Genes Flanking the Centromere of the X Chromosome in Humans

Michael W. Nachmana, Susan L. D'Agostinoa, Christopher R. Tillquistb, Zahra Mobasherb, and Michael F. Hammera,b
a Department of Ecology and Evolutionary Biology, Division of Biotechnology, University of Arizona, Tucson, Arizona 85721
b Genomic Analysis and Technology Core, Division of Biotechnology, University of Arizona, Tucson, Arizona 85721

Corresponding author: Michael W. Nachman, Biosciences West Bldg., University of Arizona, Tucson, AZ 85721., nachman{at}u.arizona.edu (E-mail)

Communicating editor: M. A. F. NOOR


*  ABSTRACT
*TOP
*ABSTRACT
*SUBJECTS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

The centromeric region of the X chromosome in humans experiences low rates of recombination over a considerable physical distance. In such a region, the effects of selection may extend to linked sites that are far away. To investigate the effects of this recombinational environment on patterns of nucleotide variability, we sequenced 4581 bp at Msn and 4697 bp at Alas2, two genes situated on either side of the X chromosome centromere, in a worldwide sample of 41 men, as well as in one common chimpanzee and one orangutan. To investigate patterns of linkage disequilibrium (LD) across the centromere, we also genotyped several informative sites from each gene in 120 men from sub-Saharan Africa. By studying X-linked loci in males, we were able to recover haplotypes and study long-range patterns of LD directly. Overall patterns of variability were remarkably similar at these two loci. Both loci exhibited (i) very low levels of nucleotide diversity (among the lowest seen in the human genome); (ii) a strong skew in the distribution of allele frequencies, with an excess of both very-low and very-high-frequency derived alleles in non-African populations; (iii) much less variation in the non-African than in the African samples; (iv) very high levels of population differentiation; and (v) complete LD among all sites within loci. We also observed significant LD between Msn and Alas2 in Africa, despite the fact that they are separated by ~10 Mb. These observations are difficult to reconcile with a simple demographic model but may be consistent with positive and/or purifying selection acting on loci within this large region of low recombination.


THE amount and distribution of genetic variation in human populations is a central issue in population genetics. With the completion of the human genome sequence (LANDER et al. 2001 Down; VENTER et al. 2001 Down), a major goal now is to identify variation among individuals and among populations. A detailed description of this variation will provide the necessary background for the design of efficient association studies to uncover genes involved in complex diseases. Studies of human molecular variation also shed light on the relative importance of different population genetic processes (e.g., mutation, drift, selection, and recombination) and thus provide clues to the mechanism of evolutionary change at the molecular level. Finally, patterns of nucleotide variation across the genome help reveal human evolutionary history, including relationships among major ethnic groups, patterns of migration and range expansions, and changes in population size.

Considerable work over the past decade has documented DNA sequence variation in humans. Early studies focused primarily on mitochondrial DNA (VIGILANT et al. 1991 Down) and the Y chromosome (HAMMER 1995 Down; WHITFIELD et al. 1995 Down; UNDERHILL et al. 2000 Down), while more recent single-locus studies have focused on the X chromosome (e.g., NACHMAN et al. 1998 Down; HARRIS and HEY 1999 Down; KAESSMANN et al. 1999 Down; NACHMAN and CROWELL 2000A Down; GILAD et al. 2002 Down; SAUNDERS et al. 2002 Down; VERRELLI et al. 2002 Down; YU et al. 2002 Down) and on the autosomes (e.g., HARDING et al. 1997 Down; CLARK et al. 1998 Down; RIEDER et al. 1999 Down; FULLERTON et al. 2000 Down; HAMBLIN and DI RIENZO 2000 Down; HARDING et al. 2000 Down; ZHAO et al. 2000 Down; ALONSO and ARMOUR 2001 Down; BAMSHAD et al. 2002 Down; ENARD et al. 2002 Down; TOOMAJIAN and KREITMAN 2002 Down; WOODING et al. 2002 Down). One of the clear results to emerge from this body of work is the substantial heterogeneity among genes in overall patterns of variation, including differences in the level of nucleotide diversity, the amount of linkage disequilibrium, and the frequency distribution of alleles. For example, LI and SADLER 1991 Down suggested 13 years ago that the average level of nucleotide heterozygosity is quite low in humans ({pi} = 0.1%), and subsequent work has largely confirmed this result (PRZEWORSKI et al. 2000 Down). However, it is clear that this average value masks substantial variation in levels of heterozygosity among different genes; some loci exhibit almost no variation (e.g., Xq13.3, {pi} = 0.036%; KAESSMANN et al. 1999 Down) while others exhibit variation more than four times higher than average (e.g., 16p13.3, {pi} = 0.46%; ALONSO and ARMOUR 2001 Down). A portion of these differences may be accounted for by the differences in effective population size associated with the autosomes, X chromosome, Y chromosome, or mitochondrial DNA; however, heterozygosity still varies by more than one order of magnitude once effective population size differences are taken into account (NACHMAN 2001 Down). Similarly, some regions of the genome exhibit nonrandom associations [i.e., linkage disequilibrium (LD)] between single-nucleotide polymorphisms (SNPs) over hundreds of kilobases (REICH et al. 2001 Down; SABETI et al. 2002 Down; SAUNDERS et al. 2002 Down), while other regions exhibit no associations over distances of <1 kb. Likewise, the distribution of allele frequencies differs significantly among loci (HEY 1997 Down). For example, mitochondrial loci and many nuclear loci harbor an excess (over neutral predictions) of low-frequency alleles (e.g., HEY 1997 Down; KAESSMANN et al. 1999 Down; NACHMAN and CROWELL 2000A Down; STEPHENS et al. 2001 Down; PTAK and PRZEWORSKI 2002 Down), while some nuclear genes show the opposite pattern, with an excess of intermediate-frequency alleles (HARDING et al. 1997 Down; HARRIS and HEY 1999 Down; BAMSHAD et al. 2002 Down).

These and other patterns of DNA sequence variation are context dependent in at least two important ways. First, the distribution of genetic variation is a property of populations and, as such, is expected to vary among populations with different histories. For example, REICH et al. 2001 Down report higher levels of LD in non-African compared with African populations. There is also considerable evidence suggesting that, in general, non-African populations harbor less genetic variation than African populations (e.g., VIGILANT et al. 1991 Down). For many loci, African populations harbor more rare alleles than non-African populations (WALL and PRZEWORSKI 2000 Down), although for some loci, the opposite pattern is seen (NACHMAN and CROWELL 2000A Down). Attempts to identify population-specific patterns were hampered initially by the lack of a common sampling scheme for the loci under comparison. More recently, however, several impressive studies have sampled multiple loci in a common set of individuals (e.g., FRISSE et al. 2001 Down; PATIL et al. 2001 Down; STEPHENS et al. 2001 Down; YU et al. 2002 Down; KITANO et al. 2003 Down; SEATTLE SNPs 2003). More studies of this sort will help disentangle locus-specific effects, such as selection, from population-specific effects or patterns that are a consequence of a particular sampling strategy.

A second way in which context is important is in the genomic position of genes. Different regions of the genome differ in many important attributes, including gene density, local rate of recombination, mutation rate, and base composition. With the human genome sequence in hand, we can now begin to quantify some of these parameters more precisely and ask how they influence patterns of genetic variation. For example, nucleotide heterozygosity is positively correlated with recombination rate and negatively correlated with gene density (PAYSEUR and NACHMAN 2002 Down), a result that is expected under different models invoking the joint effects of selection and linkage (MAYNARD SMITH and HAIGH 1974 Down; CHARLESWORTH et al. 1993 Down; GALTIER et al. 2000 Down). These models in turn make different predictions about the frequency distribution of alleles. Likewise, the mutation rate in mammalian genomes is likely to vary as a function of base composition. For example, mutation rates at CpG sites are ~10 times higher than the average as a consequence of deamination of 5-methylcytosine (COOPER and KRAWCZAK 1993 Down; SOMMER and KETTERLING 1996 Down; NACHMAN and CROWELL 2000B Down). Mutation rate appears to vary in a nonlinear way with overall GC content, and mutation rate (as reflected in interspecific divergence) is also positively correlated with both recombination rate and SNP density (LERCHER and HURST 2002 Down; WATERSTON et al. 2002 Down; HARDISON et al. 2003 Down; HELLMANN et al. 2003 Down).

To understand the determinants of nucleotide variation in humans, we have initiated a long-term study of DNA sequence polymorphism in different regions of the human genome in a common set of samples (NACHMAN and CROWELL 2000A Down; SAUNDERS et al. 2002 Down; HAMMER et al. 2003 Down). Our sample includes 41 individuals, with 10 each from Africa, Europe, and the Americas, and 11 from Asia. Much of our effort is focused on the X chromosome, which, because of its hemizygosity in males, allows us to study long-range patterns of linkage disequilibrium. Here, we report on two genes located on either side of the X centromere, Msn and Alas2 (Fig 1). Both genes are situated in regions of very low recombination and low gene density. Patterns of variation are remarkably similar at these two loci: we found overall low levels of nucleotide heterozygosity, an excess of rare alleles, considerably less variation in the non-African samples than in the African sample, and no evidence for intragenic recombination.



View larger version (30K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. Map of the human X chromosome. Genes for which published polymorphism data are available are shown above the chromosome. The centromeric region flanked by Msn and Alas2 is shown below the chromosome, with all known genes in this 10-Mb region. Open boxes immediately above Alas2 and Msn indicate the regions that have been sequenced in this study.


*  SUBJECTS AND METHODS
*TOP
*ABSTRACT
*SUBJECTS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Samples:
Forty-one men were chosen for the initial sequencing of Msn and Alas2 (see below), including 10 from Africa, 10 from Europe, 11 from Asia (including 1 from Melanesia), and 10 from the Americas. Human genomic DNAs were isolated from lymphoblastoid cell lines established by the Y CHROMOSOME CONSORTIUM 2002 Down at the New York Blood Center from blood donated by volunteers who gave informed consent. To measure LD in a larger sample, we also resequenced ~750 bp from each gene to capture several informative sites in 110 men from sub-Saharan Africa (29 South African Bantu speakers, 1 Biaka, 13 Cameroonians, 21 Gambians, 32 Khoisan, 1 Mbuti, and 13 Tanzanians). All sampling protocols were according to procedures approved by the New York Blood Center and University of Arizona Human Subjects Committees. A single male common chimpanzee (Pan troglodytes) and a single male orangutan (Pongo pygmaeus) were also surveyed from DNAs provided by O. A. Ryder.

PCR amplification and sequencing of Msn and Alas2:
A map of the centromeric region of the human X chromosome is shown in Fig 1. Msn (moesin, membrane-organizing extension spike protein) and Alas2 (aminolevulinate, delta-, synthase 2) are separated by ~10 Mb of DNA in the assembled sequence of the human genome (LANDER et al. 2001 Down). The exact distance between these loci is uncertain because of incomplete sequence assembly across the centromere of the X chromosome. Both Msn and Alas2 lie in genomic regions experiencing low rates of recombination (<1 cM/Mb; PAYSEUR and NACHMAN 2000 Down; YU et al. 2001 Down; KONG et al. 2002 Down). DNA was PCR amplified in 25-µl volumes with 40 cycles of 94° 1 min, 55° 1 min, and 72° 2 min. Amplification primers were designed from published sequence for Msn exon 2 and intron 2 (GenBank accession no. Z98946) and for Alas2 from introns 8 and 10 (GenBank accession no. AF068624). Products were cycle sequenced on both strands and run on an ABI 377 automated sequencer. For Msn, a total of 4578 bp was sequenced in our worldwide sample of 41 individuals, entirely from intron 2 (the first base in our sequence corresponds to the first base of intron 2 in GenBank accession Z98946). For Alas2, portions of introns 8 and 10, and all of exon 9, intron 9, and exon 10, were sequenced in our worldwide sample of 41 individuals; of the Alas2 sequence, 4697 bp represent introns. To investigate LD in our sample of 120 Africans, we resequenced nucleotides 664–1413 of Msn (750 bp) and 2895–3682 of Alas2 (789 bp). Sequences have been submitted to GenBank under accession nos. AY530963, AY530964, AY530965, AY530966, AY530967, AY530968, AY530969, AY530970, AY530971, AY530972, AY530973, AY530974, AY530975, AY530976, AY530977, AY530978, AY530979, AY530980, AY530981, AY530982, AY530983, AY530984, AY530985, AY530986, AY530987, AY530988, AY530989, AY530990, AY530991, AY530992, AY530993, AY530994, AY530995, AY530996, AY530997, AY530998, AY530999, AY531000, AY531001, AY531002, AY531003, AY531004, AY531005 (Msn) and AY532068, AY532069, AY532070, AY532071, AY532072, AY532073, AY532074, AY532075, AY532076, AY532077, AY532078, AY532079, AY532080, AY532081, AY532082, AY532083, AY532084, AY532085, AY532086, AY532087, AY532088, AY532089, AY532090, AY532091, AY532092, AY532093, AY532094, AY532095, AY532096, AY532097, AY532098, AY532099, AY532100, AY532101, AY532102, AY532103, AY532104, AY532105, AY532106, AY532107, AY532108, AY532109 and AY532641 (Alas2).

Data analysis:
Sequences were aligned by eye, and the numbers and frequencies of all polymorphisms were counted. Two measures of nucleotide variability were calculated: {pi} (NEI and LI 1979 Down) and {theta} (WATTERSON 1975 Down). Nucleotide diversity, {pi}, is based on the average number of nucleotide differences between two sequences randomly drawn from a sample, and {theta} is based on the proportion of segregating sites in a sample. Under neutral equilibrium conditions, both {pi} and {theta} estimate the parameter 3Neµ for X-linked loci, where Ne is the effective population size and µ is the neutral mutation rate. Departures from a neutral steady-state frequency distribution of polymorphisms were evaluated using three approaches (TAJIMA 1989 Down; FU and LI 1993 Down; FAY and WU 2000 Down). TAJIMA's (1989) test compares the average number of nucleotide differences between sequences ({pi}) with the proportion of polymorphic sites ({theta}) in a sample; FU and LI's (1993) test is based on the number of singletons in a sample; and FAY and WU's (2000) test is based on the number of high-frequency-derived polymorphic nucleotides in a sample. Both TAJIMA's (1989) and FU and LI's (1993) tests may reject the null model because of selection or because of demographic processes (such as a population bottleneck); however, FAY and WU's (2000) test is unlikely to reject the null model except in cases where selection is operating (but see also PRZEWORSKI 2002 Down). Linkage disequilibrium (D') was calculated for a set of independent pairwise comparisons between nonunique polymorphic sites (LEWONTIN 1964 Down, LEWONTIN 1995 Down), and the significance of D' was assessed using Fisher's exact tests (FET). Ratios of polymorphism within humans to divergence between human and chimpanzee were compared with expectations under a neutral model using the Hudson-Kreitman-Aguadé (HKA) test (HUDSON et al. 1987 Down). Polymorphism was based on variation segregating among the 41 human alleles and divergence was based on a single randomly chosen human allele and a single chimpanzee allele. The program Genetree v 9.0 (BAHLO and GRIFFITHS 2000 Down) was used to infer the root of the tree and to estimate the time to the most recent common ancestor (TMRCA). Analysis of molecular variance (AMOVA) was used to infer population structure (EXCOFFIER et al. 1992 Down).


*  RESULTS
*TOP
*ABSTRACT
*SUBJECTS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Levels of polymorphism and divergence:
A total of ~9.6 kb was sequenced from Msn and Alas2 in a sample of 41 globally dispersed humans (Fig 1). Because all of the sequence from Msn is from introns, we have excluded the short exon sequences from Alas2 in all the analyses that follow. This increases the likelihood that all comparisons are among genomic regions experiencing similar levels of selective constraint. Thus, most analyses and discussion refer only to the 9281 bp of intron sequences (Table 1). Polymorphic sites for Msn and Alas2 introns are shown in Table 2. Numbers of segregating sites, nucleotide diversity, measures of the distribution of allele frequencies, and levels of divergence are summarized in Table 1 for the complete data set of 41 individuals. Nine segregating sites were observed in Msn, while 7 segregating sites and two single-base insertion-deletion polymorphisms were observed in Alas2. Msn also had a variable poly(A) tract ranging in length from 17 to 25 bp. Nucleotide diversity was low at both Msn ({pi} = 0.00035) and Alas2 ({pi} = 0.00015), as was Watterson's {theta} (0.00046 and 0.00035, respectively).


 
View this table:
[in this window]
[in a new window]

 
Table 1. Nucleotide polymorphism and divergence at Msn and Alas2


 
View this table:
[in this window]
[in a new window]

 
Table 2. Individual samples and polymorphic sites at MSN and ALAS2

Divergence between humans and chimpanzee for both Msn (0.0092) and Alas2 (0.0055) was comparable to results from previous studies of X-linked introns (average divergence for seven loci = 0.0072; NACHMAN et al. 1998 Down), suggesting that the neutral mutation rate at these loci is similar to genomic average values (~2 x 10–8/site/generation; NACHMAN and CROWELL 2000B Down). Nonetheless, the divergence at Msn was >50% higher than the divergence at Alas2. Similarly, levels of polymorphism at Msn were higher than levels of polymorphism at Alas2. The fact that Msn is more variable than Alas2 both within and between species is consistent with a higher mutation rate (or lower level of constraint, or both) for Msn compared to Alas2.

We compared levels of polymorphism and divergence at Msn and Alas2 to levels of polymorphism and divergence at two other X-linked loci using the HKA test (HUDSON et al. 1987 Down) to test the neutral prediction that these ratios should be the same. Polymorphism data include SNPs in humans, and divergence is based on a randomly chosen allele from humans and a randomly chosen allele from chimpanzees. DmdI44 (NACHMAN and CROWELL 2000A Down) and Pdha1 (HARRIS and HEY 1999 Down) were chosen as reference loci in these comparisons because both loci reside in genomic regions with moderate to high rates of recombination and thus should be relatively free of the effects of selection at linked sites. DmdI44 was surveyed in the same set of individuals used in the present study. In the total sample, Msn and Alas2 showed marginally significantly lower variation than expected (0.04 < P < 0.10) relative to both DmdI44 and Pdha1 (Table 3). The combined data from Msn and Alas2 showed significantly lower variation than expected relative to both DmdI44 and Pdha1 (P = 0.03; Table 3). In the African sample alone, Msn showed significantly lower variation than expected while Alas2 did not; in the non-African sample, both Msn and Alas2 showed low variation relative to DmdI44 but not relative to Pdha1 (Table 3). This is consistent with the known low level of variation at Pdha1 in non-African populations (HARRIS and HEY 1999 Down).


 
View this table:
[in this window]
[in a new window]

 
Table 3. HKA test results comparing Msn and Alas2 to DmdI44 and Pdha1

Frequency distribution of polymorphisms:
The frequency distribution of all polymorphisms is plotted in Fig 2. The ancestral state of each polymorphic site was inferred by comparison with the chimpanzee and orangutan sequences, and the frequency of the derived state is shown for each polymorphism. The distribution is characterized by a large number of both low-frequency- and high-frequency-derived polymorphisms. We compared the observed distribution with the distribution expected under the standard neutral model using Tajima's D, Fu and Li's D, and Fay and Wu's H tests (Table 1 and Table 4). In the total sample, all three of these tests take on negative values for Msn alone, Alas2 alone, and for the combined data (Msn + Alas2), and most of these values are either significant or marginally significant (Table 1). When different geographic regions are considered separately, all three test statistics are ~0 in Africa, consistent with a neutral model, but are strongly negative in non-African populations (Table 4). Thus, much of the deviation observed in the total sample appears to be due to deviations largely in the non-African populations. The direction of the deviation is consistent with positive selection, a population expansion, background selection (depending on the strength of selection), and/or some form of population structure (see DISCUSSION).



View larger version (16K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. The frequency distribution of polymorphisms at Msn and Alas2 among 41 humans. Both SNPs and indels are included. The frequency of the derived state is shown, with polarity assessed in comparison to the chimpanzee and orangutan.


 
View this table:
[in this window]
[in a new window]

 
Table 4. Amount and distribution of polymorphisms at Msn and Alas2 by geographic region

Linkage disequilibrium:
Complete linkage disequilibrium was observed among all sites within Msn and among all sites within Alas2. For example, when all pairwise comparisons were made among nonsingleton segregating sites, none of the comparisons between pairs of sites in Msn (N = 10) or Alas2 (N = 5) contained all four gametic types (i.e., D' = 1 in all cases). LD was also observed between Msn and Alas2; none of the 20 comparisons between pairs of sites across Msn and Alas2 contained all four gametic types. We tested the significance of LD by comparing pairs of sites in order along the chromosome; this provides a set of statistically independent comparisons for tests of significance (LEWONTIN 1995 Down). We excluded singletons and doubletons and compared the following seven sites in order: M1046, M1312, M1414, M4325, A2144, A3203, and A3416. Significant linkage disequilibrium was observed in each of the six sequential comparisons involving these sites (FET, P < 0.01 for each, after Bonferroni correction for multiple tests). To see if this degree of LD was driven by the differences between African and non-African samples (see Geographic variation), we conducted the same analysis on the African sample alone. Despite the small sample size (n = 10), significant linkage disequilibrium was observed in each of the sequential comparisons involving polymorphic sites in Africa alone (FET, P < 0.05 for each). This analysis could not be performed on the non-African sample alone because of the absence of intermediate-frequency polymorphisms.

We were surprised to find LD between Msn and Alas2 in Africa because these genes are separated by ~10 Mb. For example, in a study of 19 genomic regions using an African sample from Nigeria, REICH et al. 2001 Down found that LD typically decayed to half of its maximum value within 5 kb. Our African sample included only 10 individuals. To explore the possibility that recombinant haplotypes are present in Africa and to better document LD between Msn and Alas2, we genotyped an additional 110 African individuals for four informative sites: Msn 1046, Msn 1312, Alas2 3203, and Alas2 3416. To genotype these SNPs, we PCR amplified and sequenced ~750 bp from each gene. Table 5 shows all of the polymorphisms among these 120 Africans (the original 10 plus 110 new individuals). Three observations are noteworthy. First, D' = 1 between sites within each gene in the total sample of 120 Africans. Second, in comparisons between Msn and Alas2, we observed all four gametic types at appreciable frequencies, suggesting that the absence of some haplotypes in the smaller set of 10 individuals (Table 2) was simply a consequence of the small sample size. Third, despite the presence of these new haplotypes in the larger sample, we observed significant LD between sites at Msn and sites at Alas2 (Msn 1046–Alas2 3203, FET P = 0.01, D' = 0.58; Msn 1046–Alas2 3416, FET P = 0.004, D' = 0.86). The difference in D' between our sample of 10 (D' = 1) and our sample of 120 (D' = 0.58) in comparisons between Msn 1046 and Alas2 3203 highlights the importance of using large samples to make inferences concerning LD.


 
View this table:
[in this window]
[in a new window]

 
Table 5. Haplotypes defined by variant sites in subregions of Msn and Alas2 and their frequencies among 120 African individuals

The LD in this data set can also be seen in the phylogenetic analysis. Using parsimony, the 18 human polymorphisms in Table 2 were mapped onto a single shortest tree of length 19 (Msn site 245 includes two mutations resulting in three segregating nucleotides; Fig 3A). In this tree, there are two equally parsimonious placements of the root. A haplotype network of the 120 Africans based on subregions of Msn and Alas2 is shown in Fig 3B. The reticulation in Fig 3B is consistent with the recombinant haplotypes in Table 5. In this network, there is a single most parsimonious placement of the root between haplotypes G and H. Two alternative evolutionary hypotheses for the evolution of the six African haplogroups (C–H) are shown in Fig 4; both hypotheses involve four mutations and one recombination event.



View larger version (31K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3. (A) Haplotype network based on 9281 bp combined from Msn and Alas2 in a worldwide sample of 41 individuals. Mutations are indicated on branches and correspond to the numbers in Table 2. The size of each circle is proportional to the frequency of the haplotype in the total sample. (B) Haplotype network for 120 African individuals based on nucleotide sites Msn 664–1413 (750 bp) and Alas2 2895–3682 (789 bp).



View larger version (7K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4. Alternative evolutionary hypotheses for the origin of the six major African haplogroups. Both hypotheses require four mutations and one recombination event.

Geographic variation:
The geographic distribution of nucleotide variation at Msn and Alas2 is shown in Table 4. For both genes, nucleotide diversity is substantially lower in the non-African than in the African samples. The distribution of haplotypes in the combined Msn and Alas2 data (N = 41, Table 2 and Fig 3A) is quite different in the African and non-African samples. In Africa, haplotypes are present in only one or two individuals; in the non-African sample, a single very common haplotype (A1) is shared among 25 of 31 individuals (Fig 3A). This haplotype, represented by the consensus sequence in Table 2, is present in each of the non-African continents surveyed. The tree in Fig 3A is largely split into an African clade and a non-African clade. One exception to this pattern is a single Native American [Y Chromosome Consortium (YCC) 27] with a Msn haplotype otherwise found only in Africa. This same individual also contained a polymorphism at DmdI7 otherwise found only in Africa (NACHMAN and CROWELL 2000A Down). Both observations are consistent with recent admixture between the Poarch Creek and African Americans. Another individual (YCC 26) from the United Kingdom showed two polymorphisms (Msn 1414 C and Msn 4325 T) otherwise found only in Africa. In our sample of 120 Africans, 115 individuals had a "C" at site Msn 1414 while only 5 individuals had a "T" at this site. This is largely consistent with the division of the tree in Fig 3A into a mostly African clade and a mostly non-African clade. It is interesting to note that, as recently observed for another X-linked locus (NACHMAN and CROWELL 2000A Down), Native Americans have more diversity than Asians at Msn. This is still true even when the Poarch Creek chromosome is removed from the analysis.

The differentiation between Africans and non-Africans was reflected in patterns of within- and between-group variation in an AMOVA. When the four population samples (i.e., from each continent) were clustered in a single group, {Phi}ST was 0.45 (Table 6). When populations were divided into Africans and non-Africans, the {Phi}ST value increased to 0.63. Interestingly, 100% of this between-group variation was partitioned between Africans and non-Africans (e.g., {Phi}CT = 0.65 and {Phi}SC = –0.06; Table 6). The difference between Africans and non-Africans at Msn + Alas2 is greater than the level of differentiation observed for other loci on the X or Y chromosomes sampled in these same individuals (Table 6).


 
View this table:
[in this window]
[in a new window]

 
Table 6. AMOVA showing population differentiation for a set of loci sampled in the same individuals

The network in Fig 3A contains three major haplogroups (C–E) in sub-Saharan Africa. Haplogroup C was found in a single Pygmy, haplogroup D (D1–D5) was found in South African Bantu speakers and Pygmies (as well as in the Poarch Creek; see above), and haplogroup E (E1–E3) was found in Khoisan and in Pygmies. Thus, Pygmies exhibit the highest level of diversity in this small sample of sub-Saharan Africans.

Comparisons with the chimpanzee sequence place the root of the tree in Fig 3A either on the branch between haplogroups C and D or on the branch between haplogroups C and E; these alternative roots are equally parsimonious in our sample of 41 individuals. The larger sample of 120 African individuals places the root between haplogroups G and H (Fig 3B). In both cases, African samples occur on both sides of the deepest node in the tree. Thus, Africa is the most likely location of the ancestral Msn-Alas2 sequence. We used two approaches to estimate the time to the most recent common ancestral Msn-Alas2 sequence. First, we calculated the average number of differences between the root of the tree in Fig 3A and each haplotype and multiplied this by two. This reflects the average distance between haplotypes through the root of the tree. We assumed a human-chimpanzee divergence of 6 million years and calculated the ratio of the average distance between haplotypes through the root of the human tree to the human-chimpanzee Msn-Alas2 divergence. The average number of mutations between sequences across the base of the human gene tree was 6.2, while divergence between the human consensus and the chimpanzee sequences was 73. This leads to an estimated time of human sequence divergence of 510,000 years. This time estimate is slightly larger than the one generated from the TMRCA obtained from maximum-likelihood simulations (BAHLO and GRIFFITHS 2000 Down). The maximum-likelihood estimate for the TMRCA was 1.2 x 1.5N generations (assuming a panmictic population of constant size). Substituting a value of N = 10,000 (HAMMER 1995 Down) and assuming 20 years/generation, the TMRCA was 360,900 years ± 98,200 years.


*  DISCUSSION
*TOP
*ABSTRACT
*SUBJECTS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

We investigated the levels and patterns of nucleotide variation in noncoding sequences from two genes mapping on either side of the X chromosome centromere in a worldwide sample of 41 humans. These genes are located ~10 Mb apart and both lie in genomic regions with low rates of recombination and low gene density. We were interested in exploring the effects of this "genomic context" on patterns of variation at these genes. Two major results emerge from this study. First, we found significant linkage disequilibrium across the X chromosome centromere. Second, levels and patterns of variation at both genes show significant departures from a standard neutral model of evolution. We discuss each of these in turn.

Linkage disequilibrium across the centromeric region of the X chromosome:
We observed significant LD within Msn, within Alas2, as well as significant LD between these genes. LD was seen in our worldwide sample of 41 individuals (containing only 10 Africans), and it was also seen in our sample of 120 African individuals.

It is instructive to compare the LD seen in this study to the LD observed at Dmd and G6pd, two other X-linked loci that have been surveyed in these same 41 individuals. For example, at DmdI44, recombinants (i.e., all four gametic types) are seen between nucleotides separated by <200 bp. Dmd lies in a genomic region with high rates of recombination (>2 cM/Mb), and this likely contributes to the difference in LD seen between Dmd and Msn + Alas2. G6pd lies in Xq28 in a region of moderate recombination (~1–2 cM/Mb). Several mutations at G6pd are known to confer resistance to malaria and thus are under selection in regions of the world where malaria is common. In Africa, the G6pd A– allele is in linkage disequilibrium with mutations at L1cam, a locus situated ~500 kb from G6pd, and this long-range LD is likely caused by selection acting on the G6pd A– allele (SABETI et al. 2002 Down; SAUNDERS et al. 2002 Down). The distance between Msn and Alas2 over which significant LD is seen is ~10 times greater than the LD seen near G6pd, a locus known to have unusually high LD as a consequence of selection. Because these estimates come from the same sample of individuals, the differences in LD among loci are unlikely to be due to population-level effects or to sampling strategy.

It is also useful to compare the LD seen in this study to the LD observed in other samples. REICH et al. 2001 Down studied the decay of LD throughout the genome in both African and non-African populations. By measuring D' between SNPs at regular intervals, they were able to measure the "half-length" of LD (the distance at which the average D' drops below 0.5) for 19 genomic regions spanning a range of recombination rates. The average half-length of LD was 60 kb in a population of European ancestry and was <5 kb in a Yoruban population from Nigeria. This supported earlier suggestions that LD was generally lower in African than in non-African populations (e.g., TISHKOFF et al. 1996 Down). In contrast, we observed D' > 0.5 among 120 Africans between SNPs separated by ~10 Mb.

What is the cause of the high LD near the centromere of the X chromosome? Because this amount of LD is not seen at other loci sampled in the same individuals, nor in other samples from the same general geographic regions, it is unlikely to be due to population-level effects such as a bottleneck or admixture. Two other factors may contribute to the observed LD. The first is simply that the centromeric region of the X chromosome experiences low rates of recombination. Estimates of recombination rate in this region are on the order of 0.1–0.6 cM/Mb (YU et al. 2001 Down; KONG et al. 2002 Down; PAYSEUR and NACHMAN 2000 Down), suggesting that Msn and Alas are separated by 1–6 cM. While the overall low level of recombination likely contributes to the observed patterns, LD over 1–6 cM is still highly unusual in the human genome (HUTTLEY et al. 1999 Down) and is otherwise unknown in African populations, except in cases where selection is known to be acting (SABETI et al. 2002 Down; SAUNDERS et al. 2002 Down). The second factor that may contribute to LD in our data is natural selection. Either positive or purifying selection at linked sites could increase the level of LD, as discussed below.

Rejection of the standard neutral model:
Several observations suggest that patterns of variation at both Msn and Alas2 are not consistent with the standard neutral model. First, there are low levels of variability at both loci despite typical levels of divergence. In the total data set (Msn + Alas2, N = 41), the HKA test rejects the null model in comparison with two other X-linked loci, DmdI44 and Pdha1, chosen because they reside in genomic regions with above-average rates of recombination and thus are likely to be free of the effects of selection at linked sites (Table 3). The inference of significantly lower variation at Msn + Alas2 comes with two caveats. One is that we have performed multiple HKA tests but have not corrected the significance level for multiple comparisons; thus the reduction should be interpreted as modest. The other is that the significance of the HKA test, and the subsequent inference of selection on particular loci, depends on the choice of reference loci. For example, compared with other loci showing little variation (such as Xq13.3, KAESSMANN et al. 1999 Down; or F9, HARRIS and HEY 2001 Down), neither Msn nor Alas2 rejects the null model in an HKA test (data not shown). Table 7 lists X-linked loci in humans for which polymorphism data are available from large samples. Msn and Alas2 are among the least variable loci surveyed on the X chromosome. Considering Watterson's {theta}, Alas2 is the least variable locus, and Msn is the third least variable locus. Most other genes that show low variability, such as F9 (HARRIS and HEY 2001 Down), Mao-a (GILAD et al. 2002 Down), and G6pd (SAUNDERS et al. 2002 Down), are believed to be influenced by selection. Thus, even though the P-values associated with the HKA tests in Table 3 are not strongly significant, it is true that Msn and Alas2 are among the least variable genes in the human genome (NACHMAN 2001 Down).


 
View this table:
[in this window]
[in a new window]

 
Table 7. Nucleotide variability at X-linked genes in humans

In addition to the reduction in variability at Msn and Alas2, we observed a significant skew in the distribution of allele frequencies, particularly in non-African populations. This is seen in the negative values for Tajima's D, Fu and Li's D, and Fay and Wu's H statistics (Table 1 and Table 4). These observations are certainly consistent with selection, but may also be consistent with some demographic explanations. For example, a population expansion is expected to lead to negative values of Tajima's D and Fu and Li's D and may help account for the values in Table 4. Fay and Wu's H, which is based on the frequency of derived polymorphic nucleotides, is not expected to reject the null model under a simple population expansion (FAY and WU 2000 Down). However, some forms of population structure may lead to significantly negative values of this test statistic even in the absence of selection (WAKELEY and ALIACAR 2001 Down; PRZEWORSKI 2002 Down). For example, PRZEWORSKI 2002 Down showed that Fay and Wu's H might lead to a rejection under a symmetric two-island model with moderate migration even if individuals are sampled from only one of the populations. In our combined Msn + Alas2 data, Fay and Wu's H is significantly negative in the non-African but not in the African sample (Table 4). This result is entirely driven by three polymorphisms at Msn: 1312, 1414, and 4325. At each of these sites, the ancestral nucleotide is common or fixed in Africa and is represented in the non-African sample in only two individuals (YCC 26 and YCC 27), one of whom may be partially of African origin (see above). Thus it is possible that the significant Fay and Wu's H test derives in part from admixture of African haplotypes in non-African populations. A similar effect of admixture may contribute to the negative values of Tajima's D in non-African populations.

A third unexpected observation is the long-range LD seen between Msn and Alas2. As discussed above, it is not easy to account for this by any simple model of population structure, since LD is seen in the total data set and in the African sample alone. It is also not easy to account for this by the reduced recombination rate in this genomic region, since the total genetic distance between Msn and Alas2 is 1–6 cM.

Finally, we observed a very high level of population structure in our data, mostly driven by differences between African and non-African samples. Two sites show a nearly fixed difference between Africa and the rest of the world (Msn 1414 and Msn 4325), and the resulting {Phi}ST for the combined data is 0.45, a value greater than that for other loci surveyed in these individuals (Table 6). AKEY et al. 2002 Down estimated FST for 26,530 SNP markers throughout the genome sampled in 42 East Asians, 42 African Americans, and 42 European Americans. The mean FST for these data was 0.123, and ~6% of the markers had FST >= 0.40. Thus Msn + Alas2 show more population structure than most loci in the genome. However, autosomal loci, which constitute most of the data in AKEY et al. 2002 Down, are typically expected to show less population structure than X-linked loci because of differences in population size.

The standard neutral model is based on a population of constant size at mutation-drift equilibrium and in principle may be rejected because of selection, population processes, or both. At Msn and Alas2 we observe low variability, a skew in the frequency spectrum, high LD, and high {Phi}ST. Can we distinguish selection from demography as the cause of these patterns? One standard approach for distinguishing population-level processes (or artifacts of sampling) from locus-specific effects is to compare multiple loci, ideally sampled in the same set of individuals. Viewed in the context of other X-linked loci (Table 6 and Table 7), Msn and Alas2 are unusual in many but not all respects. Msn and Alas2 show less variability than most loci and greater LD than virtually all loci. They show a strong skew in the frequency distribution with an excess of rare variants, but this is also seen at a handful of other loci. Likewise, they show considerable population structure, but this too is seen at some other loci. It appears difficult to reconcile a single demographic model with this combination of results. For example, while a population expansion out of Africa coupled with subsequent migration might explain the negative values for Tajima's D, Fu and Li's D, and Fay and Wu's H, it does not account for the unusually high levels of LD both in the total sample and in Africa nor does it account for the significantly lower variability at Msn and Alas2 but not in other genes sampled in these same individuals.

On the other hand, many of our observations for Msn and Alas2 are consistent with the action of selection at linked sites near the centromere of the X chromosome. The combination of low variability, a skew in the frequency spectrum, high LD, and high {Phi}ST could be explained by background selection, positive directional selection, or some combination of these processes.

Background selection (CHARLESWORTH et al. 1993 Down) has the effect of reducing the effective population size of the chromosomal region in question. This reduced population size will lead to reduced levels of variation, increased levels of LD, and increased levels of differentiation among populations. In addition, when selection is weak, background selection can lead to a skew in the distribution of allele frequencies with an excess of rare alleles (CHARLESWORTH et al. 1993 Down). The overall low levels of variability at Msn and Alas2 and the difference in the number of haplotypes seen in African and non-African populations might be consistent with background selection coupled with a founder effect out of Africa.

Likewise, genetic hitchhiking (MAYNARD SMITH and HAIGH 1974 Down; KAPLAN et al. 1989 Down) can lead to reduced levels of variation and a skew in the distribution of allele frequencies with an excess of rare variants (BRAVERMAN et al. 1995 Down). Positive selection can lead to increased population differentiation either as a consequence of local adaptation or through the fixation of the same beneficial allele in different subpopulations with low levels of gene flow (SLATKIN and WIEHE 1998 Down). Positive selection can increase levels of LD in several ways. For example, if a selective sweep is partial or is geographically restricted, LD may be generated among linked sites on the selected chromosome (e.g., STEPHAN et al. 1998 Down; SABETI et al. 2002 Down; SAUNDERS et al. 2002 Down). Presumably, complete selective sweeps can also generate LD near the target of selection simply as a consequence of a localized reduction in population size, although this has not been studied in detail (PRITCHARD and PRZEWORSKI 2001 Down). It is important to bear in mind that effects of selection at linked sites are not expected to extend far unless selection is strong. For example, the probability of a linked neutral site escaping hitchhiking is high when the selected and neutral sites are separated by more than (0.1)s/c bp, where s is the selection coefficient and c is the recombination rate per base (KAPLAN et al. 1989 Down). Thus, if c = 10–9 for the centromeric region of the X chromosome, and s = 0.1, sites >10 Mb away are likely to have recombined off of a selected chromosome. If positive selection is responsible for the observed patterns, the exact nature of this selection is not clear from our data. Significant rejections of the neutral model are seen for both African and non-African subsets of the data, although the patterns of variation in these two subsets of the data clearly differ. One straightforward explanation for the observation of multiple haplogroups in Africa and a single lineage out of Africa is a partial selective sweep of the common haplogroup (A) in non-African populations. Such a sweep could have occurred concomitant with or following the movement of anatomically modern humans out of Africa. However, selection could also be responsible for the significant reduction of variation in Africa (Table 3) as well as the significant LD in Africa, since these features are not seen at other loci.

Two recent genomic scans for selection, in each case based on different data and approaches, suggest that positive selection has acted recently near the X chromosome centromere (AKEY et al. 2002 Down; PAYSEUR et al. 2002 Down). AKEY et al. 2002 Down estimated FST for >26,000 SNPs to identify genomic regions with unusually high levels of population differentiation that might be indicative of selection acting differently in Asia, Africa, or Europe. PAYSEUR et al. 2002 Down compared observed and expected distributions of allele frequencies for >5000 microsatellites in a population of European origin to identify genomic regions with an excess of rare alleles that might be indicative of directional selection. Both studies identified the centromeric region of the X chromosome as containing multiple markers showing the signature of positive selection.

The observations presented here are difficult to reconcile with a simple demographic model. However, numerous aspects of our data seem consistent with both background selection and hitchhiking models, and we emphasize that both processes may be important. In principle, it might be possible to distinguish between them by surveying microsatellite variation in this region of the X chromosome. Background selection predicts a reduction in levels of variability, even for markers with high mutation rates such as microsatellites, while genetic hitchhiking only predicts reduced variation at microsatellites for very recent selective sweeps (SLATKIN 1995 Down; WIEHE 1998 Down; PAYSEUR and NACHMAN 2000 Down).


*  ACKNOWLEDGMENTS

We thank the members of the Nachman and Hammer labs for useful discussions and comments on the manuscript. We also thank M. Noor and two anonymous reviewers for comments. This work was supported by the National Science Foundation.

Manuscript received December 25, 2003; Accepted for publication January 29, 2004.


*  LITERATURE CITED
*TOP
*ABSTRACT
*SUBJECTS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

AKEY, J. M., G. ZHANG, K. ZHANG, L. JIN, and M. D. SHRIVER, 2002  Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12:1805-1814.[Abstract/Free Full Text]

ALONSO, S. and J. A. ARMOUR, 2001  A highly variable segment of human subterminal 16p reveals a history of population growth for modern humans outside Africa. Proc. Natl. Acad. Sci. USA 98:864-869.[Abstract/Free Full Text]

BAHLO, M. and R. C. GRIFFITHS, 2000  Inference from gene trees in a subdivided population. Theor. Popul. Biol. 57:79-95.[CrossRef][Medline]

BAMSHAD, M. J., S. MUMMIDI, E. GONZALEZ, S. S. AHUJA, and D. M. DUNN et al., 2002  A strong signature of balancing selection in the 5' cis-regulatory region of CCR5. Proc. Natl. Acad. Sci. USA 99:10539-10544.[Abstract/Free Full Text]

BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995  The hitchhiking effect on the site frequency spectrum of DNA polymorphims. Genetics 140:783-796.[Abstract]

CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993  The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303.[Abstract]

CLARK, A. G., K. M. WEISS, D. A. NICKERSON, S. L. TAYLOR, and A. BUCHANAN et al., 1998  Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am. J. Hum. Genet. 63:595-612.[CrossRef][Medline]

COOPER, D. N., and M. KRAWCZAK, 1993 Human Gene Mutation. Bios Scientific, Oxford.

ENARD, W., M. PRZEWORSKI, S. E. FISHER, C. S. LAI, and V. WIEBE et al., 2002  Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418:869-872.[CrossRef][Medline]

EXCOFFIER, L., P. E. SMOUSE, and J. M. QUATTRO, 1992  Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479-491.[Abstract]

FAY, J. C. and C.-I WU, 2000  Hitchhiking under positive Darwinian selection. Genetics 155:1405-1413.[Abstract/Free Full Text]

FRISSE, L., R. R. HUDSON, A. BARTOSZEWICZ, J. D. WALL, and J. DONFACK et al., 2001  Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69:831-843.[CrossRef][Medline]

FU, Y. X. and W. H. LI, 1993  Statistical tests of neutrality of mutations. Genetics 133:693-709.[Abstract]

FULLERTON, S. M., A. G. CLARK, K. M. WEISS, D. A. NICKERSON, and S. L. TAYLOR et al., 2000  Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. Am. J. Hum. Genet. 67:881-900.[CrossRef][Medline]

GALTIER, N., F. DEPAULIS, and N. H. BARTON, 2000  Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics 155:981-987.[Abstract/Free Full Text]

GILAD, Y., S. ROSENBERG, M. PRZEWORSKI, D. LANCET, and K. SKORECKI, 2002  Evidence for positive selection and population structure at the human MAO-A gene. Proc. Natl. Acad. Sci. USA 99:862-867.[Abstract/Free Full Text]

HAMBLIN, M. T. and A. DI RIENZO, 2000  Detection of the signature of natural selection in humans: evidence from the duffy blood group locus. Am. J. Hum. Genet. 66:1669-1679.[CrossRef][Medline]

HAMMER, M. F., 1995  A recent common ancestry for human Y chromosomes. Nature 378:376-378.[CrossRef][Medline]

HAMMER, M. F., F. BLACKMER, D. GARRIGAN, M. W. NACHMAN, and J. A. WILDER, 2003  Human population structure and its effects on sampling Y chromosome sequence variation. Genetics 164:1495-1509.[Abstract/Free Full Text]

HARDING, R. M., S. M. FULLERTON, R. C. GRIFFITHS, J. BOND, and M. J. COX et al., 1997  Archaic African and Asian lineages in the genetic ancestry of modern humans. Am. J. Hum. Genet. 60:772-789.[Medline]

HARDING, R. M., E. HEALY, A. J. RAY, N. S. ELLIS, and N. FLANAGAN et al., 2000  Evidence for variable selective pressures at MC1R. Am. J. Hum. Genet. 66:1351-1361.[CrossRef][Medline]

HARDISON, R. C., K. M. ROSKIN, S. YANG, M. DIEKHANS, and W. J. KENT et al., 2003  Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13:13-26.[Abstract/Free Full Text]

HARRIS, E. E. and J. HEY, 1999  X chromosome evidence for ancient human histories. Proc. Natl. Acad. Sci. USA 96:3320-3324.[Abstract/Free Full Text]

HARRIS, E. E. and J. HEY, 2001  Human populations show reduced DNA sequence variation at the Factor IX locus. Curr. Biol. 11:774-778.[CrossRef][Medline]

HELLMANN, I., I. EBERSBERGER, S. E. PTAK, S. PAABO, and M. PRZEWORSKI, 2003  A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72:1527-1535.[CrossRef][Medline]

HEY, J., 1997  Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 14:166-172.[Abstract]

HUDSON, R. R., M. KREITMAN, and M. AGUADE, 1987  A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159.[Abstract/Free Full Text]

HUTTLEY, G. A., M. W. SMITH, M. CARRINGTON, and S. J. O'BRIEN, 1999  A scan for linkage disequilibrium across the human genome. Genetics 152:1711-1722.[Abstract/Free Full Text]

KAESSMANN, H., F. HEISSIG, A. VON HAESELER, and S. PA