Genetics, Vol. 156, 1285-1298, November 2000, Copyright © 2000

Microsatellite Variation and Recombination Rate in the Human Genome

Bret A. Payseura and Michael W. Nachmana
a Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721

Corresponding author: Bret A. Payseur, Department of Ecology and Evolutionary Biology, Biosciences West Bldg., University of Arizona, Tucson, AZ 85721., payseur{at}u.arizona.edu (E-mail)

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Background (purifying) selection on deleterious mutations is expected to remove linked neutral mutations from a population, resulting in a positive correlation between recombination rate and levels of neutral genetic variation, even for markers with high mutation rates. We tested this prediction of the background selection model by comparing recombination rate and levels of microsatellite polymorphism in humans. Published data for 28 unrelated Europeans were used to estimate microsatellite polymorphism (number of alleles, heterozygosity, and variance in allele size) for loci throughout the genome. Recombination rates were estimated from comparisons of genetic and physical maps. First, we analyzed 61 loci from chromosome 22, using the complete sequence of this chromosome to provide exact physical locations. These 61 microsatellites showed no correlation between levels of variation and recombination rate. We then used radiation-hybrid and cytogenetic maps to calculate recombination rates throughout the genome. Recombination rates varied by more than one order of magnitude, and most chromosomes showed significant suppression of recombination near the centromere. Genome-wide analyses provided no evidence for a strong positive correlation between recombination rate and polymorphism, although analyses of loci with at least 20 repeats suggested a weak positive correlation. Comparisons of microsatellites in lowest-recombination and highest-recombination regions also revealed no difference in levels of polymorphism. Together, these results indicate that background selection is not a major determinant of microsatellite variation in humans.


THEORETICAL studies suggest that the joint effects of selection and linkage may lead to broadscale patterns of genetic variation in different genomic regions. Background (purifying) selection on deleterious mutations may reduce levels of linked neutral variation, particularly in genomic regions of reduced recombination (CHARLESWORTH et al. 1993 Down; CHARLESWORTH 1994 Down; HUDSON and KAPLAN 1995 Down). This process reflects a mutation-selection equilibrium in which heterozygosity is reduced as a function of the deleterious mutation rate, the average selection coefficient and dominance factor for harmful mutations, and the rate of recombination for a given region. Fixation of beneficial mutations may also reduce levels of linked neutral variation as a result of genetic hitchhiking (MAYNARD SMITH and HAIGH 1974 Down; KAPLAN et al. 1989 Down; STEPHAN 1995 Down).

A key distinction between the effects of harmful and beneficial mutations on linked neutral variation is that background selection is an equilibrium process while genetic hitchhiking is not (SLATKIN 1995 Down; WIEHE 1998 Down). Consequently, background selection predicts a positive correlation between recombination rate and levels of neutral polymorphism regardless of the neutral mutation rate. Thus, this association is predicted both for polymorphisms at single nucleotide sites (where mutation rates may be on the order of 10-8; DRAKE et al. 1998 Down) and for polymorphisms at microsatellite loci (where mutation rates may be on the order of 10-4; BANCHS et al. 1994 Down). In contrast, genetic hitchhiking will reduce variability at linked neutral sites sporadically, and the recovery to steady-state heterozygosity will depend in part on the neutral mutation rate. If selective sweeps are common and the mutation rate is low (as for single nucleotide polymorphisms), a positive correlation is expected between recombination rate and polymorphism. However, if sweeps are rare or if mutation rates are very high (as for microsatellites), a positive correlation is not expected (WIEHE 1998 Down).

Empirical data from several sources show that nucleotide polymorphism is reduced in genomic regions experiencing low rates of recombination. The best evidence for this comes from Drosophila melanogaster. Nucleotide polymorphism is significantly reduced at the tip of the X chromosome (AGUADE et al. 1989 Down; BEGUN and AQUADRO 1991 Down) and on the small fourth chromosome (BERRY et al. 1991 Down), both of which experience low rates of recombination. More generally, there is an overall positive correlation between recombination rate and nucleotide diversity throughout the D. melanogaster genome (BEGUN and AQUADRO 1992 Down; AQUADRO et al. 1994 Down; MORIYAMA and POWELL 1996 Down). Reduced nucleotide variability in low-recombination regions has also been observed in D. ananassae (STEPHAN and LANGLEY 1989 Down), D. simulans (BEGUN and AQUADRO 1991 Down; BERRY et al. 1991 Down), D. mauritiana (HILTON et al. 1994 Down), and D. sechellia (HILTON et al. 1994 Down). There is weaker evidence for a positive association between recombination rate and nucleotide diversity in mice (NACHMAN 1997 Down), humans (NACHMAN et al. 1998 Down; PREZWORSKI et al. 2000 Down), sea beets (KRAFT et al. 1998 Down), tomatoes (STEPHAN and LANGLEY 1998 Down), and goatgrasses (DVORAK et al. 1998 Down).

One empirical study in which the effect of recombination rate on microsatellite variation has been assessed is in D. melanogaster, where SCHUG et al. 1998 Down documented a strong, positive association. However, microsatellite mutation rates in Drosophila are quite low (6 x 10-6; SCHUG et al. 1997 Down) and thus the observed correlation may be consistent with either genetic hitchhiking or background selection (SCHUG et al. 1998 Down).

Here, we assess the relationship between microsatellite variability and recombination rate in humans. Microsatellite mutation rates in humans are known to be high (e.g., 10-4; BANCHS et al. 1994 Down) and the wealth of genetic and physical mapping data makes it possible to estimate recombination rates in different genomic regions. DIB et al. 1996 Down have constructed a genetic map of the human genome based on 5264 (CA)n microsatellite markers genotyped in eight CEPH families comprising 134 individuals. The sex-averaged length of the genetic map is 3699 cM and the average interval size is 1.6 cM. Two different physical maps of the human genome have been assembled from radiation-hybrid (RH) cell panels (GYAPAY et al. 1996 Down; STEWART et al. 1997 Down). RH panels consist of somatic cell hybrids, with each cell line containing a random set of fragments of irradiated human genomic DNA in a hamster background. Markers that are physically close to one another are expected to have relatively few chromosomal breaks between them and thus co-occur in hybrid cell lines more often than markers that are far apart. Cell lines are genotyped for numerous markers and the relative positions in centirays (cR) of markers are inferred from the degree of co-occurrence. RH maps provide probabilistic statements about the physical locations of markers and rely on the assumption that radiation-induced break points are randomly distributed. However, there is some evidence that radiosensitivity is influenced by local chromatin composition and that the distribution of radiation-induced breakage events may not be Poisson (TEAGUE et al. 1996 Down). COLLINS et al. 1996 Down have also assembled physical maps of the human genome, with marker positions determined largely from in situ hybridization of probes to high-resolution, G-banded metaphase chromosomes. Finally, complete sequences of portions of the human genome (e.g., DUNHAM et al. 1999 Down) now provide the exact physical location of many markers.

In this article, we compare recombination rates and levels of microsatellite polymorphism in different regions of the human genome. We find no evidence for a strong association between recombination rate and microsatellite polymorphism, suggesting that background selection is not a primary determinant of microsatellite variability in humans.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Recombination rates:
Recombination rates can be estimated from comparison of genetic and physical maps. Error in estimates of recombination rate may derive from one or both of these sources. The ultimate physical map of the human genome will eventually come from the complete sequence of the whole genome; at the time of this writing the sequences of two human chromosomes are available (DUNHAM et al. 1999 Down; HATTORI et al. 2000 Down). Chromosome 21 shows little regional variation in recombination rate (HATTORI et al. 2000 Down) while chromosome 22 has several regions with considerably elevated recombination rates (DUNHAM et al. 1999 Down).

Because chromosome 22 has been completely sequenced and because it displays significant variation in recombination rate, we first analyzed patterns of microsatellite variation on this chromosome in light of its recombinational landscape (as described in detail below). The estimates of recombination rate for chromosome 22 are among the best for the human genome since they derive from physical distances measured directly in base pairs. Even in this situation, however, recombination rates may be inaccurate because of imprecision in the genetic map (which is based on pedigrees rather than crosses and thus is constructed from fewer meioses). Moreover, the relatively low density of markers that have been integrated on both the genetic map and complete sequence implies that small-scale variation in recombination rate may go undetected.

In the second part of this article, we extend our analysis to include the whole genome. Since the complete sequence is not yet available we rely on physical maps based on cytogenetic data (COLLINS et al. 1996 Down; http://cedar.genetics.soton.ac.uk/public_html/ldb.html, subsequently referred to as "Morton's map") or radiation hybrid panels (GYAPAY et al. 1996 Down; STEWART et al. 1997 Down; DELOUKAS et al. 1998 Down; http://www.ncbi.nlm.nih.gov/genemap). There are currently two large-scale RH maps of the human genome, the Genebridge4 (GB4) map (GYAPAY et al. 1996 Down) and the Stanford G3 map (STEWART et al. 1997 Down). Although the G3 panel contains a slightly higher level of resolution, many more markers have been placed on the GB4 map. To assess the degree of concordance between different physical maps we have compared the positions of co-occurring microsatellites on the GB4 map, the G3 map, Morton's map, and the complete sequence of chromosome 22 (Table 1). In all cases, these different physical maps were highly significantly correlated (Kendall's correlation analyses, P < 10-4 for each comparison). This suggests that these physical maps, which were constructed independently, are relatively accurate.


 
View this table:
In this window
In a new window

 
Table 1. Comparison of the physical position of co-occurring microsatellite markers on the GB4, G3, and Morton physical maps and the complete sequence of chromosome 22

Recombination rates were calculated separately using four different physical maps: the complete sequence of chromosome 22, the GB4 radiation hybrid map, the G3 radiation hybrid map, and Morton's map. Recombination rates were calculated two different ways. In the first approach, we used a sliding window encompassing five markers on either side of the locus of interest. When the locus of interest was situated at or near the edge of a chromosome, windows were constructed to include five markers proximal to the locus of interest and zero to five markers distal to the locus of interest. For example, for a microsatellite situated at the end of a chromosome, only six markers were included in the window. Thus, recombination rates estimated using the sliding-window method at the edges of chromosomes are based on fewer data and may be somewhat biased in these regions (NACHMAN and CHURCHILL 1996 Down). For each sliding window, a linear function was fit to the points representing genetic (DIB et al. 1996 Down) and physical map position, and the slope of this line was taken as the estimate of recombination rate. We also used the sliding-window approach with third- and fifth-order polynomial curve fitting and estimated recombination rate as the derivative of the function at the locus of interest. Using polynomial rather than linear functions had little effect on recombination rate estimates and so we report only the estimates from linear functions. In the second approach (subsequently referred to as the "whole-chromosome" method), we fit a third-order polynomial to the genetic and physical positions of all the markers on a chromosome. The derivative of this function, evaluated for each marker, was taken as the estimate of recombination rate. Estimating recombination rate by simultaneously using all markers on a given chromosome has a smoothing effect (KLIMAN and HEY 1993 Down), and probably obscures much regional variation, which is better captured by the sliding-window method. Conversely, estimates under the sliding-window approach are more strongly impacted by errors in the genetic and physical mapping locations of individual markers.

All analyses were completed using both the sliding-window and the whole-chromosome estimates of recombination rates. Although sliding-window and whole-chromosome recombination rates differed in some cases, none of our conclusions were affected by these differences; in most cases we report results from the sliding-window analyses only. All analyses were also completed using estimates of recombination rate derived from each of the four physical maps. While estimates differed in some cases, none of our conclusions were affected by these differences and we report results only from the complete sequence of chromosome 22, the GB4 map, and Morton's map. Results from the G3 map were similar and are available upon request from the authors. Recombination rates calculated on the basis of the GB4 map were converted from centimorgans per centiray to centimorgans per megabase to facilitate comparison to other estimates of human recombination rates using the conversion factors for individual chromosomes given by HUDSON et al. 1995 Down.

Microsatellite variation:
Data on microsatellite variation were obtained from DIB et al. 1996 Down(http://www.genethon.fr). All microsatellites consist of (CA)n dinucleotide repeats, genotyped in unrelated European individuals (56 autosomes and 108 X chromosomes). DIB et al. 1996 Down used an initial screen for polymorphism to isolate markers: only loci that contained at least three distinct alleles among 8 chromosomes were included. This screen may have created an ascertainment bias for our analyses by excluding loci that are monomorphic or nearly monomorphic (see RESULTS).

The number of alleles and observed heterozygosity for each locus were taken from DIB et al. 1996 Down. The variance in allele size for each marker was calculated as

(SOKAL and ROHLF 1995 Down), where fi is the frequency of the ith allele, xi is the number of repeats at the ith allele, is the average number of repeats weighted by frequency, and n is the number of alleles. For a subset of markers placed on the physical maps, microsatellite length was determined by counting the longest string of CA dinucleotide repeats from the published sequence (DIB et al. 1996 Down). Each locus was scored as a perfect (i.e., uninterrupted) repeat or an imperfect repeat.

Because background selection is an equilibrium model, its predictions are only strictly valid for randomly mating populations. One of the CEPH families from which data have been drawn for this study descends from an admixed population (BEGOVICH et al. 1992 Down), potentially violating the assumption of random mating. However, inclusion of this family is unlikely to have biased our results for two reasons. First, the contribution of this family to the sample is only four chromosomes. Second, linkage disequilibrium induced by nonrandom mating would cause an overestimation of degrees of freedom for statistical analyses, increasing the likelihood of rejecting the null hypothesis of no correlation between recombination rate and microsatellite variation. Because we find no strong statistical evidence for such an association, this bias is conservative relative to our conclusions.

Statistical analyses:
Recombination rate, heterozygosity, number of alleles, and variance in allele size for microsatellites were nonnormally distributed (Shapiro-Wilk's test, P < 0.01 for each distribution). Nonparametric Kendall's correlation analyses were used to characterize the relationships between measures of microsatellite polymorphism and recombination rate. Analyses were conducted for chromosome 22 first (on the basis of recombination rates estimated from physical positions in the complete sequence) and then at the genomic level for the entire dataset and separately for each chromosome (on the basis of recombination rates estimated from physical positions on the GB4 map and on the Morton map). Because we performed multiple tests, we adjusted the statistical significance level for all analyses. There were 12 tests performed per chromosome, and we used a Bonferroni correction (SOKAL and ROHLF 1995 Down) by adjusting the significance level to .

Heterogeneity in mutation rates might obscure a relationship between microsatellite variation and recombination rate (SCHUG et al. 1998 Down). In humans, evidence has accumulated for a positive association between microsatellite mutation rate and the number of repeats (WEBER 1990 Down; BRINKMANN et al. 1998 Down; DI RIENZO et al. 1998 Down). In an attempt to control for the effects of mutation rate variation, we completed additional analyses with restrictions on allele size (20 or more repeats) and kind (perfect repeats).

Finally, we used Mann-Whitney U-tests to compare levels of microsatellite variation in high vs. low recombination regions, including markers in the upper and lower 10% of the recombination rate distribution.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Chromosome 22:
Mean recombination rates, variance in allele size, heterozygosity, number of alleles, and number of repeats for the 61 microsatellite loci on chromosome 22 are given in Table 2. Sliding-window estimates of recombination rate varied by approximately one order of magnitude, from 0.47 to 4.57 cM/Mb. Higher recombination rates were found primarily in several distinct regions in the middle of the chromosome, as previously reported (DUNHAM et al. 1999 Down). Variance in allele size ranged from 0.90 to 63.36. There was no correlation between recombination rate and variance in allele size for the 61 loci on chromosome 22 (Kendall's correlation analyses; {tau} = -0.118; P = 0.181); a scatterplot of these data is shown in Fig 1. Similar results were obtained using heterozygosity ({tau} = -0.108; P = 0.218) and number of alleles ({tau} = -0.134; P = 0.126) as measures of microsatellite polymorphism. Furthermore, when correlation analyses of microsatellite variation and recombination rate were restricted to loci with at least 20 repeats, no association emerged (P > 0.05 using each measure of polymorphism). These results stand in contrast to the strong association between microsatellite variation and recombination rate observed in Drosophila (R2 = 0.55; SCHUG et al. 1998 Down), where only 18 loci were surveyed.



View larger version (14K):
In this window
In a new window
Download PPT slide
 
Figure 1. Scatterplot of variance in allele size vs. recombination rate for 61 microsatellite loci on chromosome 22. Recombination rates were calculated from the genetic (DIB et al. 1996 Down) and physical positions of loci in the complete sequence of chromosome 22 (DUNHAM et al. 1999 Down). There is no correlation between variance in allele size and recombination rate (Kendall's {tau} = -0.118; P = 0.181).


 
View this table:
In this window
In a new window

 
Table 2. Recombination rate and microsatellite variability for 61 loci on chromosome 22

Among the 61 microsatellites considered above, 25 have been placed on Morton's map (and only a few have been placed on the radiation hybrid maps). A scatterplot comparing recombination rates estimated from the complete sequence of chromosome 22 and those estimated from Morton's map is shown in Fig 2. There is a positive correlation between these estimates (correlation coefficient R = 0.578, P = 0.003; nonparametric Kendall's {tau} = 0.180, P = 0.207), suggesting that recombination rates estimated from these different underlying physical maps are consistent with each other. It should be pointed out that this is a relatively weak test since it only includes 25 loci and each sliding window is based on 11 markers (5 on either side of the locus of interest).



View larger version (12K):
In this window
In a new window
Download PPT slide
 
Figure 2. Scatterplot of recombination rates estimated using Morton's map vs. recombination rates estimated using the sequence of chromosome 22 (correlation coefficient R = 0.578, P = 0.003; nonparametric Kendall's {tau} = 0.180, P = 0.207).

Genome-wide recombination rates:
Complete tables listing the genetic positions, physical positions, and estimated recombination rates for microsatellite markers placed on the GB4 map and Morton's map are given at http://eebweb.arizona.edu/nachman/publications/data/microsats.html. Substantial variation in recombination rate was observed both within and among chromosomes. For the GB4 map, the mean recombination rate was 1.55 cM/Mb (sliding-window method) and 1.46 cM/Mb (whole-chromosome method), with low values <0.5 cM/Mb and high values >6 cM/Mb (Table 3). For Morton's map, the mean recombination rate was 1.72 cM/Mb (sliding-window method) and 1.37 cM/Mb (whole-chromosome method), with low values <0.5 cM/Mb and high values >20 cM/Mb (Table 4). The highest recombination rates on Morton's map (i.e., >10 cM/Mb) derive from two regions, one on the p arm of chromosome 5 and one on the p arm of chromosome 7. Neither region is well represented on the GB4 map, and this may account for the absence of recombination rates >10 cM/Mb on this map. Alternatively, these exceptionally high rates may reflect errors in the physical position of markers on Morton's map. Several windows on Morton's map resulted in negative values for recombination rates; these values derive from inconsistencies in marker order on the Genethon (DIB et al. 1996 Down) and Morton maps (COLLINS et al. 1996 Down).


 
View this table:
In this window
In a new window

 
Table 3. Recombination rate and microsatellite variability for 1635 loci on the GB4 radiation hybrid map


 
View this table:
In this window
In a new window

 
Table 4. Recombination rate and microsatellite variability for 3180 loci on Morton's map (COLLINS et al. 1996 Down)

Estimates of recombination rate from the GB4 and Morton maps were highly correlated (Kendall's {tau} = 0.215; P < 0.0001), and several general patterns emerged from both maps. First, scatterplots of genetic vs. physical map position for the markers on each chromosome typically reveal sigmoidal curves, GB4 scatterplots are shown in Fig 3, Morton's map scatterplots are not shown but are similar. This pattern is seen for most metacentric chromosomes and is consistent with a reduction in recombination rate near centromeres, as previously documented (e.g., NAGARAJA et al. 1997 Down). Second, several metacentric chromosomes displayed high levels of recombination near one or both telomeres. For example, the highest recombination rates calculated from the GB4 map are for loci at the q telomere of chromosome 2. Third, in general, the acrocentric chromosomes (13, 14, 15, 21, and 22) revealed less variation in rates of recombination than did the metacentric chromosomes.





View larger version (80K):
In this window
In a new window
Download PPT slide
 
Figure 3. Scatterplots of genetic vs. physical map position for microsatellites on the GB4 map. Sigmoidal curves for many of the chromosomes are indicative of lower levels of recombination near centromeres of metacentric chromosomes.

The mean recombination rates estimated using the sliding-window method were close to the mean values obtained from the whole-chromosome method (Table 3 and Table 4). However, the variance in recombination rates was larger for the sliding-window estimates than for the whole-chromosome estimates. Recombination rate estimates from the two approaches were highly correlated (Morton's map: {tau} = 0.26, P < 0.0001; GB4 map: {tau} = 0.59, P < 0.0001). In subsequent analyses, we report results using the sliding-window approach, but similar results are obtained using whole-chromosome estimates of recombination rate.

Genome-wide microsatellite variation:
Variation in number of alleles, heterozygosity, and variance in allele size for the microsatellite loci is given in Table 3 and Table 4. Substantial variation in levels of microsatellite polymorphism was observed for loci on both maps. For microsatellite loci on the GB4 map the mean variance in allele size was 13.95 and ranged from a minimum of 0.37 to a maximum of 303.21. For loci on the Morton map, the mean variance in allele size was 13.67 and ranged from a minimum of 0.19 to a maximum of 474.29. Under a stepwise mutation model, variance in allele size provides an estimate of the neutral mutation parameter, 2(Ne - 1)µ, where Ne is the effective population size and µ is the neutral mutation rate (MORAN 1975 Down). Assuming an average mutation rate of 10-4–10-5 (BANCHS et al. 1994 Down), the mean variance in allele size for loci on the GB4 map (13.95) suggests an effective population size of 0.7–7.0 x 105, an estimate in rough agreement with others based on nucleotide polymorphisms from a variety of independent loci (~104; e.g., HAMMER 1995 Down). The mean heterozygosity on the GB4 map was 70% and ranged from 21 to 92% (Table 3). Effective population size can also be calculated from heterozygosity (H) under a stepwise mutation model, where H = 1 - [] (OHTA and KIMURA 1973 Down). With mutation rates of 10-4–10-5, the observed heterozygosity of 0.70 suggests Ne = 0.13–1.3 x 105. These calculations of Ne assume equilibrium conditions and constant mutation rates among loci, and, as such, are intended to provide only rough estimates. On the GB4 map, the mean number of alleles per locus was 7.68 and the average number of repeats was 19.48, illustrating the fact that humans tend to have longer microsatellites (with potentially higher mutation rates) than D. melanogaster (SCHUG et al. 1997 Down). Despite some ascertainment bias present in the original choice of markers used to construct the genetic map, the results in Table 3 and Table 4 indicate that there is still substantial variation in all measures of polymorphism. Moreover, estimates of Ne from these data are not much higher than estimates of Ne from other genetic data.

Comparison of genome-wide recombination rate and microsatellite variation:
Using two genome-wide datasets (loci on Morton's map and loci on the GB4 map), we compared three measures of microsatellite variability (variance in allele size, heterozygosity, and number of alleles) to recombination rate. We also restricted these analyses to loci with perfect repeats, loci with at least 20 repeats, and loci with at least 20 perfect repeats. Results of these analyses are shown in Table 5 and Table 6. Although some correlations exhibited low probabilities (especially for loci on Morton's map), none were significant when corrected for multiple tests. In all cases, the magnitudes of the correlations were small. Scatterplots of microsatellite polymorphism vs. recombination rate are shown in Fig 4. These genome-wide results are entirely consistent with the results for chromosome 22 based on the complete sequence of that chromosome (compare Fig 1 and Fig 4) in revealing no correlation between microsatellite polymorphism and recombination rate.



View larger version (37K):
In this window
In a new window
Download PPT slide
 
Figure 4. Scatterplots of variance in allele size vs. recombination rate. Variables are log transformed (to the base e) for ease of presentation. (A) All microsatellites on the GB4 map; (B) microsatellites with at least 20 repeats on the GB4 map; (C) all microsatellites on Morton's map; (D) microsatellites with at least 20 repeats on Morton's map.


 
View this table:
In this window
In a new window

 
Table 5. Correlation analyses of microsatellite variation and recombination rate for 1635 loci on the GB4 radiation hybrid map


 
View this table:
In this window
In a new window

 
Table 6. Correlation analyses of microsatellite variation and recombination rate for 3180 loci on Morton's map

We tried to control for the effects of variable mutation rates among loci by restricting the analysis to markers with 20 or more repeats. We also performed a multiple regression of log-transformed variance in allele size on log-transformed recombination rate and number of repeats for loci on each map separately. Although the number of repeats was strongly associated with variance in allele size (P < 0.0001 for both maps), its use as a covariate did not reveal an association between variance in allele size and recombination rate (Morton's map, P = 0.180; GB4 map, P = 0.337), although log heterozygosity and log recombination rate were weakly correlated for loci on Morton's map using this approach (P = 0.027; adjusted total R2 = 0.063).

Microsatellite variation was also compared to recombination rate for each of the chromosomes separately. Only chromosome 4 displayed any evidence of a positive association between microsatellite polymorphism and recombination rate, and this association was weak (Fig 5). In analyses including all loci on chromosome 4 from the GB4 map, recombination rate was not significantly correlated with variance in allele size ({tau} = 0.140, P = 0.079), but was significantly correlated when only loci with at least 20 repeats were considered ({tau} = 0.368, P = 0.007; Fig 5). A trend is evident, although this result is not statistically significant under the Bonferroni correction for multiple tests. Support for an association on chromosome 4 using loci on Morton's map was weaker (all loci: {tau} = 0.094, P = 0.060; loci with at least 20 repeats: {tau} = 0.257, P = 0.103).



View larger version (11K):
In this window
In a new window
Download PPT slide
 
Figure 5. Scatterplot of variance in allele size vs. recombination rate for microsatellites on chromosome 4 from the GB4 map.

Finally, using markers throughout the genome, we compared polymorphism at loci experiencing the highest (90th percentile) and lowest (10th percentile) recombination rates using data for each map separately. Most of the low-recombination microsatellites map near centromeres and many, but not all, of the high-recombination microsatellites map near telomeres. Mann-Whitney U-tests reveal no difference in measures of microsatellite variation for high-recombination loci vs. low-recombination loci (P > 0.05 in all tests). These results are seen in comparisons utilizing all loci as well as the subset of loci with 20 or more repeats. The distributions of variance in allele size for the high-recombination loci and for the low-recombination loci are shown in Fig 6. Although a slight difference in variation can be seen in this figure (there are more low-polymorphism loci in regions of low recombination, for example), both highly polymorphic and nearly monomorphic loci can be found in each group.



View larger version (27K):
In this window
In a new window
Download PPT slide
 
Figure 6. Distributions of variance in allele size for microsatellites experiencing high (upper 10%) and low (lower 10%) rates of recombination. Data are shown for (A) loci on the GB4 map and (B) loci on Morton's map. A few loci with very high variances in allele size are not shown for ease of visual presentation.


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Chromosome 22:
The complete sequence of chromosome 22 provides the unambiguous physical position of 61 microsatellite markers that have been genetically mapped (DIB et al. 1996 Down). Consequently, recombination rates can be estimated with more certainty for chromosome 22 than for much of the genome. Recombination rates vary by one order of magnitude and levels of microsatellite polymorphism (variance in allele size) vary by a factor of 70 on chromosome 22. Chromosome 22 thus provides a good test of the simple prediction that recombination rates and levels of microsatellite polymorphism should be correlated under background selection. As shown in Fig 1, we find no evidence of such an association. These results stand in contrast to the findings of SCHUG et al. 1998 Down, who reported a strong positive correlation between variance in micro-satellite allele size and recombination rate for 18 markers throughout the D. melanogaster genome. Possible explanations for these differing results are discussed below. It should be noted that with the complete sequence of the human genome, comparable analyses will soon be possible for other human chromosomes.

Recombination rates throughout the human genome:
The average rate of recombination across the human genome from comparison of genetic and physical maps is ~1.5 cM/Mb. This average value, however, masks the substantial variation in recombination rate that exists in different genomic regions. There is broad concordance between the sliding-window and whole-chromosome methods of estimating recombination rates and both methods show several consistent patterns, including the suppression of recombination near centromeres of metacentric chromosomes. Relatively little is known about the recombinational landscape in humans or the scale at which recombination rates vary. Several studies have revealed recombinational hotspots (e.g., OUDET et al. 1992 Down; HARDING et al. 1997 Down), suggesting that recombination rates may vary substantially over a scale of several kilobases. If so, then the whole-chromosome approach to quantifying variation in recombination rate is likely to have a smoothing effect that will obscure important differences in recombination rate. However, even the sliding-window approach implemented here will not detect fine-scale variation in rate since the average spacing of markers integrated between the genetic and physical maps is on the order of 1–2 Mb. Moreover, the effective resolution of the RH panels (~1 Mb for the GB4 map) is insufficient to detect fine-scale variation. Furthermore, the distances in centirays given on radiation hybrid maps are probabilistic statements about relative physical positions and do not correspond precisely with distances in base pairs. While the complete sequence of the human genome will soon provide the ultimate physical map, estimates of recombination rates in humans will still depend on the precision of genetic maps that are limited by reliance on pedigrees rather than crosses.

Recombination rate and microsatellite variation:
Genome-wide analyses are completely consistent with the results from chromosome 22 and do not support the hypothesis of a strong positive correlation between microsatellite polymorphism and recombination rate (Table 5 and Table 6, Fig 4). Similarly, comparisons between loci experiencing the highest and lowest levels of recombination are inconsistent with the notion that recombination rate strongly affects levels of microsatellite polymorphism (Fig 6). For example, both groups contain nearly monomorphic loci and both groups also contain highly polymorphic loci. Recombination rate (as estimated here) does not appear to be a major determinant of microsatellite variability in humans. These patterns can be contrasted with one recent study in D. melanogaster, where variation in recombination rate explained 55% of the variation in variance in allele size for a set of 18 microsatellite loci throughout the genome (SCHUG et al. 1998 Down). In that study, none of the high-recombination loci were monomorphic and none of the low-recombination loci were highly polymorphic.

At least four factors may contribute to the substantial scatter in Fig 4 and we consider each of these in turn: (i) imprecision of estimates of recombination rate, (ii) ascertainment bias, (iii) variation in mutation rate, and (iv) locus-specific effects (such as selection).

Four observations suggest that imprecise estimates of genome-wide recombination rates are not obscuring an otherwise strong correlation. First, the estimates of very low levels of recombination near centromeres are likely to be reasonably accurate since they agree well with other studies based on different genetic and physical maps (e.g., WANG et al. 1994 Down; NAGARAJA et al. 1997 Down). In these lowest-recombination regions, a substantial fraction of the loci exhibit moderate to high levels of polymorphism (Fig 4 and Fig 6). Second, the physical positions of markers on the GB4, G3, and Morton maps are in good agreement (see MATERIALS AND METHODS), and recombination rates estimated from these different maps are strongly correlated. Microsatellite polymorphism and recombination rate are not strongly correlated when recombination rates are calculated from any of these maps. Third, the results from the genome-wide analysis are entirely consistent with the results from chromosome 22, where we have better estimates of recombination rate. Fourth, a positive correlation is observed between nucleotide variability and recombination rates estimated from the GB4 map (PREZWORSKI et al. 2000 Down), suggesting that these recombination rates are sufficiently precise to detect a correlation when one exists.

Ascertainment bias in the original choice of loci may also be hiding a stronger association between recombination rate and microsatellite polymorphism. If all of the monomorphic or nearly monomorphic loci excluded by DIB et al. 1996 Down were in regions of low recombination, their removal might weaken the observed association. Although ascertainment bias may contribute to the low correlation, it is probably an insufficient explanation for our results. A substantial number of nearly monomorphic loci are seen in regions of high recombination (Fig 4 and Fig 6), and the observed variance in allele size ranges over four orders of magnitude (Table 1 Table 2 Table 3).

Variability in mutation rates among microsatellite loci is well documented in humans (BRINKMANN et al. 1998 Down; DI RIENZO et al. 1998 Down) and remains a likely explanation for much of the heterogeneity we see in levels of polymorphism. If we assume that all loci experience an effective population size of 104 and are evolving neutrally, then the observed heterogeneity in variance in allele size suggests that mutation rates vary from 10-2 to 10-5, numbers that agree with mutation rates measured in pedigrees (reviewed in JARNE and LAGODA 1996 Down). We have tried to correct for variation in underlying mutation rate by restricting analyses to long and perfect repeats and by performing multiple regressions including both recombination rate and repeat length as independent variables. These corrections may be insufficient for several reasons. First, repeat length was measured from one sequenced allele and may not represent the mean repeat length for a locus. For loci that have many alleles of different size, a randomly chosen allele may be substantially shorter or longer than the mean allele size for that locus. Second, repeat length is not a perfect predictor of mutation rate, and it is likely that there is substantial heterogeneity within a given length class. The mechanisms responsible for heterogeneity in microsatellite mutation rates are active topics of debate (DI RIENZO et al. 1998 Down). If these loci mutate through slipped strand mispairing, for example, mutation rates may be influenced by genomic variation in chromatin structure. Thus, despite our efforts to control for variation in mutation rate, substantial differences in mutation rate may exist and be responsible for much of the heterogeneity in levels of polymorphism in Fig 4.

A final possibility is that many of the microsatellites are experiencing the effects of selection on closely linked loci. Balancing selection is expected to elevate levels of polymorphism, while directional selection will reduce polymorphism (HUDSON et al. 1987 Down). Selection acting haphazardly on different loci at different times may contribute to heterogeneity in levels of polymorphism. This is a formal hypothesis that cannot be excluded, although we know of no evidence to support this hypothesis at present.

Of the analyses of individual chromosomes, only chromosome 4 showed any hint of a positive correlation between recombination rate and microsatellite polymorphism (Fig 5), although this result is not significant when corrections for multiple tests are employed. Chromosome 4 contains one of the largest regions of very low recombination rate in the human genome (~200 cR, Fig 3). Thus, it is possible that the effects of selection at linked sites may be more pronounced for chromosome 4 than for other chromosomes. A positive correlation is seen between recombination rate and variance in allele size (Fig 5) but not between recombination rate and heterozygosity (not shown). In D. melanogaster, SCHUG et al. 1998 Down also found that variance in microsatellite allele size was correlated with recombination rate, while heterozygosity was not. Heterozygosity ranges from 0 to 1, but most microsatellite loci exhibit a narrow range of high heterozygosities from 0.60 to 0.90. There is no such restriction on variance in allele size, which can take on a wide range of values. Heterozygosity may be a less sensitive statistic than variance in allele size for detecting changes in population size or deviations from equilibrium conditions.

Background selection and genetic hitchhiking:
Background selection predicts a positive correlation between rate of recombination and levels of genetic variation even for markers with high mutation rates such as microsatellites. Our results demonstrate that variation in recombination rate is not strongly associated with variation in microsatellite polymorphism in humans. Hence, background selection does not appear to be a major determinant of levels of microsatellite polymorphism in different genomic regions.

Are there inherent differences between Drosophila and humans that might make background selection less important in humans? The strength of background selection depends on the deleterious mutation rate for the genomic region in question. As a first approximation, we can assume that the deleterious mutation rate for a given region will be a function of the number of genes in that region. Overall, the density of genes per recombinational distance is about five times higher in D. melanogaster than in humans (100 genes/cM in D. melanogaster compared to 20 genes/cM in humans; NACHMAN et al. 1998 Down). For the background selection model, the deleterious mutation rate for a region of reduced recombination is more relevant, however. In D. melanogaster, the largest euchromatic region of low recombination is near the centromere of the third chromosome and corresponds to ~15% of the total amount of euchromatin in the genome (SORSA 1988 Down). The Drosophila genome contains ~14,000 genes (ADAMS et al. 2000 Down). Therefore, the centromeric region of the third chromosome may contain ~2100 genes (assuming genes are randomly distributed). The low recombination region of human chromosome four comprises ~25% of the length of this chromosome, or 1.5% of the length of the human genome (MORTON 1991 Down). The human genome contains ~70,000 genes (BIRD 1995 Down) and thus the centromeric region of chromosome 4 may contain roughly 1050 genes, or about half as many as in the centromeric region of chromosome 3 in D. melanogaster. These calculations are very rough and do not take into account possible differences in the genomic deleterious mutation rate between flies and humans (e.g., DRAKE et al. 1998 Down). Nevertheless, these calculations suggest that, a priori, we might expect the effects of selection at linked sites to be weaker in humans than in flies.

Can we use these results to evaluate models of genetic hitchhiking in humans? WIEHE 1998 Down has shown that with mutation rates typical of human microsatellites, the selective effects of hitchhiking will quickly be obscured by mutation. When population sizes are small and mutation rates large, as in human populations, the traces of hitchhiking on levels of microsatellite variability are unlikely to be detected by statistical means. Thus, while microsatellite data are appropriate for testing background selection models, they are unlikely to provide a useful test of genetic hitchhiking. Therefore, the results presented here do not speak to the relative importance of background selection and genetic hitchhiking in humans, but do show that background selection is not strongly influencing genomic patterns of microsatellite diversity. Discerning whether or not genetic hitchhiking is important in humans will require large-scale surveys of markers with lower mutation rates (such as nucleotide polymorphisms) in regions of high and low recombination.


*  ACKNOWLEDGMENTS

We thank Asher Cutter, Chris Hanus, and Scott Payseur for help with analyses. We thank Andy Clark, Chip Aquadro, and two anonymous reviewers for useful comments. We thank John Collins for providing the chromosome 22 marker positions. Members of the Nachman lab gave constructive suggestions during the course of the project. This work was funded by the National Science Foundation.

Manuscript received June 2, 1999; Accepted for publication July 19, 2000.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

ADAMS, M. D., S. E. CELNIKER, R. A. HOLT, C. A. EVANS, and J. D. GOCAYNE et al., 2000  The genome sequence of Drosophila melanogaster.. Science 287:2185-2195[Abstract/Free Full Text].

AGUADE, M., N. MIYASHITA, and C. H. LANGLEY, 1989  Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster.. Genetics 122:607-615[Abstract/Free Full Text].

AQUADRO, C. F., D. J. BEGUN and E. C. KINDAHL, 1994 Selection, recombination, and DNA polymorphism in Drosophila, pp. 46–56 in Non-Neutral Evolution: Theories and Molecular Data, edited by B. GOLDING. Chapman & Hall, New York.

BANCHS, I., A. BOSCH, J. GUIMERA, C. LAZARO, and A. PUIG et al., 1994  New alleles at microsatellite loci in CEPH families mainly arise from somatic mutations in the lymphoblastoid cell lines. Hum. Mut. 3:365-372[Medline].

BEGOVICH, A. B., G. R. MCCLURE, V. C. SURAJ, R. C. HELMUTH, and N. FILDES et al., 1992  Polymorphism, recombination, and linkage disequilibrium within the HLA class II region. J. Immunol. 148:249-258[Abstract].

BEGUN, D. J. and C. F. AQUADRO, 1991  Molecular population genetics of the distal portion of the X chromosome in Drosophila: evidence for genetic hitchhiking of the yellow-achaete region. Genetics 129:1147-1158[Abstract].

BEGUN, D. J. and C. F. AQUADRO, 1992  Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster.. Nature 356:519-520[Medline].

BERRY, A. J., J. W. AJIOKA, and M. KREITMAN, 1991  Lack of polymorphism on the Drosophila fourth chromosome resulting from selection. Genetics 129:1111-1117[Abstract].

BIRD, A. P., 1995  Gene number, noise reduction and biological complexity. Trends Genet. 11:94-100[Medline].

BRINKMANN, B., M. KLINTSCHAR, F. NEUHUBER, J. HUHNE, and B. ROLF, 1998  Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat. Am. J. Hum. Genet. 62:1408-1415[Medline].

CHARLESWORTH, B., 1994  The effect of background selection against deleterious mutations on weakly selected, linked variants. Genet. Res. 63:213-227[Medline].

CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993  The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303[Abstract].

COLLINS, A., J. FREZAL, J. TEAGUE, and N. E. MORTON, 1996  A metric map of humans: 23,500 loci in 850 bands. Proc. Natl. Acad. Sci. USA 93:14771-14775[Abstract/Free Full Text].

DELOUKAS, P., G. D. SCHULER, G. GYAPAY, E. M. BEASLEY, and C. SODERLUND et al., 1998  A physical map of 30,000 human genes. Science 282:744-746[Abstract/Free Full Text].

DIB, C., S. FAURE, C. FIZAMES, D. SAMSON, and N. DROUOT et al., 1996  A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380:152-154[Medline].

DI RIENZO, A., P. DONNELLY, C. TOOMAJIAN, B. SISK, and A. HILL et al., 1998  Heterogeneity of microsatellite mutations within and between loci, and implications for human demographic histories. Genetics 148:1269-1284[Abstract/Free Full Text].

DRAKE, J. W., B. CHARLESWORTH, D. CHARLESWORTH, and J. F. CROW, 1998  Rates of spontaneous mutation. Genetics 148:1667-1686[Abstract/Free Full Text].

DUNHAM, I., N. SHIMIZU, B. A. ROE, and S. CHISSOE et al., 1999  The DNA sequence of human chromosome 22. Nature 402:489-496[Medline].

DVORAK, J., M.-C. LUO, and Z.-L. YANG, 1998  Restriction fragment length polymorphism and divergence in the genomic regions of high and low recombination in self-fertilizing and cross-fertilizing Aegilops species. Genetics 148:423-434[Abstract/Free Full Text].

GYAPAY, G., K. SCHMITT, C. FIZAMES, H. JONES, and N. VEGA-CZARNY et al., 1996  A radiation hybrid map of the human genome. Hum. Mol. Genet. 5:339-346[Abstract/Free Full Text].

HAMMER, M., 1995  A recent common ancestry for human Y chromosomes. Nature 378:376-378[Medline].

HARDING, R. M., S. M. FULLERTON, R. C. GRIFFITHS, J. BOND, and M. J. COX et al., 1997  Archaic African and Asian lineages in the genetic ancestry of modern humans. Am. J. Hum. Genet. 60:772-789[Medline].

HATTORI, M., A. FUJIYAMA, T. D. TAYLOR, H. WATANABE, and T. YADA et al., 2000  The DNA sequence of human chromosome 21. Nature 405:311-319[Medline].

HILTON, H., R. M. KLIMAN, and J. HEY, 1994  Using hitchhiking genes to study adaptation and divergence during speciation within the Drosophila melanogaster species complex. Evolution 48:1900-1913.

HUDSON, R. R. and N. L. KAPLAN, 1995  Deleterious background selection with recombination. Genetics 141:1605-1617[Abstract].

HUDSON, R. R., M. KREITMAN, and M. AGUADE, 1987  A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159[Abstract/Free Full Text].

HUDSON, T. J., L. D. STEIN, S. S. GERETY, J. MA, and A. B. CASTLE et al., 1995  An STS-based map of the human genome. Science 270:1945-1953[Abstract].

JARNE, P. and P. J. L. LAGODA, 1996  Microsatellites, from molecules to populations and back. Trends Ecol. Evol. 11:424-429.

KAPLAN, N. L., R. R. HUDSON, and C. H. LANGLEY, 1989  "The hitch-hiking effect" revisited. Genetics 123:887-899[Abstract/Free Full Text].

KLIMAN, R. M. and J. HEY, 1993  Reduced natural selection associated with low recombination in Drosophila melanogaster.. Mol. Biol. Evol. 10:1239-1258[Abstract].

KRAFT, T., T. SALL, I. MAGNUSSON-RADING, N.-O. NILSSON, and C. HALLDEN, 1998  Positive correlation between recombination rates and levels of genetic variation in natural populations of sea beet (Beta vulgaris subsp. maritima). Genetics 150:1239-1244[Abstract/Free Full Text].

MAYNARD SMITH, J. and J. HAIGH, 1974  The hitch-hiking effect of a favorable gene. Genet. Res. 23:23-35[Medline].

MORAN, P. A. P., 1975  Wandering distributions and the electrophoretic profile. Theor. Popul. Biol. 8:318-330[Medline].

MORIYAMA, E. N. and J. R. POWELL, 1996  Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277[Abstract].

MORTON, N. E., 1991  Parameters of the human genome. Proc. Natl. Acad. Sci. USA 88:7474-7476[Abstract/Free Full Text].

NACHMAN, M. W., 1997  Patterns of DNA variability at X-linked loci in Mus domesticus. Genetics 147:1303-1316[Abstract].

NACHMAN, M. W. and G. A. CHURCHILL, 1996  Heterogeneity in rates of recombination across the mouse genome. Genetics 142:537-548[Abstract].

NACHMAN, M. W., V. L. BAUER, S. L. CROWELL, and C. F. AQUADRO, 1998  DNA variability and recombination rates at X-linked loci in humans. Genetics 150:1133-1141[Abstract/Free Full Text].

NAGARAJA, R., S. MACMILLAN, J. KERE, S. JONES, and S. GRIFFIN et al., 1997  X chromosome map at 75-kb STS resolution, revealing extremes of recombination and GC content. Genome Res. 7:210-222[Abstract/Free Full Text].

OHTA, T. and M. KIMURA, 1973  A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population. Genet. Res. 22:201-204[Medline].

OUDET, C., A. HANAUER, P. CLEMENS, T. CASKEY, and J. L. MANDEL, 1992  Two hot spots of recombination in the DMD gene correlate with the deletion prone regions. Hum. Mol. Genet. 1:599-603[Abstract/Free Full Text].

PREZWORSKI, M., R. R. HUDSON, and A. DI RIENZO, 2000  Adjusting the focus on human variation. Trends Genet. 16:296-302[Medline].

SCHUG, M. D., T. F. C. MACKAY, and C. F. AQUADRO, 1997  Low mutation rates of microsatellite loci in Drosophila melanogaster.. Nat. Genet. 15:99-102[Medline].

SCHUG, M. D., C. M. HUTTER, M. A. F. NOOR, and C. F. AQUADRO, 1998  Mutation and evolution of microsatellites in Drosophila melanogaster.. Genetica 102(103):359-367.

SLATKIN, M., 1995  Hitchhiking and associative overdominance at a microsatellite locus. Mol. Biol. Evol. 12:473-480[Abstract].

SOKAL, R. R., and F. J. ROHLF, 1995 Biometry. W. H. Freeman, New York.

SORSA, V., 1988 Chromosome Maps of Drosophila. CRC Press, Boca Raton, FL.

STEPHAN, W., 1995  An improved method for estimating the rate of fixation of favorable mutations based on DNA polymorphism data. Mol. Biol. Evol. 12:959-962[Medline].

STEPHAN, W. and C. H. LANGLEY, 1989  Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. I. Contrasts between the vermillion and forked loci. Genetics 121:89-99[Abstract/Free Full Text].

STEPHAN, W. and C. H. LANGLEY, 1998  DNA polymorphism in Lycopersicon and crossing-over per physical length. Genetics 150:1585-1593[Abstract/Free Full Text].

STEWART, E. A., K. B. MCKUSICK, A. AGGARWAL, E. BAJOREK, and E. BRADY et al., 1997  An STS-based radiation hybrid map of the human genome. Genome Res. 7:422-433[Abstract/Free Full Text].

TEAGUE, J. W., A. COLLINS, and N. E. MORTON, 1996  Studies on locus content mapping. Proc. Natl. Acad. Sci. USA 93:11814-11818[Abstract/Free Full Text].

WANG, L. H., A. COLLINS, S. LAWRENCE, B. J. KEATS, and N. E. MORTON, 1994  Integration of gene maps: chromosome X. Genomics 22:590-604[Medline].

WEBER, J. L., 1990  Informativeness of human (dC-dA)n (dG-dT)n polymorphisms. Genomics 7:524-530[Medline].

WIEHE, T., 1998  The effect of selective sweeps on the variance of the allele distribution of a linked multiallele locus: hitchhiking of microsatellites. Theor. Popul. Biol. 53:272-283[Medline].