Abstract
The relation between the level of genetic variation and the rate of recombination per physical unit was investigated in sea beet (Beta vulgaris subsp. maritima). The rate of recombination per physical unit was estimated indirectly through marker density in an RFLP linkage map of sugar beet. From this map, we also selected RFLP markers covering two of the nine chromosomes in Beta. The markers were used to estimate the level of genetic variation in three populations of sea beet, two from Italy and one from England. Two estimates of genetic variation were employed, one based on the number of alleles in the sample and the other on heterozygosity. A statistically significant positive correlation was found between recombination rate and genetic variation. Several theoretical explanations for this are discussed, background selection being one. A correlation similar to this has been observed previously in Drosophila, one that was higher than what we obtained for Beta. This is consistent with various biological differences between the two species.
GENETIC variation and the level of recombination per physical unit have been shown repeatedly to be positively correlated in natural populations of Drosophila melanogaster (Aguadé et al. 1989; Stephan and Langley 1989; Begun and Aquadro 1992; Aquadroet al. 1994). A strictly neutral explanation of these observations would require that recombination rates correlate positively with mutation rates. However, such a correlation would also lead to a correlation between recombination rates and divergences between sibling species, which has not been detected (Begun and Aquadro 1992). At least three different explanations for the correlation between recombination rates and genetic variation in Drosophila have been proposed. The first, genetic hitchhiking, is when variation at a neutral locus becomes reduced due to a selective sweep at a linked locus (Maynard Smith and Haigh 1974; Kaplanet al. 1989; Begun and Aquadro 1992; Stephanet al. 1992; Wiehe and Stephan 1993; Aquadroet al. 1994; Hudson 1994; Bravermanet al. 1995). According to the second explanation, background selection, selection against deleterious mutations, decreases the genetic variation at linked neutral loci (Charlesworthet al. 1993; Hudson and Kaplan 1995; Charlesworth 1996; Charlesworth and Guttman 1996; Nordborget al. 1996). The third explanation is that a temporal fluctuation in selection coefficients decreases the genetic variation at linked neutral loci (Gillespie 1994). For all three explanations, the reduction in genetic variation at linked neutral loci is stronger in genomic regions in which recombination is restricted. This could create a correlation between recombination rate per physical unit and level of genetic variation.
To detect a correlation of this sort, one needs estimates both of the recombination rate per physical unit and of the level of genetic variation in several regions of the genome. Although estimates of genetic variation are readily obtained in almost any organism, the rate of recombination per physical unit is much more difficult to estimate. Direct estimation of recombination rates per physical unit is possible for species for which both a genetic and a physical map exist. The distribution of markers on the genetic map alone can be used, however, to estimate recombination rates per physical unit indirectly (Nachman and Churchill 1996). Using this approach, a positive correlation between recombination and variation has been detected in mouse (Nachman 1997). Because that study involved only four loci, however, it is difficult to say whether a general correlation between recombination level and genetic variation in mouse exists.
It is of considerable interest to determine whether a correlation between recombination and genetic variation exists in species other than Drosophila and possibly mouse. Recently, Halldén et al. (1996) developed a high-density RFLP linkage map of sugar beet (B. vulgaris subsp. vulgaris), making it possible to estimate recombination rates per physical unit in the beet genome. In the present study we assume that the RFLP marker order and the distribution of recombination rates across the genome in the sugar beet and its wild relative, the sea beet (B. vulgaris subsp. maritima), are similar. These assumptions are supported by three observations. First, hybrids between sea beet and sugar beet show no decrease in fertility. Second, sugar beet RFLP markers have been shown to produce clear, single-copy hybridization patterns when hybridized to DNA from sea beet (Hjerdinet al. 1994). Third, sugar beet was only domesticated from wild sea beet fairly recently and introgressions from the sea beet to the sugar beet genome have occurred on many occasions since (Bosemark 1978). In the present study, RFLP clones from sugar beet are used to examine genetic variation in natural populations of sea beet and to investigate whether a correlation exists between regional recombination rates and levels of genetic variation.
MATERIALS AND METHODS
Plant material: Sea beet, B. vulgaris subsp. maritima, is a diploid (2n = 18), outcrossing, and self-incompatible (Larsen 1977) species belonging to the family Chenopodiaceae. Seeds were collected from three natural populations, one from Cornwall in England and two from the coast of northeastern Italy. The two Italian populations were from locations ∼100 km apart. The seeds were grown in a greenhouse, and for each population 11 seedlings, each having different seed parents, were selected for further analysis.
RFLP analysis: Total genomic DNA was isolated and quantified as described in Halldén et al. (1996). Restriction-enzyme digestions, electrophoresis on agarose gels, and Southern blotting were performed according to Halldén et al. (1996). DNA from the plants selected was digested in single digests by EcoRI and EcoRV. Hybridizations were performed using all of the clones that mapped to the first or the third linkage group in one of the two mapping populations utilized in Halldén et al. (1996). This map contained a total of 413 markers and was constructed using two populations, of 222 and 133 F2 individuals, respectively.
Estimation of genetic variation: Two linkage groups from the Halldén et al. (1996) linkage map were selected for the study. Both linkage groups show clear differences between regions in the density of markers. All the RFLP clones used had been shown previously to map to single loci (Halldén et al. 1996). RFLP markers that produced weak bands or multibanded patterns that were not possible to interpret genetically were excluded from the statistical analysis. For the final analysis, 27 and 24 markers remained for linkage groups 1 and 3, respectively.
Genetic hitchhiking and background selection models quantify the reduction in genetic variation as a reduction in θ. This parameter, used frequently in population genetics, is defined as θ = 4Nμ, where N is the effective population size and μ the neutral mutation rate. We used two estimators of θ, one based on heterozygosity,
Estimation of recombination rates: Recombination rates per physical unit across the two selected linkage groups were estimated as proposed by Nachman and Churchill (1996). If RFLP markers are uniformly distributed along the chromosome, chromosomal regions with low recombination rates have markers tightly clustered on the genetic map, whereas regions with high recombination rates have longer map distances between markers. Under such conditions, marker density on the genetic map is inversely proportional to regional recombination rates and can be used to estimate local variations in recombination rates along linkage groups. We made use of the RFLP linkage map of Halldén et al. (1996) to estimate marker densities for linkage groups 1 and 3. Marker density for each marker was estimated using a cosine kernel function in a region ±5 cM from the specific marker (Silverman 1986). The inverse of these marker densities is proportional to the recombination rate per physical unit around the marker. These values were scaled to fit the number of map units per Mb for each chromosome. The number of base pairs per chromosome was derived from Bennet and Smith (1976), all the chromosomes in Beta being assumed to be of equal size (Bosemark and Bormotov 1971). Estimates of local recombination rates become less reliable near the ends of the linkage groups. We tried to minimize this effect by using a reflecting boundary as outlined by Silverman (1986).
Correlation analysis: The Spearman rank correlation between estimates of recombination rates and levels of genetic variation was calculated separately for each combination of linkage group, population, and enzyme. We also calculated correlation coefficients for the combined datasets. When the data were combined, several estimates of genetic variation were obtained for each marker, e.g., from different populations and/or restriction enzymes. For datasets that included observations that were not independent, one-sided significance levels of the correlation coefficients were calculated, using a resampling method. Keeping the observations of genetic variation fixed for the different markers, the recombination rates were shuffled 2000 times, the correlation between the recombination rate and the level of variation being calculated each time. The probability of the observed correlation was estimated then by comparison with the simulated distribution.
RESULTS
The distribution of recombination rates along the first and third linkage groups of the sugar beet map of Halldén et al. (1996) was estimated from the distribution of markers. In the first linkage group, recombination was clearly suppressed in the middle, increasing further outward and decreasing finally at the very ends. The third linkage group showed a similar pattern, but without any indication of decreased recombination at the ends (Figure 1).
As expected (Donnelly and Tavaré 1995), estimates of θ showed considerable variance, even among adjacent markers (Figure 2). The variance appears mainly due to real differences between loci and populations, because the sampling variances were much lower than the variances among the estimates.
Markers from linkage group 1 were used to analyze the two Italian populations, whereas markers from linkage group 3 were used for both the Italian and the English populations. All DNA samples were digested separately by two different restriction enzymes, EcoRI and EcoRV, allowing estimates of θ to be obtained from four nonindependent datasets in the case of linkage group 1 and from six datasets for linkage group 3. For each of these datasets, Spearman rank correlation coefficients between recombination rates and both
—Recombination rate per physical unit (cM/Mb) as a function of map position (cM) for the first (a) and third linkage groups (b) of Halldén et al. (1996). The markers used in this study are represented by solid squares, whereas all the other markers in the map of Halldén et al. (1996) are represented by open squares.
— for linkage groups 1 (a) and 3 (b), and
for linkage groups 1 (c) and 3 (d), as a function of recombination rate per physical unit (cM/Mb). The open squares represent mean estimates for the two enzymes in the Italian population A, the solid squares the Italian population B, and the triangles the English population.
Correlation coefficients between θ and recombination rate per physical unit
In testing for correlations in the combined datasets, due consideration should be given to the fact that the different datasets are dependent. Accordingly, permutation methods were used to establish P values for the combined datasets (Table 1). Linkage group 1 showed a significant positive correlation between recombination and
DISCUSSION
We obtained positive correlations between estimates of recombination rates and θ in wild beet populations. Ten different datasets were investigated. Except for two with correlation coefficients close to zero, all showed positive correlations. When all the datasets were combined, the overall correlation found between recombination rates and θk was highly significant (r = 0.226, P = 0.007). Because the correlation coefficients for most of the datasets were positive, the overall significant result is not due to simply one of the datsets. The results are in agreement with theoretical predictions (Kaplanet al. 1989; Stephanet al. 1992; Charlesworth et al. 1993, 1995; Hudson and Kaplan 1995; Nordborget al. 1996) and with previous observations in Drosophila (Aguadé et al. 1989; Stephan and Langley 1989; Begun and Aquadro 1992; Aquadroet al. 1994). For all the datasets, recombination rates were more strongly correlated with
The theory of neutral evolution predicts that the expected degree of variation within populations, at a given locus, depends on the population size and the neutral mutation rate (Kimura 1983). Our data could therefore also be explained by higher mutation rates in regions of high recombination rates. However, if such a correlation between recombination rates and mutation rates exists, we would also expect a correlation between recombination rates and divergence between species. In contrast, the background selection and genetic hitchhiking models do not predict any correlation between recombination per physical unit and divergence. We have therefore reanalyzed the data from Hjerdin et al. (1994), which include RFLP data for several different species of the genus Beta. Twelve of the single-copy markers used in that study have been mapped (Halldén et al. 1996). For these 12 markers we calculated the divergence (Equations 5.52-5.55 in Nei 1987) between B. vulgaris subsp. maritima and B. macrocarpa, a close relative of maritima (data not shown). Recombination rates per physical unit for the same set of markers were estimated using the same formula as for the markers used in the present study. Spearman correlation coefficient between recombination and divergence was slightly negative, but nonsignificant (r = -0.02) for this data set, whereas the correlations between recombination and variation within B. vulgaris subsp. maritima and within B. macrocarpa were both positive and nonsignificant (r = 0.07 and 0.05, respectively). Thus, the significant correlation between recombination rates and levels of genetic variation found in the present study cannot be explained by a correlation between recombination rates and mutation rates. Instead, our data are most easily explained by models such as background selection and genetic hitchhiking.
Begun and Aquadro (1992) and Aquadro et al. (1994) report a much stronger correlation than was found here between recombination rate and level of genetic variation in D. melanogaster. There are several possible explanations, methodological as well as biological, for this difference. First, the recombination rates per physical unit were estimated in different ways. Our estimates are based entirely on genetic data from an RFLP map, whereas in Drosophila the marker location on genetic maps was compared to locations in polytene chromosomes. The use of a physical map should provide a better measure of recombination rates. This could explain some of the differences. Nachman and Churchill (1996), on the other hand, showed that in Drosophila the inverse of marker density is a reliable measure of recombination rates. Also, our estimates of recombination rates per physical unit and of genetic variation are not strictly independent. All markers we used had previously been mapped in a cross between two cultivars of sugarbeet and were thus necessarily polymorphic between these two cultivars (Halldén et al. 1996). Accordingly, in genomic regions of lesser genetic variation there should be a greater number of clones that are monomorphic between the two cultivars and thus impossible to map. Therefore, we have probably overestimated both the level of genetic variation and the recombination rate per physical unit in the regions in which recombination is suppressed.
Whereas Begun and Aquadro (1992) and Aquadro et al. (1994) based their investigations on restriction-site information from various gene regions, our study involves random genomic clones. If genes are not uniformly distributed over the chromosomes, some of the loci included in our study could be located in regions of fewer genes and thus be less affected by hitchhiking or background selection than would be expected on the basis of recombination rates in that region. Regions of low recombination rates in the RFLP map of sugar beet probably coincide with the centromeric regions, which to a large degree consist of repetitive DNA and may thus contain fewer genes. However, a similar clustering of markers has been found in several species in which the clusters have been shown to include cDNA markers and isozyme loci, examples being tomato and potato (Tanksleyet al. 1992), common bean (Adam-Blondonet al. 1994), rice (Causseet al. 1994), and sugar beet (Pillenet al. 1993). This shows that genes also exist in such clusters. Furthermore, in wheat the reduction in recombination rates has been shown to extend far outside the centromere region (Curtis and Lukaszewski 1991).
Probably much more important than these methodological factors are the biological differences between Beta and Drosophila. A key parameter in determining the effect of background selection and genetic hitchhiking is the rate of mutations to nonneutral variants per map unit (Kaplanet al. 1989; Stephanet al. 1992; Charlesworthet al. 1993). In addition, the relation between the nonneutral mutation rate and the decrease in neutral variation at linked loci is nonlinear, showing a positive and increasing slope (Charlesworthet al. 1993; Charlesworth 1996). Under the assumption that the mutation rate in a region is proportional to the number of genes, the average number of genes per centimorgan should be proportional to the mutation rate per centimorgan. Whereas the genetic map in B. vulgaris is 621 cM (Halldén et al. 1996), the map in Drosophila is 277 cM (FlyBase 1995). Because Drosophila lacks recombination in males, the effective recombination rates are only half of those indicated by the genetic map. Drosophila has been estimated to have 12,000-16,000 genes (Bird 1995). Although there is no estimate of the gene number in Beta, Arabidopsis thaliana has been estimated to harbor ∼21,000 genes (Bevanet al. 1998). If we assume that Beta has about the same number of genes as Arabidopsis, both being dicotelydonous plants, the mutation rate per map unit must be at least twice and perhaps three times as high in Drosophila as in Beta. Thus, the genome-wide effect of background selection and genetic hitchhiking can be expected to be stronger in Drosophila than in Beta.
Still another possibility is that Beta has higher variance in θ among loci. The stochastic nature of genetic drift should result in very different estimates of θ for different loci, regardless of sample size. Any investigation attempting to demonstrate a potential effect of recombination on the degree of variation needs to possess sufficient statistical power to overcome the obscuring effect of the variance among the loci due to genetic drift. The significant correlation between recombination rate and genetic variation found in our study shows the methodology and the sample size to be sufficient for revealing the impact that the level of recombination has on the level of genetic variation in Beta. Still, a higher variance in θ among loci would decrease the correlation coefficient between recombination and θ. Drosophila is thought to consist of very large populations, whereas the sea beet is known to be divided into several subpopulations of smaller size (Letschert 1993; Kraftet al. 1997). Thus, the variance among loci can be expected to be higher in Beta.
Our results show that the theoretical prediction of a positive correlation between recombination and variation can be observed in species other than Drosophila. We also found the strength of the correlation to vary between species. Recently, similar results were obtained for several different species of Aegilops (Dvorâk et al. 1998). They found that the strength of the correlation between recombination and variation varied among species and was stronger for self-fertilizing species. Thus, many different factors, such as numbers of genes per centimorgan, population structure, and reproduction mode, can affect the magnitude of the correlation between recombination and variation. Today, as genetic maps, and sometimes physical ones too, are available both in a number of model organisms and in many crop species, it would be of great interest to examine patterns of variation in a number of species varying in genomic size, in breeding system, and in population structure.
Acknowledgments
We thank Magnus Nordborg and Bengt-Olle Bengtsson for helpful suggestions and comments and R. J. Murphy.
Footnotes
-
Communicating editor: A. G. Clark
- Received March 5, 1998.
- Accepted July 27, 1998.
- Copyright © 1998 by the Genetics Society of America