Construction of a 10,000-Marker Ultradense Genetic Recombination Map of Potato: Providing a Framework for Accelerated Gene Isolation and a Genomewide Physical Map
Hans van Os, Sandra Andrzejewski, Erin Bakker, Imanol Barrena, Glenn J. Bryan, Bernard Caromel, Bilal Ghareeb, Edwige Isidore, Walter de Jong, Paul van Koert, Véronique Lefebvre, Dan Milbourne, Enrique Ritter, Jeroen N. A. M. Rouppe van der Voort, Françoise Rousselle-Bourgeois, Joke van Vliet, Robbie Waugh, Richard G. F. Visser, Jaap Bakker, Herman J. van Eck

Abstract

An ultradense genetic linkage map with >10,000 AFLP loci was constructed from a heterozygous diploid potato population. To our knowledge, this is the densest meiotic recombination map ever constructed. A fast marker-ordering algorithm was used, based on the minimization of the total number of recombination events within a given marker order in combination with genotyping error-detection software. This resulted in “skeleton bin maps,” which can be viewed as the most parsimonious marker order. The unit of distance is not expressed in centimorgans but in “bins.” A bin is a position on the genetic map with a unique segregation pattern that is separated from adjacent bins by a single recombination event. Putative centromeres were identified by a strong clustering of markers, probably due to cold spots for recombination. Conversely, recombination hot spots resulted in large intervals of up to 15 cM without markers. The current level of marker saturation suggests that marker density is proportional to physical distance and independent of recombination frequency. Most chromatids (92%) recombined once or never, suggesting strong chiasma interference. Absolute chiasma interference within a chromosome arm could not be demonstrated. Two examples of contig construction and map-based cloning have demonstrated that the marker spacing was in accordance with the expected physical distance: approximately one marker per BAC length. Currently, the markers are used for genetic anchoring of a physical map of potato to deliver a sequence-ready minimal tiling path of BAC contigs of specific chromosomal regions for the potato genome sequencing consortium (http://www.potatogenome.net).

GENETIC linkage maps constitute a necessary prerequisite to study the inheritance of both qualitative and quantitative traits and to develop markers for marker-assisted breeding and for map-based gene cloning. Multi-locus molecular marker techniques, such as AFLP (Vos et al. 1995), can be used to generate large numbers of markers in a relatively short time, facilitating the construction of dense genetic linkage maps. High-density genetic linkage maps have already been constructed in crop plant species such as rice (Harushima et al. 1998: 2275 markers), maize (Vuylsteke et al. 1999: 1539 and 1355 markers mapped in two populations), wheat (Boyko et al. 2002: 732 markers), potato and tomato (Tanksley et al. 1992: ∼1000 markers; Haanstra et al. 1999: 1175 markers), pepper (Paran et al. 2004: 2262 markers mapped in six populations), sorghum (Bowers et al. 2003: 2512 markers), cotton (Rong et al. 2004: 3347 markers), and papaya (Ma et al. 2004: 1501 markers). High-density genetic linkage maps with >5000 microsatellite markers have also been constructed in mammals (Murray et al. 1994; Dib et al. 1996; Dietrich et al. 1996).

A genomewide ultradense genetic map results in the global saturation of the genome with marker loci, which, if concentrated on a single mapping population, can be useful for all other mapping applications. Usually, map-based cloning of genes responsible for interesting traits requires local marker saturation around the target gene. This targeted marker saturation is generally achieved with bulked segregant analysis (Michelmore et al. 1991). Ultradense genetic maps avoid this time-consuming and costly step, which has to be achieved in separate experiments for every trait locus targeted. Moreover, expected average between-marker distances that are smaller than the average insert length of a BAC library generally allow chromosome landing (Tanksley et al. 1995). In addition, ultradense genetic maps also facilitate the genetic anchoring of a physical map. If large-insert genomic clones or contigs can be identified directly with markers from an ultradense genetic map, then they can be anchored to their corresponding chromosomal positions. In additon to these applications, the ultradense map will become the reference map that facilitates marker exchange and map alignment within the research community working on any given organism, provided that the marker information contained within the map is transferable to other genotypes or populations. High transferability of AFLP markers between populations has been amply demonstrated by using the AFLP catalog for potato (Rouppe van der Voort et al. 1997a,b) and barley (Qi and Lindhout 1997; Waugh et al. 1997). The transferability of other single-locus marker types, such as RFLPs, STSs, and SSRs, is more obvious and has therefore not been questioned.

The construction of ultradense genetic linkage maps has been confronted with two major problems. Currently available computer programs for linkage mapping are incapable of handling data sets of several thousands of markers and result in prohibitively long calculation times. Moreover, even small frequencies of scoring error result in high rates of ordering ambiguities between markers within short genetic distances. Two recently developed computer programs, referred to as RECORD (van Os et al. 2005a) and SMOOTH (van Os et al. 2005b), have tackled these problems. RECORD employs a marker-ordering algorithm based on minimization of the total number of recombination events in any given marker order (van Os et al. 2005a). SMOOTH is a statistical genotyping error-removal utility that calculates the probability of a data point being a “singleton” on the basis of neighboring marker information. A singleton appears to be the result of an apparent double recombination event at either side of a single marker locus (Nilsson et al. 1993). More likely singletons represent artifacts due to scoring errors, technical, or biological phenomena such as methylation polymorphisms and gene conversion. The observation of singletons depends on their context of flanking markers. Therefore, singletons are removed in an iterative process, singleton removal, reordering of markers, singleton removal, reordering, etc., thereby gradually relaxing the statistical threshold of singleton identification (van Os et al. 2005b). The loss of a few percentages of the data is obviously less damaging to the map than having similar levels of genotyping errors. Using these two computer programs results in a framework of ordered bins in which all recombination events in the population have been identified. A “bin” is a position on the genetic map with a unique segregation pattern and is separated from adjacent bins by a single recombination event. This ordered set of bins is considered to be a “skeleton bin map” to which all original marker data can be fitted, using a maximum-likelihood method. This approach also provides a quality estimate for each marker that is based on the deviation between the observed marker segregation pattern and the expected segregation pattern as defined by the position of the bin in the skeleton bin map. A bin may contain a number of cosegregating markers and is defined by a segregation pattern. This pattern is called the “bin signature,” and it represents an accurate genetic position on the map within a given population. The unit of distance of the skeleton bin map is expressed in recombination events. In saturated linkage maps, all recombination events are captured. As a consequence, application of the Kosambi mapping function is not necessary to compensate for unnoted double recombination events. A more comprehensive description of the method is provided in Isidore et al. (2003) and van Os et al. (2005a,b) and is outlined in Figure 1.

Figure 1.—

Overview of the method used to construct the ultradense map of potato as described by Isidore et al. (2003) and van Os et al. (2005a,b). Specifically, the handling of data and the application of software are indicated.

In this article, we present, to our knowledge, the densest meiotic linkage map yet produced for any species. The ultradense map of potato covers all linkage groups and contains >10,000 markers in total. The nonrandom pattern of marker distribution provides insight into the positions of putative recombination hot spots and centromeric regions. The distribution of recombination events per chromatid provides information on level of chiasma interference. Given an estimated genome size of 840 Mb (Bennett et al. 1997), and assuming random marker distribution, this level of marker saturation will expedite all map-based cloning efforts in potato, as well as the anchoring of BAC contigs for the construction of a sequence-ready potato physical map.

MATERIALS AND METHODS

Plant material:

A cross between two diploid heterozygous potato clones, SH83-92-488 × RH89-039-16 (hereafter referred to as SH × RH), resulted in an F1 mapping population of 136 individuals. The same mapping population has been used to clone the nematode resistance gene Gpa2 against Globodera pallida (Van der Vossen et al. 2000), the Phytophthora infestans R-gene R3a (Huang et al. 2004, 2005), and the high-resolution map of the H1 locus for G. rostochiensis resistance (Bakker et al. 2004). Genomic DNA was extracted from frozen leaf tissue according to Van der Beek et al. (1992).

Marker analysis:

AFLP markers (Vos et al. 1995) were generated with templates of three different restriction enzyme combinations—EcoRI/MseI, SacI/MseI, and PstI/MseI—and by applying three selective nucleotides to AFLP primers at the EcoRI, SacI, and MseI side and two selective nucleotides to the primers at the PstI side. A total of 381 primer combinations, listed at http://potatodbase.dpw.wau.nl/UHDdata.html, were used to generate markers. Amplification products were separated by electrophoresis and visualized by autoradiography as described in Isidore et al. (2003).

The autoradiograms were analyzed manually or with the aid of the computer program Cross-Checker (Buntjer 2000b), which is available at http://www.dpw.wur.nl/pv/. The names of the markers indicate the enzymes used, the selective nucleotides, and the size of the fragment; for instance, EAACMCAA_507.0 is an AFLP marker derived from a primer combination with the enzymes EcoRI and MseI, selective nucleotides AAC and CAA, and a mobility that corresponds to a fragment with an estimated size of 507 bp. Fragment mobility estimates were inferred relative to a 10-base ladder (Sequamark, Research Genetics, Huntsville, AL) using reference gels provided by Keygene NV, Wageningen, Netherlands. Assigning linkage groups to the 12 potato chromosomes was done with a set of AFLP markers with known position (Rouppe van der Voort et al. 1997a,b) and other markers, including RFLPs, SSRs, cleaved amplified polymorphic sequence (CAPS), and sequence characterized amplified regions (SCARs).

Map construction:

The marker data were split into three sets on the basis of their segregation type. Markers that were heterozygous in the maternal parent (SH) and absent in the paternal parent (RH) were scored as <ab × aa>; “paternal” markers heterozygous in RH and absent in SH were scored as <aa × ab>; markers segregating in both parents were denoted as <ab × ab>. The maternal and paternal data sets were divided into 12 linkage groups with the module GROUP, included in JoinMap 2.0 (Stam and Van Ooijen 1995). In total, 65 faulty markers were removed manually to ensure a stable grouping down to a LOD threshold of 6. Faulty markers result from nonallelic bands with identical mobility. Such bands, superimposed on gel, where, e.g., <ab × aa> and <aa × ab> are perceived as one single <ab × ab> marker, result in a segregation pattern drawing two unrelated parental groups into one artifactual group. A preliminary marker order and the linkage phase was calculated with the “quick and dirty” mapping module JMQAD32 from JoinMap 2.0 (Stam and Van Ooijen 1995). This algorithm calculates the marker order by minimizing the sum of adjacent recombination frequencies.

The order of markers in the linkage groups was then recalculated with RECORD (van Os et al. 2005a), which requires data in BC1 format. RECORD makes use of a cost function, which aims to minimize the total number of recombination events within a given marker order. This can be viewed as the most parsimonious marker order. After this second ordering of the markers, the data were displayed in map order as a color-coded “graphical” genotype in Microsoft Excel using a conditional cell formatting formula. Using this display, we could easily mark data points that were in disagreement with the observations at flanking marker loci. These data points could be the result of a double recombination event at either side of a single marker locus, gene conversion, AFLP artifacts, or scoring errors and are collectively called singletons (Nilsson et al. 1993; van Os et al. 2005b). They were reevaluated by visual inspection of the autoradiograms and corrected if necessary.

The corrected data were ordered for a third time with RECORD and the remaining singletons were removed with SMOOTH (van Os et al. 2005b) in iterations with RECORD.

The program ComBin (Buntjer et al. 2000a; available at http://www.dpw.wur.nl/pv/) was used for final inspection. ComBin removes the redundancy due to cosegregating markers and draws connections between nonredundant marker bins without the assumption that a chromosome is a linear structure. Side branches result from singletons, and any alternative connection between pairs of markers (or bins) is allowed as well. When nonlinear structures were visualized by ComBin, further data inspection was performed. When ComBin analysis results in a linear figure, it can be concluded that the linkage group is free from data ambiguities. When all ambiguities identified with ComBin have been replaced with missing values, the cosegregating markers are used to infer bin signatures. A bin signature comprises the consensus segregation pattern of marker loci, which do not recombine and are thus incorporated in the bin. The resulting bins form a skeleton bin map of the potato linkage groups. Subsequently, the bins are filled with marker loci. Please note that marker loci represent the real observed segregation data, including ambiguous data points, whereas the bin signatures represent the least ambiguous consensus segregation obtained so far.

The mapping of the bridge markers, which are heterozygous in both parents <ab × ab>, is based on the information offered by the skeleton bin map. When the telomeric maternal and paternal bin signatures are superimposed (<ab × aa> 1:1 + <aa × ab> 1:1 = <ab × ab> 3:1), a putative bridge bin signature results. This method of postulation of all putative bridge bin signatures follows the method of the “two-way pseudotestcross” proposed by Grattapaglia and Sederoff (1994) in reverse direction. Depending on linkage phase in coupling or repulsion of the parental markers ({0-} or {1-} and {-0} or {-1}), the postulated bridge bin can take four alternative 3:1 segregation patterns as the bridge bin signature ({00}, {01}, {10}, and {11}). The bridge markers were fit into the putative bridge bins by maximum likelihood. A LOD threshold of 15 (P < 0.001) was used to avoid false-positive assignment of bridge markers to bridge bins. This threshold was determined by a permutation test. After fitting 10,000 random markers into the bins, <0.1% of the markers fitted into the framework map with a LOD score higher than either 4 or 15 for 1:1 and 3:1 segregating markers, respectively. Chromosome orientation follows Dong et al. (2000) with the short arm north and the long arm south, except for the linkage groups homologous to chromosomes VII, XI, and XII, which are in opposite orientation.

RESULTS

Markers and progeny:

The diploid mapping population SH × RH, comprising 136 individuals, was analyzed with a total of 381 AFLP primer combinations derived from three different enzyme combinations. A total of 10,305 clearly scorable markers was recorded. Additional SSR, CAPS, RFLP, SCAR, and phenotypic marker loci were analyzed for the population, which raised the number of markers to 10,365. This implies a data set of 1.4 million data points. Inspection of the data revealed six individuals that were contaminated (1), duplicated (4), or not related to either of the parents (1). These individuals were omitted from further analyses, resulting in a population of 130 informative individuals.

Mapping:

The total data set was split into maternal, paternal, and biparental data sets (Table 1). Among the total of 10,365 markers, 4187 segregated due to polymorphism in the maternal parent <ab × aa>, 3413 segregated from the paternal parent <aa × ab>, and 2765 markers were heterozygous in both parents. The latter type of markers, being referred to as bridge markers <ab × ab>, were used to align the maternal and paternal maps. Summation of the parental-specific markers and the bridge markers resulted in 6952 maternal loci and 6178 paternal loci.

View this table:
TABLE 1

Number of markers per enzyme combination per parent

The maternal data set could be split into 12 linkage groups at a LOD threshold of 6. For the paternal data, linkage groups II–XI were obtained at LOD 6, but linkage groups I and XII remained associated up to a LOD threshold of 12. This was due to coincidental correlation between the segregation patterns of loci in these two groups. The 24 linkage groups from SH and RH were aligned with the expected potato chromosome. In addition to the 12 known paternal linkage groups, a small highly skewed unassigned linkage group, which contained only 13 markers with a length of ∼10 cM, was obtained. This group was heterozygous in the paternal clone (RH) and unassigned (U) to any particular chromosome and is therefore referred to as RHU. The linkage group RHU was omitted from further analyses.

The 24 data sets representing the different linkage groups from both parents were subjected to reexamination for putative scoring errors and to statistical identification of singletons using the computer programs RECORD and SMOOTH as described in Isidore et al. (2003). During data inspection, we noted one individual (SH×RH57-J12) in which almost all markers from the paternal parent in chromosome VIII were present. This phenomenon is probably due to nondisjunction of this chromosome in the first meiotic division, resulting in a trisomic state. The resulting systematic errors were replaced with missing values. By following the mapping method as described above and in Isidore et al. (2003), a skeleton bin map was obtained, corresponding with a most parsimonious representation of all marker data.

Skeleton bin map:

Twelve maternal and 12 paternal skeleton maps deduced from bin signatures provide a representation of the recombination events captured in this mapping population. In total, 569 maternal and 549 paternal bin signatures were obtained. Most adjacent bin signatures differ by only one offspring genotype score, which represents the recombination event between the adjacent bins. In other cases, bin signatures differed for two or more offspring genotype scores, suggesting two or more recombinations between adjacent bins. Inclusion of empty bins to accommodate for multiple recombination events between marker loci resulted in a skeleton bin map spanning 977 and 1005 recombination events in the maternal and paternal map, respectively (Table 2, Figure 2). All bins, including the empty bins, are numbered consecutively. With 130 offspring in the mapping population, one bin represents 100/130 cM. Hence, the genetic length of the parental maps is 751 cM for the maternal map and 773 cM for the paternal map.

Figure 2.—

The distribution of AFLP markers on the ultradense genetic linkage map of potato. (a) Legend for the marker density based on pseudocolors. (b) Map of the maternal parent SH. (c) Map of the paternal parent RH. The number on the left of the linkage group indicates the cumulative number of recombination events counted from the top. The number of markers in each bin is represented by shading indicated in a. Putative centromere positions are indicated with “I” alongside the chromosome.

View this table:
TABLE 2

Overview of the number of markers, bins, and recombination events per parent and per linkage group

Fitting of original marker data into the skeleton bin map:

The original scoring data (after the manual verification of singletons) were fitted into the bins of the skeleton bin map by maximum likelihood. Subsequently, the marker content of every bin was examined. Application of SMOOTH to remove singletons may have resulted in unmerited removal of correct data, thus causing a reduction in the effective population size. This visual inspection of the original scoring data, specifically near the position of the recombination events, allowed for the correct repositioning of markers into adjacent empty bins. In this way, the unjust removal of putative singletons by SMOOTH is restored. Obviously, after these final improvements to the skeleton bin map, the marker data had to be fitted into the bins again.

Not all markers, however, were allocated to map positions. From the maternal markers, 46 markers did not reach the threshold of LOD 4, another 22 markers were manually deleted from the most skewed bin SH05B044, and 1 marker fitted into two linkage groups with equal likelihood. From the paternal markers, 45 markers did not reach the threshold of LOD 4, 46 markers were removed from bin RH12B049, and 1 marker could not be fitted into one bin unambiguously.

Of the bridge markers, 15 were linked to nonhomologous maternal and paternal linkage groups and 525 markers did not reach the stringent threshold of LOD 15. The final numbers of markers within the bins of the skeleton map are listed in Table 2 and can be retrieved via http://potatodbase.dpw.wau.nl/UHDdata.html. Figure 2 shows the skeleton bin map.

Data quality:

All mapped markers shown in the online database have been provided with a quality label. This label is based on the deviation between the observed data of this marker and the expected segregation pattern as recorded in the bin signature. Because the dimension of genetic distances due to recombination events is independent of the dimension of distance due to singletons, this deviation can be considered as a distance perpendicular to the map. Hence, listing the number of singletons per marker is useful as a quality measure, which represents the goodness of fit of the marker in a given bin. Singletons did not occur randomly among the markers. Many markers (34%) were without singletons and the 10% of the markers with the poorest data quality account for over half the total amount of the 33,489 singletons observed in the data set (van Os et al. 2005b).

Segregation distortion:

In the maternal map, segregation distortion was observed for all markers of the maternal linkage group V. Moderate segregation distortion (44:86) started at one telomeric end, increased to a highly distorted ratio of 26:104 (χ2 = 46.8; P < 0.0001) at bin 45 (SH05B045), and declined to 50:80 at the other telomeric end.

In the paternal map, the linkage groups I and XII showed segregation distortion. The skewed interval on chromosome I ranged from bin RH01B001 to RH01B042, with bin RH01B021 showing the highest segregation distortion (35:95; χ2 = 27.7; P < 0.0001). The first two bins of the short arm of chromosome XII did not show significant skewness (54:76), but skewness increased toward the other end. The telomeric bin RH12B051 showed the strongest segregation distortion: 21:109 (χ2 = 59.6; P < 0.0001).

Markers in the proximity of the highly skewed bins RH01B021 and RH12B051 showed correlated segregation patterns. This required an elevated LOD threshold to separate markers in the two linkage groups RH01 and RH12. Correlated segregation patterns between loci from different linkage groups are a violation of Mendel's law of independent assortment of allele pairs. Possibly interacting allele pairs with strong effects on pollen or embryo viability, germination, or tuber formation are located on the paternal chromosomes I and XII.

Map saturation and marker distribution:

Figure 2 provides a bird's eye view of the length and saturation of the linkage groups. In Figure 2, shading, rather than listing 10,000 marker names, offers an indication of over- and undersaturated regions. The similarity between the maternal and paternal maps is striking with respect to map length and the positions of strong clustering of markers. But also the lack of clusters at chromosomes III and X is congruent between maternal and paternal maps. The largest cluster is observed on chromosome I, where the bins SH01B32 and RH01B13 contain 539 and 373 marker loci, respectively. Taking the most densely populated bin as the putative position of the centromere, the putative centromeric bins (and number of markers in parentheses) are SH01B32 (539), RH01B13 (373), SH02B04 (72), RH02B01 (47), SH04B31 (212), RH04B35 (155), SH05B44 (113), RH05B46 (174), SH06B05 (52), RH06B17 (97), SH07B70 (95), RH07B68 (80), SH08B13 (43), RH08B22 (16), SH09B21 (114), RH09B31 (27), SH12B56 (199), and RH12B49 (100).

Despite the saturation of the map, gaps are observed. The largest gap is on chromosome VIII, spanning 14 recombinations in the maternal parent and 20 recombinations in the paternal parent. These gaps are probably due to recombination hot spots, but could also indicate fixation (homozygosity) of the potato genome in this region.

Distribution of crossover events and chiasma interference:

The distribution of marker alleles observed in the offspring genotypes allows a reconstruction of the number of recombination events in the chromatids transmitted from the parents. Analysis of the distribution of recombination events per chromatid showed that the vast majority of the 3120 (= 130 × 24) chromatids either were without recombination (44%) or showed a single recombination event (48%). The precise numbers of chromatids are 1379 (44%), 1505 (48%), 228 (7.3%), and 7 (0.22%) chromatids showing zero, one, two, and three recombination events, respectively. Evidently, the vast majority (93%) of the meiotic bivalents in the parents experienced only one chiasma, resulting in two recombinant and two nonrecombinant chromatids. No significant differences in recombination frequencies were observed between the female and male meiosis.

Knowing the genetic position of the 1982 recombination events captured in this mapping population (977 in SH + 1005 in RH), we can investigate chiasma interference. The 1982 recombination events are distributed over 3120 (= 130 × 24) chromatids. The expected distribution of recombination events follows a Poisson distribution with λ = 1982/3120 = 0.635, resulting in expected amounts of 1652, 1050, 334, 71, 11, and 1 chromatids with 0, 1, 2, 3, 4, and 5 recombination events, respectively. When comparing the observed and expected distributions, strong chiasma interference is evident. The single crossover chromatids are strongly overrepresented and the zero and multiple crossover chromatids are strongly underrepresented.

The 228 chromatids revealing two recombination events are intriguing, because these may reveal further details on chiasma interference relative to the position of the centromere. Therefore we want to test whether or not chiasma interference is limited to chromosome arms and if the centromeres play a role in the process of chiasma interference. In other words, would a first chiasma more strongly inhibit the formation of a second chiasma on the same chromosome arm and hardly interfere with the formation of a chiasma on the other chromosome arm? The 228 chromatids with two recombination events were analyzed by counting the number of recombination events per chromosome arm, taking the most densely populated bin as the putative position of the centromere. Chromosomes III, X, and XI, which are without clear centromeres, were omitted from analysis, leaving 174 cases with two recombination events from the remaining chromosomes. The assumption of equal arm lengths would result in an expected ratio of 1:1 between cases with one recombination per arm and cases with both recombinations on one arm.

The 174 double recombination events were distributed over 125 chromatids with one recombination event at either side of the putative centromere and 49 cases with two recombination events in one arm. These 49 cases, however, were mainly observed in the long arms of typically acrocentric or telocentric chromosomes (90%). The short arms of acrocentric and the metacentric chromosome V contributed only four cases of double recombination events within an arm (10%). We therefore conclude that the overrepresentation of cases with one recombination per arm is more likely a reflection of the difference in arm length, rather than strong evidence for a maximum of one chiasma per arm. Five of the seven chromatids, which displayed three recombination events, were observed in chromatids belonging to chromosome III and X without a clear centromeric marker cluster.

Distribution of AFLP markers derived from different restriction enzyme combinations:

Markers have been generated from AFLP templates on the basis of three different enzyme combinations: EcoRI/MseI, SacI/MseI, and PstI/MseI. The genomic position of the markers is determined by the position of the six-cutter restriction site, whereas MseI only “trims” fragment length to a size range optimal for polyacrylamide gel electrophoresis. Hence, for each enzyme combination, the marker distribution on the genetic maps reflects the distribution of the six-cutter restriction sites. The effect of the selective nucleotides is considered negligible in view of the many primer combinations tested. The consequences of the AFLP enzyme combination on the position of the markers can be examined by using two different approaches. First, we test for underrepresentation of methylation-sensitive PstI markers in the putative centromeric cluster. Second, we compare the average distance between marker loci as a measure for marker clustering per enzyme combination. Third, we examine the effect of the number of C + G residues in the enzyme recognition site.

PstI markers in particular should have a nonrandom distribution, reflecting the methylation status of the genomic DNA. AFLP template from PstI-digested DNA should represent only hypomethylated gene-rich regions of the genome. The complexity of the PstI/MseI AFLP template is approximately fourfold lower compared to the EcoRI/MseI or SacI/MseI template, because equally complex AFLP fingerprints were obtained with only two selective nucleotides added to the core PstI primer (+2/+3 primer combinations). In contrast, EcoRI and SacI markers were generated with +3/+3 primer combinations. When comparing the fraction of PstI markers in the putative centromeric clusters relative to the fraction of PstI markers at other regions of the genome, the maternal and paternal linkage groups III, X, and XI, which lack a clear putative centromeric marker cluster, were excluded. A total of 2508 markers, one-third of the total number of mapped 1:1 segregating markers, were counted in 18 putative centromeric bins. These 18 bins contained only 209 (8.3%) PstI markers, whereas among all 7439 mapped 1:1 segregating markers, 1446 (19.2%) are PstI markers. This observation provides clear evidence for an underrepresentation of PstI markers in the putative centromeric marker clusters.

Finally, the effect of four C + G residues in the recognition site of SacI and of two C + G residues in EcoRI is examined. Euchromatic regions differ from centromeric heterochromatic regions. Plant genomes have a strong underrepresentation of C + G residues (33–36% in dicots; Karlin and Mrázek 1997). Specifically, the repetitive DNA in the centromeric heterochromatin is more A + T-rich, and the gene-rich euchromatic regions are less biased. This could also affect the distribution of EcoRI vs. SacI markers. However, the representation of SacI markers (569) in the putative centromeric marker clusters (569/2508 = 22.7%) is not significantly different from the ratio observed for EcoRI markers. Therefore we conclude that SacI and EcoRI markers cluster equally in the putative centromeric bins.

An alternative way to study marker distribution is based on the distances between neighboring markers. To compensate for the unequal number of markers per linkage group and per enzyme combination, a random subset of 1024 EcoRI, PstI, and SacI markers was drawn. As a control, a fourth subset was composed of 1024 randomly drawn bins, including empty bins. All 1000 intermediate distances within a subset were counted. The results shown in Figure 3 indicate that all markers, including PstI markers, are strongly clustered, as compared to the control. The level of marker clustering, however, differs among the AFLP enzyme combinations: PstI markers show the lowest amount of clustering.

Figure 3.—

Frequency distribution of distances (recombination events) between neighboring marker loci to represent the marker clustering of EcoRI, SacI, and PstI AFLP markers as compared to random genetic sites. Equally sized subsets of 1024 markers were randomly chosen from each different enzyme combination. Each distance between two neighboring markers from the subset was calculated, resulting in 1000 distances per enzyme combination. Within each enzyme combination, the frequency of each distance was calculated. The degree of clustering is dependent on the amount of distances with the value of 0. The degree of clustering for random genetic positions was visualized by calculating the frequency of intermediate distances between 1024 randomly chosen bins.

The frequency of distances between neighboring markers larger than five recombination events is also higher than expected on the basis of a random distribution. This can be explained by the presence of stretches of empty bins caused by recombination hot spots or local fixation (lack of heterozygosity).

In conclusion, we state that the occurrence of recombination is not random. This is explained by the occurrence of crossover interference, of local hot spots for recombination, and of cold spots for recombination, assuming a physically random distribution of EcoRI, SacI, and PstI recognition sites in the nucleotide sequence of the potato genome.

DISCUSSION

The saturation of the potato genome with marker loci:

With 381 AFLP primer combinations and a mapping population of 130 individuals, >10,000 markers were generated, of which 93% could be accurately assigned to a genetic position. Previous maps of potato were already available (Bonierbale et al. 1988; Gebhardt et al. 1989; Tanksley et al. 1992; Jacobs et al. 1995), but varied from 100 to 500 markers. Even when comparing this map with recently published high-density maps of papaya (Ma et al. 2004: 1501 loci), cotton (Rong et al. 2004: 3347 loci), and sorghum (Bowers et al. 2003: 2512 loci), this potato map is the densest map based on meiotic recombination in any species yet obtained.

Obviously, the level of DNA polymorphism makes a large difference in such efforts in potato or papaya in terms of data collection. However, with respect to data analysis it was noted that the available mapping software could not cope with such large data quantities. Linkage groups with >1000 markers cannot be handled with current software such as JoinMap, and even small amounts of errors caused severe marker-ordering problems. Therefore, new approaches were devised, resulting in software (RECORD; van Os et al. 2005a) that could produce accurate marker orders in a relatively short time. The necessity to remove scoring errors was recognized and performed with SMOOTH (van Os et al. 2005b). The combination of these two programs made it possible to construct a reliable and robust framework map. The framework map consists of bins, which are positions on the genetic map with a unique segregation pattern that are separated by recombination events. Thanks to the high density of the markers, it was possible to determine the position of most of the recombination events on the map. Since a direct translation from bin to centimorgans can be made, a consecutive numbering of the bins is sufficient for indicating the positions of the genetic markers.

Exploitation of the ultradense map:

Position information of AFLP markers of this study has been exploited to identify linkage groups in other mapping studies. For example, Bradshaw et al. (2004), Bryan et al. (2004), and Caromel et al. (2005) could identify linkage groups by the presence of comigrating AFLPs shared with this diploid reference map. Recently, the mapping of the R10 and R11 genes for resistance to late blight (P. infestans) by Bradshaw et al. (2006) was entirely based on the mobility-based nomenclature of AFLP markers from this study (http://potatodbase.dpw.wau.nl/UHDdata.html). This clearly demonstrates the transferability of the AFLP markers from this study to other mapping efforts, as was outlined by Rouppe van der Voort et al. (1997a,b). They concluded that the locus specificity of AFLP markers resides in the unique mobility of a fragment in an AFLP fingerprint (comigration = colocalization = DNA sequence homology). Alternatively, AFLP markers are easily converted into simple single-locus PCR markers according to procedures outlined by Brugmans et al. (2003).

The utility of this study also depends on the level of marker saturation of the potato genome. In view of the average insert size of a BAC library and the estimated genome size of 840 Mb (Tempelaar et al. 1985; Bennett et al. 1997), the current number of marker loci should be sufficient for gene-cloning efforts via BAC landing, since the average distance between markers is estimated to be ∼84 kb. Proof of concept was recently obtained by the cloning of the late blight resistance gene R3a (Huang et al. 2004, 2005) and the construction of a BAC contig comprising the wart disease resistance gene Sen1-4 (Brugmans et al. 2006). Both studies demonstrated that marker spacing was in accordance with the expected physical distance.

Nevertheless, the genetic structure of the R3a and Sen1-4 locus also showed remarkable differences. The R3a gene was mapped relative to two bins (1.5 cM), collectively containing 27 AFLP markers (marker dense). A 1748-offspring high-resolution map resulted in 35 sub-bins of 0.06 cM. However, the recombination events were unevenly distributed, leaving 10 AFLP and two CAPS markers cosegregating with resistance and stretches of sub-bins without markers. In contrast, the Sen1-4 locus was roughly mapped relative to six bins (3.5 cM) with only 9 AFLP markers (marker poor). These 9 AFLPs landed on overlapping BAC clones, resulting in a single ∼1-Mb contig. These two examples suggest that marker-poor and empty bins indicate a favorably low megabase-to-centimorgan ratio, whereas marker-rich bins indicate a high megabase-to-centimorgan ratio. Therefore, undersaturated regions on the map do not necessarily present a problem for map-based cloning efforts.

Empty bins and oversaturated bins may indicate alternating recombination hot spots and cold spots on the genome. Consecutive empty bins could also indicate a local absence of marker polymorphism due to fixation of one allele. On both SH08 and RH08 a long stretch of up to 19 empty bins could represent an example of either. At this moment, almost half of the bins in the framework map remain empty (44%). Assuming a random distribution (Poisson) of 10,365 markers over 1982 bins would result in an expected number of only 13 empty bins, which is in sharp contrast to the observed nonrandom placement of markers in bins. Eventually, the construction of a genetically anchored physical map should provide more insight into the reason for empty bins.

Marker distribution:

Three different enzyme combinations have been chosen to generate markers. In a pilot study, a maternal genetic map was produced with 19 EcoRI/MseI primer combinations. In this study, it was recognized that, with this single enzyme combination, a considerable portion of the genome remained unpopulated with markers. Therefore, it was decided to generate AFLP markers from DNA template prepared with three different restriction enzymes: EcoRI, SacI, and PstI. Nevertheless, both the A + T-rich recognition site directing the distribution of EcoRI markers and the C + G-rich SacI markers resulted in strong clustering, including approximately one-third of all the markers. For mapping purposes, a more dispersed genetic distribution is preferred, but for applications such as the genetic anchoring of a physical map, this is probably not a drawback. For linkage mapping of trait loci, PstI markers are recommended, because these are biased to nonmethylated regions. There is, however, a drawback with PstI markers: in almost every fingerprint, several bands that were absent in both parents were observed in the progeny. These putative methylation polymorphisms will increase the number of singletons. Furthermore, PstI markers should be used with caution for BAC landing. Extra bands will appear, because of the absence of methylation in bacteria.

The highly similar distribution of EcoRI and SacI markers demonstrates that the effect of clustering due to unequal levels of recombination outweighed the effect of differences in A + T composition in euchromatic vs. heterochromatic regions. Why are these clusters so sharply confined to a single bin position? This seems to contradict multiple publications on AFLP maps, where clustering is obvious but extended over a wider region. In our view, the interaction between (1) the mapping algorithm and (2) the quality of the data set explains the presence of these sharp marker clusters. First, it was demonstrated that singletons have little effect on the performance of the mapping algorithm of RECORD, but methods that use the distance between marker pairs cannot avoid inflation of map length (van Os et al. 2005a). Second, the rigorous removal of singletons will reduce the distance between closely linked markers. Usually, distances between markers are the sum of the distances caused by recombination events and the distances caused by singletons. Modest numbers of singletons (1–2%) overshadow the effect of suppressed recombination and will flatten the marker cluster.

Centromeric suppression of recombination is the obvious explanation for marker clustering. First, the clear congruence of the maternal and paternal homologous linkage groups excludes other adventitious heterochromatic regions as the cause of marker clustering. Second, the relative position of the clusters coincides with expectations based on cytological observations (Tanksley et al. 1992; Dong et al. 2000). For example, chromosome II of tomato and potato are telocentric, the short arm being reduced to the nucleolar organizer. Chromosome VI of tomato and potato is known for its very small short arm, and chromosome V is metacentric. In view of the sharp demarcation of the marker dense clusters, we conclude that mapping the centromeric position has an accuracy of the size of one bin: 0.8 cM. The centromeric positions of the paternal linkage groups have been confirmed using half-tetrad analysis in a 4x × 2x mapping population (Mendiburu and Peloquin 1979; T.-H. Park, R. Hutten and H. van Eck, unpublished results).

Analysis of meiotic recombination and chiasma interference:

Recently, Hillers and Villeneuve (2003) investigated the control mechanisms of meiotic crossing over in Caenorhabditis elegans, which averages only one crossover per chromosome pair per meiosis. A tendency was revealed to restrict the number of crossovers, irrespective of the physical length. Pairs of fusion chromosomes composed of two or even three whole chromosomes enjoyed only a single crossover in the majority of meioses. This observation parallels the work of Gerats et al. (1985), who describe a relationship between the length of the deletion in the short arm of Petunia chromosome VI and the recombination frequency between markers in the long arm. The recombination frequency increased with an increasing length of the deletion. Both cases in C. elegans and Petunia demonstrate that the occurrence of a preset amount of recombination events is highly regulated and even two recombination events are considered “a crowd” (Van Veen and Hawley 2003). In this study, marker saturation allowed the detection of every recombination event. The fraction of chromatids with more than one recombination event was only 1.6% or 49 cases. Therefore, in our view there is no reason to assume absolute chiasma interference.

In this study, singletons have not been interpreted as indicative of double recombination events. Most likely, they are caused by inaccurate scoring, but some data points can also be caused by gene conversions, mutations, and other biological phenomena (van Os et al. 2005b). Rong et al. (2004) have chosen an alternative interpretation in a similar situation. They have concluded that negative crossover interference could explain the unexpectedly abundant double recombinants.

Toward a sequence-ready physical map:

Currently, a physical map of the potato genome is being constructed from the paternal clone RH using EcoRI + 0/MseI + 0 fingerprints of individual BAC clones (De Boer et al. 2004). The anchoring of several thousand BAC contigs to this genetic map will be achieved by application of AFLP on 0.4 genome equivalent pools of BACs. AFLP loci that have been mapped are easily recognized in fingerprints of 0.4 genome equivalent BAC pools. Deconvolution of the pooling design allows the identification of the BAC clones and the contig, which carries the mapped AFLP locus. A genetically anchored physical map will culminate in a sequence-ready minimal tiling path of BAC contigs of specific chromosomal regions. Within the International Solanaceae Genome Project for comparative genome studies (http://sgn.cornell.edu/solanaceae-project/), as well as within the potato genome sequencing consortium (http://www.potatogenome.net), this ultradense linkage map and the anticipated genetically anchored physical map will have a valuable role.

Acknowledgments

This work was carried out under the European Union FAIR (Agriculture and Fisheries) program grant FAIR5-PL97-3565.

Footnotes

  • We commemorate our coauthor Françoise Rousselle-Bourgeois, who passed away on October 31, 2003.

  • 1 Present address: Department of Plant Breeding, Cornell University, Ithaca, NY 14853.

  • 2 Present address: Keygene N.V., 6700 AE Wageningen, The Netherlands.

  • 3 Present address: Plant Biotechnology Centre, TEAGASC, Carlow, Ireland.

  • Communicating editor: D. Weigel

  • Received January 16, 2006.
  • Accepted March 30, 2006.

References

View Abstract