We have leveraged the reference sequence of a boxer to construct the first complete linkage map for the domestic dog. The new map improves access to the dog's unique biology, from human disease counterparts to fascinating evolutionary adaptations. The map was constructed with ∼3000 microsatellite markers developed from the reference sequence. Familial resources afforded 450 mostly phase-known meioses for map assembly. The genotype data supported a framework map with ∼1500 loci. An additional ∼1500 markers served as map validators, contributing modestly to estimates of recombination rate but supporting the framework content. Data from ∼22,000 SNPs informing on a subset of meioses supported map integrity. The sex-averaged map extended 21 M and revealed marked region- and sex-specific differences in recombination rate. The map will enable empiric coverage estimates and multipoint linkage analysis. Knowledge of the variation in recombination rate will also inform on genomewide patterns of linkage disequilibrium (LD), and thus benefit association, selective sweep, and phylogenetic mapping approaches. The computational and wet-bench strategies can be applied to the reference genome of any nonmodel organism to assemble a de novo linkage map.
GENOMICS has broadened the exploration of natural variation, helping to bring the naturalist perspective back into the fold of modern biology. The dog is an example of a nonmodel organism with much to offer in the way of natural phenotypic diversity. Since Darwin (1883), the domesticated dog has been recognized as a model of mammalian evolution, with striking variation in form and function. Much of the phenotypic diversity in the dog appears to have evolved rapidly, accelerated by the intense pressure of artificial selection and the force of genetic drift through population bottlenecks. Selective breeding over centuries has served as the preliminary experiment in canine genetics: the mutants have been screened and the strains have been established. The architectures of domestic traits are likely enriched for genes of large effect, given that artificial selection can act only on discernible phenotypic differences (Andersson 2001; Neff and Rine 2006). It follows that the diversity in size, shape, and behavior in the dog is genetically tractable. Moreover, this “adaptive” variation can be couched in the developmental context of an extant progenitor, the wolf (Vilà et al. 1997), to illuminate the morphological and behavioral antecedents from which breed-defining traits have been derived. In addition to propagating purposefully bred traits, managed breeding has had unintended consequences. The dog suffers many of the same diseases as man, from Mendelian defects (e.g., deafness) to complex diseases (e.g., cancer susceptibilities) (Ostrander et al. 1993). These health issues follow breed predilection, implying that ancestral mutations have been trapped within the now-closed gene pools, presumably from founder effects at the inception of breed registries.
Variation in the dog, both adaptive and maladaptive, can be understood genetically through several phenotype-driven approaches (reviewed in Karlsson and Lindblad-Toh 2008). Linkage analysis of Mendelian traits and monogenic diseases is particularly powerful owing to large sibships and pedigree ascertainment through breeders. Genomewide association mapping leverages the strengths of population structure and is driven by a breed-based linkage disequilibrium (LD) that is 20- to 100-fold more extensive than that observed in human populations (Sutter et al. 2004; Lindblad-Toh et al. 2005). Less conventional methods for detecting genotype–phenotype correlations, such as in silico association (Grupe et al. 2001; Jones et al. 2008) and selective sweep (Pollinger et al. 2005) mapping, also hold promise in the dog, especially for understanding hallmark traits that have been bred to fixation, thereby presenting a “segregation problem” to geneticists.
Each of these approaches to genetic mapping is more powerfully applied with a thorough understanding of recombination rate. This includes a relevance to LD mapping that may be particularly important in canine genetics. In man, LD tends to extend over a kilobase scale, for which genetic and physical distances are reasonably correlated. In contrast, LD in dog breeds extends to a megabase scale (Karlsson et al. 2007); understanding local variation in recombination rate (cM/Mb) could facilitate interpretations of LD.
Given the strengths of canine genetics, a principal unmet need is a comprehensive description of genomewide variation in recombination rate. The current linkage map is built upon the success of previous efforts (Lingaas et al. 1997; Mellersh et al. 1997; Dolf 1999; Langston et al. 1999; Neff et al. 1999); however, those maps were constructed with <600 markers and lack the coverage, precision, and accuracy needed to optimize linkage analysis and subserve LD mapping (Dukes-McEwan and Jackson 2002). A comprehensive map would enhance genetic analysis in several ways. It would allow assessment of empiric coverage during linkage scans, provide essential information for multipoint and fine-resolution mapping, and serve as a foundation for interpreting LD across the genome. The map can also serve as a comparative resource for continuing to improve our understanding of the basic processes governing meiotic recombination and faithful chromosome segregation.
Toward this end, we have leveraged the high-quality draft sequence (7.6×) of the dog genome (Lindblad-Toh et al. 2005) to assemble a de novo linkage map. The reference sequence afforded an abundance of molecular polymorphisms, which were computationally “cloned” for even spacing. The physical scaffold aided map assembly by providing a first approximation of marker order that was ultimately tested with genetic data. Several measures of map integrity indicated that our estimates of recombination rate were both accurate and reasonably precise. Thus a comprehensive map will join a list of important canine resources, from high-resolution radiation hybrid maps (Vignaux et al. 1999; Breen et al. 2001; Guyon et al. 2003) to the high-quality draft sequence, annotated with 2.4 million SNPs. We discuss the biological implications of this new canine map and provide online resources to facilitate its application.
MATERIALS AND METHODS
Samples and markers:
The familial resources used in this study are summarized in Table 1. The Cornell families (CF) were a genetically admixed research colony inclusive of six breed backgrounds (Neff et al. 1999). The F2 family was developed previously as an experimental intercross of a border collie (sire) and a Newfoundland (dam) (Neff et al. 1999). The extended Silken Windhound family was a privately managed pedigree provided for this study. A panel of 36 purebred dogs was assembled and surveyed to test marker performance across breed populations (supporting information, Table S1). DNA was prepared from blood, tissue, or buccal swab samples using previously described methods (Bell et al. 1981; Oberbauer et al. 2003).
Microsatellite loci were computationally mined from canFam1 and canFam2 (Lindblad-Toh et al. 2005) by exploiting the RepeatMasker track of the University of California Santa Cruz (UCSC) Genome Browser (www.genome.ucsc.edu). Perfect (CA)n microsatellites (n > 12) were chosen for even spacing across the genome (∼750-kb steps). PCR primers were designed with Primer3 using standardized parameters (Table S2). Candidate oligo sequences were screened against the genome sequence using BLAST to ensure unique primer binding sites. Markers with a primer sequence having ≥95% identity to a secondary annealing site were replaced by a suitable adjacent locus. Table S3 lists the autosomal molecular markers developed in this study. An electronic file with the full marker set is also available (File S2).
Microsatellite markers were multiplexed in sets of 3–9 loci on the basis of differential dye labels and expected differences in PCR product size. Marker loci were amplified according to a published method that uses dye-specific M13 oligonucleotides to incorporate fluorescent label during the PCR (Oetting et al. 1995). In addition, a leader sequence (GTTTCTT) was appended to the 5′ end of a subset of reverse primers to catalyze the nontemplated enzymatic addition of a nucleotide to the 3′ end of the labeled product strand (Brownstein et al. 1996). Final PCR concentrations for forward (40 mers) and reverse primers (27 mers) were 0.18 and 1.8 μm, respectively. M13 primers for labeling product were included in the PCR at a final concentration of 0.36 μm.
PCRs were performed in 17-μl reaction volumes with 50 ng genomic DNA template or 2 μl buccal swab extract, and final reaction conditions of 2.5 mm MgCl2, 1× buffer, 200 μm each dNTP (GenScript), and 0.15 units of Taq DNA polymerase (ABGene). Reactions were covered with inert Chill-Out (BioRad) to permit hot-start PCR. Thermocycling consisted of an initial denaturation step of 93° for 3 min and a final extension step of 72° for 20 min. Marker amplifications were for 7 cycles of 93° for 20 sec, 65° for 30 sec, 72° for 2 min; 5 cycles of 93° for 20 sec, 58° for 30 sec, 72° for 2 min; and 25 cycles of 93° for 20 sec, 55° for 30 sec, 72° for 2 min.
An aliquot of PCR product (0.4 μl) was combined with 0.5 μl of GeneScan 500 LIZ (Applied Biosystems) and 10 μl Hi-Di formamide and denatured for 3 min at 95°. An aliquot (1 μl) was separated by electrophoresis and detected by fluorescence using ABI 3730 capillary instruments. The command line option in GeneMapper 4.0 was used to automatically preprocess ABI 3730 files remotely. Genotypes were scored with allele assignments that were consistent across families. All genotypes were manually curated.
Pedigree members of the F2 intercross family were genotyped with a commercially available Infinium CanineSNP20 BeadChip according to the manufacturer's instructions. The canine BeadChip assays 22,362 unique SNPs distributed across the genome with ∼110 kb spacing. Data were collected using an Illumina BeadStation scanner and dedicated data collection software. Genotypes were generated with BeadStudio.
Fluorescence in situ hybridization:
BAC clones corresponding to genomic regions were obtained from CHORI. BAC-derived DNA was labeled with Spectrum Green or Spectrum Orange using nick translation. Paired sets of chromosome-specific probes (300 ng each) were differentially labeled and cohybridized to canine metaphase spreads prepared from cultured canine lymphocytes. Fluorescence in situ hybridization (FISH) preparations were washed (0.4× SSC, 0.3% NP-40) and counterstained with DAPI. Slides were analyzed and images were captured using a Genus Cytogenetics digital microscopy station.
Microsatellite and SNP data were inspected for Mendelian inheritance using PedCheck (O'Connell and Weeks 1998). When a single non-Mendelian inheritance was detected for a marker within a family, all potentially errant genotypes among the siblings or parent (i.e., PedCheck output) were removed from the data set. When multiple genotypes for a locus were inconsistent with inheritance in a family, all of the genotypes for the marker for that family were removed. If the number of errant genotypes detected by inheritance exceeded 10% for a given locus across families, the locus was discarded.
Map assembly algorithm:
Markers were positioned according to their respective sequence coordinates in the canFam2 reference genome (www.ncbi.nim.nih.gov). Correct chromosome assignment was assessed by testing all pairwise marker combinations for linkage using the twopoint option of CRIMAP (Green et al. 1990). Markers showing stronger linkage to at least two loci on a different chromosome were reassigned to that chromosome. The all option of CRIMAP was used to scrutinize the order of markers along each chromosome. Every marker was iteratively removed and reassigned solely on the basis of genetic data. Markers that could be placed elsewhere on the chromosome with equivalent statistical support (within one LOD) were deemed either error prone, poorly informative, or incorrectly positioned on the sequence build. These markers were not included in the framework map. The flips option of CRIMAP was used to further test local marker order. Maximum likelihoods were calculated for each permuted order in a three-marker sliding window along the chromosome. The order inferred from physical coordinates was maintained only if it was within one LOD unit of the highest likelihood obtained.
The effects of positive interference render tight double recombinants biologically unlikely. Double crossover events (DCO), which usually involve a single marker out of phase with the surrounding markers, are most likely indicative of genotyping errors, mutations, gene conversions, sequence mis-assemblies, or segregating chromosome polymorphisms. The chrompic option of CRIMAP was used to identify multiple crossovers over short genetic distances. The output was graphed with a custom tool that visually emphasized interruptions in the parental origin of phased chromosomes (kodachrompic; A. K. Wong, unpublished data). A double crossover event isolated to a single offspring was interpreted as a genotyping error of that individual. DCOs shared among siblings suggested a genotype error in a parent. The distribution of alleles in the family was scrutinized to identify the most probable errant genotype, which was removed. Terminal markers, which could not be evaluated in the same way, were scored twice by independent readers to improve the reliability of the genotype data.
Map expansion caused by undetected errors was estimated by iteratively (1) removing a marker, (2) rebuilding the local map, (3) recalculating intermarker recombination fractions, and (4) comparing the original and new recombination fractions. Markers that led to map inflation (>2 cM) were excluded. Sex-averaged, male-specific, and female-specific autosomal genetic distances, as well as the female-specific distances of the X chromosome, were calculated with the fixed option of CRIMAP using the Kosambi map function.
Quantifying interindividual variation:
The number of recombination events in parental meioses was inferred from offspring genotypes using the chrompic option in CRIMAP. Recombination events localized to chromosome termini (first and last three markers) were discounted; distal exchanges are challenging to distinguish from genotyping errors. The mean recombination rate in each mother (or father) was estimated as the average number of recombination events in the haploid gametes transmitted to her (or his) progeny. One-way ANOVA was applied to test for significant differences among mothers and fathers. The recombination count in a given meiotic product was used as the response variable, and parental identity was treated as the factor. The significance of resultant F-statistics was empirically determined using the quantile position of the realized statistic along the distribution of 104 F-values derived by permutation.
Markers on the framework map were binned into nonoverlapping 5- and 10-Mb windows across the genome. For each window harboring a sufficient number of markers (n ≥ 2), the regional recombination rate was estimated by the slope of a simple linear regression of genetic map position in centimorgan units on physical genome position in megabases. Imposing a harsher criterion of n ≥ 5 makers per window had no significant impact on our findings, but did vastly diminish the number of regions that could be included in the analysis, particularly at the 5-Mb scale. Results reported in the main text derive from the more lax criterion.
The number of repetitive elements (LINES, SINES, LTRs, poly-A repeats, poly-C repeats, poly-G repeats, poly-T repeats, simple repeats, AT-rich, CA-rich, satellites, DNA repeats, and low-complexity repeats), base composition measures (GC%, number of CpG dinucleotides, number of CpG islands), incidence of the CCNCCNTNNCCNC putatively recombinogenic motif (Myers et al. 2008), the number of genes (from RefSeq and the N-SCAN gene prediction track implemented in the UCSC Genome Browser), and the distance to the telomere as a fraction of total chromosome length were computed for each window using the most recent genome assembly (canFam2). Spearman rank correlation tests between recombination rate and individual sequence variables were performed in the R environment for statistical computing.
Assembling markers and meioses:
Perfect CA-repeat microsatellites were computationally selected from the reference sequence (Lindblad-Toh et al. 2005) to achieve submegabase and possibly subcentimorgan spacing on a linkage map. Because markers were newly developed and untested, the collection (n = 3349 loci) was preliminarily screened against a panel of 36 purebred dogs (Table S1). Results indicated that the markers were amenable to high-throughput genotyping and were sufficiently informative for map assembly (HET = 0.46 ± 0.16; Table S3 and File S1).
The markers were typed on familial resources comprising 450 mostly phase-known meioses (Table 1). The families represented an efficient meiotic mapping resource, with 281 dogs from three pedigrees totaling 24 sibships and an average of eight offspring per cross. Genotyping was performed in six phases, with each phase including markers to span the genome in ∼4-Mb increments. This approach allowed a more frequent quality control of genotyping, as well as a new map assembly following each phase. Intermediate map builds were posted online during the course of the study (http://www.vgl.ucdavis.edu/dogmap/). Completion of the phases yielded a million genotypes for assembly of a final map. The meiotic contributions of families are summarized in Table 1. Summary statistics are given in Table S3.
Clarifying marker order:
For map assembly, markers were ordered along chromosomes according to their physical coordinates in canFam2. Chromosome assignments and relative order were tested against the genetic data. Several loci were discrepant (n = 17), showing significantly stronger linkage to two or more loci on a different chromosome than the one to which they had been assigned in canFam2. Genetic data were given preference and discrepant markers were reassigned (Table S4). The vast majority of marker positions were concordant for both types of positional information (99.4%), indicating the canFam2 build was of high quality. This implied there was considerable value in using the physical scaffold to guide and clarify marker order. It should be noted that the power to detect macroassembly errors was presumably greater than the power to discern microassembly errors. Fine-scale errors in the sequence build may have been undetected in our analyses.
Estimating genetic distances:
Given an established marker order, genetic distances for intermarker intervals were estimated for male-specific, female-specific, and sex-averaged maps. Sex-averaged distances were tested for map inflation, which can result from cryptic errors producing artifactual crossovers. Each locus was iteratively removed from the map, and relevant intermarker distances were recalculated. Approximately 1% of markers (n = 46) were suspected of having undetected genotyping errors, as evidenced by a significant reduction in map length (>2 cM) upon their iterative removal. These potentially errant loci were excluded from further assembly.
Of autosomal markers genotyped (n = 3549), 1469 qualified as anchor loci on the basis of an ability to “sample” a large number of meioses (i.e., >100 informative meioses) with concomitant statistical support for linkage (i.e., pairwise LOD > 11). These markers formed the basis of a framework map. An additional 1606 markers were considered “map validators” (i.e., a pairwise LOD > 3); these loci contributed modestly to estimates of recombination rate, but supported the mapping content of anchor loci (e.g., by substantiating phase transitions of observed crossovers). Table 2 lists summary statistics for the framework and comprehensive maps.
A SNP-based map:
In addition to microsatellite markers, ∼22,000 SNPs distributed across the genome (every 104 kb ± 25 kb) were genotyped on the F2 intercross family (Table 1). The quality of the SNP data was high, as reflected by call rate (99.8%), inheritance (0.09% non-Mendelian errors), and informativeness (15,375 polymorphic loci; average MAF ≥ 0.20). The resulting SNP-based map detected most of the same crossovers observed in the microsatellite-based map (718/836). Undetected crossovers were mostly attributable to uncertain phase of diallelic SNPs. The SNP-based map exhibited 64 crossovers at the ends of autosomes that had not been captured by microsatellite markers. This suggested the sex-averaged length of the canine autosomal genome might be somewhat greater than 21 M. Overall, the two maps were in strong agreement, consistent with the microsatellite-based map being of high integrity.
An X map and the pseudoautosomal region:
Markers were analyzed separately for the X chromosome to accommodate female-specific recombination. Of 140 markers analyzed, 82 served as anchor loci and 55 served as validating loci. Markers were ordered according to physical coordinates, and genetic distances were computed from informative maternal meioses. The results suggested a genetic length for the X chromosome of 111 cM (Table 3 and Figure S1).
The vast majority of X-linked loci showed a single allele in males, as expected for hemizygosity. However, five markers exhibited significant heterozygosity in males [average (avg.) HET = 0.80]. None of these markers exhibited fixed alleles, arguing against duplicated regions. The five loci clustered physically (Figure S1), and their allelic variation segregated with autosomal inheritance, consistent with their belonging to the canine pseudoautosomal region (PAR). The genotype data for 565 SNPs along the X chromosome were checked for a similar pattern. Fifty SNPs showed heterozygosity in males (avg. HET = 0.56), and these were from the same physical region as the five microsatellites (Figure S1). This suggested the PAR on the metacentric X localized to the telomeric end (Figure S1). This corresponded to a female-specific length of 7 cM and a male-specific length of 28 cM. Reference sequence data were available from female dogs only (Kirkness et al. 2003; Lindblad-Toh et al. 2005), precluding a characterization of the physical arrangement of the PAR with Y synteny.
Observed variation in genomewide recombination:
Although the map showed pronounced regional differences in recombination rates, several systematic patterns were evident. Nearly every autosome was characterized by low recombination rate at the centromeric end and high rate at the telomeric end (Figure 1). Two chromosomes (CFA27 and CFA32) exhibited a reversed pattern (i.e., high rates at the centromere and low rates at the telomere; Figure 1). Given the consistency of the recombinational profiles across the other 36 autosomes and the generality of this pattern in mammals (Kong et al. 2002; Shifman et al. 2006), the simplest interpretation was that the orientation of the chromosomes was misspecified in canFam2.
We tested this prediction directly using FISH analysis. Differentially labeled probes were generated from BAC clones, which were anchored to the ends of each autosomal linkage map by markers. The probes were localized by FISH with canine metaphase spreads. The results strongly supported the chromosome orientation inferred from the linkage map rather than the one listed in canFam2 (Figure S3; Table S5).
Recombination rate was inversely associated with the physical length of chromosomes (Figure 2). The smallest chromosome (CFA38) exhibited a threefold greater rate than the largest chromosome (CFA1) (Table 3). These results were generally consistent with the expectation of at least one crossover per chromosome, a nearly universal requirement of meiosis from yeast to man (Kaback et al. 1989; Kong et al. 2002).
The rate of recombination also systematically varied by sex. Female meioses exhibited a 1.2-fold greater average rate than male meioses (Table 2). The influence of sex was not uniformly distributed across the genome (Figure S2), with differences in sex ratio being most striking at autosome ends. Female meioses exhibited a 4-fold greater rate near centromeres, whereas male meioses exhibited a 4-fold greater rate near telomeres. Similar observations of sex differences have been made in human (Broman et al. 1998) and mouse (Paigen et al. 2008).
Interindividual variation in recombination rate was also observed (Figure S4). The female with the highest recombination rate exhibited ∼20% more crossovers per meiosis than the female with the lowest rate, although these differences could be chance variation (one-way ANOVA: F = 1.35; P = 0.16). There was substantially more variation in genomewide rates (nearly twofold) among male dogs, and these differences were statistically significant (F = 1.77; P = 0.042). This finding complements observations of individual differences in recombination in a variety of species, including human (Broman et al. 1998; Kong et al. 2004), Drosophila (Chinnici 1971; Kidwell 1972; Brooks and Marks 1986), Tribolium (Dewees 1975), and laboratory strains of mice (Reeves et al. 1990; Koehler et al. 2002; Paigen et al. 2008). Interindividual variation may therefore be a pervasive characteristic of meiotic recombination in sexually reproducing species.
Sequence correlates of recombination rate:
A sequence-explicit framework motivated investigation of the relationship between sequence variables and regional heterogeneity in recombination rate. Twenty sequence features were tested for rank correlation with recombination rate (Table 4). Nonoverlapping 5- and 10-Mb windows were tested across the genome. The strongest predictor of sex-averaged recombination rates for both window sizes was proximity to the telomere. In addition, several classes of repetitive elements, the number of CpG dinucleotide sites, and the incidence of a recombinogenic sequence motif (Myers et al. 2005, 2008) were also predictive of local recombination rates, similar to what has been observed in other mammals (Kong et al. 2002; Jensen-Seaman et al. 2004; Myers et al. 2005; Shifman et al. 2006).
We examined these sequence correlates in the context of sex-specific influences. This analysis revealed distinct differences in males and females. Several GC-related measures (GC-rich repeats, CpG dinucleotide sites, and CpG islands) were positively correlated with male recombination rates, but weakly or negatively correlated with female recombination rates (Table 4). These findings were intriguing in light of the fact that males disproportionately drive the evolution of GC content in the human genome (Dreszer et al. 2007; Duret and Arndt 2008), which may stem from a male-intensified, biased gene conversion toward G/C alleles during recombinational repair (Marais 2003). The putative recombinogenic motif (CCNCCNTNNCCNC) was also positively correlated with male recombination rates, but uncorrelated with female rates. This sex specificity in the dog was similar to that observed in the house mouse (Shifman et al. 2006), but different from the motif's recombinogenic properties in both human sexes (Myers et al. 2008).
Sex-dependent correlations were also found for several classes of repetitive elements. AT-rich repeats, satellites, low-complexity repeats, poly(T) repeats, poly(A) repeats, and long terminal repeats were positively correlated with female recombination rates, but noncorrelated with male rates. Together, these findings suggest that regional recombination is mediated differently in the two sexes by features of the sequence landscape.
Interpolated maps for SNPs and scanning set loci:
The resolution and integrity of the canine map afforded an opportunity to place additional markers on the map through linear interpolation (Kong et al. 2002). Positions for 2.4 million publicly available SNPs were interpolated against the sex-averaged map. Results have been made available here (File S2) and are also available upon request as a custom track for a publicly available browser (www.genome.ucsc.edu). Microsatellite markers from the latest minimal scanning set (MMS3, 507 markers; Sargan et al. 2007) were similarly positioned by interpolation to afford multipoint analyses (File S3). An electronic file with inferred intermarker genetic distance recombination fractions has also been made available (File S4).
We have addressed a principal unmet need in our field by assembling a linkage map that enables navigating the dog genome in a structured and systematic way. The map we have created (i) informs on the basic biology of meiosis and the evolution of recombination rate, (ii) provides important clues on the genomic determinants of regional heterogeneity and sex specificity of recombination rates, (iii) facilitates unbiased genetic access to natural variation and phenotypic diversity in canids, and (iv) lays out an integrated wet bench and bioinformatics approach for developing a de novo map for any species for which a draft sequence becomes available.
Basic biology of meiosis:
Although classical genetics in model organisms has dissected the mechanisms governing meiosis, observations from linkage maps have uniquely informed on natural variation in underlying processes (Dumont and Payseur 2008; reviewed in Coop and Przeworski 2007). The canine genome represents a natural experiment in meiosis, with 38 pairs of unusually short, acrocentric autosomes. Despite the unique challenges such a karyotype might present to the meiotic apparatus, patterns of recombination in the genetic map suggest this karyotype has been accommodated by conventional means. An upregulation of recombination rate among physically smaller chromosomes, for instance, is a conserved feature of meiosis from yeast to man (Kaback et al. 1989; Kong et al. 2002). This nonrandom distribution ensures the obligatory crossover per chromosome arm that helps homologs attain bipolar orientation during meiosis I.
Despite the unique karyotypic features of the dog, the sex-averaged rate of recombination across the genome (0.97 cM/Mb) is within the range of other characterized mammals (0.5–1.1 cM/Mb). Genetic map-based estimates of recombination rate are available for two additional carnivore species, house cat (∼1.1 cM/Mb; Menotti-Raymond et al. 1999) and silver fox (∼0.6 cM/Mb; Kukekova et al. 2007). While large differences in map quality and coverage preclude detailed comparisons, the rate of recombination in dog is clearly not an outlier among carnivores. Although fine-scale recombination rates evolve on short evolutionary timescales (Ptak et al. 2005; Winckler et al. 2005), the dog genetic map adds evidence for more rigid evolutionary constraints on broader scale recombination rates (Myers et al. 2005; Dumont and Payseur 2008).
The elevated recombination rate in female meioses, a salient feature of eutherian maps, was also evident in the dog, although the sex ratio was not as great as it is in man (1.2-fold in dog vs. 1.7-fold in humans) (Broman et al. 1998; Kong et al. 2002). In part, this may be attributable to karyotype—mouse and rat also have mostly acrocentric autosomes and similarly exhibit a muted sex difference relative to human (Shifman et al. 2006). Karyotypic organization accounts for only some of the differences between sexes (Hunt and Hassold 2002). For reasons not yet clear, sex differences are modest in cow (Ihara et al. 2004) and sheep (Maddox et al. 2001), for instance, and are reversed in the two metatherian mammals (marsupials) studied to date (Zenger et al. 2002; Samollow et al. 2004).
As in other placental mammals, sex differences in the dog were not uniformly distributed across the genome. Female meioses exhibited a greater recombination rate near centromeres whereas male meioses showed a greater rate near telomeres. In general, it appears female meioses are at greater risk for nondisjunction in man (Antonarakis 1991), and this risk is exacerbated among chromosomes with distal crossover events (Lamb et al. 1996). Distal crossovers decrease fidelity in yeast as well, suggesting crossover placement fundamentally influences chromosome orientation on the meiotic spindle. Interestingly, segregation in yeast can be rescued with an experimental tether, but only if the tether is located at the centromere (Lacefield and Murray 2007). This implies that physical ties near the centromere (e.g., crossovers) intrinsically promote bipolar orientation. If so, the greater recombination rate near the centromere in females could be compensatory—an adaptation that offsets the greater sensitivity to nondisjunction in oogenesis.
Evolution of patterned recombination:
The canine linkage map may offer an unprecedented opportunity to address the evolution of recombination rate. By generating favorable allelic combinations and breaking down negative linkage disequilibrium, recombination can facilitate the response to selection (Fisher 1930; Felsenstein 1965). This has led to the hypothesis that species under frequent and intense selection evolve toward increased recombination rates (Burt and Bell 1987). Recent work has shown that the purebred domesticated chicken exhibits higher recombination rates than Red Jungle fowl, its wild ancestral counterpart (Groenen et al. 2009). In our case, describing the effects of domestication requires a genetic map assembled for the wolf, Canis lupus. There are logistical constraints to this, but a broad assessment of recombination patterns could be inferred from an analysis of LD in wolf populations, as has been done in man (McVean et al. 2004).
The dog also presents an opportunity to address microevolutionary changes in recombination rate. Mouse strains show systematic differences in recombination (Shiroishi et al. 1991; Paigen et al. 2008), and interindividual variation, from Drosophila to man, has been tied to causative genes (Chinnici 1971; Kidwell 1972; Brooks and Marks 1986; Shiroishi et al. 1991; Koehler et al. 2002; Coop et al. 2008; Kong et al. 2008). Consistent with these observations, we documented significant variation among male dogs. If this variation is heritable, we might also expect differences in recombination rate among breeds. However, our use of admixed pedigrees for map assembly complicated the detection of such differences. The relatively short timescale separating dog breeds and the artificial selection that has shaped breed phenotypes should motivate the construction of breed-specific maps to understand the effects of recent isolation and strong selection on the evolution of recombination rate.
Genomic control of recombination rate:
Sequence correlates provide mechanistic insights into recombination rate variation. Several sequence correlates found in human (Yu et al. 2001; Kong et al. 2002), mouse (Jensen-Seaman et al. 2004), and rat (Jensen-Seaman et al. 2004) were also found in the dog. The number of CpG dinucleotides, for instance, is positively correlated with recombination rate in each species (Kong et al. 2002; Jensen-Seaman et al. 2004), including the dog. Cross-species comparisons have also revealed differences. Although recombination is correlated with LINES in human, mouse, and rat (Kong et al. 2002; Jensen-Seaman et al. 2004), no such correlation was observed in the dog. Distance from the centromere is the best predictor of recombination in man and dog, whereas sequence features remain the best predictors in mouse and rat (Jensen-Seaman et al. 2004).
Many sequence-based correlations were significant, but most were relatively weak. In this respect, the dog joins a growing list of mammals for which the majority of genomic variation cannot be explained by sequence characteristics (Kong et al. 2002; Jensen-Seaman et al. 2004; Myers et al. 2005; Shifman et al. 2006). Alternative sources of variation, for which there is mounting evidence, are epigenetics and chromatin state (Winckler et al. 2005; Neumann and Jeffreys 2006; Sandovici et al. 2006; Sigurdsson et al. 2009). CpG islands, the principal targets of methylation, are correlated with recombination rate in human, mouse, rat, and dog. Interestingly, CpG association in the dog differed in sign between the sexes, suggesting opposing processes in male and female meioses.
The canine genetic map provides important clues to the origins of sex differences in recombination. This dimorphism has long been a puzzle (Morgan 1912; Dunn 1920; Haldane 1922). The most striking result to date is a sequence variant in humans that increases recombination rates in males, but decreases recombination rates in females (Kong et al. 2008). Thus, causative genes and cis-sequence motifs may combine to account for sex-specific differences in recombination rate, possibly mediated by chromatin state (Petkov et al. 2007).
Application to mapping phenotypes:
The de novo map makes known the full genetic landscape of the dog genome and thus allows assessing empirical coverage provided by linkage-scanning sets. The genetic positions of markers in commonly used sets were interpolated so that researchers could derive these benefits. Known positions and intermarker distances are also essential for multipoint analyses to maximize the power and resolution of fine mapping and quantitative trait loci (QTL) mapping. The genetic map may also inform on the pattern and extent of LD in the purebred dog. LD is 20- to 100-fold more extensive in breed isolates than in human populations (Sutter et al. 2004; Lindblad-Toh et al. 2005), a consequence of introgression and historical bottlenecks. Recombination is one of several forces that shape LD; knowledge of local recombination rates may therefore be useful for discerning the contributions of other factors (e.g., selection).
An available reference sequence presented us with an uncommon opportunity to assemble a map efficiently. The genome sequence afforded an abundance of putative markers that could be computationally mined from a physical framework and selected for even spacing and complete coverage. These markers performed well, and their standardized design facilitated wet bench genotyping. Systematic marker ascertainment affords other opportunities, such as studying mutation rates in the context of the “slippery genome” hypothesis of dog domestication (Fondon and Garner 2004). Positional information for markers was valuable in guiding and clarifying locus order. This resulted in a map that was more inclusive of typed markers. These strategies for building a map de novo are applicable to any natural species with a draft sequence. This exemplifies how genomics is continuing to enhance access to natural variation in a broader array of nonmodel organisms.
Online maps and ancillary data:
We have provided map figures (Figure S5 and Figure S6) and electronic files with marker and map content. We have also posted detailed map builds online with hyperlinks to ancillary data. The genetic maps and associated tabular data are available at: http://www.vgl.ucdavis.edu/dogmap/.
We thank the service group of the Veterinary Genetics Laboratory for exceptional technical support. We thank Cindy Taylor Lawley, Carsten Rosenow, John Steulpnagel, and others at Illumina for help in providing SNP data from the Canine Infinium BeadChip. G. Acland generously provided DNA from the Cornell families reference resource. Q. Shan provided helpful discussions; C. Erdman and J. Akey provided critical comments. We thank the Sacramento Valley Dog Fanciers' Association, the Greater Portland Dachshund Club, the Central Illiana Dachshund Club, the Bulldog Club of Northern California, the Chico Dog Fanciers' Association, and Linda Wroth for their generous support. This work was supported by grants awarded by the Canine Health Foundation of the American Kennel Club (AKC-CHF nos. 570A and 662) and through matching funds from the Center for Companion Animal Health at the University of California, Davis. B.L.D. was supported by a National Science Foundation predoctoral fellowship. B.A.P. was supported by National Institutes of Health (NIH) grant HG004498. K.W.B. was supported by NIH grant GM074244.
Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.106831/DC1.
↵1 Present address: Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544.
↵2 Present address: Office of Science Policy and Planning, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892.
↵3 Present address: Center for Canine Health and Performance, Van Andel Research Institute, Grand Rapids, MI 49503.
Communicating editor: I. Hoeschele
- Received September 8, 2009.
- Accepted November 30, 2009.
- Copyright © 2010 by the Genetics Society of America