Abstract
As part of a study of the genetics of floral adaptation and speciation in the Mimulus guttatus species complex, we constructed a genetic linkage map of an interspecific cross between M. guttatus and M. nasutus. We genotyped an F2 mapping population (N = 526) at 255 AFLP, microsatellite, and gene-based markers and derived a framework map through repeated rounds of ordering and marker elimination. The final framework map consists of 174 marker loci on 14 linkage groups with a total map length of 1780 cM Kosambi. Genome length estimates (2011–2096 cM) indicate that this map provides thorough coverage of the hybrid genome, an important consideration for QTL mapping. Nearly half of the markers in the full data set (49%) and on the framework map (48%) exhibited significant transmission ratio distortion (α = 0.05). We localized a minimum of 11 transmission ratio distorting loci (TRDLs) throughout the genome, 9 of which generate an excess of M. guttatus alleles and a deficit of M. nasutus alleles. This pattern indicates that the transmission ratio distortion results from particular interactions between the heterospecific genomes and suggests that substantial genetic divergence has occurred between these Mimulus species. We discuss possible causes of the unequal representation of parental genomes in the F2 generation.
QUANTITATIVE trait locus (QTL) mapping is a powerful and increasingly accessible tool for characterizing the genetic basis of adaptive divergence and speciation (Tanksley 1993). QTL maps provide both a broad outline of the genetics of evolutionary change and a first step toward the isolation and identification of the particular genes involved in phenotypic differentiation (Patersonet al. 1991; Mackay 2001). QTL studies of crop plants have revealed the recruitment of genes of major effect during domestication (Dorweileret al. 1993), substantial epistasis among loci affecting selected traits (Doebleyet al. 1995), and a shared genetic basis for parallel phenotypic evolution (e.g., Patersonet al. 1995). A few ground-breaking studies have examined the genetic architecture of phenotypic divergence in wild plants and animals (e.g., Bradshaw et al. 1995, 1998; Voss and Shaffer 1997; Kim and Rieseberg 1999; Zenget al. 2000), but we still know little about the nature of the genes underlying adaptive evolution and speciation in natural systems.
A genetic linkage map is necessary for the identification of QTL involved in species differences and can also provide insight into patterns of genomic divergence (Rieseberg et al. 1995, 2000; Whitkus 1998). With the advent of molecular markers based on polymerase chain reaction (PCR) techniques, the construction of linkage maps has become increasingly feasible even in wild populations. However, because the estimation of linkage relationships depends on the observation of rare recombination events between pairs of loci, both a large pool of markers and a large segregating population are necessary for full genome coverage and the accurate estimation of locus order and map length (Buetow 1991; Collinset al. 1996; Ehmet al. 1996; Remingtonet al. 1998; Ott 1999). Because markers inevitably vary in informativeness and reliability, researchers with large data sets have also found it useful to restrict their analyses to “framework” maps consisting only of those markers whose order meets strict statistical support thresholds and other criteria (Keatset al. 1991). Although the construction of a reliable high-coverage map is particularly important when the ultimate goal of linkage mapping is the detection, introgression, and detailed study of QTL underlying traits of interest, this framework approach has generally not been taken in mapping studies of wild species.
Here we construct and analyze a framework linkage map based on a F2 hybrid cross between Mimulus guttatus and M. nasutus, the most widespread members of the yellow monkeyflower species complex (Scrophulariaceae). This species complex exemplifies a common pattern in flowering plant evolution—the recent derivation of highly inbreeding taxa from predominantly outcrossing lineages—and has become a model system for stud-ies of the quantitative and population genetics of mating system variation (e.g., MacNair and Cumbes 1989; Ritland and Ritland 1989; Fenster and Ritland 1994a; Fensteret al. 1995; Carret al. 1997; Dudashet al. 1997; Willis 1999a,b). M. guttatus (outcrossing) and M. nasutus (selfing) differ dramatically in floral morphology and other characters related to a mating system (Vickery 1978; Ritland and Ritland 1989). The two taxa are cross-compatible at the level of F1 and F2 seed production (Vickery 1964; Kiang and Hamrick 1978) and are often assigned to a single species, but are postzygotically isolated by partial hybrid infertility (Fishman and Willis 2001).
The linkage map presented here provides a framework for studying the genetic architecture of an evolutionarily important adaptation and also allows the direct examination of interactions between genomes in the early stages of speciation. To construct a map useful for comparative genomic studies within and between Mimulus species, we developed and mapped highly variable microsatellites and gene-based markers, as well as more abundant AFLPS. The resulting map reveals localized and directional segregration distortion, which has important implications for speciation and hybridization.
MATERIALS AND METHODS
Study system: The genus Mimulus (Scrophulariaceae) comprises about 150 species, grouped into about a dozen sections with their center of diversity in western North America (Pennell 1951; Vickery 1978). The yellow monkeyflowers of the M. guttatus species complex (section Simiolus) are the most polytypic members of the genus. Extensive morphological variation and the potential for hybridization have complicated taxonomic assignments within Simiolus, and the members of the M. guttatus complex have been both grouped into a few highly variable species (Thompson 1993) and divided among as many as 20 distinct species (e.g., Pennell 1951). M. guttatus (2n = 28), the most common species in the complex, is bee pollinated and predominantly outcrossing (Ritland and Ritland 1989; Willis 1993b; Sweigartet al. 1999), but self-fertilization appears to have evolved at least several times within the species complex (Pennell 1951; Vickery 1978; Fenster and Ritland 1994b). The most widespread of the selfing taxa is M. nasutus (2n = 28), which produces cleistogamous or nearly cleistogamous flowers. M. nasutus is generally thought to be derived from a M. guttatus-like ancestor, but phylogenetic relationships among members of the complex have not been resolved (Fenster and Ritland 1994b).
The distributions of M. guttatus and M. nasutus overlap broadly from British Columbia to northern Mexico. Allopatric populations are more common, but the two species often cooccur in seasonally wet areas such as road cuts and ephemeral streams. At sympatric sites, potential premating barriers to hybridization include differences in microhabitat and flowering time (Kiang and Hamrick 1978), as well as differences in floral morphology (Ritland and Ritland 1989; Dole 1992), pollen production (Ritland and Ritland 1989; Fenster and Carr 1997), and pollen tube growth (Diaz and MacNair 1999) associated with their divergent mating systems. Despite these prezygotic isolating mechanisms, hybrids are frequently observed in the wild (Vickery 1964, 1973; Kiang 1973; Kiang and Hamrick 1978; Ritland 1991; Fenster and Ritland 1992). However, experimental hybridizations indicate that partial postzygotic reproductive isolation has developed between M. nasutus and M. guttatus. Vickery (1964, 1973, 1978) found reduced seed set in some interspecific F1 hybrids and reported mild F2 breakdown. Major chromosomal rearrangements do not appear to differentiate the species (Mukherjee and Vickery 1962), but smaller karyotypic differences may not have been detected. In a companion experiment to this mapping study, we found partial male and female sterility in both F1 and F2 hybrids between inbred lines of the two species (Fishman and Willis 2001). The pattern of hybrid infertility implicated negative epistatic interactions between heterospecific genomes as an important contributor to postzygotic reproductive isolation.
Generation of F2 mapping population: We crossed a single inbred line of M. guttatus with a single inbred M. nasutus genotype. The M. guttatus parental line was derived from an annual, highly outcrossing population from the Oregon Cascades (Iron Mountain; Willis 1993a; Sweigartet al. 1999). This parental line (IM62) was formed by more than five generations of selfing with single seed descent (Willis 1993b) and is near the outcrossed population mean for floral characters and pollen fertility (J. H. Willis and A. Kelly, unpublished data). The M. nasutus parental line was derived from a population in northwestern Oregon (Sherar's Falls) and maintained for several generations in the greenhouse through autonomous self-fertilization. As expected from the cleistogamous floral morphology of M. nasutus, both the Sherar's Falls population and the parental line used in this study (SF5.4) are highly inbred (homozygous at marker loci highly variable in M. guttatus populations; A. Kelly and J. H. Willis, unpublished data). F2 hybrids were generated by crossing the M. nasutus and M. guttatus inbred lines (IM62 as pollen parent) and then self-pollinating a single F1 individual.
In March 1997, we grew the F2 mapping population (initial N = 600) and F1 hybrids and parental lines (N = 100 for each) in a common garden experiment at the University of Oregon Department of Biology greenhouse. Greenhouse and plant culture conditions were similar to those during parental line formation and previous experiments with these populations (Willis 1999a,b). The plants were grown in 2.25-in. pots filled with a soilless potting mix and placed in a fully randomized design. We planted about five seeds per pot and thinned to the centermost individual after most seeds had germinated (14 days), but did not explicitly measure germination rates or subsequent mortality. We measured 16 floral, vegetative, and reproductive characters on adult plants. Biometric and QTL analyses of these phenotypic data are presented elsewhere (e.g., Fishman and Willis 2001).
Tissue collection and DNA extraction: Several corollas from each F2 individual were collected into 1.5-ml Eppendorf tubes, immediately placed on dry ice, and stored at −80°. Genomic DNA was isolated from the corollas using a modified hexadecyl trimethyl-ammonium bromide (CTAB) chloroform extraction protocol (Lin and Ritland 1996; Kelly and Willis 1998) and its concentration quantified with a Hoechst fluorometer. A total of 526 F2 individuals yielded sufficient DNA for genotyping with some or all molecular markers.
Development and analysis of molecular markers: Microsatellites: Microsatellite primers were developed from a Sherar's Falls M. nasutus recombinant DNA library screened for clones with di- and trinucleotide repeats. We designed ~120 primer sets flanking AATn regions, most of which successfully amplified products from M. nasutus genomic DNA (Kelly and Willis 1998). Prior to the mapping project, we identified a subset of microsatellite markers that amplify products from both M. guttatus and M. nasutus (Kelly and Willis 1998), some of which are highly polymorphic within the Iron Mountain population of M. guttatus (Sweigartet al. 1999). We surveyed these previously identified loci and several dozen additional primer sets for polymorphism between the parental IM62 and SF lines and for segregation in a test set of 16 F2 individuals. We identified 27 informative loci, which were then genotyped for the full F2 mapping population. We used the general PCR reaction conditions and thermocycle programs described in Kelly and Willis (1998) for the F2 genotyping, except that the 5′ primers were end-labeled with infrared (IRD) dyes for visualization with a Li-Cor automated sequencing system. The GenBank accession numbers and specific reaction conditions for the microsatellite loci used in this study are given in Table 1.
The PCR products were resolved on 18-cm denaturing polyacrylamide gels run on a Li-Cor 4000L automated sequencer, following the gel preparation and loading protocols of Remington et al. (1998). Prior to gel loading, 5 μl of formamide loading dye was added to each reaction. The samples were denatured by heating to 75°–85° for 2 min and then immediately chilled to 4°. We loaded 1–2 μl of dye/product mixture into each sample lane (48-well comb) and also ran parental genotypes and/or IRD-labeled size standards in the outermost lanes. We used electrophoretic run parameters of 1000 V, 35 mA, 25 W, and 50° plate temperature, scan speed 3, signal filter 3, and 16-bit pixel depth for the collection of TIFF image files.
Fragments polymorphic between the parents and segregating in the F2 population (one locus per primer set) were scored by eye in the TIFF image files using the program RFLPSCAN 3.0 (Scanalytics). The segregating fragments were assigned molecular weights by the program on the basis of molecular weight standards and the previously determined parental allele sizes. The software automatically binned the data across gels and generated fragment presence/absence strings for the two segregating alleles produced by each primer set. The presence/absence data for each locus were then converted into MAPMAKER 3.0 format (Landeret al. 1987) using spreadsheet programs (e.g., + − → A, − + → B, + + → H, where A and B are homozygotes for SF and IM62 alleles, respectively, and H is the heterozygote).
Gene-based markers: We used a degenerate PCR primer approach to clone homologs of several floral developmental genes characterized in model species including LEAFY (LFY; Weigelet al. 1992), APETALA3 (AP3; Jacket al. 1992), and CYCLOIDEA (CYC; Luoet al. 1996). To clone Mimulus homologs of LFY and AP3, we designed degenerate forward PCR primers nested within conserved regions of genes (LFY second exon forward primer, 5′-catccgtttatcgtnacggagcc-3′; AP3 first exon, 5′-gctcgagggaagatccagat-3′) and degenerate reverse PCR primers in a second conserved region separated from the first by an intervening intron (LFY, 5′-ttgaatatggtrtcdatatccca-3′; AP3, 5′-cttcytcaagtgctcttgcat-3′). Gene fragments were amplified from genomic DNA of the inbred IM62 and SF parental lines, and the resulting PCR products were cloned using the Invitrogen (San Diego) TOPO TA cloning kit; at least five clones per gene per parent were sequenced. Sequences were aligned with homologous sequences in the database and a neighbor-joining tree was constructed using CLUSTAL X 1.8 (Jeanmouginet al. 1998) to verify putative gene identity. CYC homologs were amplified from IM62 genomic DNA using the GCYCFS and GCYCR primers of Moeller et al. (1999) and PCR fragments were cloned and analyzed as previously for AP3 and LFY.
To map genes, IM62 and SF sequences were aligned and new nondegenerate PCR primers designed to amplify regions that exhibited length polymorphism (5–13 bp) between alleles. Segregating polymorphism and expected sizes of alleles were verified using the Li-Cor protocol for microsatellites, after which the entire F2 mapping population was genotyped and scored.
Amplified fragment length polymorphisms: Templates for amplified fragment length polymorphism (AFLP) reactions were prepared using standard protocols (Voset al. 1995; Remingtonet al. 1998) modified for low DNA volume and high throughput. The restriction digest-ligation steps were carried out in 96-well PCR plates using 100 ng of corolla DNA in 10 μl of H2O. The restriction digests with EcoRI (3 hr incubation at 37°) and TaqI (3 hr at 65°) and the ligation of EcoRI and TaqI adapters (Voset al. 1995) were scaled down to a total volume of 25 μl. The restriction-ligation (RL) product was then diluted 1:3 with H2O for use as a template in the preamplification reactions.
We carried out four different preamplification reactions using standard EcoRI (E) and TaqI (T) primers (Voset al. 1995) with single selective nucleotides (E + A with T + A, T + G, T + C, and T + T). The preamplification protocols followed Remington et al. (1998) except that 6 μl of the diluted RL product was used as the PCR template in a total reaction volume of 20 μl. The preamplification product was diluted 1:25 for use as a template for the final selective amplification steps. We performed selective amplification reactions using various combinations of three E primers with three selective nucleotides (E + ACG, E + ACC, and E + AGG) and the four T + 1 primers. The reaction mixtures and thermocycle conditions for the selective amplifications followed Remington et al. (1998), except that the reactions were scaled down to a total volume of 15 μl. The E + 3 primers were end-labeled with infrared dyes (IRD700 and IRD800) for visualization with the Li-Cor automated sequencing system. The AFLP products were resolved on 25-cm denaturing polyacrylamide gels run on a Li-Cor sequencer, following the loading protocols and electrophoretic parameters of Remington et al. (1998). Parental genotypes and molecular weight standards were run in the outer lanes on a subset of the gels for each primer set.
Polymorphic fragments were scored by eye on TIFF image files using RFLPSCAN 3.0 (Scanalytics). The molecular weights of polymorphic fragments were first determined on gels with both parental genotypes and a molecular weight standard. Loci polymorphic between the parents but monomorphic or near monomorphic in the F2 population were excluded. The remaining diagnostic fragments were matched by size across gels, scored electronically by the user, and then automatically binned by the program into single presence/absence polymorphisms. Lanes too faint to score for some or all polymorphic bands on a gel were excluded (all loci for that primer set scored as missing data) or rerun. The presence/absence strings for each marker were converted into MAPMAKER 3.0 format using spreadsheet programs. The majority of AFLP polymorphisms were coded as dominant markers (e.g., − → A, + → C, where A is homozygous for an SF null allele and C could be either a dominant IM62 homozygote or a heterozygote). Mendelian segregation of such dominant markers should result in a 1:3 ratio of A to C (or corresponding B to D) genotypes in the F2 generation. Some pairs of polymorphic AFLP fragments on a gel clearly segregated as alternative alleles at a single locus and were coded as codominant. We also examined the chromatographic output from RFLPSCAN to determine whether band intensities at a particular allele size were distributed bimodally (controlling for overall lane intensity). Such bimodality could potentially allow the differentiation of heterozygous and homozygous genotypes in single +/− polymorphisms.
Linkage map construction: The full mapping population consisted of 526 F2 individuals genotyped for the 255 diagnostic markers, with a large number of individuals genotyped at each marker (mean = 477 ± 26 SD). We constructed genetic linkage maps using MAPMAKER 3.0 (Landeret al. 1987; Lincoln and Lander 1992). Several rounds of mapping and marker exclusion contributed to the construction of the final framework linkage map. We initially separated the genotypic data into two overlapping data sets, each consisting of one class of dominant AFLP (SF null allele or IM62 null allele) plus all codominant markers. Because the genotypic information at dominant marker loci is incomplete, mapping the two sets of markers in coupling phase separately provides greater statistical power to group and order markers accurately, but reduced power for later detection and mapping of QTL (Jiang and Zeng 1997). We constructed a final framework linkage map using the complete data matrix, but used the two initial linkage maps as a guide to the most reliable subset of markers.
To construct the two initial maps, we used the GROUP command with the Kosambi mapping function (Kosambi 1944) to organize markers into linkage groups (two-point linkage criteria: minimum LOD 6.0 and maximum distance between markers of 37 cM). We then used the ORDER function with error detection (Lincoln and Lander 1992) to automatically find and map the most likely order for each group. This multipoint ordering used a threshold of LOD 3.0 to find a starting subset of five markers and to place markers in a first round and then tried to place the remaining markers with a LOD threshold of 2.0. We examined the error detection data and table of two-point distances for these preliminary orders to identify potentially unreliable markers. These markers, along with any linked but unplaced markers, were then individually evaluated using the TRY, COMPARE, MAP, and RIPPLE commands to generate and compare alternative orders. This procedure was repeated until we reached a consistent linear ordering of each group using a subset of markers with few potential genotyping errors.
We used a similar iterative procedure to construct the final framework map with all F2 genotypes in a single data set. We grouped all 255 markers according to the same criteria used in the preliminary mapping. We also used shared codominant markers to merge pairs of homologous linkage groups from the two coupling-phase maps. With the exception of one pair that could not be merged (each consisted only of dominant markers), these approaches resulted in the same linkage groups. We used the ORDER command with the original parameters to find the most likely order for the integrated groups and then repeated the process of evaluating the reliability of individual markers and comparing alternative orders. Linked but unplaced markers from the initial mapping were also reevaluated. For this final ordering, we made an effort to include alternating markers from the two dominant classes where such substitutions did not substantially decrease the overall likelihood of an ordered group. This process resulted in four groups of markers: (1) framework markers ordered on the final map; (2) accessory markers that are linked to established groups but cluster tightly with other markers and cannot be placed in a single interval with high certainty; (3) “unreliable” markers that appear linked to one or more groups but exhibit nonlinear two-point linkage patterns (increasing map length by >7 cM) and unusually high error rates when ordered; and (4) unlinked markers.
Genome length and map coverage: We calculated the average framework marker spacing (s) by dividing the summed length of all linkage groups by the number of intervals (number of markers minus number of linkage groups). We estimated the genome length L using several different techniques. First, we simply added 2s to the length of each linkage group to account for chromosome ends beyond the terminal markers. Second, we used Method 4 of Chakravarti et al. (1991), which multiplies the length of each linkage group by the factor (m + 1)/(m − 1), where m is the number of framework markers on each group. Because most groups contain additional internal markers not included on the framework map, we also calculated L using Method 4 with m equal to the total number of markers linked to each group. We also estimated map coverage c. The proportion c of the genome within d cM of a marker, assuming random marker distribution, was estimated as c = 1 − e−2dn/L, where L is a genome length estimate and n is the number of markers.
Transmission ratio distortion: We tested for significant non-Mendelian genotype frequencies at each marker locus in the full data set [χ2 with 1 d.f. (dominant markers) or 2 d.f. (codominant markers); Sokal and Rohlf 1995)]. For codominant markers, we similarly tested for distortion of allele frequencies. We used two significance thresholds (α = 0.05 and 0.001) to provide fewer and more conservative estimates of the degree and extent of transmission ratio distortion. However, because the genotypes at individual markers are related by linkage and tests are thus not independent, we do not use Bonferroni or sequential Bonferroni corrections to account for multiple tests. To examine the pattern of transmission bias across the framework map, we also calculated the absolute deviation of the parental homozygote frequency from the Mendelian expectation of 0.25 at each locus on the framework map. This results in a single estimate of transmission bias for each dominant marker and two semi-independent values for each codominant marker. We then identified contiguous genomic regions containing multiple markers distorted in the same direction at α≤ 0.05.
We used the Bayesian multipoint mapping method described by Vogl and Xu (2000) to estimate the location and effects of transmission ratio distorting loci (TRDLs). This procedure makes use of reversible jump Markov chain Monte Carlo and treats the number of distorting loci and their positions and effects as unknown variables. The method assumes that different TRDLs act independently (i.e., they have multiplicative fitnesses). We analyzed our genotypic data using the program ANITA (kindly supplied by C. Vogl), which modifies the method of Vogl and Xu (2000) for a general (full sib or F2) cross. Each linkage group was analyzed separately by setting a Poisson prior distribution of the number of distorting loci (μ = 1) and running the program for 10,000 iterations.
RESULTS
Generation of informative markers: Microsatellites: To identify microsatellite markers informative for mapping in the interspecific cross, we surveyed 122 primer sets developed from a M. nasutus cDNA library probed for AATn and AGn repeats (Kelly and Willis 1998). Twenty-seven primer sets produced codominant fragment pairs that were polymorphic between the parental lines, reliably scorable, and segregating in the F2 mapping population (Table 1).
Gene-based markers: Using a degenerate primer approach, we found a single putative Mimulus homolog for both LFY and AP3 (Table 1). Length variation in introns within these gene regions allowed us to easily differentiate IM62 and SF alleles on denaturing acrylamide gels. We found two putative copies of CYC in Mimulus, which appear to be recent duplicates (paralogs) postdating the split between the Antirrhinum and Mimulus clades (CYCA and CYCB, Table 1; gene phylogenies not shown). Our finding of two putative copies of CYC is not surprising, given that recent paralogs are also present in Antirrhinum (Luoet al. 1999). Both CYCA and CYCB contained small indels that allowed the electrophoretic differentiation of parental alleles. All four loci amplified reliably and segregated as codominant markers in the F2 mapping population.
Names, PCR conditions, and GenBank accession numbers of microsatellite (n = 27) and gene-based markers (n = 4) genotyped in the M. nasutus × M. guttatus F2 mapping population
Amplified fragment length polymorphisms: Of the 12 possible combinations of our three E + 3 and four T + 1 primers, eight selective amplifications produced consistently sharp and intense fragment patterns. These primer combinations were used to genotype the entire F2 population and each produced numerous (20–40) polymorphic and segregating bands (Table 2). Scored fragments ranged from 55 to >550 bp in length and we could resolve single base-pair differences in fragment mobility throughout this range.
We used relatively nonselective TaqI + 1 primers in the final amplifications (as opposed to the standard MseI + 3; Voset al. 1995; Marqueset al. 1998), but bands were well separated without further selective extensions on the TaqI primer. The TaqI recognition sequence (TCGA) includes a CpG dimer, whereas the recognition sequence of MseI restriction enzyme (TTAA) does not. CpG dinucleotides are generally underrepresented in the genomes of dicots and other organisms (Karlin and Burge 1995), but are often clustered in the 5′ regions of genes. Thus, restriction with TaqI will produce fewer and larger fragments than an equivalent MseI reaction (Voset al. 1995) and may require fewer selective bases to generate readably sparse fragment patterns. Remington et al. (1998) noted a negative association between the CpG content of E + 3 /M + 4 primer combinations and the number of polymorphic AFLP fragments in Pinus. In our study, amplifications with the E + ACG primer (one CpG) resulted in particularly strong and well-separated banding patterns, but produced as many polymorphisms as those with the other EcoRI primers (no CpG). Three of the four test amplifications using the E + ACC primer produced consistently faint bands and were not used in the full F2 genotyping. However, because this primer was labeled with only a single infrared dye (IRD800), these problems may reflect the generally faint IRD800 dye signal rather than the selective nucleotide sequence.
Number of scored AFLP polymorphisms by primer combination
While scoring the AFLP gels, we identified a substantial number of fragment pairs segregating as alternative alleles at a single locus. Such pairs were generally separated by <10 bp and were characterized by alternative parental genotypes (e.g., IM62 =+ −, SF = − +), by a complete lack of −− genotypes in the F2 population (N > 430), and by ++ genotypes with ~50% band intensity relative to the single-banded genotypes. These cosegregating bands presumably reflect small insertion-deletion events in one of the DNA regions amplified by a particular set of E + 3/T + 1 primers. In total, 54 of 250 scored fragments (21.6%) were converted from dominant presence/absence polymorphisms into codominant markers (Table 2). Because AFLPs do not require taxon-specific primer development and many loci can be screened on a single gel, these 27 codominant AFLP markers were far more efficient to genotype than the equivalent number of polymorphic microsatellite loci. However, such markers can be identified accurately only in large F2 populations and their usefulness may not extend to other types of studies. No strongly bimodal pattern of band intensity (chromatogram peak height) was detected for most other AFLP loci (data not shown), indicating that it would not be possible to convert the remaining single presence/absence polymorphisms into codominant genotypic information.
Construction of framework map: The M. guttatus × M. nasutus framework map consists of 174 marker loci spanning 1780 cM Kosambi on 14 linkage groups (Figure 1). This framework map was constructed through successive grouping, ordering, and evaluation of markers in the partial (dominants in coupling phase + codominants) and full data sheets and represents the “best-behaved” subset of the 255 genotyped loci. Only one marker from the full data set was unlinked to any other marker at our criteria for initial grouping (minimum LOD 6, maximum distance 37 cM). The remaining markers grouped into 15 preliminary groups ranging in size from 4 to 40 loci. With a few exceptions, each of these groups corresponded to a pair of coupling-phase linkage groups with shared codominant markers. During the mapping process, we identified 41 markers as unreliable and dropped them before the final ordering steps. Generally, such markers placed in a single interval with high probability (LOD ≫ 2), but their placement resulted in substantial (<7 cM) increases in map length, nonadditive two-point distances within the group, and many apparent genotyping errors. Since we were interested primarily in generating a stable framework map for QTL analyses and introgression line formation, we chose to simply exclude such potentially unreliable markers rather than re-examine the entire genotypic data set for individual scoring errors. In constructing the final framework map, we also did not include 39 accessory markers that met our linkage and reliability criteria when ordered in their most likely position but that could not be placed in a single interval with LOD > 2. To avoid long regions in a single linkage phase, the framework map does include a few such markers in repulsion phase with adjacent markers. However, the majority of the 174 framework markers placed in a single, well-supported location (LOD > 3). Codominant markers make up 25% of the markers on the framework map and the two classes of dominant AFLPs are represented in fairly even proportions (35 and 40%).
Framework linkage map of M. guttatus × M. nasutus F2 hybrid population. The names of codominant markers (microsatellites, gene-based markers, and codominant AFLPs) are underlined, the names of M. nasutus null AFLPs are in italic, and the names of M. guttatus null AFLPs are in plain text. Microsatellite names consist of AAT (or AG) plus a reference number. AFLP names consist of a primer pair code (see Table 2) plus the fragment length, plus c if codominant.
The 14 framework linkage groups correspond to the haploid chromosome number of these Mimulus species. In preliminary maps and in the final grouping of all markers with strict linkage criteria, one of these groups (LG14) appeared as two smaller groups. However, examination of the two-point linkage data revealed that all of the markers on the smallest group (LG14a: AP3, CA150, BC80, and CC320) were weakly linked (LOD > 1) to several markers on one end of another group (LG14b) at distances just above the threshold. LG14b consists only of dominant markers of mixed phase, which may have made linkage to the distant LG14a markers more difficult to detect. However, LG14b consistently behaved as a single linkage group despite its lack of codominant markers. Linking these two subgroups for the final ordering resulted in a single large group that fit our criteria for additivity of two-point distances.
Map length and genome coverage: Several approaches to calculating genome length L indicate that the framework map provides nearly complete coverage of the M. nasutus/guttatus genome. If we assume a random distribution of markers and simply add twice the average interval length (s = 11.125 cM) to each group to account for chromosome ends extending beyond the terminal markers, we estimate the genome length to be 2092 cM. Using Method 4 of Chakravarti et al. (1991) with only framework markers produces a nearly identical estimate of L (2096 cM), whereas including linked nonframework markers results in a slightly smaller L (2011 cM). The framework map length of 1780 cM represents 85–89% of these estimated genome lengths. Using the formula c = 1 − e−2dn/L (see materials and methods) and the estimate of L from all linked markers (2011 cM), we estimate that 91.5% of the genome is within 10 cM of a linked marker. Using only the 174 loci on the framework map, we estimate that 82.3 and 96.9% of the genome is within 10 and 20 cM, respectively, of a framework marker.
The above approaches to genome length estimation assume that the mapping process underestimates genome length. However, observation of chiasmata and other evidence suggests that recombination fractions and total genome length may be inflated by mapping programs such as MAPMAKER (Sybenga 1996). Another approach to estimating genome coverage compares the sum of recombination fractions across the framework map to the expectation given a single crossover per chromosome arm (Sybenga 1996; Whitkus 1998; Ott 1999). On our 14 linkage groups, the expected minimum of 28 crossovers would give a total expected recombination fraction of 1400. The sum of recombination fractions across all intervals on our framework map is 1692, suggesting that we could have overestimated genome length by as much as 20%. However, our identification and removal of significantly placed but unreliable internal markers probably reduced the contribution of scoring errors to map length inflation.
One other line of evidence suggests that the framework map, while not saturated with markers, spans almost the entire genome. Prior to the analyses presented here, we constructed numerous preliminary maps using increasingly large subsets of markers and individuals. New markers added late in this process were invariably linked to a previously established group and the number of unlinked markers stabilized after the first 156 markers were included, suggesting that no new genomic regions were being incorporated. Furthermore, only a single marker remained unlinked in the final data set and is probably not a legitimate marker distantly linked to the framework groups. This codominant marker (microsatellite locus AAT350) exhibited severe transmission ratio distortion and had many missing genotypes—only 1 of 394 genotyped F2 individuals was homozygous for the M. nasutus parental allele. This distorted marker may not provide sufficient information for mapping or may not be a true genetic locus.
Transmission ratio distortion: Nearly one-half (49%) of the 255 markers genotyped in our F2 mapping population deviate from the Mendelian expectation of 3:1 or 1:2:1 genotype ratios (for dominant and codominant markers, respectively) at α = 0.05. Nearly one-third (31%) show significantly distorted genotypic ratios at a higher threshold (α = 0.001). The codominant markers have the highest proportion of loci with distorted ratios (66 and 47% at α = 0.05 and 0.001, respectively). Relatively few dominant AFLPs with M. nasutus null alleles show distorted transmission ratios (36 and 22% at α = 0.05 and 0.001, respectively) and dominant AFLPs with M. guttatus null alleles are intermediate. The apparently higher incidence of transmission ratio distortion in codominant markers may reflect both lower standards for inclusion in the mapping data set and increased power to detect distortion with full genotypic information.
The transmission ratio bias we observe in F2 genotypes is highly directional. Of 38 codominant markers distorted at α = 0.05, 23 have an excess of M. guttatus (GG) genotypes and a deficit of M. nasutus (NN) homozygous genotypes, whereas only 9 show the opposite pattern. Very few markers show excesses (n = 3) or deficits (n = 3) of both NN and GG genotypes. For codominant markers with distorted genotype ratios, we also examined the degree and direction of bias in allele frequency. The majority of these markers showed allele frequencies distorted from the expected 1:1 ratio (34 and 23 at α = 0.05 and 0.001, respectively). However, fewer than half (14 of 38) of the distorted codominant markers had genotype ratios significantly different from the expectation given the random union of two gametes with the observed allele frequencies (χ2 with 1 d.f., α = 0.05). Over two-thirds of those markers with significantly distorted allele frequencies were biased toward excess M. guttatus alleles (68 and 74% at α = 0.05 and 0.001, respectively). The genotype frequencies of dominant markers show a similar pattern of directional bias, with 77% of distorted loci exhibiting either an excess of GG or a deficit of NN genotypes.
Of course, genotypes at different markers are not independent observations and may be correlated by linkage relationships or other processes. The framework map provides an opportunity to examine the distribution of transmission ratio bias across the hybrid genome. We did not explicitly exclude distorted markers in constructing the framework map and the proportion of distorted markers on the map (48% at α = 0.05) is nearly identical to the proportion that are distorted in the full data set. The degree of transmission ratio distortion varies substantially among linkage groups (Figure 2). We identified nine regions (indicated by horizontal bars in Figure 2) containing multiple distorted markers. These regions occur on eight different linkage groups and most (seven of nine) are defined by markers with an excess of GG genotypes and/or a deficit of NN genotypes. Two linkage groups (LG1 and LG11) consist almost entirely of severely distorted markers and one (LG14) has two distorted regions (one +NN, one +GG) at opposite ends of the group.
We mapped TRDLs by implementing the multipoint Bayesian method developed by Vogl and Xu (2000). Initially, we allowed up to four distorting loci per linkage group. With this maximum, the program identified TRDLs on all but four linkage groups (LG2, LG4, LG9, and LG11). However, on the 10 groups with TRDLs, it consistently found the maximum of four TRDLs. These were generally located in one or two small regions (<10–20 cM, often within a single marker interval). In these cases, adjacent TRDLs had very low (<0.10) frequencies of both homozygote classes and asymmetric and complementary frequencies of the two classes of heterozygotes (NG and GN). These unexpected results may reflect the generalized coding of genotypes in this version of the mapping method. To accommodate both full sib and F2 families, the modified ANITA program estimates the effects of the two types of heterozygotes separately. In an F2 inbred line cross such as our mapping population, however, these two classes are indistinguishable. While the program should rapidly converge on symmetric frequencies in an F2 (C. Vogl, personal communication), the combination of dominant marker data with this flexibility may allow the program to generate biologically unlikely results.
We then reanalyzed each linkage group with a maximum of either two TRDLs (LG5 and LG14, which have markers significantly distorted in both directions) or one TRDL (all others). As in the previous analysis, four groups consistently contained no TRDLs (Figure 2). For groups with a single TRDL, its location was generally within the region containing the four TRDLS previously estimated. The program always found the maximum of two TRDLs on LG14, each corresponding to a region of distortion at either end of the group (Figure 2). Given a maximum of two TRDLs on LG5, the program consistently mapped two tightly linked TRDLs (with asymmetric frequencies of the two heterozygote classes, but mean heterozygote advantage; see above) to the central region of many distorted markers. However, given a maximum of one TRDL, it always located a single TRDL with a large excess of M. nasutus alleles near the end of the linkage group (position at 0.01 cM). These conflicting results are difficult to interpret (and are not presented in Figure 2), but it seems likely that LG5 does contain two TRDLs with opposite effects.
In total, we mapped nine TRDLs to eight linkage groups (Figure 2). Most TRDLs mapped to regions with multiple distorted markers (horizontal bars in Figure 2). However, the mapping method did not locate TRDLs in the highly distorted regions of either LG2 or LG11. Three TRDLs were mapped to regions that did not have obvious blocks of distorted markers (LG3, LG6, and LG13). One of these TRDLs (on LG3) had no effect on allele frequency but instead caused about a 10% excess of heterozygotes. The two other novel TRDLs map to regions of low marker density where visual inspection of genotypic ratios would not suggest their presence. Of the TRDLs with effects on allele frequency, six exhibited an excess of M. guttatus alleles and two exhibited an excess of M. nasutus alleles.
We combined the results of the TRDL mapping with the count of regions with multiple distorted markers to estimate the minimum number of loci causing unequal transmission of parental alleles. The TRDL mapping identified eight loci that substantially altered parental allele frequencies, not counting the putative overdominant locus on LG3. Although not detected by the mapping method of Vogl and Xu (2000), three more TRDLs probably exist within the highly distorted regions of LG2, LG5, and LG11. The conflicting TRDL mapping results for LG5 also suggest that there may be an additional distorting locus near position 0–10 cM on that linkage group. In total, we posit a minimum of 11–12 TRDLs involved in the the unequal transmission of parental alleles to the F2 generation. Nine cause an excess of M. guttatus alleles and 2 or 3 cause an excess of M. nasutus alleles. This tally of TRDLs is likely to underestimate the true number of distorting loci because of the assumption of multiplicative fitness (no epistasis), our assignment of a maximum of one to two loci per linkage group, and limited power to detect loci with small effects or TRDLs in regions of sparse marker coverage.
Transmission ratio distortion across the M. guttatus × M. nasutus framework linkage map. The + and × symbols represent the two homozygous parental genotypes [M. nasutus (NN) and M. guttatus (GG), respectively] at marker loci on each of the 14 linkage groups. The vertical position of each symbol shows the magnitude and direction of the deviation of genotype frequencies from the Mendelian expectation (0.25). To show the reciprocality of bias, M. nasutus homozygote (NN) deviations were graphed directly [deviation = f(NN) − 0.25], and the M. guttatus (GG) deviations were graphed as negative [deviation = −(f(GG) − 0.25)]. Thus, values above the zero (dotted) line indicate excesses of NN homozygotes or deficits of GG homozygotes and values below the zero line indicate excesses of GG homozygotes or deficits of NN homozygotes. The horizontal bars indicate regions of two or more contiguous markers showing significant distortion (α = 0.05) in the same direction. The shaded peaks show the posterior frequency distributions of the location of TRDLs as estimated by the Bayesian mapping method of Vogl and Xu (2000). Peaks are labeled with the average frequency of the M. guttatus allele at the most likely TRDL location. Frequencies >0.2 were truncated to allow visualization of all peaks on the same scale. See text for discussion of TRDL mapping results.
DISCUSSION
Map construction: Linkage maps provide a genetic framework for identifying quantitative trait loci and analyzing genome structure and are a powerful tool for the study of adaptation and speciation. Here, we present a framework linkage map of a cross between M. nasutus and M. guttatus, a pair of closely related species with widely divergent floral morphologies and mating systems. To generate a heterospecific linkage map suitable for QTL analyses and the eventual marker-assisted introgression of particular genomic regions, we took a thorough approach to framework map construction. We genotyped an F2 mapping population (N = 526) at 197 dominant AFLPs, 27 codominant AFLPs, 27 microsatellite loci, and four codominant gene-based markers. The large size of the mapping population allowed the confident identification of perfectly cosegregating pairs of AFLP bands, which provided nearly half of all codominant markers in the genotypic data set. The proportion of AFLP polymorphisms segregating as codominant pairs (21.6%) is similar to but slightly higher than the proportion observed in other plants (e.g., Paglia and Morgante 1998; Baiet al. 1999). These data illustrate the value and efficiency of AFLP markers for genetic linkage mapping in wild plant systems, particularly in combination with less abundant, but more informative, microsatellite and gene-based markers.
Mapping studies using dominant AFLP or RAPD markers to map interspecific crosses have generally used backcross mapping populations (e.g., Lin and Ritland 1996) or constructed separate maps with the two sets of coupling-phase markers (e.g., Bradshawet al. 1995). However, the large number of recombinant F2 genotypes in our mapping population allowed the detection and estimation of linkage between dominant markers in repulsion phase. This statistical power, along with the relatively high proportion of codominant markers in the data set (>20%), allowed us to construct a single integrated linkage map rather than assign linkage group homology on the basis of shared codominant markers. The orders and intermarker distances of markers linked in coupling phase on the final integrated map were similar to those on preliminary single phase maps (data not shown), indicating that our framework map is a robust representation of linkage relationships among markers. Although linkage maps consisting entirely of codominant markers are ideal, maps alternating codominant markers with dominant markers linked in repulsion phase are nearly as accurate for the localization and characterization of QTL (Jiang and Zeng 1997).
Some errors in genotyping are inevitable in large data sets (particularly with AFLPs; Remingtonet al. 1998) and can lead to well supported but incorrect marker placement (Buetow 1991; Ehmet al. 1996) and the overestimation of recombination fractions (Collinset al. 1996). We attempted to minimize the contribution of genotyping error to map order and length by starting with a large initial number of markers and dropping apparently unreliable markers from the map rather than re-examining individual genotypes. We used the two-point LOD table and error detection functions in MAPMAKER 3.0 (Landeret al. 1987; Lincoln and Lander 1992) to identify and manually prune “bad” markers from the map. Markers were dropped if their placement inflated the distance between flanking markers or generated an unusually high number of apparent recombination events in adjacent intervals. This approach also excludes markers that are inherently difficult to genotype, an important consideration for future QTL mapping and marker-assisted introgression projects.
Map length and genome coverage: Several lines of evidence suggest that the framework linkage map provides thorough coverage of the M. guttatus × M. nasutus genome. The 14 linkage groups, which range in size from 64 to 173 cM and each contain at least nine framework markers, presumably correspond to the 14 chromosomes found in the M. guttatus species complex. The only marker unlinked to these groups at our strict linkage threshold does not appear to be located in a distantly linked and unmapped portion of the genome. Estimates of genome length based on the map length (1780 cM) and the distribution of framework markers suggest that the map encompasses at least 85% of the genome and that 97% of the genome is within 20 cM of a framework marker. Because the entire F2 genome can be scanned for QTL, high coverage makes the framework map extremely useful for the detection and characterization of genomic regions associated with phenotypic divergence. In addition, our framework map includes highly variable microsatellite loci and gene-based markers, which are found on all but three linkage groups. These markers, which should also be informative in other Mimulus species, will facilitate comparisons with other genetic maps in the genus (Bradshaw et al. 1995, 1998; Lin and Ritland 1996). This contrasts with the only previous linkage map of the M. guttatus species complex, which consisted of 81 markers on 15 linkage groups and covered only 58% of the estimated genome length (Lin and Ritland 1996). However, the estimate of genome length from our study (2011–2096 cM Kosambi) is similar to Lin and Ritland's (1996) estimate for a backcross between M. guttatus and M. platycalyx (2474 cM Haldane).
Although the framework map provides good coverage of the hybrid genome, the map and genome length estimates should be viewed as qualitative rather than absolute. Multiple factors may contribute to the over- or underestimation of recombination fractions (r) and centimorgan distances. Compared to the number of recombination events expected on 14 chromosomes (Sybenga 1996), the framework map may overestimate genome length by as much as 20%. This is in contrast to other interspecific F2 crosses, which often find suppressed recombination (Chetelatet al. 2000) or a deficit of detected crossovers (Whitkus 1998). We attempted to minimize map inflation due to genotyping error by excluding bad markers and using the error detection function in MAPMAKER 3.0 (which estimates r assuming a low fixed percentage of genotyping errors; Lincoln and Lander 1992), but some erroneous apparent crossovers may nonetheless lengthen the map. Recombination fractions for dominant markers linked in repulsion are particularly vulnerable to misestimation due to chance or error, since relatively few individuals have informative genotypes.
The systematic transmission ratio distortion we observed in some regions of the map (Figure 2) may also either artificially inflate intermarker distances or result in the tight clustering of markers (Bailey 1949; Lorieux et al. 1995a,b; Liu 1998). A pair of linked markers distorted in the same direction will have an apparent excess of nonrecombinant homozygotes relative to an otherwise similar pair of undistorted markers. Because MAPMAKER 3.0 does not use the actual genotypic ratios at each locus in estimating recombination fractions, the distances between markers distorted in parallel will be somewhat underestimated. This effect is visible in the apparent clustering of markers in the contiguous distorted regions of the framework map (Figure 2). Conversely, for markers distorted in opposite directions, the program will underestimate the frequency of double parental homozygotes and overestimate r. Lorieux et al. (1995a,b) and Quillet et al. (1995) have developed unbiased estimators of r for pairs of loci distorted due to either independent or shared biological causes, but to our knowledge these approaches have not been integrated into any standard mapping programs. Given the large genotypic data set and the high proportion of distorted markers, we chose not to calculate corrected estimates of r for all locus pairs, which involves constructing the linkage groups with manual three-point ordering rather than automated multipoint ordering (Quilletet al. 1995).
The effects of distortion on interval length should not diminish the utility of the framework map for QTL mapping, except that permutation thresholds for QTL detection will need to be calculated separately for the contiguous distorted regions (Doerge and Churchill 1996; R. Doerge, personal communication). Although transmission ratio bias may affect the distances between linked markers, it probably has not generated false linkage between framework markers that are actually on different chromosomes, as can occur (Cloutieret al. 1997). For example, the sets of distorted markers on LG1 and LG11 have similarly biased genotypic ratios (<18% NN and/or >37% GG), but all of these markers were unambiguously assigned to the two separate groups. Because markers with spurious two-point linkages are likely to lengthen the map when placed with multipoint mapping, we may have excluded most such markers during the framework mapping process. However, distortion in general also does not appear to have been the major factor in the exclusion or nonplacement of markers; the framework markers show distortion in the same proportion (>45%) and to the same degree as the full marker data set.
Implications and possible sources of transmission ratio distortion: The large number of marker loci exhibiting distorted genotypic frequencies in our F2 mapping population provide insight into genomic differentiation between the parental species and also have important implications for natural and experimental introgression. Unequal transmission of alleles at nuclear loci is commonly observed in wide intraspecific and interspecific crosses (Zamir and Tadmor 1986; Jenczewskiet al. 1997; Bradshawet al. 1998; Whitkus 1998). The proportion of distorted markers in our F2 population (49% at α = 0.05) is at the high end of the range reported for crosses between plant species (reviewed by Jenczewskiet al. 1997). The distorted markers are concentrated in particular regions of the linkage map and these regions are largely unidirectional in bias (most show an excess of M. guttatus and/or a deficit of M. nasutus homozygous genotypes; Figure 2). This pattern suggests that biological mechanisms, rather than chance or error, underlie most of the observed transmission ratio distortion. The grouping of distorted markers and the results of the TRDL analysis suggest that particular distorting loci (at least 11–12) may be responsible for the high incidence of transmission ratio distortion in the genotypic data set. Whatever the underlying processes, this pattern may circumscribe the genetic composition of advanced generation hybrids in the lab or wild by favoring the rapid fixation of M. guttatus alleles in some genomic regions and retarding introgression in others (Rieseberg et al. 1995, 2000).
In this study, transmission ratio distortion appears to result from interactions between the heterospecific genomes rather than from inbreeding depression or unconditional selection against parental genotypes at single loci. Inbreeding depression can be a major source of transmission ratio distortion in linkage mapping populations—indeed, some mapping projects are explicitly designed to identify the loci causing inbreeding depression (e.g., Remington and O'Malley 2000). However, both the design of our experiment and the pattern of distortion suggest that inbreeding depression is unlikely to contribute to the observed biases in genotypic ratios in our F2 mapping population. For inbreeding depression to substantially distort genotypic frequencies in an F2 population, a lethal or semilethal recessive allele heterozygous in one parent must be transmitted to the F1 and, upon selfing and segregation, cause differential zygote mortality and a deficit of carrier parent homozygotes in the F2 population. Because both parental lines in this study were highly inbred and normally fit, it is unlikely that such major deleterious alleles are being newly revealed in the F2 mapping population. Furthermore, although the M. guttatus inbred line could conceivably still carry hidden deleterious alleles (some surveyed marker loci remained heterozygous after five generations of selfing; L. Fishman, A. Kelly, E. Morgan and J. Willis, unpublished data), their segregation would not produce the predominant bias against M. nasutus genotypes that we observe. The segregation of deleterious alleles fixed in the highly inbred M. nasutus parent could explain the observed pattern. However, the high fitness of this line (and its source population) rules this explanation out: the selection coefficients necessary to generate the observed degree of transmission ratio distortion at multiple loci would cause the M. nasutus population to have an average fitness at least two or three orders of magnitude lower than the M. guttatus parent.
Heterospecific interactions, broadly construed, may bias the genotype frequencies of F2 hybrids at several stages. The non-Mendelian genotypic ratios we observed in the F2 mapping population could result from (a) events during meiosis or early development of gametophytes that distort allele frequencies in viable F1 gametes; (b) the differential fertilization success of viable F1 gametes; (c) the differential survival of F2 zygotes with different multilocus genotypes; or (d) all of the above.
Several lines of evidence (see below) favor selection among gametes as an important contributor to the biased genotypic ratios in this cross, but we must consider both gametic and zygotic mechanisms as potential causes of transmission ratio distortion. Several different mechanisms could generate unequal representation of parental alleles in the gametes of F1 hybrids. Autosomal meiotic drive, in which a killer allele eliminates gametes carrying alternative alleles, can cause severe segregation distortion. Such drive loci have been well characterized in Drosophila, mouse, and Neurospora (reviewed in Lyttle 1991). However, meiotic drive is an unlikely explanation for the widespread and generally moderate bias in genotype frequencies in our F2 hybrids, because such a large number of independently segregating drive loci would render the heterozygous F1 generation almost completely sterile rather than fairly fertile (Fishman and Willis 2001). Transmission ratio distortion may also be caused by the inviability of some or all recombinant gametes. Differential fitness of recombinant gamete genotypes could be caused by epistatic interactions between alleles at different loci (haploid expression of Dobzhansky-Muller incompatibilities; Dobzhansky 1951) or by aneuploidy of recombinant gametes. At least two mechanisms could result in such aneuploidy: recombination within chromosomal rearrangements (Grant 1971) or ectopic exchange promoted by differences in genome size (Jenczewskiet al. 1997). All of these mechanisms should affect the proportion of viable gametes in F1 hybrids as well as allele frequencies in the viable F1 gametes. However, our data on the distribution of pollen and ovule inviability in F1 and F2 Mimulus hybrids do not support any haploid model of hybrid sterility and transmission ratio distortion (Fishman and Willis 2001). These mechanisms would produce a much lower incidence of gamete sterility in the F2 generation than the F1 because of the regeneration of the parental diploid genotypes and the additional removal of recombinant gametes by selection. We found the opposite pattern: lowest male and female fertility and many fully sterile individuals in the F2 generation (Fishman and Willis 2001).
Recombinant gametes that are equally viable may nonetheless vary in their fertilization success under competitive conditions. Intra- and interspecific variation in pollen tube growth rates is frequently observed in plants and can be an important prezygotic barrier to hybridization if conspecific pollen outperforms heterospecific pollen (reviewed in Rieseberg and Carney 1998). The divergent mating systems and floral morphologies of our Mimulus species provide the conditions for such differential fertilization success. Reciprocal crosses and mixed pollinations demonstrate that M. guttatus pollen tubes outgrow M. nasutus pollen on long styles, although pollen-pistil interactions also appear to be involved (Kiang and Hamrick 1978; Diaz and MacNair 1999). F1 pollen grains with M. guttatus alleles at pollen tube growth QTL may similarly outcompete those with M. nasutus alleles on F1 styles (which are equal to M. guttatus styles and twice as long as M. nasutus styles, on average; L. Fishman, A. Kelly and J. H. Willis, unpublished data) and generate a unidirectional transmission bias at linked markers. Several such pollen performance QTL, which cause differential transmission of parental alleles through pollen but do not affect F1 gamete viability or F2 fitness, have been mapped in rice and other grain crops (Harushimaet al. 1996; Fariset al. 1998).
Finally, transmission ratio distortion could be caused by the inviability of zygotes with particular diploid hybrid genotypes. We know that epistatic interactions between parental genomes are at least partly responsible for high levels of pollen and ovule infertility in the F2 mapping population (Fishman and Willis 2001) and lethal Dobzhansky-Muller incompatibilities could certainly act at earlier life stages. In M. guttatus, two synthetic lethal systems have previously been identified that partially isolate a copper-adapted population from others throughout Central California (MacNair and Christie 1983; Christie and MacNair 1984, 1987). This suggests that negative interactions among alleles at different loci may develop very rapidly in Mimulus and could contribute substantially to the barriers between incipient species. However, because this mechanism selects against unmatched genotypes at pairs of interacting loci rather than favoring alleles from a single parent, it should not necessarily generate a unidirectional bias in genotype frequencies across multiple genomic regions. Interactions between nuclear loci and the maternally inherited cytoplasmic genotype could generate consistent asymmetry, but the direction of allelic bias we observe (deficits of M. nasutus, excesses of M. guttatus) is not predicted by the cytoplasmic genotype (M. nasutus) of our F1 and F2 hybrids.
Clearly, no single genetic mechanism can account for the patterns of transmission ratio distortion and sterility we observed in F1 and F2 hybrids between M. nasutus and M. guttatus. Our data on these postmating barriers to hybridization and introgression provide new insight into genetic divergence in this rapidly evolving system, but raise further questions about the underlying mechanisms. We are currently conducting crossing experiments designed to distinguish the effects of differential gamete viability, pollen competition, and zygote mortality on the biased transmission of parental alleles in hybrids. An ongoing project using nearly isogenic introgression lines to study the genetic basis of hybrid sterility should reveal which mechanisms, if any, contribute to both transmission ratio distortion and F2 hybrid breakdown. The framework linkage map we present here will be an invaluable guide for these projects, for QTL analyses of the phenotypic differences associated with mating system evolution, and for further studies of adaptation and speciation in Mimulus.
Acknowledgments
We are very grateful to J. Aagaard for the development of the gene-based markers used in this study and to C. Vogl for modifying and sharing a TRDL mapping program for F2 designs. Thanks also to members of the Willis Lab, J. Kelly, L. Moyle, and D. Remington, for helpful discussions of this material and to two anonymous reviewers for comments on earlier versions of this manuscript. S. Belcher, the greenhouse staff of the University of Oregon, and many undergraduate students helped with the crossing, care, and measurement of the experimental plants. This work was supported by grants from the National Science Foundation.
Footnotes
-
Communicating editor: J. A. Birchler
- Received May 25, 2001.
- Accepted September 5, 2001.
- Copyright © 2001 by the Genetics Society of America