Mammalian genomes can vary substantially in haploid chromosome number even within a small taxon (e.g., 3–40 among deer alone); in contrast, teleost fish genomes are stable (24–25 in 58% of teleosts), but we do not yet understand the mechanisms that account for differences in karyotype stability. Among perciform teleosts, platyfish (Xiphophorus maculatus) and medaka (Oryzias latipes) both have 24 chromosome pairs, but threespine stickleback (Gasterosteus aculeatus) and green pufferfish (Tetraodon nigroviridis) have just 21 pairs. To understand the evolution of teleost genomes, we made a platyfish meiotic map containing 16,114 mapped markers scored on 267 backcross fish. We tiled genomic contigs along the map to create chromosome-length genome assemblies. Genome-wide comparisons of conserved synteny showed that platyfish and medaka karyotypes remained remarkably similar with few interchromosomal translocations but with numerous intrachromosomal rearrangements (transpositions and inversions) since their lineages diverged ∼120 million years ago. Comparative genomics with platyfish shows how reduced chromosome numbers in stickleback and green pufferfish arose by fusion of pairs of ancestral chromosomes after their lineages diverged from platyfish ∼195 million years ago. Zebrafish and human genomes provide outgroups to root observed changes. These studies identify likely genome assembly errors, characterize chromosome fusion events, distinguish lineage-independent chromosome fusions, show that the teleost genome duplication does not appear to have accelerated the rate of translocations, and reveal the stability of syntenies and gene orders in teleost chromosomes over hundreds of millions of years.
A century after Alfred Henry Sturtevant published the first genetic map and described the logic underlying map construction (Sturtevant 1913), meiotic maps remain critical for genomic analysis because they can provide chromosome-length contiguity to the thousands of scaffolds that today’s reference genome sequences typically generate (Lewin et al. 2009). The long-range contiguity that extensive genetic maps can supply is essential for understanding chromosome rearrangements, such as the inversions Sturtevant described in GENETICS (Sturtevant and Beadle 1936), and is important for learning how genomes evolve over time.
Chromosome numbers in teleost fish are remarkably conserved compared to mammals: 58% of teleost species (334 of 580 species examined) have 24 or 25 haploid chromosomes with a small secondary peak at 50 chromosomes (25/580 species, 0.4%), most of which likely represent genome duplication events (Naruse et al. 2004). In contrast, the distribution of chromosome numbers among mammals shows four small peaks at 19 (4.7% of mammals), 21 (8.5%), 22 (8.1%), and 24 (8.8%) haploid chromosomes, with the remaining 70% distributed between three chromosomes and >50 chromosomes (Naruse et al. 2004). Even within single orders of mammals, chromosome numbers often vary immensely; for example among Cervidae (deer) the haploid number varies between 3 and 40 and among Rodentia (rodents), the haploid number varies between 5 and 51 (Scherthan 2012).
Among teleosts with a sequenced genome, zebrafish (Danio rerio) has 25 haploid chromosomes like most teleosts, but platyfish (Xiphophorus maculatus) and medaka (Oryzias latipes) have 24, cod (Gadus morhua) 23, fugu pufferfish (Takifugu rubripes) 22, and both stickleback (Gasterosteus aculeatus) and green pufferfish (Tetraodon nigroviridis) have 21 (Figure 1). The construction of an extensive meiotic map for the platyfish (X. maculatus) aligned to scaffolds from the newly sequenced genome (Schartl et al. 2013) allowed us to evaluate, in unprecedented detail, the roles of translocations and inversions in teleost karyotype evolution and to distinguish among hypotheses for the evolutionary origin of karyotypes with different chromosome numbers.
We arrayed platyfish genome scaffolds against the largest genetic map of any vertebrate except the newly published zebrafish map (Howe et al. 2013). We constructed the meiotic map from a backcross-mapping panel generated by crossing female platyfish (X. maculatus) to male swordtail (X. hellerii) using swordtail as the recurring parent. The platyfish genetic map contains 16,114 polymorphic markers scored on 267 individuals localized along the 24 platyfish chromosomes (Walter et al. 2004; Schartl et al. 2013). Because the platyfish genetic map utilized polymorphisms detected by single nucleotide polymorphisms (SNPs) present in sequence tags adjacent to restriction enzyme cut sites (RAD-tags) (Miller et al. 2007; Baird et al. 2008; Amores et al. 2011), the sequences in mapped markers can be associated readily to contigs in the genome sequence assembly (Schartl et al. 2013). The alignment of the genetic map to the assembly resulted in the assignment of 90% of the total genome sequence present in genomic contigs to a chromosomal position. The green pufferfish T. nigroviridis genome, in contrast, has just 65% of the genome sequence aligned to chromosomes (Jaillon et al. 2004). The combination of the extensive meiotic map and platyfish genome sequence scaffolds provides an unprecedented opportunity to study karyotype evolution at the full chromosome level during teleost diversification (Figure 1).
Materials and Methods
Crossing a single X. maculatus Jp 163 A male (WLC no. 1325) to a single X. hellerii (strain Rio Lancetilla, Db-, WLC no. 1337) female produced numerous progeny. Two F1 hybrid females were backcrossed to X. hellerii males, which produced a mapping panel of 267 backcross fish. Genomic DNAs from map cross parents and progeny were digested with the restriction enzyme SbfI (New England Biolabs) followed by the ligation of adapters with five nucleotide (nt) barcodes, each of which differed by at least two nucleotides to guard against sequencing error. RAD-tag libraries were made as described (Amores et al. 2011). A 50-ng aliquot of size-selected DNA was PCR amplified for 12 cycles and fragments 200–500 bp long were gel purified and sequenced using 80-nt single-end reads on an Illumina Genome Analyzer IIx sequencer. Equal quantities of barcoded DNA from the parents were loaded into a half sequencing lane each and DNAs from 16 progeny were loaded onto each sequencing lane. Low quality reads and ambiguous barcodes were discarded. We used Stacks software (Catchen et al. 2011) to sort retained reads into loci and to genotype individuals by implementing a likelihood-based SNP calling algorithm to distinguish SNPs from sequencing errors (Hohenlohe et al. 2010). The parameters for Stacks provided to denovo_map.pl were: a minimum of three reads for a stack (-m 3), up to four differences when merging stacks into loci (-M 4), and up to four fixed differences when merging loci from various individuals into the catalog. Stacks exported data into JoinMap 4.0 (Wageningen, The Netherlands) for linkage analysis using at least 120 of 267 individuals for each marker.
The sequence for each of the initial set of 18,119 polymorphic RAD-tags was mapped onto sequenced genome contigs and annotated by the Synteny Database (Catchen et al. 2009). In addition, the set of sequences surrounding microsatellites from a previous genetic map (Walter et al. 2004) were identified in genomic contigs and each contig that contained both a mapped microsatellite and a mapped RAD-tag was associated to the RAD-tag map, thus providing anchor markers for numbering platyfish chromosomes in accord with previous assignments. Because JoinMap could not handle datasets as large as ours, we partitioned markers into two overlapping subsets, with all sets containing common anchor markers. Markers were initially grouped in JoinMap 4.0 using the “independence LOD” parameter under “population grouping” with a minimum LOD value of 15. After this initial grouping to individual linkage groups (LGs), markers were divided into three groups of eight different LGs each. Subsequent grouping and ordering of markers was performed at a minimum LOD of 30.0. Marker ordering was performed using the maximum likelihood (ML) algorithm in JoinMap 4.0 with default parameters. Suspicious double recombinants were identified using the “genotype probabilities” feature in JoinMap 4.0 and visual inspection of the colorized graphical genotypes. Double recombinants were reevaluated by visually inspecting reads in Stacks and followed by manual correction of genotypes when necessary. For example, if a suspected homozygote causing a putative double crossover had a small number of reads, the genotype would be eliminated because it might represent a heterozygote that by chance had not had a read for the alternative allele; likewise, if a suspected heterozygote caused a double crossover and had many reads, but only one read for the second allele, it would also be eliminated because that single read might be a sequencing error. After genotype correction, the complete dataset was loaded again into JoinMap 4.0 to obtain a new order of markers and the process of identification of double recombinants was performed again until suspicious double recombinants and double recombination events were minimized. The expected recombination count feature in JoinMap4.0 was used to identify individuals with a higher than expected number of recombination events and visual inspection of marker order was performed. When needed, marker order was manually optimized after visual inspection of the colorized graphical genotypes in JoinMap 4.0. If moving a marker or group of markers reduced the total number of recombination events, the marker was manually moved to a new position. Subsets of markers were used to compare the order of markers along a linkage group using both regression mapping and maximum likelihood algorithms in JoinMap 4.0. Results showed no significant differences (Supporting Information, Figure S1). Segregation distortion was determined by calculating the χ2 values for each marker using JoinMap 4.0. Relative segregation distortion along each linkage group was determined by plotting the χ2 values against position along each linkage group using markers with no missing genotypes. Significance thresholds of P = 0.01 and P = 0.005 were plotted as horizontal lines on the graphs (Figure S2). The map is available at http://genome.uoregon.edu/xma/.
Assigning genome contigs to the genetic map
Platyfish genome contigs were aligned along the genetic map by performing BLAST comparisons of the nucleotide sequences of mapped RAD-tag markers to genome contigs using the Synteny Database. Some small contigs contained a single mapped RAD-tag marker that ordered the contig along the chromosome, but because a single anchor point leaves orientation ambiguous, orientation was randomly selected. This protocol would not affect the number of interchromosomal translocations identified nor would it alter the number of intrachromosomal transpositions discovered but some inversions in the alignment would not actually be in the fish’s genome. When more than one adjacent mapped RAD-tag marker hit the same contig, that contig was ordered on the chromosome in its proper orientation. Sometimes, two genetic markers from independent map positions aligned to the same scaffold, causing map order to disagree with genome assembly order. This type of discrepancy could be caused by either (1) erroneous genome assembly or (2) incorrect detection of crossovers in the genetic map. The latter is unlikely because nearby double crossovers are biologically rare. In these cases, the genetic map was hand checked to confirm map orders and if verified, then the scaffold was broken into fragments resulting in the agreement of assembly order with genetic map order. To break the scaffold, the nearest contig boundaries to the genetic markers within the scaffold were identified (because scaffolds are composed of contigs alternating with gaps) and the scaffold was broken at the contig–scaffold boundary.
After aligning the genome to the genetic map, we first assigned each nucleotide on each chromosome to a numerical position and then aligned each of our platyfish RNA-seq transcript contigs (Shen et al. 2011; Garcia et al. 2012) to its genome contig and recorded the maximal start/end coordinates for each gene. Orthologs of each transcriptome contig were called by algorithms of the Synteny Database as described (Catchen et al. 2009).
For analyzing conserved syntenies, we employed the Synteny Database using parameters as described (Catchen et al. 2009) and datasets from Ensembl version 71 including zebrafish zv9. To construct dot plots, the Synteny Database identifies orthologs and paralogs for each gene along a specific platyfish chromosome by reciprocal best BLAST analysis and plots positive results on chromosomes of the same or other species; when not using the “scale plot to actual chromosome lengths” option, the mark appears directly above the index gene on the index chromosome so in this mode, dot plots display gene orders according to the index chromosome and identify chromosomal regions of orthology or paralogy in the subject genome.
Fluorescent in situ hybridization
For fluorescent in situ hybridization (FISH) to mitotic chromosomes, we used BAC IDXMAA-37L10 containing egfrb from contig 394, which corresponds to the previous LG24 (Walter et al. 2004) and is located at the bottom of our LG21, and BAC ID XMAA-106I17 from contig 126, which corresponds to the previous LG21 (Walter et al. 2004) and is at the top of our LG21. These BACs were labeled with fluorescein 12-dUTP and tetramethyl-rhodamine-5-dUTP, respectively, by standard nick translation. A hybridization mixture containing 200 ng of each labeled probe and 100-fold excess of sonicated genomic female DNA was denatured and allowed to hybridize overnight to the denatured metaphase chromosomes of X. maculatus (strain JP163A) males, the heterogametic sex. After a stringent wash (0.5× SSC 70°) and counterstaining with DAPI (4′-6-diamidino-2-phenylindole), chromosomes were analyzed with a Zeiss AxioImager microscope. Metaphase chromosomes showing clear hybridization signals with each probe were separately photographed using a CCD camera and subsequent images from the two probes were merged and processed with FISH View 2.0 software (Applied Spectral Imaging).
A meiotic map for the platyfish
Current high-throughput genome sequencing methodologies produce hundreds or thousands of genomic contigs that can be ordered into chromosome-length units using meiotic maps. To order scaffolds and contigs of the platyfish (X. maculatus) genome, we constructed a meiotic map from a backcross family using genetic polymorphisms present in RAD-tags, short sequences adjacent to infrequent restriction sites (Miller et al. 2007; Baird et al. 2008; Hohenlohe et al. 2010; Amores et al. 2011). Sequencing produced 28 million reads for the parents and 780 million reads for the progeny. After filtering sequences with ambiguous barcodes, low-quality scores and ambiguous RAD sites, we retained 21 million reads for the parents (average coverage >150 times) and 575 million reads for the progeny (average coverage per individual 35 times). To create the catalog of parental RAD-tags, 7,402,369 reads from the male parent went into “ustacks” and 7,350,842 reads were retained (99.3%), while 4,750,882 reads from the female parent went into ustacks and 4,705,641 were retained (99.04%). Nonretained reads contained too many sequencing errors to form their own stack or to merge into any existing loci. Among the 62,600 total markers scored in the parents, Stacks software (Catchen et al. 2011) identified 18,119 polymorphic markers,16,114 of which formed 24 linkage groups at LOD 35—among vertebrates, only zebrafish (Howe et al. 2013) has a meiotic map with more markers. The ∼2000 polymorphic markers that were not located on the map failed because they had fewer than the required minimum number of genotypes or did not join a linkage group at the minimum LOD value. Figure S3 shows the entire map. The platyfish map contains data from 4,100,799 genotypes, 155,401 missing values (3.6%), and 7,667 manually corrected genotypes (0.18%).
Most markers showed, as expected, an ∼1:1 ratio of homozygotes for X. hellerii alleles (the recurrent parent in this backcross) to heterozygotes, but 4022 markers—about one-quarter of the total number of polymorphic markers—showed significant segregation distortion (P > 0.01), as might be expected in an interspecific cross with some incompatibility among evolved genotypes. For example, some markers in LG21 (linkage group 21, the sex chromosome) showed a ratio of 168:89 homozygous X. hellerii to heterozygous genotypes; in other regions, for example in part of LG4, segregation distortion had the opposite polarity with a ratio of 94:166 for homozygous-to-heterozygous genotypes. Segregation distortion was not random, as might be expected from genotyping errors, but was localized to specific small regions of 13 linkage groups (Figure S2, Table S1). Isolated markers showing segregation distortion due to missing genotypes were always surrounded by markers belonging to the same genomic scaffold; this result shows that missing genotypes did not alter the map location of the genetic marker. Markers showing segregation distortion likely represent either a recessive locus from the X. hellerii recurrent parent, which is deleterious when homozygous (or a synthetic lethal) in a partially platyfish genetic background, or a dominant locus from the X. maculatus parental genome, which is lethal or deleterious when heterozygous in combination with certain X. hellerii genotypes at other loci.
The content of platyfish RAD-tag meiotic map linkage groups varied from 947 markers for LG5 to 359 markers for LG24 and linkage group length ranged from 68.5 cM for LG15 to 47.7 cM for LG9 for a total genome length of 1328.3 cM (Table S2). Algorithms for estimating genome length from genetic mapping data showed that the RAD-tag map covers from 98.4 to 99.7% of total genome length (Chakravarti et al. 1991; Fishman et al. 2001).
To associate the new RAD-tag map to the previous microsatellite-based map (Walter et al. 2004), we identified genomic contigs containing published mapped microsatellites and then found mapped RAD-tags contained on these contigs; this procedure assigned linkage groups from the RAD-tag map to previously defined LGs. Results showed that the previously described LG21 and LG24 (Walter et al. 2004) were linked together in the RAD-tag map in what constitutes the new LG21 (Figure 2, A–C). A newly identified LG24 brought the linkage group number to the described number of 24 chromosomes.
To verify the linkage mapping results showing that LG21 and LG24 from the microsatellite map are in fact on the same linkage group, we performed fluorescent in situ hybridization to mitotic chromosomes. Two-color BAC hybridization using clones residing on each of the previous two linkage groups clearly showed specific signals on a single chromosome pair in a male platyfish (Figure 2D). This result confirms the genetic linkage map showing that these BACs, previously assigned to different linkage groups, are indeed cytologically linked on the same chromosome and can recognize the homomorphic XY chromosomes. Though these BACs are syntenic on sex chromosomes, their physical locations are strikingly distinct: the BAC from contig 126 is located directly below the centromere (proximal part of the q arm), whereas the BAC from contig 394 is present distally close to the long arm telomere as predicted by the linkage map.
Of the 1950 original genomic scaffolds, 131 were split within a linkage group. In some cases splitting was due to a single marker out of order in the linkage group compared to the predicted order from the assigned base pair position in the scaffold; all of these cases were due to a missing genotype in the map. A total of 101 scaffolds had markers on more than one linkage group. In all cases where the discrepancy involved more than a single marker, all discrepant markers mapped to the same location on the other linkage group; the mapping of discrepant to adjacent locations shows that the map locations are unlikely due to genotyping errors. In those cases, we broke the scaffold at the nearest gap with no sequence (a string of Ns) in the genome assembly scaffold and assigned the two parts according to the linkage mapping. For example, scaffold JH556751.1 has 25 markers between positions 33.4 and 36.5 cM in LG2 and 40 markers between positions 67.5 and 68.3 cM in LG10. In another case, two of the 82 markers from scaffold JH556677.1 map to the same location in LG21 (Figure 2), while the remaining 80 markers are located in LG13. In such cases, the sequence assembly is likely to be in error because the probability that adjacent markers in a scaffold localize at the same position on a different linkage group than the other markers in the scaffold purely by chance is remotely low. In all other cases, markers belonging to a given scaffold were contiguous in a linkage group. Note that to score a marker confidently as a heterozygote, just a few reads from the second allele are sufficient; but to score a marker as a homozygote always requires the number of reads to exceed criterion. Regression and ML mapping algorithms provided nearly identical marker orders (Figure S1).
The anchoring of platyfish contigs to the extensive platyfish genetic map allowed for the long-range comparison of gene orders across species. Analysis of conserved syntenies using the Synteny Database (Catchen et al. 2009) revealed evolutionary relationships of platyfish chromosomes. Figure S3 displays the full map.
Conserved syntenies of ohnologous platyfish chromosomes arising in the teleost genome duplication
A dot plot of a platyfish chromosome’s paralogs distributed across the entire platyfish genome shows that most paralogs are located on a single other platyfish chromosome. For example, most of the paralogs of genes on Xma1 (X. maculatus chromosome 1) are located on Xma20 (Figure 3A) and, reciprocally, most paralogs of genes on Xma20 are located on Xma1 (Figure 3B). Dot plots showed that 8 of the 24 platyfish chromosomes have a similar one-to-one paralogous relationship with another platyfish chromosome, including Xma3/Xma13, Xma7/Xma24, and Xma8/Xma12. We conclude that these ohnologous (or homeologous) chromosomes derived from whole chromosome duplication events, with the simplest explanation being that they arose in a whole genome duplication event, the teleost genome duplication (TGD). Because chromosome paralogy contents for these four chromosome pairs are reciprocal, no translocations disrupted these eight chromosomes in the 350 million years since the TGD.
In some cases, only one member of a pair of ohnologous chromosomes suffered a translocation after the TGD. Three chromosomes (Xma17, Xma18, and Xma21) have paralogs prominently only on one other chromosome, but the second partner chromosome has paralogs on more than one chromosome; for example, paralogs of genes on Xma17 lie predominantly on Xma2, but many lie on Xma4. In a third pattern, 13 platyfish chromosomes (Xma2, -4, -5, -6, -9, -10, -11, -14–16, -19, -22, and -23) have TGD paralogous segments on multiple chromosomes. For example, the first 3-Mb segment of Xma16 is paralogous to Xma10, the next 7 Mb is paralogous to Xma5, and the last 14 Mb is paralogous again to Xma10 (Figure 3C). Reciprocally, Xma10 is mostly paralogous to Xma16, but with a portion that is paralogous to Xma22 (Figure 3D), and Xma5 is paralogous to not only Xma16, but also Xma14 and Xma22 (Figure 3E). This result would be expected by a translocation involving one of the two ohnologous chromosomes after the TGD.
Together, these results show that for one-third of the platyfish chromosomes (four pairs), translocations after the teleost genome duplication did not disrupt the ancestral condition and that for three additional chromosome pairs, a translocation occurred in one member of the pair but not the other. Thus, nearly half of the 24 platyfish chromosomes (11/24) appear to have avoided a translocation since the TGD about 350 million years ago.
Conserved synteny relationships comparing platyfish and medaka
The closest relative of the platyfish with a sequenced genome is Japanese medaka, O. latipes (see Figure 1). Both species belong to the Atherinomorpha within the Percomorpha and diverged from each other ∼120 million years ago (MYA) (Miya et al. 2003; Steinke et al. 2006). Karyotypes of platyfish and medaka both have 24 chromosomes, 19 (79%) of which show a simple one-to-one orthology relationship; for example, nearly all of the medaka orthologs of genes on platyfish chromosome 9 (Xma9) lie on medaka chromosome 4 (O. latipes chromosome 4, Ola4) (Figure 4A, Figure S3). Reciprocally, nearly all of the platyfish orthologs of genes on Ola4 reside on Xma9 (Figure 4B). (The “ghost” chromosomes, Ola17 in Figure 4A and Xma6 in Figure 4B, represent TGD ohnologs of Ola4 and Xma9, respectively.)
To compare this stasis in teleost chromosomes to mammals, consider that the mouse and human lineages diverged only ∼75 million years ago (Waterston et al. 2002), but among their 20 (mouse) and 23 (human) chromosome pairs, only the X shows a one-to-one relationship. Although the content of two human autosomes (Hsa20 and Hsa17) are on a single mouse chromosome (Mmu2 and Mmu11, respectively), the content of no mouse autosome is on a single human chromosome. Figure S4 shows the relationships of several human and mouse chromosomes for comparison to teleosts.
In addition to the 19 chromosomes with a one-to-one relationship between platyfish and medaka, the five remaining platyfish chromosomes are similar to Xma9 in being orthologous almost exclusively to a single medaka chromosome, but in addition having one or two short segments ∼1 Mb long that lie on another medaka chromosome, as for Xma19 at about position 20 Mb (Figure 4C). The short anomalous chromosome segment on medaka chromosome 24 could arise by one of four possible hypotheses: a translocation that occurred after the platyfish/medaka lineage divergence in (1) the platyfish lineage or in (2) the medaka lineage, or an assembly error in (3) the platyfish or (4) medaka reference genome sequence. Resolution of these hypotheses requires an outgroup to define the ancestral condition. Stickleback provides the necessary outgroup genome: Figure 4D shows that the entire length of Xma19 matches a single stickleback chromosome (G. aculeatus linkage group 15, GacXV), rather than two chromosomes as for medaka; thus, platyfish retains the ancestral condition and the gene arrangement in medaka is the derived state, ruling out the hypothesis of a misassembly of the platyfish genome. These findings could be explained by either a translocation in the medaka lineage or a misassembly of the medaka genome. In the Xma19 comparison (Figure 4C), the ghost feature of paralogs on Ola24 represents the finding that Ola22 and Ola24 are TGD ohnologons (see Figure S5M), suggesting that the anomalous segment at 20 Mb might result from misassembly to the incorrect medaka TGD paralogon.
Besides Xma19, four additional platyfish chromosomes (Xma6, -10, -14, and -18) show a total of seven additional short regions of discrepancy between platyfish and medaka, and five of these seven are also discrepant between platyfish and stickleback (Figure S5)—these five shared anomalies represent segments that are either assembled incorrectly in the platyfish genome, which seems unlikely due to the strong support of the genetic mapping of markers in each of these segments, or are translocations of short regions that occurred in the platyfish lineage after it diverged from the medaka lineage. Figure S5, I–L shows that only one other anomalous segment (on Xma10) represents sequences on the medaka TGD ohnologon. These analyses show that remarkably few translocations disrupted karyotypes since the divergence of medaka and platyfish lineages 120 million years ago, and those that did occur involved exchanges of just a few megabases.
Although platyfish and medaka chromosomes share the same gene content, the order that those genes occupy along the chromosomes diverged over time. Figure 4E compares the location of medaka and platyfish orthologs along Ola4 and Xma9. Numerous transpositions are apparent; for example, chromosome segments labeled a and b in the figure occupy contiguous locations on Xma9 but are discontinuous on Ola4, as expected from a transposition (Figure 4E). Nevertheless, within the two segments, orthologs are generally arrayed in the same order in the two species. In addition to transpositions, numerous inversions occurred as these lineages diverged; for example, the segment of genes labeled b and c in Figure 4E are adjacent in both platyfish and medaka, but segment c is inverted, comparing the two species. We conclude that transpositions and inversions have been much more common during the phylogenetic divergence of these lineages than have translocations, supporting results from the comparison of zebrafish and medaka (Naruse et al. 2004).
Conserved synteny relationships comparing platyfish and threespine stickleback
After medaka, the next closest species to platyfish with a sequenced genome is threespine stickleback (G. aculeatus), which belongs to the Gasterosteiformes rather than the Atherinomorpha, the group containing platyfish and medaka (see Figure 1). The estimated divergence time for Gasterosteiformes and Atherinomorpha is ∼180 MYA (Miya et al. 2003; Steinke et al. 2006; Near et al. 2012). Platyfish has 24 chromosome pairs, as do about one-quarter of all teleosts, but fourspine stickleback (Apeltes quadracus) has 23 pairs (Figure 1), while ninespine stickleback (Pungitius pungitius) and threespine stickleback both have just 21 pairs (Ross et al. 2009). Because ninespine stickleback (21 pairs) and fourspine stickleback (23 pairs) are more closely related to each other than either is to threespine stickleback (21 pairs) (Urton et al. 2011), it is not clear whether the ancestral condition in stickleback was 21 or 23 chromosome pairs.
Several competing hypotheses might account for the reduced chromosome number in threespine stickleback: (1) Six ancestral chromosomes might have fused in pairs, perhaps in Robertsonian fusions, to make three chromosomes in the threespine stickleback lineage and thus reduce the chromosome number from 24 to 21. (2) Four ancestral chromosomes might have all fused together into a single chromosome in the threespine stickleback lineage. (3) Many chromosomes might have fused in the threespine stickleback lineage to make a karyotype with fewer than 21 chromosomes, followed by subsequent fissions to end up with 21 chromosomes. Other more complicated hypotheses are also possible. Comparisons of platyfish and stickleback genomes can now distinguish these hypotheses to show how threespine obtained its reduced chromosome count.
Analysis of conserved syntenies reveals the origin of different chromosome numbers in platyfish and stickleback. Figure 4F shows that the platyfish orthologs of genes on stickleback linkage group 4 (GacIV), which contains a major quantitative trait locus for armored plates (Colosimo et al. 2005), reside on two platyfish chromosomes, Xma23 and Xma17. Figure 4G shows that nearly all stickleback orthologs of genes on Xma23 are on GacIV, as are nearly all of the stickleback orthologs of genes on Xma17 (Figure 4H). This pattern would occur if the chromosomal ancestor of Xma23 and Xma17 fused in the lineage leading to stickleback. Note that the section of GacIV between ∼26 and 28 Mb is orthologous to a portion of Xma23 but is discontinuous with the rest of the Xma23 orthologs (Figure 4F); this pattern indicates the occurrence of a transposition or inversion after the chromosome fusion event. Two additional stickleback chromosomes show a similar pattern: GacVII, which harbors a major quantitative trait locus for pelvic spine (Shapiro et al. 2004), is a fusion of Xma11 and Xma14, and GacI is a fusion of chromosomes with orthologs on Xma18 and Xma24 (Figure S6). The threespine stickleback sex chromosome GacXIX (Peichel et al. 2004; Urton et al. 2011) was not involved in these fusions and its genetic content lies almost completely on Xma15 (the platyfish sex chromosome is Xma21) (Figure S7). Thus, the reduced stickleback karyotype evolved from the fusion of three pairs of chromosomes present in the last common ancestor of platyfish and stickleback (GacIV = Xma23 + Xma17; GacVII = Xma11 + Xma14; and GacI = Xma18 + Xma24). We conclude that three simple pairwise fusions of six ancestral chromosomes produced the threespine stickleback karyotype. Two of these, GacIV and GacVII, have been shown to be involved in what had been called chromosome fusion/fission events comparing threespine (n = 21) to fourspine stickleback (n = 23) (Urton et al. 2011). Platyfish comparative data can distinguish between fission and fusion events because they define the condition at the root of the tree. The platyfish data show that the fourspine arrangement matches more closely the ancestral condition; and thus, for the three fusion events that reduced the threespine karyotype to 21 haploid chromosomes, one fusion event likely occurred before the radiation of the sticklebacks, and two occurred in the threespine stickleback lineage after it diverged from the fourspine and ninespine stickleback lineage.
Conserved synteny relationships comparing platyfish and green pufferfish
Like stickleback, green pufferfish (T. nigroviridis, see Figure 1) has 21 chromosomes in a haploid set. Did the same pairs of ancestral chromosomes fuse in the evolution of the green pufferfish genome as fused in the threespine stickleback lineage? To approach this question, we compared platyfish chromosomes to those of green pufferfish.
Results showed that the three longest green pufferfish chromosomes are fusions of pairs of chromosomes corresponding to Xma9 and Xma23 for Tni1, Xma7 and Xma10 for Tni2, and Xma16 and Xma24 for Tni3 (Figure 5). Thus, although two platyfish chromosomes (Xma23 and Xma24) are involved in fusions for both threespine stickleback and green pufferfish, they are paired with different partners in the two species. Because different pairs of ancestral chromosomes fused in the green pufferfish lineage than in the threespine stickleback lineage, we conclude that chromosome fusion events leading from 24 to 21 chromosomes were independent in these two lineages. Recognizing that the sample size is low, the recurrent involvement of ancestral chromosomes represented by the current Xma23 and Xima24 in chromosome fusions could be merely coincidence; alternatively, two other hypotheses might explain the results: first, Xma23 and Xma24 may be “sticky,” perhaps having repetitive elements or other features in common with other chromosomes that facilitate fusion; or, Xma24 may be involved in fusions more frequently than other chromosomes because it has the fewest RAD-tag markers (360), 75% as much as the next smallest chromosome (Xma8), and thus likely has the fewest genes in the karyotype and so might disrupt phenotypes less than other chromosomes when involved in chromosome fusions. Xma24 and Ola2 are reciprocally orthologous throughout the length of both chromosomes, giving confidence to the platyfish meiotic map organization. Comparisons with additional lineages that independently experienced chromosome fusions should help distinguish these hypotheses.
Figure 5 shows details for Tni1. The first 10 Mb of Tni1 has orthologs on Xma23, while the last 13 Mb contains orthologs of genes on Xma9 (Figure 5A). This situation could have arisen either by the fusion of two ancestral chromosomes in the pufferfish lineage or the fission of ancestral chromosomes in the platyfish lineage. A comparison to threespine stickleback resolves the issue, showing that Tni1 consists of two threespine stickleback chromosomes fused together, GacIV and GacXIII (Figure 5F), the orthologs of Xma23 and Xma9. While all green pufferfish orthologs of genes on Xma9 lie on Tni1 (Figure 5B), green pufferfish orthologs of genes on Xma23 lie on two green pufferfish chromosomes, most on Tni1 (at least 21 Mb), but ∼5 Mb of genes on Xma23 have orthologs that reside on Tni20 (Figure 5C). Tni20 is a short chromosome only ∼3.5 Mb long, with orthologs of ∼2 Mb on Xma23 and ∼1 Mb occupying part of Xma14 (Figure 5D). Curiously, of the 24 Mb on Xma14, only ∼3 Mb has orthologs on any green pufferfish chromosome, and that one is Tni20 (Figure 5E). None of the many other genes on Xma23 have conserved synteny with any other green pufferfish chromosomes, suggesting either substantial loss of Xma23 orthologs in green pufferfish or failure of green pufferfish genome sequence contigs to be associated with this chromosome in the genome assembly. Figure S8 shows conserved synteny dot plots for the other two fused chromosomes, Tni2 and Tni3.
Conserved synteny relationships comparing platyfish and zebrafish
The zebrafish lineage diverged basal to the percomorphs discussed above (see Figure 1); as expected from this history, results showed that syntenies conserved between zebrafish and platyfish are much less extensive than among the percomorphs. Nevertheless, the genetic content of several chromosomes has been preserved over the 300 million years since these lineages diverged, for example Xma16 and D. rerio linkage group 3 (Dre3, Figure 6, A and B). Seven zebrafish/platyfish chromosome pairs, about one-quarter of the total, share this pattern (Dre3/Xma16; Dre4/Xma17; Dre9/Xma7; Dre14/Xma23; Dre16/Xma3; Dre19/Xma13; and Dre23/Xma1). For the Dre4/Xma17 pair, all of the zebrafish orthologs of genes on Xma17 lie on Dre4, but only on the left arm of Dre4; the right arm of Dre4 has few orthologs or paralogs on any platyfish chromosome (Figure 6, C and D). This is significant because the right arm of Dre4 has a major sex-determining locus and is largely heterochromatic (Anderson et al. 2012), suggesting that this chromosome arm evolved substantially in the zebrafish lineage after it diverged from the lineage of percomorph fish.
Other platyfish/zebrafish pairs have more complex patterns of conserved synteny. For example, although nearly all of the orthologs of platyfish chromosome Xma15 reside on Dre20, the platyfish orthologs of genes on Dre20 reside both on Xma15 and on Xma9, with evidence of several transpositions (Figure 6, E and F). Finally, the zebrafish orthologs of genes on many platyfish chromosomes reside on more than one zebrafish chromosome, like Xma14/(Dre7 + Dre23) (Figure 6G) and reciprocally, the platyfish orthologs of zebrafish chromosomes Dre7 and Dre23 lie on more than one platyfish chromosome (Xma14 and Xma4 for Dre7, Figure 6H; Xma1 and Xma14 for Dre23, data not shown). The complex nature of the comparison of platyfish and zebrafish chromosomes makes it difficult to retrace the steps that led to zebrafish having 25 chromosomes and platyfish just 24.
Conserved synteny relationships comparing platyfish and human chromosomes
Because the human genome is one of the most conserved eutherian karyotypes (Kohn et al. 2006; Kemkemer et al. 2009), it represents a good lobefin outgroup for platyfish (Figure 1). A comparison of platyfish and human chromosomes illustrates several principles. (1) Short portions of platyfish chromosomes often ∼1–2 Mb long frequently share conserved syntenies with individual segments of single human chromosomes (Figure 7A). (2) Each platyfish chromosome has orthologs on several human chromosomes; for example, Xma2 has a significant number of orthologs on five human chromosomes, Hsa7, -11, -12, -15, and -16 (Figure 7A). This result shows that a number of translocations, chromosome fusions, and/or chromosome fissions occurred in one or more, likely both, lineages during the 450 million years since the last common ancestor of platyfish and human. (3) Platyfish orthologs on a single human chromosome are generally short and discontinuous; for example, the Xma2 orthologs on Hsa7 occur in at least seven different sections of Xma2 (Figure 7A). This result shows that a substantial number of inversions and transpositions occurred in likely both lineages as they diverged. (4) Each segment of each human chromosome is generally co-orthologous to segments on two platyfish chromosomes (Figure 7B). For example, the first 5 Mb of Hsa12 has co-orthologs on both Xma2 and 17, as does a majority of Hsa12; but in addition, other platyfish chromosome pairs contain co-orthologs of Hsa12 genes, including Xma1 and 20, Xma8 and 12, Xma3 and 13, and Xma5 and 16 (Figure 7B). These duplicated chromosome segments in platyfish, which are apparent for all human chromosomes, were identified as ghost pairs in the platyfish/platyfish comparisons (e.g., Figure 3) and are most readily explained as a legacy of the TGD (Amores et al. 1998; Taylor et al. 2003; Braasch and Postlethwait 2012). (5) Duplicated segments of platyfish chromosomes generally have the same start and stop points with respect to the human chromosome, for example, the region between 50 and 60 Mb on Hsa12 (Figure 7B). This result—that breakpoints on ohnologous chromosomes tend to be coterminous—would be expected for chromosome rearrangements that occurred prior to the TGD. Because most breakpoints follow this pattern, we conclude that most translocations, apparent when comparing human and platyfish chromosomes, occurred either in the platyfish lineage before the TGD or in the lobefin lineage leading to human. (6) The break points of some duplicated platyfish segments, however, are not coterminous; for example, the segment from 60 to 110 Mb on Hsa12 has duplicated co-orthologs in platyfish, one of which is intact on Xma17, but the other of which is broken into two segments that appear on two different platyfish chromosomes, Xma2 and Xma20 (Figure 7B). This result would be expected if a translocation occurred in the platyfish lineage after the TGD. (7) The comparison of human chromosomes to the genome of zebrafish, whose lineage diverged basal to that of platyfish and other percomorphs (Figure 1), confirms both the duplicated nature of teleost genomes and the specific breakpoints of the major translocations and inversions that are more clearly evident in the analysis of the platyfish genome (Figure 7C). Because many of the chromosome rearrangement borders for Hsa12 are shared between platyfish and zebrafish (compare Figure 7, B and C), these rearrangements must have occurred before the divergence of platyfish and zebrafish lineages. (8) Pairs of platyfish chromosomes predicted to be ohnologs in dot plot comparisons to individual human chromosomes share extensive stretches of paralogous genes; for example, dot plots showed that Xma2 and Xma17 are ohnologs of Hsa12 (Figure 7B), and these two platyfish chromosomes show many pairs of orthologs (Figure 7D), despite the presence of multiple inversions and transpositions that occurred after the TGD (Figure 7D). The rather small number of translocations detected comparing Hsa12 to either the platyfish or zebrafish genome (Figure 7, B and C) contrasts with the large numbers of inversions and transpositions that occurred within Xma2 or Xma17 after the TGD (Figure 7D). (9) Finally, comparing the order of orthologous gene pairs along a human chromosome and its platyfish orthologous chromosome (e.g., Hsa12 and Xma17) shows an even greater frequency of crossing lines, hence more rearrangements (Figure 7E).
Degradation of conserved syntenies over time
A comparison of gene orders within conserved syntenies shows how conserved syntenies change over evolutionary time. For the eight platyfish chromosomes whose gene content is shared reciprocally over a single zebrafish chromosome, we utilized the Gene Homology Matrix output of the Synteny Database (Catchen et al. 2009) to compare gene orders along the phylogeny (see Figure 1). The result for platyfish chromosome Xma1 is typical. Medaka had few inversions and transpositions with respect to platyfish (Figure 8A), as expected from their close phylogenetic relationship (∼125 million years separation). Although green pufferfish and stickleback are equally related to platyfish, their joint lineage having separated from the platyfish lineage ∼190 million years ago (Figure 1), green pufferfish showed somewhat more conservation of gene order with platyfish than threespine stickleback showed to platyfish over the eight chromosomes examined (Figure 8, B and C shows Xma1), suggesting fewer intrachromosomal rearrangements in the green pufferfish lineage than the threespine stickleback lineage in the ∼170 million years since pufferfish and stickleback lineages diverged.
In contrast to extensive gene order conservation among percomorphs (divergence time ∼190 million years) (Figure 8, A–C), gene orders were greatly reorganized comparing platyfish to zebrafish (lineage divergence ∼290 million years ago) despite the conservation of syntenies over the entire zebrafish and platyfish chromosome (Figure 8D). This result shows that even in the absence of interchromosomal rearrangements (translocations), intrachromosomal rearrangements (inversions and transpositions) may have accelerated in the zebrafish lineage; alternatively, rearrangements may have been rapid in the 100 million years of evolution between the divergence of the platyfish and zebrafish lineages and the divergence of the platyfish and stickleback/pufferfish lineages. Comparison to the genome sequence of an outgroup teleost with an assembly reaching full chromosome length is necessary to distinguish these two hypotheses; unfortunately, such a resource, which could come from a meiotic map directed assembly of the genome of a teleost diverging basally to zebrafish, does not yet exist.
The human orthologs of platyfish genes from each platyfish chromosome are arrayed on several human chromosomes, for example, four main human chromosomes for Xma1 (Figure 8E). For each human chromosome, conserved gene orders with platyfish (Figure 8, F–I) appear to be only marginally more degraded than the zebrafish/platyfish comparison (Figure 8D). These data support the idea that rearrangements are more frequent in the zebrafish lineage.
We made a meiotic recombination map for the platyfish, the second most extensive meiotic map among all vertebrates, to investigate the mechanisms of karyotype evolution among teleost fish after the teleost genome duplication event. We aligned platyfish genomic contigs (Schartl et al. 2013) along the genetic map to obtain a full chromosome-level view of the genome. We compared conserved syntenies at several levels: (1) between platyfish and three percomorph species (medaka, like platyfish, a member of the Atherinomorpha; threespine stickleback, a member of the Gasterosteiformes; and green pufferfish, a member of the Tetraodontiformes); (2) platyfish and zebrafish, which diverged basal to the percomorphs among teleosts; and (3) platyfish and human, a representative lobefin outgroup to the teleosts. These comparisons revealed in unprecedented detail the extent to which the genetic content of percomorph fish chromosomes have been deeply conserved over 190 million years of evolution (Miya et al. 2003; Steinke et al. 2006). This deep conservation is far greater than that of mammalian karyotypes over about half that time interval (Naruse et al. 2004; Ferguson-Smith and Trifonov 2007).
The stability of teleost karyotypes compared to rapid change in mammalian karyotypes raises the question of the mechanisms of percomorph vs. mammalian genome biology and evolution that might have mitigated against the origin of translocations. One possibility could relate to transposable elements. Teleost fish have a greater diversity of transposon families than mammals, but have many fewer copies in each family (Chalopin et al. 2013; Schartl et al. 2013). Although mammalian genomes have few different types of transposable elements, they have many copies of each family that could provide long stretches of almost identical sequences on several different chromosomes (Chalopin et al. 2013). These extended regions of nearly identical sequence could provide substrates for illegitimate recombination in meiosis that can lead to chromosome rearrangements.
In contrast to the interchromosomal stability displayed by percomorph fish, a substantial number of intrachromosomal rearrangements occurred in either (1) stem percomorphs in the 100 million years between the percomorph/zebrafish divergence and the platyfish/stickleback divergence or (2) the zebrafish lineage in the 290 million years since it diverged from the percomorphs. Distinguishing these hypotheses requires access to a chromosome-long genome assembly from a more basally diverging teleost fish, such as a bony-tongue (Osteoglossomorpha) or eel (Elopomorpha); a need exists for the construction of such a resource.
A comparison of human chromosomes to the platyfish genome shows that most translocation break points are conserved between TGD ohnologous chromosomes (Figure 7B). This finding means that most translocations evident from comparing human and teleost chromosomes occurred before the TGD or in the lobefin lineage leading to humans. Furthermore, a comparison of individual human chromosomes to both zebrafish and platyfish chromosomes shows that most translocations occurred before the divergence of platyfish and zebrafish lineages (Figure 6, B and C). Some of these translocations likely occurred in the lobefin lineage leading to humans, others in the rayfin lineage before the TGD, and others after the TGD but before the divergence of zebrafish and percomorph lineages. Distinguishing among these possibilities requires access to chromosome-length genome assemblies of both a basally diverging teleost and a recently diverging rayfin fish that does not share the TGD, such as gar, amia, sturgeon, or bicher.
One might have predicted that translocations would have been especially prevalent in the few million years after the TGD because natural selection could have favored individuals carrying chromosome rearrangements due to their role in returning the early TGD tetraploid fish genomes to a meiotically diploid situation. In addition, chromosome rearrangements occurring differently in different populations might have provided genetic barriers to interbreeding of populations and thus to have spurred speciation, the radiation of the teleosts. Initial periods of chromosomal rearrangements shortly after the TGD may have given way to karyotype stability for millions of years.
The platyfish genetic map, containing the second most markers of any vertebrate meiotic map (after zebrafish), when used to organize genomic sequencing scaffolds, reveals in unprecedented detail the origin and evolution of teleost karyotypes. Most striking is the stability of chromosome gene content over hundreds of millions of years, with several entire chromosomes maintained intact between zebrafish and platyfish, whose lineages separated ∼300 million years ago. Results showed how specific chromosome fusion events reduced chromosome number in threespine stickleback and that a similar reduction in the green pufferfish lineage occurred independently even though both lineages utilized fusions of the same two specific ancestral chromosomes. Results reiterate the principle that translocations become fixed in populations much less frequently than inversions and transpositions, and surprisingly, that more translocations occurred before the teleost genome duplication than after that event when using human as an outgroup. Finally, results place in focus the need for a chromosome-length genome assembly for both a basally diverging teleost and a nonduplicated rayfin fish to better understand karyotypic evolution of teleost fish, the most species rich group of vertebrates.
We acknowledge the NIH grant R01OD011116 (J.H.P.), the Deutsche Forschungsgemeinschaft grant TR 17/B4 (M.S.), R24OD011120 (R.W.), R24OD011198 to W.W., and the Alexander von Humboldt Foundation (J.H.P. and M.S.). University of Oregon Animal Welfare assurance no. is A-3009-01; IACUC protocol no. is 11-07.
Communicating editor: M. Halpern
- Received July 14, 2013.
- Accepted March 20, 2014.
- Copyright © 2014 by the Genetics Society of America