Abstract
The consequences of hybridization are varied, ranging from the origin of new lineages, introgression of some genes between species, to the extinction of one of the hybridizing species. We generated replicate admixed populations between two pairs of sister species of Drosophila: D. simulans and D. mauritiana; and D. yakuba and D. santomea. Each pair consisted of a continental species and an island endemic. The admixed populations were maintained by random mating in discrete generations for over 20 generations. We assessed morphological, behavioral, and fitness-related traits from each replicate population periodically, and sequenced genomic DNA from the populations at generation 20. For both pairs of species, species-specific traits and their genomes regressed to those of the continental species. A few alleles from the island species persisted, but they tended to be proportionally rare among all sites in the genome and were rarely fixed within the populations. This paucity of alleles from the island species was particularly pronounced on the X-chromosome. These results indicate that nearly all foreign genes were quickly eliminated after hybridization and that selection against the minor species genome might be similar across experimental replicates.
HYBRIDIZATION between species in nature is more common than biologists suspected a few decades ago. At least 10% of animal species can produce progeny when crossed with individuals from a different species (Mallet 2005); the proportion seems to be higher in plants (Stebbins 1950). The fitness outcomes of hybridization and admixture are varied (Taylor and Larson 2019). Research on hybrid zones has revealed the extent of gene exchange in nature and in some cases has identified alleles able to cross species boundaries [reviewed in Moore (2015); Taylor et al. (2015); Gompert et al. (2017)]. Alleles that reside on sex chromosomes, however, are less likely to be transferred from one species to another (Payseur et al. 2004; Macholán et al. 2007; Carneiro et al. 2010, 2014; Garrigan et al. 2012; Turissini and Matute 2017), while mitochondrial DNA (mtDNA) seems to be easily transferred across species boundaries (Bachtrog et al. 2006; Wallis et al. 2017). A question that remains open is what outcome is expected when two species engage not in sporadic gene exchange, but rather form an admixed population carrying many genes from each of the two parental species.
Mass hybridization has three possible outcomes in terms of species persistence. The first is that the two genomes could sort themselves into their initial parental arrangements after hybridization; this will occur in instances where admixed genomes are unfit and penalized by selection (Rosenblum et al. 2012). A second possibility is that genomes can exist as a mosaic, with both genes from both parental species’ ancestries persisting in a stable manner with roughly equivalent contributions from the parental species (Schumer et al. 2016). A third possibility is that after admixture occurs, a few alleles from one of the parental species can remain in a genetic background that evolves back to one largely resembling a single parental species. In this last case, we refer to the species that contributes the majority of alleles in the admixed genome as the “major species”, and the one that contributes the minority of admixture as the “minor species.”
These scenarios have important implications for the way we understand genome evolution and the general outcome of hybridization in nature. For example, under a scenario where genomes do not tolerate introgression and behave as coadapted units, we would expect admixed genotypes to be broadly selected against and the genetic composition of hybrid populations to evolve toward that of a single parental species. On the other hand, if the genomes of two species are largely compatible and can be readily mixed (Mallet et al. 2016), potentially providing benefits to admixed individuals, populations of hybrids would be expected to retain ancestry of both species and in some instances even become isolated species themselves (i.e., hybrid speciation, Buerkle et al. 2000; Chapman and Burke 2007; Mallet 2007; Mavárez and Linares 2008; Schumer et al. 2014a; Comeault and Matute 2018). These two outcomes are not mutually exclusive, and in some cases large portions of the genome may be resistant to admixture and introgression while other portions are free to move between species boundaries, either as a result of being selectively neutral or selectively favored (Schumer et al. 2014b; Juric et al. 2016; Muirhead and Presgraves 2016) . Both outcomes have been observed. Hybridization can lead to purging of one of the genomes in which the admixed individuals carry only a small proportion of one of the parental species (Garrigan et al. 2012; Turissini and Matute 2017; Schrider et al. 2018), as well as to the existence of stable and balanced mosaic genomes (Rieseberg et al. 2003; Fontaine et al. 2015; Schumer et al. 2016). An aspect that remains largely unknown is whether these outcomes are deterministic in repeated instances of hybridization. Evaluating this hypothesis in nature is challenging because it requires identifying sets of species pairs that show parallel instances of hybridization (e.g., parallel hybrid zones).
Alternatively, one can create fully admixed experimental populations in the lab, where we can control the magnitude and nature of admixture, and directly observe the outcome of hybridization between species. Using this experimental approach, we can follow the evolution of phenotypes and genotypes after hybridization and determine if certain parental traits or alleles are selectively favored or whether they can persist in a fully admixed population. Additionally, if genomes persist as mosaics, this approach may reveal whether independent instances of hybridization lead to the same genetic mosaic in the genomes of admixed individuals. This approach has the advantage of providing primary evidence of the amount of admixture that two genomes can tolerate while also controlling for important features such as the timing of admixture and the relative contribution of the parental species to the population of hybrids. Such an experiment, therefore, has some advantages over studying admixture in natural populations: aspects of demographic history can be controlled (and known) in an experimental context.
Here we report the creation of replicate interspecific admixed populations using two species pairs of Drosophila, followed by measuring the fate of multiple interspecific trait differences in morphology and behavior as well as (via DNA sequencing) the genetic composition of replicate populations of hybrids after 20 generations of independent evolution. Each of the two species pairs was represented by a continentally distributed species and a closely related island endemic. The island parental species have experienced smaller long-term effective population sizes and more “specialized” ecologies than the species distributed across the continent. The first pair of species was Drosophila simulans and D. mauritiana. D. simulans is widespread throughout sub-Saharan Africa and has become an invasive species in much of the world (Begun et al. 2007; Kofler et al. 2015). The species is presumed to have originated either in East Africa or in Madagascar, and populations from these regions have the largest diversity of the whole range (Dean and Ballard 2004; Lachaise and Silvain 2004; Kopp et al. 2006). D. mauritiana, on the other hand, is endemic to the Indian Ocean island of Mauritius (Tsacas and David 1974). These two species are homosequential in chromosome banding pattern: they do not differ in chromosome number or have large-scale rearrangements that would impede recombination (Lemeunier and Ashburner 1976). The pair is thought to have diverged between 500 and 250 KYA [Ks (synonymous divergence) = 0.05] (Nunes et al. 2010; Garrigan et al. 2012). Multiple barriers to gene flow separate the two species, including strong intraspecific mating preferences and sterility of the hybrid males (Lachaise et al. 1986; Price et al. 2001). They also show multiple morphological and physiological differences (Coyne 1989; Laurie et al. 1997; Price et al. 2001). Even though no instances of natural hybridization have been reported between these species, there is evidence of extensive gene exchange in the recent past (Garrigan et al. 2012; Brand et al. 2013).
The second pair consists of the mainland African species D. yakuba, a denizen of sub-Saharan grasslands, and its sister species D. santomea. D. yakuba is found mainly in open or semiopen habitats on continental Africa and its adjacent islands. D. santomea is endemic to the highlands of the island of São Tomé in the Gulf of Guinea, 240 km west of Gabon (Lachaise et al. 2000). D. yakuba has chromosome inversions segregating in natural populations (Lemeunier and Ashburner 1976), but we constructed a line that was isochromosomal with D. santomea, so there were no rearrangements to impede recombination (Moehring et al. 2006, see below). D. yakuba and D. santomea are thought to have diverged over 1 MYA (Ks = 0.05) (Turissini et al. 2015; Turissini and Matute 2017). As with the D. simulans/D. mauritiana pair, this species pair shows multiple traits that contribute to reproductive isolation (including hybrid male sterility and mating discrimination), as well as several interspecific differences in morphology (Matute et al. 2009; Matute and Coyne 2010). Further, D. yakuba is involved in two of the few known stable hybrid zones known in Drosophila (Llopart et al. 2005; Cooper et al. 2018). The two species have exchanged genes with each other within the last 10,000 generations, including a full mitochondrial replacement from D. yakuba into D. santomea (Llopart 2005; Bachtrog et al. 2006; Beck et al. 2015).
We produced eight replicate admixed populations for each of the two species pairs and then followed the phenotypic and genotypic compositions of individuals from the admixed populations over >20 generations. For each of these populations of hybrids, we assayed phenotypic traits to see if they persisted as hybrid values, admixed values, or if they regressed to one of the parental species trait values (and at what rate). After 20 generations, we tested how parental ancestries, judged by morphology, behavior, and DNA sequences, segregated within each population of hybrids.
We found that in both species pairs, across all experimental replicates, phenotypes rapidly regressed to those of the parental continental species, becoming nearly indistinguishable from that species in morphology, behavior, and fertility. Consistent with this observation, the genomes of the admixed populations also regressed to the continental species with only a few traces of the island species. Our results indicate that after admixture, Drosophila genomes tolerate little introgression, consistent with observations of hybridization and admixture in nature. Moreover, our results show that the evolutionary outcome of hybridization can be highly repeatable and predictable at least in hybridizing species of Drosophila.
Materials and Methods
Strains and crosses
All the fly stocks used in these experiments were described previously (e.g., Price et al. 2001; Coyne et al. 2002, 2004; Moehring et al. 2004, 2006; Llopart et al. 2005; Matute and Coyne 2010). To construct admixed populations, we used only one strain from each of the four species. All parental strains were constructed as isofemale lines (i.e., progeny derived from a single inseminated female). Isofemale lines can retain multiple alleles and are rarely isogenic. This polymorphism might obscure the origin of an allele in an admixed population. For that reason, we surveyed the extant polymorphism in each of the four species by using a panel of lines from previously sequenced isofemale lines (Turissini and Matute 2017; Schrider et al. 2018; Turissini et al. 2018; accession numbers in Supplemental Material, Table S1). This last step was done so we could assay species-specific alleles using fixed, diagnostic markers for the genetic analysis (see below). The lines used for each species pair are listed as follows.
For the D. simulans/D. mauritiana hybrids we crossed the D. simulans “FC” strain to the D. mauritiana “mau SYN” strain. The FC strain (“sim FC”) is an isofemale line collected by JAC in Florida City, Florida in June 1985 and maintained in very large numbers (over 500 individuals per generation). The D. mauritiana synthetic strain (“mau SYN”) was derived from six isofemale lines collected on Mauritius in 1981 and combined in 1983 (Coyne 1989). We looked for large-scale chromosomal inversions that might exist between these species by crossing D. simulans females to D. mauritiana males and karyotyping salivary polytene chromosomes of L3-instar F1 larvae. We extracted the salivary glands of four to six larvae with forceps (Miltex Catalogue number: 17–301; McKesson, Richmond, VA). Salivary glands were mounted on precleaned glass slides, squashed, and stained with orcein following previously described methods to determine whether there are large chromosomal rearrangements between the two species (Tonzetich et al. 1988; Comeault et al. 2016). We did not detect any inversions using this approach. D. simulans has no known segregating inversions (Lemeunier and Ashburner 1976, 1984) so any inversions in D. mauritiana would be detected as heterozygotes in the hybrid larvae.
To make D. yakuba/D. santomea hybrids, we crossed the D. santomea “STO.4” strain (“san STO.4”) to the D. yakuba “Täi18-ISO” strain. The “STO.4” strain is an isofemale line whose foundress was collected in March 1998 in the Obó Natural Reserve on São Tomé at 1300 m altitude (Lachaise et al. 2000). The D. yakuba Täi18-ISO strain was derived from the Täi 18 line, an isofemale strain collected in 1981 in the Täi rainforest on the border between Guinea and the northwest Ivory Coast. Because D. yakuba is polymorphic for inversions (Lemeunier and Ashburner 1976b), and Täi 18 appears to be heterozygous for X-linked inversions (Moehring et al. 2006), we used this strain to create a line that was colinear with D. santomea, which has no segregating inversions. After seven generations of brother–sister mating within individual sublines from the Täi 18 strain, the orcein-staining method described above showed four sublines determined to be isochromosomal with D. santomea, as no inversions were seen in interspecific F1 hybrid larvae (Moehring et al. 2006). The two parental lines thus contained no large-scale interspecific chromosomal rearrangements that would impede recombination in their hybrids. As none of the lines used to produce the admixed populations were derived from brother–sister matings, the parental lines were not highly inbred and are expected to harbor some standing genetic variation.
Making admixed populations
We generated admixed populations using the same approach in both species pairs. Briefly, we first generated ∼200 F1 females from each of the two reciprocal crosses between each pair of species. These F1 females were then backcrossed to 200 pure-species males (100 of each species) to produce backcrossed individuals from the four possible crosses. Backcross offspring were then collected as virgins and used to start admixed populations. For each species pair, we made eight replicate populations and started each by combining offspring from the four backcrosses in each of the eight replicates. Each replicate started with 25 female and 25 male offspring from each of the four backcrosses for a total of 200 flies per replicate. Backcross females are usually fertile, while males are often sterile. Since we used all the possible backcross genotypes, the initial population had equal amounts of autosomal and X-chromosomal genes from each species, as well as equal amounts of mitochondrial DNA and cytoplasm. Bottles were kept in an incubator at 24° and a light/dark cycle of 12 hr of each regime. The eight populations (for both species pairs) were maintained for 24 nonoverlapping generations, slightly longer than a year, with eight randomly selected males and females used to initiate each generation. In parallel, and in the same incubators, we maintained control populations of the pure species at a roughly similar population density to that of the admixed populations. At generation 5, 10, 15, and 20, we collected 50 males and 50 females from each bottle to score a suite of morphological traits (see below). We also scored behavioral traits and fertility for flies collected at generations 20, 21, and 24. Finally, we sequenced and genotyped DNA from pools of flies from the admixed populations at generation 20.
Morphological traits: D. simulans and D. mauritiana
These two species differ in five known traits: area of the genital arches, frons width, the number of sex comb teeth, wing area, and number of anal plate bristles. The gene tartan, located in chromosomal arm 3R, partially controls the area of the genital arches (Hagen et al. 2019).
Area of the genital arches:
One of the most distinctive morphological differences between these two species is the shape of the genital arch in males, which can be assessed by its area (Liu et al. 1996; Laurie et al. 1997). D. simulans males have spherical arches with an average area of 11.98 × 10−3 mm2 (SE = 0.614 × 10−3 mm2) while D. mauritiana have much smaller finger-shaped arches with an average area of 3.00 × 10−3 mm2 (SE = 0.048 × 10−3 mm2). We cut the last abdominal segment of males from each of the admixed populations and the pure species. Cut segments were then mounted in Hoyer’s solution (kindly donated by Dr. Daniel Mackay). Genital arches were photographed at 1000 × magnification with a Leica microscope. The area of the genital lobes was calculated on the pictures using ImageJ (Schneider et al. 2012). For each admixed population and pure-species control population, we scored 20 males per population for a total of 480 observations per generation (160 for the admixed populations and 160 for each of the two parental species in control populations). We scored genital area at five intervals, in generations 0, 5, 10, 15, and 19, for a total of 2400 measurements. To quantify heterogeneity among genotypes (each of the two pure species, and the hybrid swarms), we fitted a generalized linear model with a continuously distributed response using the “lme4” library in the R Statistical Package. The full model included genotype (either D. simulans, D. mauritiana, or “admixed population”) and the random effect of replicate. We used Tukey honest significant difference (HSD) tests for post-hoc comparisons.
Frons width:
D. mauritiana has larger eyes than D. simulans; the former species also has a smaller linear width (and thus, area) in the frons (the cuticle between the eyes) than D. simulans (Posnien et al. 2012; Arif et al. 2013). D. simulans has a frons width of 349.65 × 10−3 mm (SE = 2.85 × 10−3 mm); D. mauritiana has a frons width of 331.22 × 10−3 mm (SE = 2.1× 10−3 mm). We scored the width of the frons in each of the two species and in the admixed populations. Flies were decapitated, and the heads mounted on double-sided tape (Scotch-brand tape # 3136) facing upwards (Posnien et al. 2012). We measured the width of the eyes and the width of the cuticle between the eyes (FW) at the height of the orbital bristles just above the antennae (Posnien et al. 2012) using a Leica dissection microscope. For each admixed population and pure species replicate, we scored 20 males for a total of 480 observations per generation. We scored flies at five time points: generation 0, 5, 10, 15, and 19. To quantify heterogeneity among genotypes, we followed an approach identical to the one described above for comparing the genital arches of D. simulans, D. mauritiana, and their hybrids. We used Tukey HSD tests for post-hoc comparisons.
Sex comb tooth number:
Coyne (Coyne 1985) reported a significant difference in the number of bristles on the sex combs (the clumps of stiff bristles on the first tarsal segment of male forelegs) of D. simulans vs. D. mauritiana, with D. mauritiana strains having an average of 13.99 bristles per comb (SE = 1.121), and D. simulans an average of 9.03 bristles per comb (SE = 1.018). For sex comb preparations, prothoracic legs were dissected at the coxa with Dumont #5 forceps and were mounted in Hoyer’s solution as described above. We counted the number of teeth in the sex combs and measured the length of the tibia. This latter measurement was used as a proxy for body size. For each admixed population and pure species replicate, we scored 20 males per replicate for a total of 480 observations per generation. We scored the character at five time points (generation 0, 5, 10, 15, and 19) for a total 2400 observations. To quantify heterogeneity among genotypes, we fitted a generalized linear model with Poisson distributed error using the “lme4” library in the R Statistical Package. The full model included genotype (either D. simulans, D. mauritiana, or “admixed population”) and the random effect of the replicate. We used Tukey HSD tests for post-hoc comparisons.
Wing area:
Wings are longer in D. mauritiana (mean = 0.970 mm2, SE = 0.011) than in D. simulans (mean =0.789 mm2, SE = 0.009). Potential differences cannot be attributed to body size as D. simulans and D. mauritiana do not differ in tibial length (True et al. 1997, Table S2). We measured wing width and length and calculated the area assuming the shape of an ellipse (area = π × length × width). For each admixed population and pure species replicate, we scored 20 males per replicate for a total of 480 observations per generation. We scored five time points: generation 0, 5, 10, 15, and 19, for a total 2400 observations. To quantify heterogeneity among genotypes, we followed an approach identical to the one described above for genital arches and frons width. We used Tukey HSD tests for post-hoc comparisons.
Number of anal plate bristles:
D. mauritiana females have more anal plate bristles (mean = 48.8, SE = 0.920) than do D. simulans females (mean = 33.8, SE = 0.778). For each admixed population and pure species population, we scored 20 females per population for a total of 480 observations per generation. We scored five time points: generation 0, 5, 10, 15, and 19, for a total 2400 observations. The anal plate bristles were counted under the dissecting microscope. To detect heterogeneity among genotypes, we used an approach identical to the one described above for sex comb tooth number in the D. simulans/D. mauritiana cross. We used Tukey HSD tests for post-hoc comparisons.
Morphological traits: D. yakuba and D. santomea
These two species differ in three known traits: the nature and degree of abdominal pigmentation, the number of hypandrial bristles, and the number of sex comb teeth. The genetic basis of species differences is partially known for all three traits (Rebeiz et al. 2009; Nagy et al. 2018). The gene sc-ac partially controls the number of hypandrial bristles and number of sex combs (Nagy et al. 2018), and alleles at the tan and yellow loci partly control the interspecific difference in abdominal pigmentation (Rebeiz et al. 2009). All three genes are located on the X chromosome.
Abdominal pigmentation:
D. santomea has yellow abdominal pigmentation in both sexes, while D. yakuba (along with the other seven species of the melanogaster species subgroup) has black pigment in the posterior segments of the abdomen (Lachaise et al. 2000). To estimate the pigmentation on whole flies, we used a visual scale ranging from 0 (unpigmented areas) to 4 (dark and shiny black areas), with intermediate numbers representing intermediate levels of pigmentation (Carbone et al. 2005). Additionally, we measured the proportion of the area of each segment that was pigmented (estimated in 10% increments). To obtain the overall pigmentation score for each fly, we multiplied the percentage of the area of each segment by the pigmentation intensity, and then summed these values across the three segments (A4, A5, and A6; Carbone et al. 2005). The minimum level of pigmentation was 0, and the maximum was 1200. On average, D. yakuba has a pigmentation level of 564.15 (SD= 53.642), while D. santomea has a pigmentation level of 48.74 (SD = 11.510). The scoring was done blindly: that is, the scorer did not know the species identity, admixed population number, or the generation at which the fly was collected. For each admixed population and pure species control population, we scored 20 females and 20 males per population for a total of 960 observations per generation. We scored five time points: generation 0, 5, 10, 15, and 19. To quantify heterogeneity among genotypes, we followed an approach identical to the one described above for the genital arches of the D. simulans/D. mauritiana hybridization. We used Tukey HSD tests for post-hoc comparisons.
Hypandrial bristles:
D. santomea shows a derived loss of the hypandrial bristles, two sensory structures present in male genitalia in all other species of the melanogaster species subgroup, including D. yakuba (Nagy et al. 2018). We studied whether the admixed populations showed the hypandrial phenotype of D. santomea, D. yakuba, or an intermediate phenotype. We followed the same approach described in Nagy et al. (2018). Male genitalia were cut with a scalpel and the hypandria dissected with Dumont #5 forceps (112525-20; Phymep) in a drop of Ringer’s solution (Turissini et al. 2015). Hypandria were then mounted in Hoyer’s solution and put in a 60° oven for 24 hr. We scored whether hypandria had 0, 1, or 2 bristles. For each admixed population and pure species population, we scored 20 males for a total of 480 observations per generation. We scored five time points: generation 0, 5, 10, 15, and 19.
To quantify heterogeneity among hybrid swarms, we fitted a multinomial regression using the function multinom in the library nnet (Venables and Ripley 2002) where the number of hypandrial bristles was the response of the regression (three possible outcomes: 0, 1, or 2 bristles) and the genotype was the only fixed effect. The significance of the fixed effect was inferred using the function set_sum_contrasts [library car (Fox and Sanford 2011)], and a type III ANOVA [library stats (R-Core-Team 2013)] in R. To do post-hoc comparisons between crosses, we used a Two-Sample Fisher-Pitman Permutation Test (function “oneway_test”, library coin; Hothorn et al. 2006). We adjusted the P-values from these permutation tests to account for multiple comparisons using a Bonferroni correction as implemented in the function p.adjust [library stats (R-Core-Team 2013)].
Sex combs:
D. santomea and D. yakuba differ in the mean number of teeth in their sex combs. The average tooth number among D. santomea strains was 8.88 (SD = 0.66), and the average tooth number for D. yakuba strains 7.13 (SD = 0.52) [measurements at generation 0 and in Coyne et al. (2004)]. There are differences between isofemale lines, but the average difference between species is highly significant (Coyne et al. 2004). Scoring the number of sex comb teeth in the admixed population and pure species followed the same protocol (including sample sizes) described above for D. simulans/D. mauritiana. To determine if there were differences among hybrid swarms, we used the same approach as described for sex combs in D. simulans/D. mauritiana.
Behavioral traits
In no-choice matings, conspecific copulations usually begin earlier than heterospecific copulations (Coyne 1985; Coyne et al. 2004; Matute and Coyne 2010). Similarly, conspecific copulations tend to last longer than heterospecific copulations (Price et al. 2001; Coyne et al. 2002; Chang 2004). We measured copulation latency and duration in the parental species crosses, interspecific crosses between the parental species, and crosses involving the admixed populations using no-choice mating experiments. All flies in this experiment were collected as virgins and housed in single sex vials. On day four after hatching, one female and one male were aspirated into a single vial. All mating trials were started within 1 hr of the beginning of the light cycle to maximize fly activity and female receptivity. No more than 100 vials were set up in parallel to ensure accuracy in recording when copulation began and ended. Flies were then watched constantly for 1 hr. For each of the crosses, we recorded two copulation parameters, copulation latency (the time to copulation initiation) and copulation duration (time from mounting to separation). All tests were conducted at generation 21. We describe the details for the mating experiments for the two admixed populations below.
D. simulans/D. mauritiana:
In no-choice matings, conspecific matings within D. simulans usually mate on average after 8.87 min (SD = 2.67) and copulations last 30.97 min on average (SD = 4.75). A similar pattern occurs in D. mauritiana (Cobb et al. 1988). The majority of conspecific matings occur within 1 hr of exposure between the potential mates (Cobb et al. 1988). Heterospecific matings between D. simulans males and D. mauritiana females happen rarely [∼5% of the pairs in a 1-hr timespan (Cobb et al. 1988)]. In the reciprocal cross, D. mauritiana males × D. simulans females, copulations occur close to 70% of the time (Cobb et al. 1988). In both types of heterospecific crosses, the copulation latency of these matings is longer and the duration shorter than that of conspecific matings (Moehring et al. 2004; Matute 2014). We measured the copulation latency and duration of matings between hybrid swarm males and females from their pure species ancestors collected from four of the eight populations of hybrids and compared them to matings between the pure-species ancestors (D. simulans FC females with conspecific males, D. simulans FC females with D. mauritiana SYN males). All tests were done at generation 21. Matings were performed in 10 blocks, each containing ∼15 matings of each type.
We compared the latency among mating times using a linear mixed model where the response was the behavioral trait, the type of mating was the fixed effect, and the experimental block was a random effect. We used an identical approach to compare the copulation duration among mating types.
D. yakuba/D. santomea:
In nonchoice matings, conspecific matings within both D. yakuba and D. santomea readily take place within 20 min and copulations last ∼30 min on average. Similar to the other species pair, over 90% of conspecific matings occur within 1 hr of starting the experiment. Heterospecific matings between D. yakuba males and D. santomea females happen less frequently [<40% of the pairs in a 1-hr timespan (Coyne et al. 2002; Matute 2010]. In the reciprocal cross—D. santomea males × D. yakuba females—copulations are rare, occurring close to 5% of the time (Coyne et al. 2002b; Matute 2010). In both reciprocal heterospecific matings, copulation latency is longer and duration shorter than that of conspecific matings. All experimental design details were identical to those described above for the D. simulans/D. mauritiana admixed populations.
Rate of regression to the continental species:
All phenotypes in the two sets of admixed populations regressed to the parental mean of the species from the pair that is continental (D. yakuba or D. simulans). For each scored generation, we calculated an index that showed how similar the mean trait values were to each of the parental species:where
is the mean value for either D. yakuba or D. simulans and
is the mean value for either D. santomea or D. mauritiana. In cases where there is no transgressive segregation (i.e., extreme phenotypic values in admixed individuals outside the range of the two parental values), this index ranges from 0 (when the individual has a mean trait value identical to the minor species) to 1 (when the individual has a trait value identical to the major species). Since the index uses mean values of the parental species, there will be values lower than 0 and larger than 1. This index allowed us to compare dissimilar traits and their rate of change over time. We calculated the slope of the regression of this index with respect to time (i.e., the number of generations of admixture in the admixed population). We used ANCOVA to find differences in the slope of different traits. We did two ANCOVAs, one for each type of admixed population. We fitted two linear models for each admixed population. The first model was a linear fully factorial model in which the index (as defined immediately above) depended on the phenotype, the generation, and the interaction between these two terms. The second model was a linear model where the index depended on the phenotype and the generation, but no interaction between the two terms. To compare the two models, we used a likelihood ratio test (function lrtest, library lmtest; Zeileis and Hothorn 2002).
Male fertility
We scored the motility of sperm in males from four of the eight replicate admixed populations for each of the cross types. We also scored the male offspring when males or females from the hybrid swarms were crossed to virgin females or males of both parental species. The last analysis was conducted to see whether admixed population individuals resembled one parental species more than the other, for interspecific crosses always yield completely sterile males. The sperm-motility controls comprised males from the four parental species as well as of the hybrid males from reciprocal crosses between both pairs of species used to found the admixed populations.
We used sperm motility as an index of male fertility (Coyne 1984, 1989). Males were collected as virgins from stock bottles, aged 4 days at a density of 25 males per 30-ml vial, and their testes extracted, crushed in Ringer’s solution, and examined under a compound microscope (Leica). As in the studies cited previously, we counted males lacking any motile sperm, including those lacking spermatids, as “sterile,” and those with at least one motile sperm as “fertile.” Tests were done 20 generations after the admixed populations were created. We compared the proportion of sterile males among genotypes using the function prop.test [“stats” R library (R-Core-Team 2013)]. To calculate the Bayesian confidence intervals for male sterility for each type of cross, we used the function binom.cloglog [“binom” library (Sundar Dorai-Raj and Sundar Dorai-Raj 2016)].
Genetic ancestry within hybrid populations
DNA extraction, library preparation, and sequencing:
To estimate ancestry within each admixed population, DNA was extracted from pools of 60 flies (30 females and 30 males) in the 20th generation. DNA was extracted using the QIAamp DNA Micro Kit (Qiagen, Chatsworth, CA) kit. Libraries were prepared and multiplexed at the North Carolina State University Genome Services Laboratory. Approximately 15–20 million paired end reads (68 bp for the D. mauritiana/D. simulans crosses, and 48 bp for the D. yakuba/D. santomea crosses) were sequenced for each pool using Illumina GAIIx technology at the University of North Carolina High-Throughput Sequencing Facility.
To facilitate analyses to estimate ancestry within each hybrid population, we estimated allele frequencies for groups of isofemale lines sequenced for each of the four parental species we used in our experiment. We extracted DNA and sequenced 13 D. santomea lines, 13 D. mauritiana lines, 29 D. simulans lines, and 56 D. yakuba lines (accession numbers in Table S1). One hundred and five of these genomes have been previously published (Garrigan et al. 2012; Brand et al. 2013; Turissini and Matute 2017; Schrider et al. 2018; Turissini et al. 2018). For the remaining six genomes for D. simulans, we generated genomic DNA libraries using Nextera kits. Libraries were barcoded, pooled, and sequenced on a HiSeq 2000 machine. Pooling was done randomly, and six lines were sequenced per lane. The HiSeq 2000 machine was run with chemistry v3.0 using the 2 × 100 bp paired-end read mode. We verified the quality of the obtained reads using the HiSeq Control Software 2.0.5 in combination with RTA 1.17.20.0 (real-time analysis) and CASAVA-1.8.2. Resulting reads were 100 bp, and the average coverage for each line was 20×. The accession numbers for genomic data collected from each parental line are listed in Table S1.
Alignment and variant calling:
We aligned reads from D. mauritiana/D. simulans admixed populations and pure-species D. simulans and D. mauritiana lines to a published D. simulans genome assembly [“w501” version 2 (Hu et al. 2013)]. For D. yakuba/D. santomea admixed populations and pure-species D. yakuba and D. santomea lines, we mapped reads to an unpublished chromosome-level assembly generated for D. yakuba (“NY73PB”; provided by P. Andolfatto and J. J. Emerson). To assess whether there was any bias in mapping reads to one of the two parental species’ reference genomes, we also mapped reads (for D. mauritiana/D. simulans) to a D. mauritiana assembly that has been anchored to the D. melanogaster genome (Nolte et al. 2013) and (for D. yakuba/D. santomea) to an unpublished D. santomea assembly generated using Pacbio and Illumina reads (“STOCAGO1482”; provided by P. Andolfatto and J. J. Emerson). For each admixed population and each parental line, we mapped reads to each of these reference genomes using bwa-mem (Li and Durbin 2009; Li 2013). Mapped reads were sorted, and duplicates were removed using Picard (http://broadinstitute.github.io/picard; Broad Institute 2016). For each pool of sequences from the admixed populations, we generated allele counts, at each variable site, using samtools’ mpileup (v1.4) and the “mpileup2sync.pl” script distributed with Popoolation2 (Kofler et al. 2011). These allele counts were then used when estimating ancestry at each site (see Ancestry-HMM below).
For each pure-species line, we called variants using GATK (McKenna et al. 2010; DePristo et al. 2011). We realigned data from each line around indels using GATK’s RealignerTargetCreator and IndelRealigner tools (v3.8; McKenna et al. 2010). We then estimated genotypes for each line using GATK’s HaplotypeCaller tool with options “–emitRefConfidence GVCF,” “–minReadsPerAlignmentStart 4,” “–standard_min_confidence_threshold_for_calling 8.0,” and “–minPruning 4.” We then performed joint genotyping using GATK’s GenotypeGVCFs tool for all D. yakuba and D. santomea lines, and all D. simulans and D. mauritiana lines. We filtered SNPs using GATK’s VariantFiltration tool with option “–filterExpression “QD < 2.0 || FS > 60.0 || SOR > 3.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0”” and hard-filtered sites genotyped in fewer than 50% of individuals [VCFtools (v0.1.15) option “–max-missing 0.5”]. We then generated allele frequency estimates, from the final set of filtered variants, for each group of pure-species individuals using VCFtools’–freq tool. These estimates of allele frequency were used to generate panels of reference SNPs used in the analysis described below.
Ancestry-HMM:
We used the software Ancestry-HMM (Corbett-Detig and Nielsen 2017) to estimate ancestry (i.e., the parental species-of-origin) that still segregated within each of the admixed populations. We generated reference panels for each parental species using allele frequency estimates generated from genotyped pure-species lines. We took this approach because sequence data from the parental lines used to generate the admixed populations are not available, but note that this would be the preferable experimental design. For a site to be included in our analysis, it had to pass all filters applied when genotyping the pure-species lines and have an allele frequency difference between the two pure-species >0.5. We used pure-species allele frequencies for each of these “high-quality” sites, as well as allele counts within the admixed populations as input when running Ancestry-HMM. We assumed a recombination rate of 5 × 10−6/bp between sites (because fine-scale recombination maps currently do not exist for the species we study here; Singh et al. 2005; Fiston-Lavier et al. 2010). The script used to generate the input panel for Ancestry-HMM is available at https://github.com/comeaultresearch/genomics_scripts/. Supplementary tables can be found in figshare: TBD.
We ran Ancestry-HMM specifying a single admixture event 20 generations ago where the admixed population was generated with a 50% contribution from each parental species (“-P 0.5”). We also specified expected ancestry proportions within the admixed populations at the time of sequencing of 10% “minor” parent species and 90% “major” parent species (-a option in Ancestry-HMM). We chose these proportions based on the phenotypic results described below. Parameter specifications when running Ancestry-HMM were therefore similar (but not identical) to the experimental design of the admixed populations because the hybrid populations were generated using backcrossed males.
Ancestry-HMM provides posterior probabilities (PPs) for each genotype at each site. For example, a putatively admixed diploid individual would be assigned PPs of being homozygous for parental species one ancestry, homozygous for parental species two ancestry, and heterozygous, at each site. Because we sequenced pools of individuals from our admixed populations, it was not appropriate to assume a diploid genotype. The ideal ploidy should be 120 for the autosomes and 90 for the X-chromosomes, the number of chromosomes in each pool. Our attempts using this ploidy were not successful. We ran Ancestry-HMM assuming a ploidy of eight, the mean per-site coverage for the santomea/yakuba populations. We interpreted the PPs provided by Ancestry-HMM (9 PPs, one for each n = 8 genotype) as estimates of parental allele frequencies segregating in a hybrid population at a given site. Using this approach, an ancestry estimate of 0 (n = 8 genotype 8|0) represents a site fixed for either D. santomea or D. mauritiana ancestry, an ancestry estimate of 1 (n = 8 genotype 0|8) represents a site fixed for either D. yakuba or D. simulans ancestry, and an ancestry estimate of 0.625 (n = 8 genotype 3|5), for example, represents a site where ancestry is segregating within the population, and 62.5% of alleles come from D. yakuba or D. simulans. We calculated mean ancestry across ancestry-informative sites in 5000-bp windows across the genome in each admixed population. For a site to be considered ancestry-informative, the PP of any single genotype had to be >0.33.
Data availability
Phenotypic measurements have been deposited in Dryad (https://doi.org/10.5061/dryad.rn8pk0p5s). All raw read data have been deposited in the Short Read Archive. The accession numbers are listed in Table S1. All scripts used to summarize mean ancestry across windows are available at https://github.com/comeaultresearch/genomics_scripts/. Supplemental material available at figshare: https://doi.org/10.25386/genetics.11113676.
Results
Phenotypic characteristics of the admixed populations
D. simulans/D. mauritiana:
Morphological traits:
We scored five morphological traits that differentiate D. simulans and D. mauritiana. There was no significant change of any trait in the populations of the parental species over the 20 generations of the experiment (Figure 1 and Table S3 rows A and B). In all mauritiana/simulans admixed populations, morphological traits reverted to the D. simulans parental line within a few generations. Pairwise comparisons indicate that most traits (four out of five) regressed completely to the mean trait value of D. simulans after 20 generations. The only exception to this pattern was the number of teeth in the sex combs (mean = 9.447, SD = 0.240), which nevertheless was still much closer to the mean value of pure D. simulans (8.993, SD = 0.286) than to that of D. mauritiana (mean = 13.992, SD = 0.363), but significantly different from the mean trait in both pure species (Tables S3 and S4). Admixed populations differed significantly from D. mauritiana in all pairwise comparisons. After 20 generations, the mean trait values of all replicates of the admixed populations were similar to D. simulans in all the five scored traits (Tukey HSD tests, Table S4).
Within 20 generations of formation, all admixed populations between D. simulans and D. mauritiana show phenotypic mean trait values similar to D. simulans and different from D. mauritiana. Each point shows the mean trait value of each of the eight admixed populations at a given generation. All replicates of the parental species are shown as a single trend-line as they showed no change in their mean trait value within the 20 generations of the experiment (solid gray line: D. simulans; dashed gray line: D. mauritiana). (A) Number of teeth in the sex combs. (B) Area of the male genital lobe. (C) Frons width. (D) Number of bristles in the anal plate. (E) Male wing area.
In addition to the final mean trait values, we assessed the rate at which the trait values in the admixed populations regressed to the mean trait values of pure D. simulans. Figure 2 shows the normalized rate of phenotypic evolution throughout the experiments for the five phenotypes. This metric allowed us to compare the rate of evolution of dissimilar phenotypes as time passed after the initial admixture. First, we compared the intercepts of the regressions. This metric tests whether there are differences among mean traits values at the beginning of the experiment. We found significant differences in the intercepts among traits (LRT, d.f. = 1, ΔL = 1352.0, P < 1 × 10−10), which probably reflects the differences in the coefficient of dominance and number of involved alleles in the different traits.
The rate of evolution of mean trait values in the simulans/mauritiana admixed populations differed among traits as the generations pass after admixture. After 20 generations of admixture, all the simulans/mauritiana admixed populations showed mean trait values similar to those observed in D. simulans. Each point shows the normalized mean at a given generation for each of the eight admixed populations. The lines show the best fitting linear regions for the normalized value of the trait and the times since admixture. The five different colors show the five traits measured in the simulans/mauritiana admixed populations.
We next assessed whether the rate of regression to the D. simulans mean differed among traits. This analysis showed the slopes of regression were significantly heterogeneous (LRT, d.f. = 4, ΔL = 381.4, P < 1 × 10−10). These results strongly suggest that the alleles, or linked alleles, involved in these interspecific differences are purged (or retained) differently in these admixed populations. The reversion of all traits to the D. simulans mean in a short time shows that alleles involved in producing D. mauritiana phenotypes alleles did not fare well in the admixed population, but that not all genomic regions have the same propensity to be purged.
Male fertility:
We scored male sterility after 20 generations of admixture. As expected, males from both pure species were overwhelmingly fertile. Of the 650 scored D. mauritiana males, 630 had motile sperm (0.97; 95% confidence intervals: 0.953–0.980). D. simulans showed a similar pattern: of the 598 scored males, 570 were fertile (0.953; 95% confidence intervals: 0.932–0.967). All F1 males produced between these two lines (the two reciprocal directions pooled, n = 1510) were sterile, as has been reported many times (0; 95% confidence intervals: 0.000–0.002).
Next, we scored the fertility of males from the admixed populations. We scored individuals from four of the eight replicate admixed populations. Males from the admixed population were largely fertile: at generation 21, 213 out of 234 scored males (0.91; 95% confidence intervals: 0.866–0.940) were fertile, with no heterogeneity among males from the four replicates (χ2 = 6.56, d.f. = 3, P = 0.08). The overall mean fertility of the admixed populations was similar to but significantly lower than the fertility of parental species (2-sample test for equality of proportions without continuity correction: χ2 > 11.52, d.f. = 1, P < 8.861 × 10−4 for the two possible comparisons).
Then we scored the fertility of the male progeny produced from crosses between individuals of the admixed populations and the two pure species. We found no heterogeneity among admixed populations in terms of their magnitude of postzygotic isolation in crosses to D. simulans (χ2 = 3.75, d.f. = 3, P = 0.29 for crosses between D. simulans females and admixed-population males; χ2 = 3.95, d.f. = 3, P = 0.27 for crosses between admixed-population females and D. simulans); we thus analyze all the progeny from the admixed populations as a pool. When admixed-population males were crossed to either D. simulans females or when admixed-population females were crossed to D. simulans males, male offspring were almost completely fertile: the proportion of fertile males was 0.97 (n = 315; 95% confidence intervals: 0.942–0.983) for the former cross and 0.96 (n = 145; 95% confidence intervals: 0.910–0.981) for the latter. There was no difference between these reciprocal crosses (2-sample test for equality of proportions without continuity correction: χ2 = 0.275, d.f. = 1, P = 0.600), indicating that regardless of the direction of the cross, males from the admixed populations mostly produced fertile offspring when crossed to D. simulans. The fertility of males from the crosses between D. simulans and the admixed populations was similar to the fertility of pure D. simulans in the two reciprocal directions (2-sample test for equality of proportions without continuity correction: χ2 < 0.1909, d.f. = 1, P > 0.662).
When admixed-population individuals were crossed to D. mauritiana, every male offspring was sterile. The complete sterility of these offspring was observed in the two reciprocal crosses: admixed-population males crossed to D. mauritiana females (n = 126, 95% confidence intervals: 0.000–0.029), and when admixed-population females were crossed to D. mauritiana males (n = 190, 95% confidence intervals: 0.000–0.019). This complete sterility is observed for crosses between D. mauritiana and D. simulans (Coyne 1984). These results suggest that for genes causing sterility of D. simulans/D. mauritiana hybrids, the alleles reverted over 20 generations to the ones present in D. simulans.
Mating behavior:
We studied copulation latency and duration in a subset of the admixed populations (four of the eight replicates). For both traits, males from the four tested replicates showed little variation in mating behavior when mated to D. simulans females. The mean latency of these crosses between males from the admixed populations and D. simulans females was 11.33 min (SD = 6.88); the mean duration was 28.336 min (SD = 9.86). We detected no heterogeneity among these crosses in either copulation latency (one-way ANOVA, F1132 = 1.493, P = 0.224) or duration (one-way ANOVA, F1132 = 2.904, P = 0.091). For the analyses that follow, we pooled observations from all admixed populations into a single category. Copulation latency in crosses between D. simulans females and admixed population males was similar to that of crosses between D. simulans females and males (mean = 8.865 min, SE = 2.616), and shorter than the latency in crosses between D. simulans females and D. mauritiana males (mean = 24.353 min, SE = 5.694, Table S5 shows all pairwise comparisons). Copulation duration shows a similar pattern. Matings between D. simulans females and admixed population males showed similar copulation duration to crosses between D. simulans females and males (mean = 30.973 min, SE = 4.752) but were longer than matings between D. simulans females and D. mauritiana males (mean = 9.853 min, SE = 3.886, Table S5 shows all pairwise comparisons). At the genes responsible for this discrimination against D. simulans females, then, the admixture appears to have reverted to D. simulans alleles over the 20 generations of the experiment—similar to what occurred at genes responsible for hybrid sterility and the morphological differences between species. Please note that since we did not assay admixed males and D. mauritiana females, we cannot exclude the possibility that the admixed males are effective in courting both parental species.
D. yakuba/D. santomea:
Morphological traits:
We studied morphological evolution in D. yakuba/D. santomea admixed populations similarly to the procedure in the D. simulans/D. mauritiana admixed populations. In this case, we scored three morphological traits that differentiate D. yakuba and D. santomea. In the control populations of both species, there was no significant change in the average values of any of the three traits over the 20 generations of the experiment (Table S6). We observed minor variation in D. santomea across generations but no directional change in any trait between generation 0 and 20 (Table S6). After 20 generations of admixture, the mean values of all three traits in the admixed populations became similar to those of D. yakuba (Figure 3). All the mean trait values and results from the linear models are shown in Table S6; morphological traits rapidly reverted to the D. yakuba parental line by generation 20. Pairwise comparisons indicate that the admixed populations showed traits similar but not identical to those of D. yakuba but much more different from the mean traits in D. santomea (Table S7).
All admixed populations between D. yakuba and D. santomea show phenotypic mean trait values similar to those of D. yakuba and different from those of D. santomea within 20 generations of admixture. Each point shows the mean trait value of each of the eight admixed populations at a given generation. All replicates of the parental species are shown as a single trend-line as they showed no change in their mean trait value within the 20 generations of the experiment (dashed gray line: D. santomea; solid gray line: D. yakuba). (A) Number of teeth in the sex combs. (B) Number of hypandrial bristles. (C) Abdominal pigmentation score.
We also measured the rate at which these three phenotypes regressed to D. yakuba. Figure 4 shows the rate of phenotypic evolution of the three measured traits throughout the experiment (20 generations). First, we compared the intercepts. We found that there was significant heterogeneity in the intercepts (LRT, d.f. = 1, ΔL = 264.57, P < 1 × 10−10). The traits also differ in their rate of regression to the major species (LRT, d.f. = 4, ΔL = 34.385, P = 3.41 × 10−8). Similar to the observation in the simulans/mauritiana admixed populations, these results indicate that the alleles that are involved in producing D. santomea phenotypes were purged from the admixed populations at different rates.
The rate of evolution of the mean trait value in the yakuba/santomea admixed populations during 20 generations of experimental admixture differed among three phenotypic traits that differentiate between the two parental species. After 20 generations of admixture, all the yakuba/santomea admixed populations showed mean trait values similar to those observed in D. yakuba, as shown by the index value being close to 1 (see text). Each point shows the mean value of each admixed population. The line shows the best linear regression of all observations, not only the means. The three different colors show the three traits: number of hypandrial bristles, number of teeth on the sex combs, and abdominal pigmentation.
Male fertility:
As is the case with the other species pair, pure species were largely fertile. Of 435 D. yakuba SYN males, 425 showed motile sperm (0.968; 95% confidence intervals: 0.947–0.981). Of 456 D. santomea STO.4 males, 429 were fertile (0.941; 95% confidence intervals: 0.915–0.959). F1 hybrid males from both directions of the cross (the two reciprocal directions pooled, n = 1952) were sterile (0; 95% confidence intervals: 0.000–0.002).
We also scored the fertility of males from the admixed populations 20 generations after the experiment started. Males from the admixed population were not heterogeneous at generation 21 in fertility among the four assayed replicates (χ2 = 5.81, d.f. = 3, P = 0.12). The majority of the males from these admixed populations were fertile (n = 98, mean fertility = 0.89; 95% confidence intervals: 0.807–0.936), but had lower fertility than pure D. yakuba (2-sample tests for equality of proportions without continuity correction, χ2 = 16.836, d.f. = 1, P = 4.076 × 10−5) but not lower than males from D. santomea (χ2= 3.551, d.f. = 1, P = 0.06).
We next scored the fertility of male progeny produced from crosses between individuals of the admixed population and the constituent pure species. When males from the four admixed populations were crossed to either D. yakuba females or when admixed-population females were crossed to D. yakuba males, male offspring were largely fertile: the proportion of fertile males was 0.99 (n = 90; 95% confidence intervals: 0.924–0.998) for the former cross and 0.74 (n = 73; 95% confidence intervals: 0.613–0.816) for the latter. The difference in male fertility in the reciprocal crosses is minimal (χ2= 6.617, d.f. = 1, P = 0.0101). Admixed-population males crossed to D. santomea females produced only sterile males (n = 124, 95% confidence intervals: 0.000–0.029). Admixed-population females crossed to D. santomea males also produced exclusively sterile males (n = 223, 95% confidence intervals: 0.000–0.016). This complete sterility is similar to the complete sterility between D. yakuba and D. santomea. Thus, the admixed population males behaved, in their sterility relationships, as if they were D. yakuba.
Mating behavior:
Finally, we studied the copulation latency and duration of crosses between admixed-population males and D. yakuba females. We found no variation in the two components of mating behavior among replicates of the yak/san admixed population: there was no heterogeneity across admixed populations in either copulation latency (one-way ANOVA, F1128 = 1.282, P = 0.260) or copulation duration (one-way ANOVA, F1128 = 2.731, P = 0.101). We next compared the latency and duration of crosses between the admixed-population males (all populations pooled) with D. yakuba females to matings between pure D. yakuba males and females (i.e., conspecific crosses) as well as with crosses between D. santomea males and D. yakuba females. Copulation latency in crosses between D. yakuba females and admixed-population males (mean = 13.031 min, SD = 7.193) was similar to that observed in crosses between pure D. yakuba females and males (mean = 12.243 min, SD = 5.619), and was shorter than the latency in crosses between D. yakuba females and D. santomea males (mean = 30.542 min, SE = 10.815; Table S8 shows all the pairwise comparisons). Copulation duration shows a similar pattern. Matings between D. yakuba females and admixed population males (mean= 36.231 min, SD = 8.349) showed similar copulation duration to crosses between pure D. yakuba females and males (mean = 34.568 min, SD = 9.194) but were longer than matings between D. yakuba females and D. santomea males (mean = 25.00 min, SE = 11.451; Table S8 shows all the pairwise comparisons). As with the other traits, in these cases, the mating behavior of admixed-population males and females resembled that of pure D. yakuba.
Genetic ancestry within hybrid populations
In both species pairs, and regardless of the reference genome we aligned reads to, ancestry was highly and consistently biased toward one parent species over the other (Figure 5). In the admixed mauritiana/simulans populations, 92.9–99.5% (range across the eight populations) of sites were fixed, or nearly fixed, for D. simulans ancestry (i.e., had a PP > 0.33 for ancestry estimates of 0.875 or 1), 0.5–7.4% of sites still segregated for both parental ancestries (i.e., had a PP > 0.33 for ancestry estimates ranging from 0.25 to 0.75), and 0.07–0.09% of sites were fixed, or nearly fixed, for D. mauritiana ancestry (i.e., had a PP > 0.33 for ancestry estimates of 0.125 or 0; Figure 5A). In the santomea/yakuba populations we found a similar pattern as for the mauritiana/simulans populations: 96.0–99.2% of sites had fixed, or nearly fixed, for D. yakuba ancestry, 1.1–4.8% of sites still segregated for both parental ancestries, and no sites were found to have a high PP of being fixed, or nearly fixed, for D. santomea ancestry (Figure 5B). Note that in Figure 5 the total proportion varies, and is slightly >1 in each population, because sites can have PP > 0.33 for genotypes suggesting segregating ancestry and, for example, fixed “major” parent species’ ancestry.
Genetic ancestry rapidly and consistently regressed to that of one of the two parental species in all admixed populations. (A) The proportion of sites either fixing for D. simulans ancestry or still segregating for both parental species’ ancestry in each of the eight admixed D. mauritiana/simulans populations. (B) The proportion of sites either fixing for D. yakuba ancestry or still segregating for both parental species’ ancestry in each of the eight admixed D. santomea/yakuba populations. Sites were considered to still be segregating for both parental species’ ancestry if any of the ploidy = 8 genotypes 2 | 6 through 6 | 2 received a posterior probability >1/3. The left bar for each population summarizes results obtained when mapping to either the D. mauritiana (A) or the D. santomea reference genomes (B). Bars to the right, for each population, summarize results obtained when mapping to either the D. simulans (A) or D. yakuba (B) reference genomes.
We also observed a reference bias when estimating ancestry (compare pairs of bars in Figure 5). Specifically, we found a higher proportion of sites with high PP of still segregating for both parental species’ ancestry when sequences were initially mapped to the minor species’ reference genome. For example, when ancestry was estimated following initial mapping to the D. santomea reference genome, we found an increase in the proportion of sites estimated to be segregating for both ancestries of 2.9–17.9% (compare left and right bars in Figure 5B). Despite this bias, we still find unambiguous evidence that the majority of sites (>80%) have fixed for “major” species’ ancestry. Because our overall conclusion (predictable regression to ancestry of one parent species over the other) is not affected by this reference genome bias, we focus on results obtained when mapping to the major-species’ reference genome (i.e., D. yakuba or D. simulans) and note that mapping errors can result in an underestimate of the number of segregating sites from the minor species (i.e., D. santomea or D. mauritiana).
We next summarized ancestry within 5-kb genomic windows and tested whether segregating ancestry was unevenly distributed across chromosomes or chromosomal arms. We found that genomic windows that had evidence of segregating ancestries (i.e., an ancestry estimate < 0.8) were unevenly distributed across chromosomal arms (test of equal proportions: all P < 0.00001). In the mauritiana/simulans populations, chromosomal arms 3L and 3R had the highest proportion of windows with segregating ancestry in four populations each (excluding the small fourth chromosome; Figure 6A and Figure 7A). The proportion of windows with segregating ancestry for these “segregating” regions of the genome was still low and ranged from 0.5 to 20.8% of windows on that chromosomal arm. Interestingly, in two of the mauritiana/simulans populations, almost all of chromosome 4 (96.0% and 98.5% of windows) still segregated for both parental species’ ancestry, and in a third population, 30% of chromosome 4 segregated for both parental species’ ancestry. In the other five populations, there were no windows on chromosome 4 still segregating for both parental species’ ancestry.
Genome-wide distribution of ancestry in all admixed populations. Heatmaps showing ancestry estimates summarized in 5-kb genomic windows for each chromosome or chromosomal arm in the D. simulans (A) and D. yakuba (B) reference genomes. Each row is a different admixed population and colors reflect ancestry ranging from 0 (fixed for “minor” parent ancestry) to 1 (fixed for “major” parent ancestry). The bottom row summarizes the number of populations that showed evidence of a given genomic window still segregating for both parental species’ ancestry (i.e., ancestry estimate < 0.8).
The proportion of genomic windows where both parental species’ ancestry still segregated varied across chromosomes. Each point represents the proportion of 5-kb genomic windows that have evidence for both parental ancestries still segregating after 20 generations following initial hybridization between the parental species. (A) D. simulans/D. mauritiana; (B) D. yakuba/D. santomea.
Segregating ancestry was also unevenly distributed across the genome in the yakuba/santomea populations (Figure 6B and Figure 7B): in seven of the eight admixed populations, chromosomal arm 2R retained the highest proportion of windows that still segregated for ancestry. The proportion of windows on chromosomal arm 2R still segregating for ancestry ranged from 3.8 to 19.3%. In the eighth population, 6.05% of windows on chromosomal arm 2L still segregated for ancestry.
We next tested whether genomic regions that retained ancestry of both parental species were shared across the different admixed populations. Our goal was to test if selection might be acting to maintain mixed ancestry at specific regions of the genome. Under this hypothesis, we predicted that the same regions (i.e., genomic windows) would segregate for both parental ancestries across multiple, independent, hybrid populations. In the simulans/mauritiana admixed populations, only 55 genomic windows maintained both ancestral alleles in four or more populations. Thirty of these windows were located within an 11.4-Mb region on chromosomal arm 3L (positions 8,355,000 to 19,850,000; Figure 6A) that contained 1534 genes. Only a single genomic window on each of chromosomal arms 2L and 2R and two genomic windows on each of 3L and 3R were found to still segregate for both parental ancestries in more than four populations. One region that spanned 125 kb on the X (positions 8,990,000 to 9,115,000) contained 15 windows that were fixed (or nearly fixed) for D. mauritiana ancestry. This region contained 13 genes; however, the consistency of the size of this region across all eight populations suggests that this is likely a technical artifact (due to assembly, sequencing, or mapping errors) and does not represent actual fixing of D. mauritiana ancestry across all populations.
In the yakuba/santomea populations, 269 genomic windows retained both ancestral alleles in four or more of the eight admixed populations. One hundred eighty-nine of these windows were shared across four of the eight yakuba/santomea populations, fifty-eight windows in five of the eight populations, eighteen windows in six of the eight populations, three windows in seven of the eight populations, and one window in all eight populations. Nearly all of these windows were located on chromosomal arm 2R (265 of the 269), with only two windows each on 3L and X chromosomes (Figure 6B). The windows on 2R span a 12.59-Mb region starting at position 9,120,000 and ending at 21,710,000 (Figure 6B) that contains 1696 genes.
Discussion
Interspecific hybridization seems to be common in nature. Understanding the fate of admixed genomes is a question relevant for understanding how species persist in nature. We generated admixed populations of two Drosophila species pairs and followed the changes in their phenotypes and genomes over 20 generations following hybridization. In each of the eight replicates for the two species pairs, mean trait values of morphological, behavioral, and reproductive traits differing between the parental species all regressed to resemble those of the continental (“major”) species. Consistent with these phenotypic observations, we found that genetic composition of the admixed populations regressed almost completely to resemble that of their continental parental species (either D. simulans or D. yakuba). These results have two major implications: (i) selection favoring traits from one species, or interactions between alleles of different ancestry (e.g., deleterious epistatic interactions between traits or alleles), result in the deterministic and rapid regression of hybrids to resemble one of their two parental species; and (ii) sex chromosomes are less likely to harbor admixed ancestry than the autosomes. We discuss the implications of each of these results below.
Selection against minor species alleles is pervasive and consistent across replicates
The eight admixed populations of each species pair show concordance in the regions that retained mixed ancestry. Even though the proportion of the minor species is small in both cross types, there is some concordance in the regions that retain minor species across populations generated from a given interspecific cross. Under completely random segregation, one would expect little to no concordance. Our results suggest—but not confirm—that the alleles from the minor species that were fixed, or were at high frequency in the admixed populations, were favored by selection.
Further, we find that besides the regression of traits and genes to the same parental species, there is concordance of minor species’ ancestry across populations. This can occur either through selection across the whole genome against alleles from the minor species or through strong selection on a handful of traits purging the minor species haplotypes. Notably, the amount of genome remaining from the minor species is not zero, which suggests that even though the genomes from different species of Drosophila cannot be combined in a mosaic, there are regions that can be tolerated and perhaps even favored in the background of the major species.
Since one of the parental genomes all but disappeared from the admixed populations, these results are informative about the role that hybridization may play in extinction. Levin et al. (1996) and Rhymer and Simberloff (1996) proposed that hybridization can lead to the extinction of one of the parental genomes. Theoretical models have predicted that the extinction of one of the species is invariably the outcome unless there is habitat heterogeneity (Wolf et al. 2001; Quilodrán et al. 2015, 2018). However, instances of admixture and extinction by hybridization might not be uncommon. For example, humans outnumbered Neanderthals by ∼10-to-1 during the interbreeding period, and some arguments suggest that Neanderthals did not disappear due to warfare or competition, but due to interbreeding (Harris and Nielsen 2016). Todesco et al. (2016) compiled evidence for 143 studies to assess the outcomes of hybridization in natural systems. In 69 of the studies, hybridization was inferred to be a risk for extinction. These observational studies have the potential to reveal whether hybridization is an important contributor to extinction in nature but are limited because they cannot completely recapitulate the events that led to extinction. Our experimental approach shows that alleles from one of the parental species can be rapidly purged from a population of hybrids, lending support to the idea that frequent hybridization might indeed lead to the extinction of species under certain conditions.
The results reported here departed from our expectations. We expected that after admixture, we would be able to reconstitute the two parental genomes (i.e., some individuals would have a D. yakuba genome, and some would have a D. santomea genome) because hybrid incompatibilities from either species would be equally likely to be purged out of the admixed population. The initial conditions of the experiment involved the four possible types of backcrosses and males from the two species, which would amount to a 50:50 ratio. However, the genome from the island species remained only as a relict in the form of minor species haplotypes.
There are four nonexclusive possibilities that might explain this pattern. First, the minor species in both cases were island endemics. Both D. santomea and D. mauritiana show lower heterozygosity than their continental sister species which in turn indicates lower effective population size (Leffler et al. 2012). This might also mean that these species are more prone to inbreeding depression due to the accumulation of deleterious (or slightly deleterious) alleles. In these conditions, haplotypes from the major species will be more likely to be fixed because they are more fit (e.g., Juric et al. 2016). This, of course, will depend on the level of linkage disequilibrium between hybrid incompatibilities and potentially adaptive alleles in the genome (Bierne et al. 2002, Comeault 2018, Schumer et al. 2018; Martin et al. 2019). Second, the mainland species may have been selectively favored under our experimental setting, potentially due to having a more generalized “jack of all trades” life history, being broadly adapted to a variety of habitats. Indeed, D. santomea (endemic to humid mountain forests on the island of São Tomé) displays a more specialized niche than D. yakuba in nature. Third, it is likely that the major species is only more fit in the experimental conditions that we used but not necessarily in all conditions (discussed in Stelkens et al. 2014). Finally, it is possible that the genomes of the island endemics harbor more alleles that are sufficient to cause incompatibility than the continental species. In the case of yakuba/santomea, for example, the backcrosses involving D. santomea males are more likely to produce sterile males than crosses involving the same females and D. yakuba males (table 1 in Coyne et al. 2002). This pattern, however, does not seem as clear in the backcross of males from D. simulans and D. mauritiana (table 3 in Zeng and Singh 1993). Distinguishing between these possibilities will require assessing whether the island species can become the major species in some laboratory conditions.
The location of minor species’ ancestry in the genome
Sex chromosomes harbor a disproportionate number of genes contributing to reproductive isolation when compared to autosomes (Coyne and Orr 2004; Masly and Presgraves 2007; Ellegren 2008; Presgraves 2008; Qvarnström and Bailey 2009; Muirhead and Presgraves 2016). A corollary of this observation is that sex chromosomes should be less permeable to gene exchange than autosomes (i.e., they should have less ancestry from the minor species than the autosomes; Muirhead and Presgraves 2016; Presgraves 2018). This pattern has been confirmed in naturally occurring hybrid zones of mammals (Macholán et al. 2007; Carneiro et al. 2010, 2014), flies (Garrigan et al. 2012; Turissini and Matute 2017), and butterflies (Van Belleghem et al. 2018). We tested if this hypothesis was also true in admixed populations produced synthetically. We found that, after 20 generations of admixture, minor species’ ancestry did not segregate randomly across the genome. Rather, synthetic admixed populations from both species pairs show a similar pattern to what is observed in nature: with almost no exceptions, X chromosomes harbored less genetic material from the minor species in the sex chromosomes than in the autosomes. Notably, natural patterns of variation are also consistent with this result as X-chromosomes from both species pairs are less likely to harbor introgressed haplotypes than the autosomes (Garrigan et al. 2012; Turissini and Matute 2017; but see Hartmann et al. 2019).
The location of the minor species’ ancestry differed between the two species pairs. The largest proportion of the introgression (∼40%) in the simulans/mauritiana admixed populations was found in the left arm of chromosome 3. In the case of the yakuba/santomea admixed populations, most of the minor species’ ancestry (30%) was found in the right arm of chromosome 2. This difference might be caused by differences in the genetic basis of reproductive isolation between the two species pairs or differences in the recombination landscape between species pairs. A fine-scale introgression mapping approach revealed 47 alleles sufficient to cause hybrid male sterility in D. simulans/D. mauritiana hybrid males. Chromosome 3R contains at least 13 alleles sufficient to cause male sterility between these two species (Laurie et al. 1997; True et al. 1997); chromosomes 3L and 2R each contain seven of these alleles. Chromosome 2R contains eight male sterility alleles. The X chromosome contains 12 alleles sufficient to cause male sterility. A quantitative trait locus (QTL) analyses for the same phenotype, hybrid male sterility, but in D. yakuba/D. santomea F1 hybrids revealed a QTL of large effect on the X-chromosome, and QTL of smaller effect on 2L, 3L, and 3R (figure 1 in Moehring et al. 2006). Chromosomal arms that have previously been implicated in hybrid male sterility seem to be under-represented in the proportion of ancestry from the minor species in the admixed populations. This pattern of heterogeneity across autosomal arms, however, is not consistent with the observations from natural populations. In both simulans/mauritiana (figure 2 in Garrigan et al. 2012) and yakuba/santomea (figure 6 in Turissini and Matute 2017), introgressions are evenly spread across autosomal Muller elements. This discordance suggests that tolerance to alleles from the minor species is not the only factor that determines the fate of an introgressed allele in nature (see caveats). A second possibility is that the recombination landscapes differ between the two species pairs. Higher recombination rates will break the linkage between neutral variants and deleterious variants (i.e., incompatibilities), which would allow for the neutral variants to persist longer in the admixed populations. There is extensive variation in map length among species of the melanogaster species subgroup (table 4.4 in Hemmer 2018). D. mauritiana, for example, has a larger recombination map than other Drosophila species (True et al. 1996; Brand et al. 2018). Since introgression tends to collocate with regions of the genome where there is high recombination (Schumer et al. 2018; Martin et al. 2019), if simulans/mauritiana hybrid populations show a higher recombination rate than yakuba/santomea populations, then more ancestry from the minor species would persist after 20 generations of admixture in the former type of hybrid swarm. These two possibilities, differences in the density of hybrid incompatibilities and differences in the recombination landscape, are not mutually exclusive and more work will be required to assess the relative importance of these two possibilities.
Further, genes controlling species differences between D. santomea and D. yakuba in the three traits we measured (number of hypandrial bristles, number of teeth on sex combs, and abdominal pigmentation) reside in the X-chromosome (Rebeiz et al. 2009; Nagy et al. 2018). It is worth noting that these traits may also be affected by autosomal loci, but autosomal genes seem to have a minor effect compared to X-linked genes. In the case of D. simulans and D. mauritiana, all major chromosomes harbor alleles involved in interspecific differences (Laurie et al. 1997; True et al. 1997; Zeng et al. 2000). The evolution of the mean trait values of the admixed populations toward the major species mean values is consistent with the regression of the genomes toward the major species.
Caveats
The experiment we describe here has at least two significant caveats. First, all our experimental replicates used the same pair of strains as founder populations. The results observed here might differ if we used genetically different founding strains. For example, if there are polymorphic hybrid incompatibilities (e.g., Corbett-Detig et al. 2013), or there are differences in the rate of recombination within species, the amount of introgression might differ depending on the founder lines. This limitation stems from the fact that we had to use fixed pairs of strains to eliminate chromosome inversion heterozygosity, but similar tests should be done on other strains. If different lines carry different deleterious alleles, then the results will vary depending on the lines used.
A second caveat is that all populations were kept in a single laboratory environment (a constant temperature and in cornmeal medium). This might have a large effect on what alleles are favored after hybridization. As discussed above, D. santomea is commonly associated with figs (Cariou et al. 2001), while D. yakuba tends to be associated with a variety of substrates. Similarly, D. santomea is more readily found at lower temperatures; this species also shows lower fitness than D. yakuba at higher temperatures. Similar differences have not been reported between D. simulans and D. mauritiana, but that does not mean they do not exist. Naturally occurring hybrid zones can be a complementary approach to assess the relative importance of gene exchange in nature.
Future efforts should be able to assess not only the starting and end points of the presented experimental design but also intermediate points. This should reveal how many generations it takes to purge minor species alleles, whether replicates differ in their rate of evolution, and whether the rate of change in genome composition is similar to rates of change in morphological traits.
Conclusions
Hybridization and admixture are common processes in nature. Nonetheless, the outcomes of admixture remain largely unknown. The experiment presented here provides evidence that Drosophila genomes cannot persist as species mosaics. Similar results have been observed in natural hybrid zones between different species of cottonwoods (Martinsen et al. 2001) and experimental hybrid populations of mice (Shorter et al. 2017). Other systems such as sunflowers (Yatabe et al. 2007) and Anopheles mosquitoes (Fontaine et al. 2015) have revealed that their genomes are permeable to introgression (reviewed in Mallet et al. 2016). Similar experiments are needed across other groups to determine whether our results reveal a general pattern. Regardless of the ultimate amount of ancestry that segregates in admixed populations, our experiment shows a conclusive approach to understand the consequences of hybridization in a controlled setting that can be manipulated. Such manipulations will allow us to understand whether the outcomes of hybridization are deterministic and to what extent they are contingent on environmental and demographic factors. This is likely to vary across taxa, but until similar experiments are carried out in other species, the answer will remain unknown.
Acknowledgments
We thank B. Cooper, R. Marquez, and J.M. Coughlan and the members of the Matute lab for helpful scientific discussions and comments. The comments and suggestions from three anonymous reviewers helped us improve this manuscript. We also thank R. Corbett-Detig for his advice on the use of Ancestry-HMM. P. Andolfatto and J.J. Emerson (R01 GM114093 and R01GM123303, respectively) kindly shared the unpublished genomes of D. yakuba and D. santomea. This work was supported by NIH award R01 GM121750 to D.R.M. and R01 GM058260 to J.A.C. The authors declare no conflicts of interest.
Footnotes
Supplemental material available at figshare: https://doi.org/10.25386/genetics.11113676.
Communicating editor: D. Presgraves
- Received June 4, 2019.
- Accepted November 18, 2019.
- Copyright © 2020 by the Genetics Society of America