Abstract
Sexual dimorphism in morphological, physiological, and behavioral traits is pervasive in animals, as is the observation of strong sexual dimorphism in genomewide patterns of gene expression in the few species where this has been studied. Studies of transcriptome divergence show that most interspecific transcriptional divergence is highly sex dependent, an observation consistent with the action of sex-dependent natural selection during species divergence. However, few transcriptome evolution studies have been conducted between recently diverged species (<1 MY). Here, we present analyses of sex-biased transcriptome divergence in sexually mature adults of three recently diverged species of Drosophila: Drosophila pseudoobscura, D. persimilis, and D. pseudoobscura bogotana. Data were collected using a custom designed Agilent oligonucleotide. Expression was detected in 12,507 genes. About 80% of the expressed genes show sex-biased expression in each species. Across species, 21% of the transcriptome shows switches between nonsex bias and sex bias, and just 0.9% of the transcriptome shows reversals of sex-biased expression. Over 80% of the expression divergence between species is due to changes in one sex only. About 15% of the expression divergence between species is due to changes in the same direction in both sexes and just 2% is due to changes in both sexes but in opposite directions. In agreement with previous studies, we observe a high level of sex-dependent transcriptome divergence and strong demasculinization of the two arms of the X chromosome in all species. However, in contrast to previous studies we find that male-biased genes do not have higher levels of expression divergence than non-sex-biased genes, and sex-biased genes show higher levels of expression divergence in the alternate sex, suggesting that sex-biased genes endure stronger selection when expressed in the alternate sex.
IN most animal species there is extensive morphological, physiological, and behavioral differentiation between the sexes. This sex-specific phenotypic differentiation, or sexual dimorphism, is postulated to have evolved as a result of different selective pressures acting on both sexes, in the form of sex-specific natural selection and/or sexual selection (Darwin 1871; Andersson 1994). Between species, reproductive traits or other sex-related traits exhibit the most obvious morphological differences; it is therefore not surprising that reproductive traits are typically used as key characters in species taxonomic descriptions. Moreover, in many species male-specific reproductive traits evolve faster than other traits (Eberhard 1985; Civetta and Singh 1998a). This rapid divergence of sex-related traits between species might be the result of male–male competition leading to diversifying selection or of antagonistic coevolution of male and female sexual traits that in turn may lead to the evolution of reproductive barriers (Parker and Partridge 1998; Rice 1998; Gavrilets 2000).
Observations at the molecular level resemble patterns of intersexual morphological differentiation and divergence. First, there is strong sexual dimorphism in genomewide patterns of gene expression (Jin et al. 2001; Parisi et al. 2003; Ranz et al. 2003; Gibson et al. 2004; Rinn and Snyder 2005; McIntyre et al. 2006; Yang et al. 2006; Ellegren and Parsch 2007; Sturgill et al. 2007; Zhang et al. 2007; Mank et al. 2008; Reinius et al. 2008), a phenomenon termed sex-biased expression. Second, reproductive genes, mainly male reproductive proteins, evolve faster at the sequence level (Civetta and Singh 1998b; Swanson and Vacquier 2002). Third, comparisons of the rate of nucleotide sequence divergence of genes with different patterns of sex-biased expression show that male-biased genes have the greatest level of divergence between species in flies, nematodes, and mammals (Torgerson et al. 2002; Cutter and Ward 2005; Zhang et al. 2004, 2007), and in seven species of Drosophila male-biased genes have higher birth and extinction rates than female-biased or non-sex-biased genes (Zhang et al. 2007). However, in birds genes with female-biased expression in the brain show faster rates of sequence evolution than male-biased and non-sex-biased genes (Mank et al. 2007). Fourth, male-biased genes also show the highest levels of interspecific gene expression divergence in Drosophila and mammals (Meiklejohn et al. 2003; Ranz et al. 2003; Khaitovich et al. 2005; Voolstra et al. 2007). Further, in Drosophila the expression of male-biased genes is significantly more divergent in males than in females (Meiklejohn et al. 2003).
Few studies of sex-specific gene expression divergence have been conducted in closely related species (i.e., species that have diverged <1 MYA). In Drosophila, the most detailed studies of gene expression divergence and sex bias have been conducted in Drosophila melanogaster and D. simulans (Meiklejohn et al. 2003; Ranz et al. 2003), species that diverged >2.5 million years ago. Here we present data on patterns of genomewide gene expression in three closely related members of the D. pseudoobscura species group (D. pseudoobscura, D. persimilis, and D. pseudoobscura bogotana) that have diverged within the last 0.5 million years. D. pseudoobscura bogotana is considered to be a subspecies of D. pseudoobscura (Ayala and Dobzhansky 1974), but, for convenience, in this article we refer to it as a different species. The availability of full genome sequences for D. pseudoobscura and D. persimilis (Richards et al. 2005; Clark et al. 2007) allows us to conduct comparisons of gene expression relative to sequence divergence for all predicted genes and allows us to design an optimal microarray platform in which instances of gene expression differences that may be caused by sequence divergence can be minimized.
D. pseudoobscura and D. persimilis are North American species that are partially sympatric in the western part of North America (from California to British Columbia) (Dobzhansky and Epling 1944), hybridize at low frequency in nature (Dobzhansky 1973; Powell 1983; Machado and Hey 2003), and diverged ∼0.5 million years ago (Schaeffer and Miller 1991; Wang et al. 1997; Machado et al. 2002). These two species have been widely studied in the context of understanding the history and genetic basis of species divergence and speciation (Dobzhansky 1933; Dobzhansky and Epling 1944; Orr 1987; Noor et al. 2001a,b, 2007; Machado et al. 2002, 2007; Machado and Hey 2003; Kulathinal et al. 2009). D. pseudoobscura bogotana is a subspecies of D. pseudoobscura that is found in the vicinity of Bogotá, Colombia, thousands of kilometers away from the closest population of D. pseudoobscura in Central America (Ayala and Dobzhansky 1974). These two species diverged ∼0.15 million years ago (Schaeffer and Miller 1991; Wang et al. 1997). The recent divergence times (0.15–0.5 million years) of species from the D. pseudoobscura group provide a unique opportunity to address questions about patterns of sex-specific gene expression divergence in recently diverged species.
Here we present data on genomewide patterns of sex-biased expression and compare sex-specific patterns of transcriptome divergence between the three species of the D. pseudoobscura group. We use a custom oligo microarray platform designed to minimize problems generated by cross-species hybridization, taking advantage of the annotated genome sequences of D. pseudoobscura (Richards et al. 2005) and D. persimilis (Clark et al. 2007). We show that the evolutionary history of the transcriptome of these three species has been very dynamic. Although several of the observed patterns are consistent with previous results in other species, we observe novel and interesting patterns of transcriptome divergence.
MATERIALS AND METHODS
Drosophila strains and RNA preparation:
At least three isofemale inbred lines were used for each species (D. pseudoobscura, MV2-25, Flagstaff 18, Mather 10, and Mather 32; D. persimilis, MSH 3, Mather G, and MSH 42; and D. pseudoobscura bogotana, Toro 7, Susa 6, and Potosi 3). The MV2-25 (Baylor) and MSH3 lines are the genome sequence strains of D. pseudoobscura and D. persimilis, respectively (Richards et al. 2005; Clark et al. 2007). Information about the other lines can be found in Machado et al. (2002). Virgin flies of each line were collected, separated by sex, and reared on cornmeal/yeast medium and maintained at 20° with a 12/12-h light/dark cycle. Seven-day-old virgin flies were instantly frozen in liquid nitrogen and stored in TRIZOL (Invitrogen, Carlsbad, CA) at −80°. RNA was isolated using TRIZOL from at least 40 individuals. RNA quantity and quality were separately measured on a NanoDrop spectrophotometer (Thermo Scientific) and a Bioanalyzer 2100 (Agilent Technologies). High-quality RNA samples (260/280 > 1.85; 260/230 > 1.7) from all lines of each species were pooled in equal amounts and further purified using the RNeasy kit (QIAGEN, Valencia, CA). RNA from multiple lines was pooled to avoid drawing conclusions about interspecific differences on the basis of data from a single isofemale line and to focus on major differences that are likely consistent across multiple lines.
Microarray design and hybridization:
Custom 4-plex 44K oligonucleotide arrays were designed using the Agilent eArray platform and synthesized by Agilent Technologies (Hughes et al. 2001). The array consists of 45,220 spots including positive and negative controls, with 55- to 60-mer oligonucleotides representing 18,850 unique consensus gene predictions from the D. pseudoobscura genome, using the GLEAN prediction combiner software (Elsik et al. 2007). The oligos were later reannotated on the basis of the annotation of the D. pseudoobscura genome (version 2.2). Oligos were designed to be identical between the sequenced D. pseudoobscura and D. persimilis genome lines [MV2-25 (Baylor) and MSH3] to eliminate or reduce possible artifacts in the estimation of gene expression differences between species due to inefficient cross-species hybridizations. Species-specific probes were designed for 1469 genes for which probes with identical sequences could not be found. An average of 2.17 probes per gene are included in the array. Further, oligos representing 12 mtDNA-encoded genes, 65 putative noncoding RNAs, and 108 regions of the Y chromosome (Carvalho and Clark 2005) were included in the array. A description of the putative noncoding RNAs will be presented elsewhere (Z. Jiang and C. A. Machado, unpublished data). The Y chromosome probes were not included in the analyses presented here.
Fluorescent cRNA was synthesized from 200 ng of total RNA, using Agilent's low-RNA input fluorescent linear amplification kit following manufacturer's protocols (Agilent Technologies). Labeled RNAs were cleaned with RNeasy columns (QIAGEN), and cRNA yields were quantified on a NanoDrop spectrophotometer. Two separate labeling reactions per sample were pooled and hybridized to the arrays following Agilent's protocols. We conducted single-color (Cyanine 3-CTP dye) labeling and hybridizations. Three different replicates were hybridized for each species and sex. Labeling reactions, hybridization, and scanning of the data were done at the Genomics Core of the Arizona Cancer Center (University of Arizona).
Statistical analyses:
Expression values for each probe were corrected for local background intensity. A total of 1072 probes (2.3% of total) in which one of the measurements across all samples was 10 times greater than the average of the other two replicates were considered “bad spots” and were replaced by the average value of the other two replicates of the same sample. Probes with average raw fluorescent intensities <50 across all samples were excluded on the basis of Agilent's positive and negative controls on the array. Ninety percent of the probes with bad spots were eliminated on the basis of that criterion. The use of 50 fluorescent units is a conservative threshold because raw intensities of negative controls average <10. Chip mean normalization and log2 transformation of fluorescence intensity measurements were conducted on HDBStat! (Trivedi et al. 2005). Correlation coefficients between experimental replicates were >0.97. Values from probes of the same gene were averaged before conducting a two-step mixed-model analysis of variance. In the first step signal intensities across slides are normalized using the model yi = μ + Ai + εi, where yi is the log2 transformed transcription level for each gene on slide i, μ is the overall mean intensity for genes across all samples, Ai is the effect of the ith slide, and εi is the stochastic error. The residuals from this model were fitted to gene-specific mixed-model ANOVAs: rijk = μ + Ai + Gj + Sk + (G × S)jk + εijk, where Ai is the effect of the ith slide, Gj is the jth species (D. pseudoobscura, D. persimilis, and D. p. bogotana), Sk is the kth sex (male or female), (G × S)jk is the species-by-sex interaction, and εijk is the stochastic error. The gene-specific models were fitted using Ai as a random effect, with Gj and Sk as fixed effects. The analyses were conducted using PROC MIXED in SAS version 9.0 (SAS Institute, Cary, NC). Pairwise differences among species and sexes were estimated with the Diffs option in PROC MIXED. Correction for multiple statistical tests was conducted using the false discovery rate (FDR) method (Storey and Tibshirani 2003). The FDR method calculates the number of false positives within a set of significant values and then calculates q, a new significance probability. q values were calculated using the QVALUE package (Nee and Lawton 1996), and q < 10−6 was used as the cutoff value for significance. This is a conservative cutoff, as q < 10−6 is close to the appropriate Bonferroni correction for this data set (P = 2.65 × 10−6; 0.05/18,850 genes) but allows us to focus on highly significant differences. Microarray data have been deposited in NCBI's Gene Expression Omnibus (Edgar et al. 2002) and are accessible through GEO series accession no. GSE17192.
We define male- or female-biased genes as those having significantly higher expression levels in males or females, respectively, on the basis of a significant sex-by-species effect in the mixed-model ANOVA [(G × S)jk in the model described above]. There is no widely accepted standard criterion for determining sex bias (Ellegren and Parsch 2007). Some authors use relative expression levels between the two sexes greater than a given cutoff (e.g. >2) to define sex bias (Parisi et al. 2003; Mank et al. 2008), while other authors use statistical approaches with different levels of significance (Ranz et al. 2003; McIntyre et al. 2006; Zhang et al. 2007). Our approach is sensitive due to the level of replication and the ANOVA model used, but it is also conservative because of the corrected level of significance used (q < 10−6).
For comparison between species, genes that show the same sex-biased or non-sex-biased pattern in all species of interest are defined as sex-biased or non-sex-biased. Genes that show sex bias in one species and show sex bias in the alternate sex in the other species are defined as genes showing reversal of sex bias. We estimated expression divergence between species, using the coefficient of variation of relative expression levels as in Ranz et al. (2003). Transcriptome divergence is defined as the proportion of differentially expressed genes between species.
Sequence analyses:
To compare sequence and transcription divergence we estimated sequence divergence (Ks and Ka) for a set of 13,537 orthologous genes from D. pseudoobscura and D. persimilis. The current annotation of the D. pseudoobscura genome (version 2.2) contains 16,071 coding sequences (CDS), and the annotation of the D. persimilis genome (version 1.2) contains 16,879 CDS. Using the 12-species ortholog set from FlyBase (version 2008_07), we extracted 9923 orthologs from both species. An additional 3614 orthologs were extracted from the annotated lists of both species, using reciprocal BLAST (E-value cutoff 10−10). Ortholog CDS were pairwise aligned using ClustalW (Thompson et al. 1994) and the resulting alignments were passed to PAML for estimation of Ks and Ka, using the codeml program with the pairwise distance estimation option (run mode = −2) (Yang 1997). For ∼8% of the genes we obtained unrealistically high estimates of Ks or Ka that almost certainly result from misalignment, misannotation, or frameshifts within a sequence. Those genes were excluded from the analyses. Lists of significant genes were characterized using the Web-based tool FuncAssociate (Berriz et al. 2003), which finds overrepresented classes of genes in a list of genes of interest using Gene Ontology (GO) attributes. FuncAssociate allows correcting for multiple hypothesis testing using Monte Carlo simulations. Adjusted P-values <0.001 were considered significant.
RESULTS
A custom oligo microarray was used to determine global expression patterns from whole bodies of 7-day-old virgin adults from three closely related species of the D. pseudoobscura group. A total of 12,507 genes had levels of expression greater than background signal in all species, accounting for ∼78% of the predicted genes in version 2.2 of the D. pseudoobscura genome annotation (supporting information, Table S1). Below we focus our description of the data on patterns of sex-biased expression and patterns of sex-dependent interspecific differentiation.
General patterns of sex-biased expression:
Under the selected FDR criterion (q < 10−6), ∼80% of the adult transcriptome shows sex-biased expression in the three species (Table 1, Table S1). Although estimates of the percentage of the transcriptome showing sex-biased expression are dependent on the cutoff q values (Table S1), our main conclusions regarding the nature of transcriptome changes are robust to different cutoff values (q < 10−3–q < 10−12). In D. pseudoobscura, 2717 (64.0%) of male-biased genes and 2532 (45.6%) of female-biased genes show twofold or greater differences in expression between the sexes. Thus, under a definition of sex bias based on twofold expression differences between the sexes, 42% of the D. pseudoobscura transcriptome would be sex biased.
Sex-biased expression in three species of the Drosophila pseudoobscura group
In agreement with recent results for D. pseudoobscura using a different microarray platform (Zhang et al. 2007), the three species have a higher proportion of female-biased genes than male-biased genes (∼42–44% vs. ∼34–39%), although our results show a much larger level of sex bias in D. pseudoobscura (78.4% vs. 31.9%). A previous major study in D. melanogaster and D. simulans found that 57% of the genes with conserved patterns of expression are sex biased (Ranz et al. 2003). Other studies in D. melanogaster have also found high levels of sex bias, from two-thirds of the surveyed transcriptome (Jin et al. 2001) to 88.2% of the transcriptome in a more recent study (Ayroles et al. 2009). The disparity in the proportion of sex-biased genes in our data (and other studies) relative to that in Zhang et al. (2007) likely reflects the use of different statistical methods and stringency levels. Despite these differences, 90% of the genes identified as female biased by Zhang et al. are also female biased in our data, and the same is true for 70% of the male-biased genes. The observed discrepancies in the levels of sex bias across studies underscore the need to develop standard classification methods to facilitate conducting comparison studies.
About 78% of the expressed genes have conserved patterns of sex-biased or non-sex-biased expression across the three species (Table 2, Figure S1). Close to 21% of the transcriptome shows switches between nonsex bias and sex bias (Table S2 and Table S3). Switches between non-sex-biased and male-biased expression are more frequent than switches between non-sex-biased and female-biased expression (1604 vs.1017), consistent with the observation that male-biased genes have higher effective birth and extinction rates in Drosophila (Zhang et al. 2007).
Classification of genes by their sex-biased expression in all three species
Sex-bias reversals are interesting because they suggest changes in the direction of sexually antagonistic selection during species divergence (Ellegren and Parsch 2007), and therefore these genes might give us clues to understand the evolutionary forces underlying species divergence. We found an average of 50 genes showing reversals in sex bias in each of the three pairwise comparisons (Figure S2) and a total of 112 genes (0.9%) showing this pattern (Table S8). The low proportion of genes showing sex reversals in these species is consistent with what has been observed in other Drosophila studies (Ranz et al. 2003; Zhang et al. 2007), although in this case the observed switches have occurred over a shorter evolutionary time. Only one gene (CG9538, Ag5r), which is female biased in D. pseudoobscura and male biased in the other two species, has high expression levels (top 25% of the transcriptome). Genes showing sex-bias reversals are randomly distributed among chromosomes (χ2 = 1.3, d.f. = 4, P = 0.414). No genes involved in spermatogenesis or oogenesis show reversals of sex bias.
Gene Ontology analyses show that genes involved in metabolic and energy generating pathways as well as the mitochondrial electron transport chain/respiratory chain are overrepresented among male-biased genes (adjusted P <0.001; Table S4, A). Genes with female-biased expression are involved in diverse functions (Table S4, B). There are no enriched categories for non-sex-biased genes, genes showing sex-biased expression in one or two species only, and genes showing reversals of sex-biased expression.
We observed demasculinization of the X chromosome in the three studied species, in agreement with previous studies (Parisi et al. 2003; Ranz et al. 2003; Sturgill et al. 2007). Male-biased genes are significantly underrepresented on the XL (Muller A) [D. pseudoobscura (observed deficit of male-biased genes relative to expected: 16. 5%), χ2 = 41.6, d.f. = 2, P < 0.0001; D. p. bogotana (16.4%), χ2 = 36.0, P < 0.0001; D. persimilis (17.1%), χ2 = 35.3, P < 0.0001] and XR (Muller D) [D. pseudoobscura (observed deficit of male-biased genes relative to expected: 11.4%), χ2 = 12.98, P = 0.0003; D. p. bogotana (7.5%), χ2 = 6.86, P = 0.0088; D. persimilis (8.0%), χ2 = 7.04, P = 0.0079] arms of the X chromosome. The lower level of underrepresentation observed in XR is consistent with its status as a neo-X chromosome that fused to the ancestral X chromosome 8–12 MYA (Steinemann et al. 1984; Richards et al. 2005), and thus it may still be undergoing demasculinization (Sturgill et al. 2007).
Genes differentially expressed between species:
There are more genes differentially expressed in males than in females in comparisons between D. pseudoobscura and the other two species (Figure 1), suggesting that the male transcriptome has diverged faster than the female transcriptome in these species (Fisher's exact test; P < 0.001). Lists of all the differentially expressed genes between species are presented in Table S5. Differentially expressed genes between females or males of each pairwise species comparison are not randomly distributed among chromosomes. In females, the differentially expressed genes are underrepresented in the fourth chromosome (D. pseudoobscura/D. persimilis, χ2 = 43.8, d.f. = 2, P < 0.0001; D. pseudoobscura/D. p. bogotana, χ2 = 4.1, P = 0.042; D. p. bogotana/D. persimilis, χ2 = 11.2, P = 0.0008). In males, there is a deficit of differentially expressed genes in the XR chromosome in two of the pairwise comparisons (D. pseudoobscura/D. persimilis, χ2 = 3.0, d.f. = 1, P = 0.083; D. pseudoobscura/D. p. bogotana, χ2 = 6.1, P = 0.013; D. p. bogotana/D. persimilis, χ2 = 13.6, P = 0.002). Gene Ontology analyses of genes showing differential expression between species present patterns that are primarily species-pair specific with some gene ontology terms appearing repeatedly in some of the comparisons (Table S6).
Numbers of genes differentially expressed between species by sex under the selected FDR criterion (q < 10−6). Males (M) are shown on the left and females (F) on the right. D.ps, Drosophila pseudoobscura; D.pe, D. persimilis; D.pb, D. pseudoobscura bogotana.
A few hundred genes, 10–20% of the total number of genes differentially expressed between species, are common among the group of genes differentially expressed across all three pairwise species comparisons: 358 genes among females (Table S7, A) and 563 genes among males (Table S7, B). This is a surprising observation given the shared and recent divergence history of these species (Machado et al. 2002; Machado and Hey 2003). In females, these genes are randomly distributed among chromosomes (χ2 = 2.7, d.f. = 4, P = 0.6073), there is a larger proportion of female-biased genes in this common set (χ2 = 120.0, d.f. = 2, P < 0.0001), and gene ontology analyses show an enrichment of genes associated with catalytic activity (GO: 0003824). In males, these genes are underrepresented in the XR chromosome (χ2 = 9.4, d.f. = 4, P = 0.0022), male-biased genes are extremely overrepresented (χ2 = 1897.0, d.f. = 2, P < 0.0001), and gene ontology analyses show enrichment of genes with mitochondrial function and catalytic activity.
Sex-dependent transcriptome evolution between species:
Transcriptome evolution in this species group is highly sex dependent. Over 80% of the genes that have diverged in expression between species have changed either in males or in females, and only 13–18% of the differentially expressed genes have changed in both sexes (Table 3). Interestingly, just 2% of the divergence between species is due to changes in both sexes but in opposite directions. Over 50% of the total transcriptome divergence between D. pseudoobscura and its two close relatives is due to expression differences that are observed in males only (see also Figure 1). That value is only 40% in the comparison between D. p. bogotana and D. persimilis. This is a remarkable change in male-specific divergence, given the short time of divergence between D. pseudoobscura and D. p. bogotana (0.15–0.25 MY).
Classification of genes by their sex-dependent patterns of change in expression between species
To compare our findings with previous results from D. melanogaster and D. simulans we classified patterns of expression divergence in four categories following the categories defined by Ranz et al. (2003) (Table 4; Figure S2). A large fraction of genes show conserved patterns of expression between species (56–58%). Among the genes differentially expressed between the three species pairs, only 7–11% show parallel differences in both sexes maintaining the same type or lack of sex bias. The remaining 89–93% of genes that show a change of expression between species exhibit sex-dependent patterns of evolution by either evolving different levels of sex bias (54–57%) or showing gain, loss, or reversal of sex bias (31–35%). For each species pair comparison, an average of 50 genes show reversals of sex-biased expression. This corresponds to ∼1% of all genes showing expression differences between species, a proportion that is similar to previous observations in more distantly related species (Ranz et al. 2003; Zhang et al. 2007).
Classification of genes by their pattern of evolution and sex bias (following Ranz et al. 2003)
No accelerated gene expression divergence for sex-biased genes:
We compared average levels of gene expression divergence between species for sex-biased and non-sex-biased genes for each sex or both sexes combined. Non-sex-biased genes and male-biased genes have significantly higher levels of expression divergence than female-biased genes in all or in seven of the nine comparisons, respectively (Figure 2). Non-sex-biased genes also have significantly higher levels of divergence than male-biased genes in four of the six comparisons encompassing males or both sexes combined (Figure 2, A and C). In females, the situation reverses, with male-biased genes having either the same or higher expression divergence than non-sex-biased genes (Figure 2B).
Average gene expression divergence between species for male-biased (MB), female-biased (FB), and non-sex-biased (NSB) genes that are differentially expressed between species. (A) Males only; (B) females only; (C) both sexes combined. Expression divergence was calculated using the coefficient of variation of the mean expression value (Ranz et al. 2003). D.ps, Drosophila pseudoobscura; D.pe, D. persimilis; D.pb, D. pseudoobscura bogotana. n, the number of genes compared. Only genes that have the same pattern of bias in the two species are included in each comparison. Error bars, SD. Different letters on top of each bar reflect significant differences using the Wilcoxon rank-sum test (P < 0.01).
Furthermore, we compared the mean expression divergence between males and females of each species pair, focusing on all differentially expressed genes with common patterns of sex-biased or non-sex-biased expression (Figure 3). There are no significant differences in gene expression divergence between males and females for non-sex-biased genes (Figure 3A). On the other hand, male-biased genes have higher expression divergence in females than in males (Figure 3B), and female-biased genes show higher expression divergence in males than in females for two species pairs but no differences in the comparison between D. pseudoobscura and D. p. bogotana (Figure 3C). That is, sex-biased genes show higher average levels of expression divergence in the alternate sex. These observations are not fully consistent with previous results from D. melanogaster and D. simulans (Meiklejohn et al. 2003) (see discussion).
Average expression divergence between males and females for each species pair. (A) Non-sex-biased genes; (B) male-biased genes; (C) female-biased genes. Expression divergence was calculated using the coefficient of variation of the mean expression value (Ranz et al. 2003). D.ps, Drosophila pseudoobscura; D.pe, D. persimilis; D.pb, D. pseudoobscura bogotana. n, the number of genes compared. Error bars, SD. Different letters on top of each bar reflect significant differences under the Wilcoxon rank-sum test (P < 0.0001).
Sequence divergence of sex-biased genes:
The availability of genome sequences from one strain of D. pseudoobscura and D. persimilis (Richards et al. 2005; Clark et al. 2007) allows us to make comparisons of sequence divergence across annotated genes for all categories of sex-biased expression patterns in these two species (Figure 4). For these comparisons we included only genes with the same type of sex bias in both species, excluding genes that have changed patterns of expression. This is an important clarification because in the majority of studies presenting data on sequence divergence for sex-biased genes, the information on patterns of sex bias was available for only one of the species compared. Male-biased genes have a higher mean Ka/Ks ratio than female-biased genes (P = 2.09 × 10−30) and non-sex-biased genes (P = 1.69 × 10−9). Non-sex-biased genes show intermediate average Ka/Ks ratios that are significantly higher than those of female-biased genes (P = 2.16 × 10−4). These observations are not consistent with previous findings for a small set of genes that are only male biased in D. pseudoobscura but not in D. melanogaster (Metta et al. 2006).
Patterns of sequence divergence (Ka/Ks) for male-biased (MB, n = 4296), non-sex-biased (NSB, n = 1280), and female-biased (FB, n = 2985) genes. Ka/Ks values were estimated using a set of 13,537 common orthologs extracted from the D. pseudoobscura and D. persimilis genomes. Letters reflect significant differences between the observed distributions (P < 0.0001; Wilcoxon rank-sum test). Boxes encompass the lower and upper quartiles, with the internal line representing the median and whiskers extending to the 2.5th and 97.5th percentiles.
Interestingly, we see a significant positive correlation between levels of expression divergence and sequence divergence for all genes (ρ = 0.1310, P < 0.0001) and for male-biased genes (ρ = 0.1897, P < 0.0001). This is somewhat surprising given that male-biased genes evolve faster at the sequence level but show no consistent pattern of faster evolution at the expression level in this species group.
DISCUSSION
We used a custom-designed oligonucleotide microarray to study patterns of sex-biased expression and sex-dependent expression divergence in three species of Drosophila that diverged during the last 0.5 MY. The array was designed using the complete genome sequences of D. pseudoobscura and D. persimilis to minimize artifacts in the measure of gene expression divergence that could be the result of sequence divergence between the species. Few detailed studies on sex-dependent patterns of gene expression divergence have been published (Meiklejohn et al. 2003; Ranz et al. 2003) and none on those in closely related species (divergence <1 MY). Our study on members of the D. pseudoobscura species group shows that their transcriptomes are very dynamic and that changes in sex-biased patterns of expression can occur rapidly. Although we used different microarray platforms (oligo vs. cDNA) and different analytical methods (ANOVA vs. Bayesian) from those used in the most detailed previous study (Ranz et al. 2003), our results are fairly consistent with Ranz et al.'s observations in a more divergent species pair (D. melanogaster and D. simulans). What is most remarkable from this comparison is that despite a five- to sixfold difference in divergence time between the species pairs compared, the proportion of the transcriptome showing changes in sex-biased expression patterns and sex-dependent changes in expression is not very different in the two studies. These observations suggest that divergence in sex-biased expression patterns in Drosophila may proceed very rapidly after initial species divergence and then may reach a steady state where gains in sex bias at some genes are balanced by losses at other genes. Data from other recently diverged species and from pairs of species with intermediate levels of divergence will be required to determine if that is a valid interpretation of the dynamics of changes in sex-biased expression.
Although most of our results are consistent with previously held views about the evolution of sex-biased genes, we observed three surprising sets of results among the most common patterns of expression divergence in our data: (1) non-sex-biased genes show the highest average divergence in males (Figure 2A), (2) there is no significant difference between male- and female-biased expression divergence in males (Figure 2A), and (3) male-biased genes have significantly higher divergence than female-biased genes when they are expressed in females (Figure 2B). These are unexpected findings given the previous data from D. melanogaster and D. simulans (Meiklejohn et al. 2003) showing that male-biased genes have higher levels of divergence in both males and females but are significantly more divergent only when they are expressed in males. The patterns we observe suggest that sex-biased genes are experiencing higher selection pressures in the alternate sex, leading to increased average divergences of male-biased genes in females and female-biased genes in males, but showing significant differences only in females. Testing the hypothesis that selection may be responsible for the observed patterns will require collecting intraspecific variation data on gene expression, as previously done by Meiklejohn et al. (2003).
The combined evidence from two different groups of Drosophila (D. melanogaster and D. pseudoobscura) suggests that sex-dependent evolution is likely to be a general pattern of transcriptome divergence. Furthermore, the substantial excess of sex-biased genes among those differentially expressed between species further supports the conclusion that sex-dependent selection may be the prevalent mechanism involved in gene expression divergence (Meiklejohn et al. 2003; Ranz et al. 2003), contradicting the claim that most transcriptome changes are neutral (Khaitovich et al. 2004, 2005). However, intraspecific variation expression data are needed to test this hypothesis rigorously in the future (Meiklejohn et al. 2003; Ranz and Machado 2006).
Sequence and gene expression divergence of sex-biased genes:
We show that male-biased genes show faster rates of protein divergence than genes with female- or nonbiased patterns of expression (Figure 4), in agreement with observations in other animal taxa (Torgerson et al. 2002; Zhang et al. 2004, 2007; Cutter and Ward 2005). These results indirectly support the consensus view that patterns of molecular evolution in male-related genes are influenced by sexual selection in the form of male–male competition or antagonistic male–female coevolution (Civetta and Singh 1998b; Zhang et al. 2004). A previous study in D. pseudoobscura did not find evidence of faster rates of protein evolution in a small set of genes that were male biased in this species and not in D. melanogaster (Metta et al. 2006), suggesting that D. pseudoobscura was under a unique selection regime. However, that study was based on a small number of genes for which the pattern of sex-biased expression was determined using a sequencing method (SAGE) with low coverage. Our results using whole genome data do not support those conclusions.
It is important to note that in most of the previous studies the classification of genes into different categories of sex bias was based on expression patterns from only one species. Given that a large portion of the transcriptome shows gain, loss, and reversals of sex bias, previous analyses have included genes that are quite likely to have changed their sex-biased patterns during divergence and are thus male biased in only one of the species compared. Therefore, although the main conclusion about faster rates of protein evolution in male-biased genes is likely to hold, results from other taxa should be fully confirmed once patterns of sex-biased expression are obtained for all species compared.
Some caution needs to be exerted in interpreting the Ka/Ks results given the recent divergence of the species and the presence of large amounts of shared polymorphisms in collinear regions of the genome (Wang et al. 1997; Machado et al. 2002; Hey and Nielsen 2004; Noor et al. 2007; Kulathinal et al. 2009). Two main issues need to be considered. First, the use of a single genome sequence for each species does not allow us to determine if observed nucleotide differences are true fixed differences or just polymorphisms and thus inflated divergence measures may be obtained. This could tend to decrease Ka/Ks if Ks values are more inflated than Ka, as expected. For the same reason, it is possible that the pattern observed in male-biased genes could be the result of a larger fraction of slightly deleterious nonsynonymous variants segregating in these genes. Second, a recent study of genome divergence between D. pseudoobscura and D. persimilis (Noor et al. 2007) has shown low values of Ks near the tip of each chromosome reflecting the lower polymorphism expected in regions of low recombination. This observation implies that Ks may be artificially high across much of the rest of the chromosomes of these species (reflecting polymorphism rather than divergence), and this would presumably also be true for Ka. As a result, other interpretations for the causes of the observed differences in Ka/Ks across genes with different levels of sex bias are possible. With the genome sequence data available we cannot determine how much any of those factors has influenced the observed patterns. A population genetic study should help determine whether the interpretation that the increased rates of protein divergence in male-biased genes are in fact the result of positive selection as previously shown (Stevison et al. 2004; Zhang et al. 2004).
Finally, the finding of a positive correlation between sequence and gene expression divergence was somewhat surprising given our overall results. Although several studies have reported evidence for a positive correlation between sequence and gene expression divergence (Nuzhdin et al. 2004; Khaitovich et al. 2005; Lemos et al. 2005; Liao and Zhang 2006; Sartor et al. 2006), this is not a general pattern (Jordan et al. 2004; Tirosh and Barkai 2008). At the DNA level, for protein-coding genes, a standard criterion has been developed for studying rates of evolution by comparing mutation rates between synonymous and nonsynonymous sites. However, the lack of similar criteria for measuring evolutionary rates and patterns of expression divergence is problematic and leads to use of multiple methods and to inferences about correlation that vary, depending on the methodology used. More critical is the important assumption made in expression divergence analyses about independence across genes. This independence is violated when a single gene influences variation in the expression of multiple genes, in particular when few regulatory changes of the upstream genes in a pathway might cause changes of thousands of downstream genes. This violation has been observed in artificial selection on odor-guided behavior and mating speed in D. melanogaster (Anholt et al. 2003; Mackay et al. 2005) and has been inferred in cases where “master regulators” of expression seem to be present (Morley et al. 2004). The epistatic and pleiotropic nature of the molecular mechanisms underlying gene expression might lead to uncoupled rates of protein and gene expression evolution. Future eQTL studies should help clarify whether interspecific changes in gene expression have been caused by changes in a handful of master-regulatory genes or have evolved largely independently from each other.
Acknowledgments
We thank Jose Muñoz-Rodriguez, August Woerner, and Susan Miller for technical assistance. Greg Gibson and two anonymous reviewers provided constructive comments. The Genomics Core at the Arizona Cancer Center, University of Arizona, performed the microarray work. This work was supported by National Science Foundation grant DEB-0520535 to C.A.M.
Footnotes
Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.109.105775/DC1.
Microarray data have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO series accession no. GSE17192.
Communicating editor: G. Gibson
- Received June 4, 2009.
- Accepted August 23, 2009.
- Copyright © 2009 by the Genetics Society of America