Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast.
ANALYSIS of the genome sequences of related species has provided tremendous insight into the key functional elements of genomes as revealed by patterns of DNA sequence conservation. The Saccharomyces yeasts have been particularly well sampled by sequencing projects over the past decade (reviewed in Dujon 2010), and comparative analyses have revealed a history of gene duplication (Dietrich et al. 2004; Kellis et al. 2004), conservation at DNA binding sites (Cliften et al. 2003; Kellis et al. 2003), and coevolution of binding sites with regulators (Gasch et al. 2004). However, to enable more thorough understanding of the underlying biology, sequence-based studies must be complemented by the experimental study of functional divergence. Within Saccharomyces cerevisiae, comprehensive analysis of gene expression, protein levels, and metabolite levels demonstrates the ability of gene expression rather than raw sequence data to predict phenotype (Guan et al. 2008). In the yeasts, studies of promoter usage (Borneman et al. 2007), transcription factor binding (Doniger et al. 2005), stress sensitivity (Kvitek et al. 2008), transcriptional network changes (Tsong et al. 2006; Tuch et al. 2008), mating (Zill and Rine 2008), replication timing (Muller and Nieduszynski 2012), protein levels (Khan et al. 2012), and nucleosome occupancy (Guan et al. 2011; Tsankov et al. 2010) demonstrate that interesting evolutionary features emerge when processes are compared in detail within these eukaryotes.
Despite this foundational work, no studies have yet attempted to experimentally characterize gene function on a systematic scale in nonmodel newly sequenced species. An ideal study of gene function in a new species would establish precise functions for all species-specific genes and allow a systematic comparison of gene function and regulation for orthologs between species. Such a study can form the groundwork for connecting functional and regulatory differences to the sequence variants that have accumulated over evolutionary time. Conversely, genes with conserved function and regulation can be used to infer DNA sequence changes that are either neutral or that coevolved to maintain the selected characters. Gene-expression analysis fits these requirements, as genes of shared functions are highly correlated in their expression, and, conversely, gene-expression correlations are highly predictive of gene function (Stuart et al. 2003; van Noort et al. 2003; Hibbs et al. 2007; Huttenhower et al. 2007).
Limited comparative analyses of gene expression among different species have already been attempted and show how rapidly networks can evolve (reviewed in Whitehead and Crawford 2006). Comparisons between extremely divergent systems can discover core pathways shared over vast evolutionary differences (Stuart et al. 2003; Bergmann et al. 2004), while focusing on species that are less diverged permits study of more rapidly adapting processes and facilitates identification of the specific sequence changes that might be driving these differences. Furthermore, observing a phenomenon in multiple species provides solid evidence that it is not specific to a laboratory-adapted model organism but is instead an evolutionarily conserved biological response (Hess et al. 2006; Zill and Rine 2008; Airoldi et al. 2009).
To examine the conservation and divergence of gene function, we selected the yeast Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus for simplicity) for comparison with S. cerevisiae. The two species diverged ∼20 million years ago and have a comparable level of DNA sequence divergence as mouse and human (80% conserved in coding regions and 62% conserved in intergenic regions as compared to S. cerevisiae). We have recently used next-generation sequencing to create a high-quality assembly and gene model prediction of the S. bayanus genome, and we created an extendable genome browser to facilitate its use (Scannell et al. 2011). Importantly, sequence conservation of functional elements is still detectable (for example, noncoding RNAs; Kavanaugh and Dietrich 2009). Like S. cerevisiae, S. bayanus is a species used in winemaking, and recent studies of its genome content and relationship to lager yeasts have clarified taxonomic confusion (Libkind et al. 2011). The phylogenetic proximity and shared natural history with S. cerevisiae also make it possible to select specific experimental conditions for S. bayanus by reference to the vast literature available for S. cerevisiae, one of the most popular model organisms. The two species can make interspecific hybrids, allowing complementation tests with S. cerevisiae alleles. However, with a few exceptions (Serra et al. 2003; Talarek et al. 2004; Jones et al. 2008; Zill and Rine 2008; Gallagher et al. 2009; Zill et al. 2010), little experimental work has been performed in S. bayanus and even less at genome scale (Bullard et al. 2010; Tsankov et al. 2010; Busby et al. 2011; Guan et al. 2011; Muller and Nieduszynski 2012).
We first compared the basic growth characteristics of the two species and developed genetic tools and protocols to facilitate experimental manipulations of S. bayanus. Following this characterization of the species, we then produced a gene-expression compendium of over 300 microarrays in S. bayanus, guided by a machine-learning analysis of the entire S. cerevisiae literature that predicts an optimal set of conditions for expression analysis (Guan et al. 2010), and assembled a set of published expression experiments in S. cerevisiae for comparison. Similar to comparative sequence analysis, comparing the gene-expression responses of different species allows the identification of programs of conserved gene regulation and of alterations in gene-expression response. In comparing the S. bayanus and S. cerevisiae data, we noted a number of examples of divergence in gene expression between the species (Guan et al. 2013). Also, because genes of like function typically have correlated gene expression (Eisen et al. 1998), patterns of coexpression can be used to predict the functional roles of genes (Sharan et al. 2007).
Our analysis of these data sets reveals both regulatory change and evolution of gene function amid overall conservation. Specific examples include expression rewiring in the pathways controlling meiosis and galactose utilization, oxidative stress driving expression of a species-specific network, and evidence for divergence of specific functional groups.
Materials and Methods
The strains used in this study are described in Supporting Information, Table S1. Custom oligonucleotide probes specific for S. bayanus genes were designed and printed using a pin-style arraying robot. S. bayanus cells were grown and exposed to a variety of stimuli and RNA was harvested and labeled by direct incorporation of fluorescent nucleotides into cDNA. Deletion and insertion mutants were produced in diploids by homologous recombination using adaptations of standard methods for S. cerevisiae and haploids were obtained by sporulation and dissection. S. bayanus data and a compendium of S. cerevisiae data were processed for gene function prediction using support vector machines. As there were no existing biological process annotations in S. bayanus, we adopted the annotations from S. cerevisiae for training.
The microarray expression data are available from GEO as GSE16544 and GSE47613. The interactive network view of the expression data and searchable prediction results are available at http://bayanusfunction.princeton.edu. Complete methods information is included as File S1.
Developing S. bayanus into a new model system required an initial characterization of its growth habits and preferences, along with the development of genetic tools to enable the types of studies that are routine in established model systems.
Phenotypic analysis and genetic tools
We began our work in S. bayanus by measuring its growth and physiology. As previously reported (Goncalves et al. 2011; Salvado et al. 2011), in minimal media at 20°, S. bayanus grows faster than S. cerevisiae (Figure 1A). The species grew at nearly equal rates at 25°, and at 30° S. bayanus grew more slowly than S. cerevisiae (Figure 1, B and C). Accordingly, S. bayanus was more sensitive to heat shock than S. cerevisiae; transfer to 40° slowed growth of S. bayanus more than it did in S. cerevisiae (Figure 1, D and E). This heat sensitivity precludes efficient lithium acetate transformation using heat shock at 42°, so we modified our procedure to use a milder 37° heat shock for S. bayanus.
When grown on glucose medium to the point of glucose depletion, S. bayanus underwent a diauxic shift marked by a growth arrest followed by a shift to ethanol consumption and a slower growth rate (Figure 1F), consistent with its natural history and qualitatively similar to the behavior of S. cerevisiae. We also measured the growth inhibition by a variety of transition metals, salts, and oxidants (Figure 1G). The survival of S. bayanus and S. cerevisiae was similar during starvation for the essential nutrients sulfate and phosphate (Figure 1H). Finally, we analyzed our S. bayanus strain for the presence of the 2μ plasmid and observed that it does not carry detectable levels of the plasmid, although a hybrid with S. cerevisiae prepared in our laboratories maintains this DNA element (Figure 1I).
We constructed a Tn7 insertion library (Kumar et al. 2004) to create a collection of S. bayanus mutant strains. We built a Tn7 transposon carrying a ClonNat resistance marker selectable in both bacteria and yeast. The transposed marker carries stop codons in all reading frames near both termini and so is expected to produce truncations when inserted within genes. Our library contained ∼50,000 unique genomic insertions, and we have used it to screen for a variety of phenotypes including auxotrophies, drug resistance, and copper resistance (see below). By transforming the library into MATα strains and using a ClonNat resistance marker, mutants isolated from this Tn7 set can be used directly in complementation assays by mating to S. cerevisiae strains from the widely used MATa deletion set that carries complementary G418 drug resistance. Insertion mutations can also be mapped using microarray or sequencing technologies (see below). We expect that this mutant collection will be a valuable resource for mutation screening in this new species.
Gene-expression data set
Just as lessons learned from early whole-genome sequencing projects led to more efficient sequencing of related genomes in subsequent projects, we can leverage the thousands of microarray experiments performed in the yeast S. cerevisiae to direct efficient expression profiling in a related organism. Given the shared history of these species, we reasoned that experiments with high predictive value of gene function in S. cerevisiae were also likely to be useful in related yeasts. We also assumed that most of these treatments were likely to target similar ranges of functional categories in the two species. With these ideas in mind, we developed a data-driven experiment recommendation system to identify the minimal set of maximally informative experiments for functional characterization of the S. bayanus genome based on the S. cerevisiae gene-expression literature (Guan et al. 2010).
We carried out 304 microarray measurements in 46 experimental manipulations (detailed in Table S2). Because of the many practical similarities with S. cerevisiae, the experiments were effectively prototyped for us by their original S. cerevisiae publications, in many cases needing only minor modification to adapt them for S. bayanus. Our computationally selected treatments perturbed the majority of the genes in the cell: 4828 of the 4840 S. bayanus genes measured by our array show twofold or greater change in at least one treatment.
Hierarchical clustering of this S. bayanus gene-expression compendium revealed a number of groups of genes coexpressed under a variety of conditions (Figure 2, numerical data in Table S3). Although clustering was performed solely on the S. bayanus data and was not informed by the evolutionary relationships between S. bayanus and S. cerevisiae genes, we noted many groups of S. bayanus genes nevertheless showing expression patterns similar to those in S. cerevisiae. Most strikingly, two large cohorts of genes responded coordinately to multiple stresses, with one group repressed and the other induced. This large-scale response indicates that S. bayanus shows the canonical environmental stress response identified in S. cerevisiae (Gasch et al. 2000) and other yeasts (Gasch 2007). Other treatments elicited gene-expression responses from smaller groups of genes. For instance, a group of genes was strongly upregulated in response to alpha-factor pheromone. This pheromone response declined as cells were released from alpha-factor arrest into the cell cycle. As another example, two other groups of genes were expressed periodically during the cell cycle with different phases of peak gene expression.
As an initial test of whether these expression clusters reflect functional gene groupings in both species, we started with the simplest—and almost certainly incorrect—assumption that all genes in S. bayanus have the same functions as their orthologs in S. cerevisiae. Using these inferred annotations, we calculated the Gene Ontology (GO) term enrichment for correlated clusters, and we observe significant enrichment for genes of like biological process and cellular component among the clusters of genes with coherent expression (Figure 2). Further, the expression patterns in these clusters showing compartment-specific or biological process enrichment are consistent with the expression patterns of genes involved in the same biological process in similar S. cerevisiae experiments. For instance, the cluster of genes activated by mating pheromone was enriched for genes whose S. cerevisiae orthologs have experimentally validated roles in response to pheromone, conjugation, and karyogamy.
Gene-expression patterns diverge in subtle ways
Although many aspects of gene expression are conserved, we noted a number of instances of gene-expression patterns different from those observed in S. cerevisiae orthologs in response to similar treatments. In S. cerevisiae, the galactose metabolism genes were induced only to detectable levels in the presence of galactose (Gasch et al. 2000). However, in S. bayanus, the orthologs of the galactose structural genes GAL1, GAL10, GAL7, and GAL2 were detectably induced not only when cells were exposed to galactose, but also when cells were switched from glucose to other less-preferred carbon sources including ethanol, raffinose, sucrose, and glycerol (Figure 3A). The derepression of galactose metabolism genes on nonglucose carbon sources has been previously described in detail in S. cerevisiae (Matsumoto et al. 1981; St John and Davis 1981; Yocum et al. 1984), but the magnitude of this increase in gene expression on nonglucose carbon sources is much greater in S. bayanus. We verified this expression difference between S. bayanus and S. cerevisiae using quantitative PCR for GAL1 (Figure S1). This activation of the galactose structural genes by multiple carbon sources suggests that S. bayanus might have evolved in an environment in which galactose becomes available at the same time as other nonglucose carbon sources.
We created a resource that presents a network view comparing gene expression between S. cerevisiae and S. bayanus (http://bayanusfunction.princeton.edu). The gene-expression network around GAL1 showed that GAL1, GAL10, and GAL7 have a correlation of 0.99 in both species under all expression conditions (Figure 3B). However, the correlation of the GAL genes with other genes revealed differences in regulation between species. For instance, the ortholog of the hexose transporter HXT7 had a correlation of 0.98 with the galactose genes in S. bayanus because this and other hexose transporters were upregulated whenever glucose was low. In contrast, in S. cerevisiae the correlation between HXT7 and GAL1 was only 0.19 because HXT7 was upregulated in response to declining glucose concentration while GAL1 was not.
Transcription factors as a group showed higher than expected divergence in expression between S. bayanus and S. cerevisiae, and the S. bayanus ortholog of IME1 (670.55, which we will refer to as SbayIME1) in particular showed exceptions to the diploid-specific expression observed in S. cerevisiae. In S. cerevisiae, IME1 expression is primarily limited to diploid cells (Kassir et al. 1988), but in haploid MATa S. bayanus, SbayIME1 was induced over 10-fold by alpha-factor pheromone (Figure 3C). As observed in S. cerevisiae, SbayIME1 is required for sporulation (data not shown), and although SbayIME1 was strongly induced by alpha factor we did not observe significant changes in the pheromone response of Sbayime1 mutant cells (Figure S2). Chromatin immunoprecipitation experiments observed twofold higher levels of the pheromone response transcription factor SbaySte12 (570.3) at the SbayIME1 promoter as compared to Ste12 occupancy at the IME1 promoter in S. cerevisiae (Borneman et al. 2007), supporting our observation of differential pheromone activation of SbayIME1 in S. bayanus as compared to ScerIME1. In S. cerevisiae Ime1 is subject to translational regulation (Sherman et al. 1993), and the lack of an effect on transcription in response to pheromone in the Sbayime1 mutant could similarly be explained by post-transcriptional regulation. IME1 has been observed to be under selective pressure in S. cerevisiae (Gerke et al. 2009), and the altered expression here may suggest that it is evolving to take on additional roles.
S. bayanus gene function predictions via machine learning are confirmed by mutational analysis
By comparing gene expression between orthologs under known conditions we were able to find examples of changes in gene expression and use these changes to infer functional differences between species. Such inferences are limited by existing knowledge of the link between expression and biological function and by the availability of directly comparable data sets in both species. These limits can be overcome using computational interpretation of expression data, which accurately predicts gene function over much larger data sets than a human can process (Huttenhower and Troyanskaya 2008). Using a support vector machine (SVM) learning method trained using the GO biological process annotations of S. cerevisiae orthologs, we predicted the functional roles of S. bayanus genes (Table S4).
Many gene functions are preserved over vast evolutionary distance, as evidenced by the many examples of mammalian genes that can complement deletion mutations in yeast (reviewed in Osborn and Miller 2007). Accordingly, we found that many genes were predicted to have the same function in S. bayanus and S. cerevisiae even though the SVM does not reference protein sequence homology while making predictions. For example, we predicted a role in oxidative phosphorylation for 643.11, the ortholog of RPM2, the mitochondrial RNAseP required for processing mitochondrial tRNAs from transcripts. Consistent with this prediction, an insertion mutant in SbayRPM2 was respiratory deficient (Figure S3). Similarly, we predicted a role in cell morphogenesis for 678.66, the ortholog of AMN1. A knockout mutant of Sbayamn1 lost daughter cell adhesion (“clumpiness,” Figure S4), as has been observed for the amn1 deletion allele in S. cerevisiae (Yvert et al. 2003). As a third example, we predicted a role for telomeric silencing and protein acetylation for 668.17, the ortholog of the protein acetyltransferase ARD1. In a MATa insertion mutant of Sbayard1, we observed repression of MATa haploid-specific genes, as reported for ard1 mutants (Whiteway et al. 1987) (Figure S5A) and note that the mutation causes genome-wide expression changes (Figure S5B). For the whole-genome duplicate serine/protein kinases 642.24 (DBF2) and 636.21 (DBF20), we predicted roles in the regulation of mitosis and the regulation of DNA damage checkpoints, similar to the established roles of the S. cerevisiae orthologs in regulating cytokinesis and mitotic exit. As in S. cerevisiae, mutations in these genes are synthetic lethal (data not shown).
The functional predictions also can predict gene functions not yet known in S. cerevisiae. We carried out a screen for Tn7 mutants resistant to copper sulfate and identified a resistant mutant (Figure 4, A and B). Using an array-based method (Gabriel et al. 2006), we mapped the insertion upstream of 610.13, the ortholog of OPT1 (Figure 4C). Deletion analysis of SbayOPT1 and the divergently transcribed neighboring gene SbayPEX2 (610.12) confirmed that mutation of SbayOPT1 was responsible for resistance to copper (Figure 4D). The functional predictions for SbayOPT1 include cation homeostasis, the GO parent term that includes copper ion homeostasis (our functional predictions did not include GO terms with few members). ScerOPT1 (also named HGT1) has been characterized as a high-affinity glutathione transporter induced by sulfur starvation (Bourbouloux et al. 2000; Srikanth et al. 2005). Copper resistance had not been investigated in this mutant, although sensitivity to cadmium had been noted (Serero et al. 2008). The OPT1 mutant in S. cerevisiae also showed increased resistance to copper (Figure 4E). Of note, S. bayanus is more sensitive to copper than the laboratory strain of S. cerevisiae; our screen in the sensitized background of S. bayanus likely provided added sensitivity to detect genes involved in the response to copper (Figure 4E). These results suggest the potential for a relationship between glutathione transport and copper resistance and demonstrate how the predictions of gene function in S. bayanus provide information about conserved gene function in S. cerevisiae.
Different rates of functional divergence characterize different gene groups
Just as genes involved in different biological pathways have been observed to evolve at the sequence level at different rates (Aris-Brosou 2005; Wolf et al. 2006), certain classes of genes may show more rapid functional divergence. We examined our predictions of gene function in both species and identified cases in which a pair of orthologs showed very large changes in predicted function between species (Table 1, full data in Table S5). We observed the smallest number of changes in ribosomal biogenesis and in electron transport, and many core metabolic processes showed few changes, consistent with these genes’ typical conservation at the sequence level. Processes showing the highest amount of change included response to osmotic stress, autophagy, and organelle inheritance. Although it was not immediately obvious why these processes are changing so quickly, these results will help to guide future experiments. We also observed significant change in small GTPase mediated signal transduction and hypothesize that this may reflect the constitutive signaling through the mating pathway caused by a mutation common in laboratory strains of S. cerevisiae (Lang et al. 2009) not present in the S. bayanus strains used here.
Annotations for species-specific genes
Genome sequence analysis allows comparison of gene content in different species, which can suggest the evolutionary pressures that shape specific lineages (Gordon et al. 2009). Similarly, examining the functional roles predicted for genes found in one species but not another can suggest potential functions for these species-unique genes, revealing species-specific adaptations. We examined the expression data of S. bayanus genes that do not have orthologs in S. cerevisiae and found a prominent cluster of 25 genes that includes 13 genes specific to S. bayanus—including 8 with no orthologs in any surveyed yeast (Gordon et al. 2009) (Figure 5A). These genes were induced 16- to 32-fold by peroxide stress, bleach, and MMS but not other stresses or any other conditions tested in our compendium. Peroxide, bleach, and MMS all increase reactive oxygen levels (Winter et al. 2008; Kitanovic et al. 2009), so we propose that this group of genes responds specifically to oxidative stress. Two DNA sequence motifs are enriched in the promoters of the S. bayanus genes in this cluster, and these motifs are very similar (Table S6, P < 7 × 10−5, Mahony et al. 2007) to motifs established by analysis of sequence conservation among the sensu stricto yeasts (Kellis et al. 2003). Furthermore, one of the motifs is similar to that of S. cerevisiae CAD1 (Harbison et al. 2004), a transcription factor with a role in stress response (Wu et al. 1993). As the CAD1 ortholog in S. bayanus has been annotated as a pseudogene (Scannell et al. 2011), it is likely that some other transcription factor may be activating these genes. The stress-responsive gene YAP1 has a similar binding site in S. cerevisiae and is a candidate for the oxidative stress activation we observe. The number of genes specific to S. bayanus annotated to oxidative stress suggests that S. bayanus may encounter a different spectrum of stresses.
Our functional predictions for genes in our oxidative stress cluster included response to toxin (GO:0009636), sulfur metabolic process (GO:0006790), and response to temperature stimulus (GO:0009266) (Figure 5A). Many of these functions have been demonstrated for the 12 genes that have S. cerevisiae orthologs, and 10 of the 12 S. cerevisiae orthologs are induced by hydrogen peroxide (Gasch et al. 2000; Causton et al. 2001). Five of the S. cerevisiae orthologs of this cluster have been assigned the GO biological process of response to toxin (GO enrichment, P < 4.07 × 10−9, Bonferroni corrected), and two of the S. cerevisiae orthologs in this cluster have roles in sulfur metabolism: GTT2 is a glutathione S-transferase, and YCT1 is a cysteine transporter. The predicted role in toxin response is consistent with the activation by oxidative stress, because in S. cerevisiae, genes assigned to this biological process are induced by the mycotoxin citrinin, which causes oxidative stress (Iwahashi et al. 2007). Also, the sulfur metabolic process includes genes involved in sulfur assimilation, a biochemical process that consumes reducing equivalents. Of the 12 proteins in this cluster that have S. cerevisiae homologs, 5 are proteins of unknown function. These functional predictions from S. bayanus may help to inform functional experiments on the S. cerevisiae orthologs.
Gene duplicates are known to play a prominent role in yeast genome evolution. Among our functional predictions for the S. bayanus genome, we examined the seven genes present in duplicate in S. bayanus but not in S. cerevisiae and noted that our expression data had yielded a prediction of a role in galactose metabolism for one of these genes (Table S4), which had also been previously noted on the basis of comparative homology (Hittinger et al. 2004, 2010; Cliften et al. 2006; Gordon et al. 2009; Scannell et al. 2011). Both duplicates of the ancestral GAL80 gene are retained in S. bayanus, but only GAL80 is present in S. cerevisiae. The S. bayanus GAL80 ortholog 555.11 retains its function as a repressor of galactose genes, as GAL genes were no longer repressed when Sbaygal80 mutant cells were grown in glucose (Figure S6), a derepression known in Scergal80 mutants (Douglas and Hawthorne 1966; Yocum and Johnston 1984). In addition, 670.20, the ohnolog of SbayGAL80, which itself has no ortholog in S. cerevisiae, was predicted to function in galactose metabolism by our SVM. Indeed, we observed activation of 670.20 in response to galactose (Figure 3A), and Gal4 binding sites are present upstream of the gene. The galactose-specific activation of 670.20 differs from the response of the other S. bayanus GAL family genes, which are activated by growth on multiple nonglucose carbon sources. We also noted that 670.20 was derepressed in the Sbaygal80 mutant, as were other GAL genes (Figure S6).
To more directly study the role of 670.20 in galactose metabolism, we measured the gene-expression response of 670.20 mutants to a shift from raffinose to galactose and observed a set of genes that failed to be activated by galactose in the 670.20 mutant (Figure 5B). These four genes are also members of the oxidative stress cluster shown in Figure 4. Notably, the genes regulated by the S. bayanus-specific 670.20 are themselves present only in S. bayanus, forming a species-specific network.
Although the genomes of many nonmodel organisms are now sequenced, this flood of data has not been matched by functional experimental data in these species. Much of this can be attributed to the difficulty of working with unfamiliar organisms, but many other species lend themselves to laboratory study for comparative work. For example, the fly species sequenced by the 12 Drosophila species consortium (Consortium et al. 2007) can all be lab reared, as can several sequenced species of nematodes (Cutter et al. 2009). Yeast are of course another taxa with many lab-amenable species.
Using gene-expression data we functionally annotated all the genes in S. bayanus (Table S4) and demonstrated the accuracy of our predictions using targeted mutational analysis. A sufficiently complex gene-expression data set can be used not only to compare strategies of gene regulation but also to predict biological function (Guan et al. 2008). Identifying regulatory changes across different species provides interesting insight into selection and adaptation. For instance, comparing the protein sequences encoded in bacterial genomes has helped to predict the metabolic capabilities of different lineages (Downs 2006). Our measurements of gene expression under well-characterized conditions directly relevant to defined biological processes illustrate examples of altered gene regulation that suggest functional differences between species.
Conversely, evidence of gene function in other species may be used to generate hypotheses about the functions of the orthologous genes of model systems, many of which still lack annotations (Peña-Castillo and Hughes 2007). Our study demonstrates the potential of computationally predicted annotations for both functional characterization and evolutionary analysis of new species.
The tools we have developed are generic and could easily be applied to other nonmodel organism species of interest. Application of our comparative approach to other groups of related species, such as Candida yeasts, Drosophila species, worms, or mammals, could extend the evolutionary observations made here. Since our experimental and analytical framework are agnostic to species and platform, they should be easily transferable to other systems. This new style of comparative functional genomics will ultimately allow better understanding of conservation and divergence in gene function and regulation and allow rapid adoption of experimental systems beyond the traditional model organisms.
We thank Jessica Buckles and Donna Storton in the Princeton Microarray Facility for assistance with production and processing of arrays, John Matese at the Princeton Microarray Database for support of data processing, John Wiggins for technical support of classroom computing resources, Dannie Durand and Manolis Kellis for helpful conversations, Jasper Rine for information on the bar1 mutation in the S. bayanus strain, Doug Koshland and Yixian Zheng for purchase of microarray oligonucleotides, Zhenjun Hu (developer of VisAnt) for assistance with gene network visualization, Zeiss for donating use of tetrad dissection microscopes, and Molecular Devices for donating use of Genepix software. O.G.T. is supported by the National Science Foundation CAREER award DBI-0546275, and by National Institutes of Health (NIH) R01 grants GM071966 and HG005998. M.J.D. is supported in part by a grant from the National Institute of General Medical Sciences (8 P41 GM103533-17) from the NIH. A.A.C. is supported in part by grants from the Canadian Institutes for Health Research. All authors were supported by the National Institute of General Medical Sciences (NIGMS) Center of Excellence P50 GM071508, and by donations from the A. V. Davis Foundation and Princeton University for funding of QCB301, Experimental Project Laboratory. M.J.D. is a Rita Allen Scholar and a Canadian Institute for Advanced Research Fellow. All authors participated in the design and execution of experiments through the educational activities of the NIGMS Center for Quantitative Biology at Princeton University. A.A.C., Y.G., M.J.D., and O.G.T. analyzed the data and wrote the manuscript.
Communicating editor: M. Johnston
- Received June 5, 2013.
- Accepted July 4, 2013.
- Copyright © 2013 by the Genetics Society of America
Available freely online through the author-supported open access option.