Many scientists complain that the current funding situation is dire. Indeed, there has been an overall decline in support in funding for research from the National Institutes of Health and the National Science Foundation. Within the Drosophila field, some of us question how long this funding crunch will last as it demotivates principal investigators and perhaps more importantly affects the long-term career choice of many young scientists. Yet numerous very interesting biological processes and avenues remain to be investigated in Drosophila, and probing questions can be answered fast and efficiently in flies to reveal new biological phenomena. Moreover, Drosophila is an excellent model organism for studies that have translational impact for genetic disease and for other medical implications such as vector-borne illnesses. We would like to promote a better collaboration between Drosophila geneticists/biologists and human geneticists/bioinformaticians/clinicians, as it would benefit both fields and significantly impact the research on human diseases.
- functional genomics
- genetic disease
- whole-exome sequencing
- public health
ONE of the most current and common discussion topics among biologists is the decline in federal support for research. As shown in Figure 1A, the total number of R01 funded projects, the gold standard for science support in the United States via the National Institutes of Health (NIH) has declined by more than 17% in the past 5 years. This decline in number of R01s is also reflected in a similar decline in total invested dollars. The reduction in the number of R01 grants to support Drosophila research is even greater than the average and hovers at ∼25% for the past 5 years (Figure 1, A and B). Finally, the total support in dollars for each Drosophila R01 has remained steady or declined, unlike the average R01 grant (Figure 1, C and D). Hence, the gap in dollar support between the average NIH R01 and the average R01 in the Drosophila field is now nearing 15%. In summary, we estimate that our model organism has lost more than 30% of its support from NIH in the past 5 years vs. a ∼15% decline in total support for all fields combined. One could argue that this loss of support at NIH is partially compensated by additional support from the National Science Foundation (NSF). Unfortunately, support for Drosophila research based on available data from NSF has decreased similarly (Figure 1, E and F). Although other funding mechanisms may partially compensate for these losses of support, they can at best be considered marginal.
This reduced support is especially surprising as fly biologists have contributed in so many different ways to our understanding of key biological phenomena and have greatly advanced our knowledge of mammalian biology (Rubin and Lewis 2000; Bier 2005; Bellen et al. 2010). On the positive side, these contributions have not gone unnoticed at the major US philanthropy for science, the Howard Hughes Medical Institute (HHMI). HHMI has steadily supported investigators in the fly field for the past 30 years and this support has been unabated. It was even expanded in the past 10 years by selecting Drosophila as a model organism to unravel neural networks at the recently developed Janelia Research Campus. We believe that Drosophila research has indeed numerous assets, and we will argue in this perspective that the fly has a tremendous amount of knowledge and new discoveries to offer that will directly as well as indirectly benefit humanity. We will especially focus on the opportunities afforded by new molecular and sequencing technology and better access to human data. Drosophila provides many opportunities to solve numerous medically relevant problems and should continue to be supported much more broadly to continue to pioneer fundamental discoveries.
Past Contributions of Drosophila Research to Biomedical Research
Although it may seem moot to emphasize the past contributions of fly geneticists, developmental biologists, and neuroscientists, it is worthwhile to very briefly reiterate some of the most important contributions that originated in Drosophila research and their impact on the biomedical community. We refer the readers to several review articles that describe additional examples (Rubin and Lewis 2000; Bier 2005; Spradling et al. 2006; Arias 2008; Bellen et al. 2010).
Genetics and epigenetics
It is difficult to overstate the contribution of discoveries grounded in Drosophila genetics in the first part of the 20th century initiated by Thomas Hunt Morgan and his trainees. Morgan, Sturtevant, Bridges, and Müller established the chromosomal basis of inheritance (Morgan 1915; Sturtevant et al. 1919). In classical studies, Müller showed that X-rays were mutagenic (Müller 1928) and Sturtevant demonstrated linkage (Sturtevant 1917) and showed how unequal crossovers led to duplications and deletions (Sturtevant and Morgan 1923; Sturtevant 1925), a mechanism that underlies numerous human genetic disorders (Stankiewicz and Lupski 2006). Furthermore, both also contributed to the discovery of genes that affect position-effect variegation (Sturtevant 1925; Müller 1930), which turned out to be key conserved players in chromatin modification and epigenetic gene regulation (Kleinjan and van Heyningen 2005). These, and many other observations and experiments performed in the past century propelled Drosophila as the premier research system for genetics.
Development and cancer
In the field of developmental biology, Drosophila has played a leading role that started in the 1930s with the work of Poulson on Notch (Poulson 1937) and Lewis on the homeotic genes (Lewis 1978). Notch mediates cell–cell interactions in diverse contexts, and aberrations in Notch signal transduction can cause numerous cancer and other human diseases (Louvi and Artavanis-Tsakonas 2012; Yamamoto et al. 2014b). It spawned a whole field that is currently the topic of numerous biomedical investigations (Bray 2006). The homeotic genes were first shown by Lewis to affect body plan in flies and were later shown to play numerous roles in almost all higher eukaryotic species (Krumlauf 1994). Again, numerous genes that carry homeobox motifs play critical roles in cancer (Shah and Sukumar 2010).
In the late seventies, Nüsslein-Volhard and Wieschaus performed genome-wide forward genetic screens for patterning defects in fly embryos that led to the discovery of numerous players in almost all key developmental pathways, including Wnt, Hedgehog, BMP/TGFβ, and Toll signaling (Nusslein-Volhard and Wieschaus 1980). The contribution of these pathways to our understanding of human development and developmental disease as well as cancer cannot be overstated (Rieder and Larschan 2014).
Neurobiology and neurological disorders
In the field of neurobiology, Drosophila has laid the ground for numerous important discoveries (Bellen et al. 2010). The genetic networks underlying diurnal rhythmicity were initially discovered and characterized in the fly (Konopka and Benzer 1971) and similar players were shown to cause human sleep disorders, narcolepsy, and restless leg syndrome (Sehgal and Mignot 2011). Fly geneticists also discovered the founding member of the transient receptor potential (TRP) channels (Montell et al. 1985; Montell and Rubin 1989). These channels have been shown to play critical roles in vision, pain, heat, and cold perception and the trp founding member promoted the discovery of the vertebrate homologs that are associated with numerous Mendelian diseases (Dai et al. 2010; Nilius and Owsianik 2011).
The discovery of the Shaker (Sh) and ether a go-go (eag) mutants led to the identification of two very important families of potassium channels (Tempel et al. 1987; Warmke et al. 1991; Bruggemann et al. 1993). Sh was the first potassium channel identified and allowed the biochemical purification and molecular characterization of vertebrate potassium channels (Tempel et al. 1988). Cloning and sequencing of eag led to the identification of another family of potassium channels, which were subsequently linked to long QT syndrome (Curran et al. 1995). Potassium channels have now been implicated in numerous human diseases (Jentsch 2000).
The discovery of innate immunity in insects followed by the molecular dissection of how flies respond to bacterial and fungal infections led to seminal discoveries on the role of Toll receptor and its downstream signaling cascade in innate immunity (Lemaitre et al. 1996). Subsequent identification of mammalian Toll-like receptors (TLRs) and mechanistic studies revealed that the basic molecular mechanism of innate immunity is evolutionarily conserved (Medzhitov et al. 1997; Poltorak et al. 1998; Hoshino et al. 1999). In addition to playing pivotal roles in responses against microbial and viral infections, autoimmune diseases, and allergy (Montero Vega and de Andres Martin 2009), TLRs have also been found to play multiple roles in tumorigenesis (Pradere et al. 2014).
A Comparison of Drosophila and Human Genomes
The past track record has shown that studying basic principles of biology is a very successful approach and that no translational justifications were necessary for many decades to support research to establish the basic mechanisms underlying numerous fundamental biological processes. However, new opportunities for translation that were not previously possible are now possible. In order to consider these approaches, we can look at the fly genome as containing two types of genes: those with sequence homology with human genes and those that have no obvious human homologs. Of course, many of the fly genes that have no obvious human homologs are conserved in other phyla and the number of truly specific Drosophila melanogaster genes is very limited (Zhang et al. 2007).
Essential genes in Drosophila and their human homologs
The homology between fly and human genes can vary widely and many fly genes have more than one human homolog. The fly genome contains approximately 16,000 genes, ∼13,000 of which encode proteins (Adams et al. 2000; Dos Santos et al. 2014). Of the latter, more than 60% have human homologs and can be subdivided based on whether the fly homolog has multiple or single human homologs. We note that more complex evolutionary relationships also exist: multiple fly genes sometimes have a single human homolog, or multiple fly genes are homologous to multiple human genes. Among the ∼8000 fly genes with human homologs, ∼3500 have multiple human homologs while ∼4500 have only one (Eyre et al. 2007; HCOP 2014).
A further subdivision of the fly genes can be made on the basis of their essential vs. nonessential nature. Although studies have estimated that there are about 5000 essential genes in Drosophila (Benos et al. 2001), we can only find evidence for about 2000 essential genes based on the current FlyBase information (FlyBase 2014; St Pierre et al. 2014). We think this is likely not because the estimate is inaccurate as several independent estimates based on very different datasets came to similar conclusions (Jurgens et al. 1984; Nusslein-Volhard et al. 1984; Wieschaus et al. 1984; Benos et al. 2001; Ashburner et al. 2005). The observation that only ∼2000 essential genes are annotated in FlyBase suggests that the identity and the functions of the majority of essential genes remain to be elucidated, and hence, that a large body of work remains to be done before we will have a functional annotation of most essential fly genes. This task will even be more challenging for the remaining nonessential genes encoding proteins that are often poorly functionally annotated.
Lessons from an X-chromosome screen
We recently created a large collection of X-chromosome lethal mutations in a forward genetic mosaic screen designed to isolate genes that affect many developmental and neurodegenerative processes (Haelterman et al. 2014; Yamamoto et al. 2014a). The data suggest a number of interesting relationships between essential fly genes and their human homologs. More than 90% of the essential genes that were isolated in this screen have human homologs, a significant enrichment as only ∼60% of protein coding fly genes can be associated with obvious human homologs (HCOP 2014). The data indicate that essential genes are more likely to be conserved and that this relationship can be extrapolated from our data to the whole genome. Indeed, 20% of Drosophila genes with obvious homology to human genes are characterized as being essential in flies, while only 4% of the nonconserved genes cause lethality when mutated. This 20% is likely to be too low an estimate as we predict that only 40% of the essential genes have been annotated. If the relationship between lethality and evolutionary conservation that we observe holds true for the rest of the genome that is yet to be explored, we predict that more than 50% of the fly genes that are conserved between fly and human will be essential (Yamamoto et al. 2014a).
Another feature that emerged from the X-chromosome screen is that we observed a dramatic enrichment of Mendelian disease genes in the Online Mendelian Inheritance in Man (2014) (OMIM) database for homologs of essential fly genes that have more than a single homolog in humans. These genes are nearly eight times more likely to be linked to human disease than essential fly genes that have a single human homolog (Yamamoto et al. 2014a). This enrichment is still three-fold if one analyzes the total number of diseases caused per human gene in this category. Hence, genes that are essential for viability in Drosophila and have several human homologs are much more likely to cause Mendelian diseases. The simplest hypothesis is that when a single essential fly gene has multiple copies in humans, the homologs are likely to have partially redundant functions and are therefore more likely not to be lethal but to cause human disease upon loss of function (Figure 2). Hence, a better functional annotation of the Drosophila genome including the knowledge of which genes are essential would therefore already provide very valuable information to better annotate the human genome.
These findings are to some extent supported by studies focusing on vertebrate genomes in which differences have been noted between genes resulting from whole-genome duplication events vs. single gene duplication events (Makino and McLysaght 2010; Dickerson and Robertson 2012; Singh et al. 2012, 2014; McLysaght et al. 2014). Indeed, genes resulting from whole-genome duplications are overrepresented in pathogenic vs. nonpathogenic copy number variants in humans (McLysaght et al. 2014). Therefore, whether an essential fly gene has multiple or single human homologs is important for its potential role in disease.
Needless to say, human homologs of a number of fly genes that are not essential for viability are also linked to Mendelian diseases. For example, null alleles of homologs of genes that cause familial forms of Alzheimer’s disease (Appl; Luo et al. 1992), Parkinson’s disease (parkin; Pesah et al. 2004 and Lrrk; Lee et al. 2007), Duchenne muscular dystrophy (Dystrophin; Christoforou et al. 2008), Torsion dystonia (dTorsin; Wakabayashi-Ito et al. 2011), Usher syndrome (Cad99C; D’Alterio et al. 2005 and Sans; Demontis and Dahmann 2009), and Zellweger syndrome (pex2, pex10, and pex16; Chen et al. 2010; Nakayama et al. 2011) all do not cause lethality in Drosophila. Instead, these mutants exhibit behavior defects, subtle morphological phenotypes, shortened lifespan, and/or reduction in fertility. In some cases the phenotypes are very subtle or can only be seen when the animals are under stress, as seen in mutants in genes whose human homologs are mutated in Batten’s disease (cln3, cathD, and Ppt1; Myllykangas et al. 2005; Hickey et al. 2006; Tuxworth et al. 2011). However, it is currently difficult to estimate the proportion of genes in the Drosophila genome that are nonessential but are related to Mendelian disorders (Flybase 2014; St Pierre et al. 2014).
Exploring Drosophila Genes That Lack Obvious Human Homologs
For nearly 80 years (1915–1995) Drosophila biologists studied phenomena, biological processes, and genes whose relevance for human biology and translational value had been questioned by some, yet numerous discoveries in flies have been shown to be highly relevant to human biology. One may argue that there is no “translational” value in studying Drosophila genes that are not conserved in human. Here we will provide a few examples where studies of apparent “nonconserved genes” made significant contributions to our understanding of human biology as well as promoting human health.
Revealing hidden homologies through experimental approaches
Identification of mutants that suppress or enhance apoptotic cell death in vivo laid out a pathway that regulates apoptosis in Drosophila (Abrams 1999). The key players and pathways leading to apoptosis were first discovered in Caenorhabditis elegans (Ellis and Horvitz 1986), and this pathway was also not known to be conserved when it was discovered (Kornbluth and White 2005). The lack of obvious homologs of some of the key early Drosophila proapoptotic genes head involution defective (hid), reaper, and grim in the mammalian genome was used as evidence for evolutional diversity. However, following a pioneering study that showed that exogenous expression of Hid in mammalian cells can induce apoptosis (Haining et al. 1999), investigators began to consider that the pathway may be evolutionarily conserved and carried out in depth investigations. From these studies, mammalian homologs of hid, reaper, and grim were later identified and shown to have only a few amino acids conserved at their N termini (Du et al. 2000; Verhagen et al. 2000). Hence, the study of nonconserved genes has the potential to uncover hidden evolutionary conservation that bioinformatics is not able to pick up.
Exploring concepts that are shared between flies and human
Drosophila research has played a pivotal role in the molecular understanding of how the human nervous system is wired (Leyssen and Hassan 2007). Although many players involved in axon guidance and synapse formation are conserved between flies and human (Dickson 2002), not all conserved players play the same role in vivo. For example, Dscam1 (Down Syndrome Cell Adhesion Molecule 1) is a transmembrane receptor that is required for branching of axons and dendrites in the fly nervous system (Hattori et al. 2008). In contrast to vertebrate DSCAMs and other Dscam paralogs (Dscam2-4) in Drosophila, Dscam1 possesses numerous alternatively spliced exons providing the potential to generate >19,000 different isoforms that contain slightly different extracellular domains (Schmucker et al. 2000). Dscam1 forms homophilic interactions in an isoform-specific manner to produce repulsive signals, allowing neurites expressing specific Dscam1 isoforms to recognize and distinguish different neurites (Wojtowicz et al. 2004, 2007).
Although vertebrates have two DSCAM genes that are also implicated in axon guidance and synapse formation (Fuerst et al. 2008; Ly et al. 2008; Yamagata and Sanes 2008), they have many fewer alternative isoforms and do not provide the vast neuronal diversity that Dscam1 provides in Drosophila. Interestingly, however, in mammals a different class of transmembrane receptors called Protocadherins (Pcdhs) has the potential to generate a large number (>12,000) of different isoforms that provide neuronal specificity (Chen and Maniatis 2013). Although the mechanisms of generating diverse isoforms are different between Dscam1 and Pcdhs (alternative RNA splicing vs. alternative promoter usage and formation of heteromultimers), many parallels can be drawn between the two examples. Indeed, the research on Dscam1 and Pcdhs has been influencing each other, generating a synergistic effect to facilitate the molecular understanding of neuronal connectivity (Zipursky and Sanes 2010). Discovery of biological concepts arising from studies of different gene sets and mechanisms is not limited to this example but can also be found in the olfactory system (Fuss and Ray 2009). Therefore, detailed mechanistic study of a biological process in one species often provides a valuable framework to study how a similar process works in another species, even if the genes and mechanisms that are involved are not conserved.
Promoting public health by studying nonconserved genes and processes
Vector-borne diseases are some of the greatest threats to human health. Mosquitos are the deadliest animals on earth as they are the vectors for numerous prevalent infectious diseases including West Nile virus, yellow fever, dengue fever, and malaria (Hill et al. 2005). Drosophila biology, genetics, and technology development play an important role in developing strategies for control of mosquito populations as much of our genetic and molecular knowledge about insects stems from research in flies. For example, one of the first lines of defense against vector-borne diseases is insecticides. Many of these chemicals act on channels, receptors, and enzymes in the insect nervous system, some of which have been well studied in Drosophila (Hemingway et al. 2004; Raymond-Delpech et al. 2005). Recently, molecular mechanisms underlying insecticide resistance have also being identified using Drosophila populations (Mateo et al. 2014) or by generating transgenic fly strains that produce resistance genes from other species such as mosquitoes (Riveron et al. 2013). Therefore, studies of insect nervous system genes, regardless of whether they are conserved or not, will provide additional functional insights and allow development of a list of potential targets for of new insecticides. Some insecticides act on targets that are critical for the development of insects but are dispensable or absent in mammals, livestock, and wildlife. These include compounds such as juvenile hormone analogs and chitin synthesis inhibitors, processes that have been studied well in Drosophila (De Loof 2008; Moussian 2012). Considering that some fraction of nonconserved genes are also essential for viability in Drosophila (Yamamoto et al. 2014a) and that specific nonessential genes are often involved in fitness (Zhang et al. 2007), studies of Drosophila genes that do not have obvious direct human homologs remain important.
Studies on biological phenomena and processes that are specific to insects can also offer new ideas and tools to fight vector-borne diseases. Wolbachia are species of bacteria that infect insects and other species (Serbus et al. 2008). Wolbachia are vertically transmitted and can affect the fitness of the infected animals in different ways. Although the first discovery of Wolbachia was in mosquitos (Hertig and Wolbach 1924), research in Drosophila facilitated the study of these microorganisms and revealed how they affect the reproduction and lifespan of the host (Serbus et al. 2008). More recently, a unique Wolbachia strain form Drosophila has been introduced into Aedes aegypti, a species of mosquito that transmits dengue fever, yellow fever, and chikungunya (McMeniman et al. 2009; Moreira et al. 2009; Walker et al. 2011). This Wolbachia strain, wMel, has the ability to rapidly spread in a population of mosquitos, and can also block the transmission of dengue virus. Indeed, mosquitos infected with wMel have been released at two field locations in Australia and successfully invaded the endogenous population within a few months (Hoffmann et al. 2011). While this study is ongoing to monitor the actual control of dengue fever in the area, it provides an excellent example of a strategy to “translate” studies in Drosophila to improve public health.
Finally, genome manipulation technology in the Drosophila field has provided ways to manipulate the genome of other insect species (Fraser 2012) . The molecular methods employed in mosquitos allow for transgenic vectors that will control infection spread (Coutinho-Abreu et al. 2010). For example, transgenic Anopheles mosquitos with greater resistance to the Plasmodium malaria pathogen have been generated (Dong et al. 2011). Hence, technology development in Drosophila will continue to provide useful tools for researchers to study and manipulate disease vector genomes in other insect species.
Human Functional Genomics Through Drosophila Biology
Medical advancement has been greatly aided by the study of rare diseases, and better knowledge of gene function is needed in rare disease studies (IOM 2010). Many Drosophila researchers outside of medical school settings may not be as familiar with the growing need for gene function studies created by the explosion in the use of next-generation sequencing in clinical and human genetics research. Here, we will expand on this need.
Human disease variant discovery outpaces functional exploration of genes
Targeted capture of exons followed by sequencing was initially shown to be a feasible strategy for identifying genes for Mendelian disease in 2009 (Ng et al. 2009). A subsequent flood of gene identification studies ensued (Ng et al. 2010a,b; Bamshad et al. 2011; Gonzaga-Jauregui et al. 2012), and examples of utility for immediate medical treatment in some cases followed (Bainbridge et al. 2011; Mayer et al. 2011; Worthey et al. 2011). Whole-exome sequencing (WES) has subsequently proven to be a successful clinical test for many Mendelian diseases (Yang et al. 2013). In addition, WES became the technological basis for a large effort to solve previously unsolved Mendelian disorders by the Centers for Mendelian Genomics (CMG) (Bamshad et al. 2012; Centers for Mendelian Genomics 2014), and to provide diagnosis for patients presenting to the Undiagnosed Disease Network (UDN), an expanding effort to provide a diagnosis for patients with rare diseases (St Hilaire et al. 2011; Adams et al. 2014; UDN 2014). Although these successes highlight the 50% of cases that can now be solved based on human genetics evidence, a remaining 50% of patients with Mendelian disease remain without a diagnosis (Yang et al. 2013). These remaining 50% are likely more difficult to solve, representing exceedingly rare or variable phenotypes. The greatest limitation is a result of the huge number of rare and personal variants within each human genome (Coventry et al. 2010). These rare and personal alleles cannot be ignored as they may have the most dramatic functional and phenotypic effects (Lupski et al. 2011). Therefore, analyses of personal genomes is full of uncertainty as each individual has hundreds of variants of potential interest. There have been several statistical tools that have been employed to aid in the analysis to prioritize genes and variants (Liu et al. 2011; Petrovski et al. 2013). Accordingly, understanding rare Mendelian phenotypes relies on extensive functional studies, and the value of model organisms is on the rise (Karaca et al. 2014; Santos-Cortez et al. 2014; Schaffer et al. 2014).
In addition, complex trait genetics is arguably in greater need for better functional annotation. The success of sequencing in rare disease is less clear for complex traits where many loci are of interest (Dewey et al. 2014). Since the publication of the first successful study in 2005 (Klein et al. 2005), the catalog of published genome-wide association studies (GWAS) has steadily grown (Welter et al. 2014). This catalog of manually curated high-quality, replicated studies represents the most reproducible and significant group of associations between variant SNPs and complex human traits (NHGRI 2014). Nevertheless this catalog provides associations between SNPs in genomic regions and complex traits. Which gene functions are related to those traits requires biological study. Indeed, for complex traits, model organism research can provide strong links between the genes and the traits of interest (Freeman et al. 2012; den Hoed et al. 2013; MacLeod et al. 2013; Shulman et al. 2014).
The information currently available for many human genes is limited. It might include known links and association with diseases, expression data of its transcripts, known or predicted protein–protein interactions, and homology information (HUGO Gene Nomenclature Committee 2014; Online Mendelian Inheritance in Man 2014). One approach to fill the gap used in cancer studies is to integrate information from the transcriptome, proteome, and interactome to identify a group of genes and proteins that functions together (Brennan et al. 2013). Another is to work toward improved bioinformatics platforms using clustering algorithms or machine learning to find similarities between known and unknown variables for genes (Park et al. 2013). However, for human geneticists and clinicians, the idea that each patient is unique suggests that in reality our understanding of each and every gene will have to be individually improved, ideally with in vivo functional studies. The “-omics” technologies and bioinformatics frameworks will be invaluable to classify, and categorize, but ultimately run the risk of “low input, high throughput, no output science” (Brenner 2008). The work of many in the Drosophila field is complementary to these efforts. Many Drosophila researchers perform in depth analyses of the genes and gene families that are involved in a biological process that they are interested in. Although this strategy often provides high-quality data that can be applied to diverse other biological settings, the amount of work required to generate all the reagents necessary for such analysis and the labor it takes to characterize the phenotype is very time consuming. In order to provide the level of detail and attention to mechanisms needed for a more actionable gene annotation that has significant clinical value, one will clearly need a combination of approaches.
How to approach and fill the gap?
In functionally annotating genes, there is no high-throughput substitute for approaches for gene function in model organisms. First, individual labs proceeding gene by gene provides the most in-depth information about gene function in vivo. Second, discoveries can be made on a larger list of genes by making use, for example, of forward genetics in which a key phenotype or biological process is studied exhaustively in the model. These efforts can be undertaken for a fraction of the cost of similar studies in mice, and this is a key advantage for using Drosophila for functional genomics. Indeed, forward genetic screens have clear advantages and have already successfully identified Mendelian loci (Bayat et al. 2012; Yamamoto et al. 2014a) and associations for complex traits (Neely et al. 2010a,b). Selection-based screens in Drosophila have also facilitated the validation of whole-genome sequencing (WGS) findings, as demonstrated by analyzing hypoxia tolerance loci in a population of Ethiopians by integrating genotype–phenotypic information in Drosophila (Zhou et al. 2011; Udpa et al. 2014).
What was not previously possible was to rapidly translate forward genetics in model organisms to human genetics. The examples noted above for developmental pathways and genes for TRP and potassium channels took years of many scientists and clinicians working independently to eventually tie mutants with dramatic and interesting phenotypes in flies to Mendelian disorders and cancer. After the isolation of the fly mutants, years of work are required to analyze the functions of these genes and perform biochemical and structural analyses. Identification of the homologs as disease genes required a completely independent process of clinical phenotyping, sample collection, linkage analyses, and identification of mutations in disease.
In Figure 3 we propose a general approach to the gap in knowledge. For example, one approach is to start with many mutations identified in a forward genetic screen (Xiong et al. 2012; Yamamoto et al. 2012; Zhang et al. 2013; Charng et al. 2014), to map these mutants with new tools publically available in fly (Zhai et al. 2003; Cook et al. 2010b). The causative mutations can now be efficiently identified with WGS (Haelterman et al. 2014) and rescued with genomic BACs (Venken et al. 2009, 2010) to verify the specificity. Then the human homologs can be identified by their sequence homology and studied in thousands of human exomes (Bamshad et al. 2012; Yamamoto et al. 2014a). This approach can be performed even more exhaustively by taking into account the genes in entire biological pathways and processes based on the literature (FlyBase 2014; St Pierre et al. 2014).
Two important points about exploring the human correlate from Drosophila screens should be noted. First, the most in depth analyses may require more collaboration between fly biologists, human geneticists, and clinicians since communication among people with complementary skills will dramatically increase the efficiency and speed of such research. As a starting point, there are a number of publicly available resources and databases of human primary data that can be mined by fly researchers. In Table 1 we list some of the most user-friendly databases. Use of these databases and other websites can facilitate collaborations with the human geneticists and clinicians that have access to the patients. Second, the human phenotypes of interest in genetic diseases may bear little resemblance to the mutant phenotype of the homologous gene (e.g., wing defects in flies and aortic abnormalities in human: both linked to defective Notch signaling) but forward genetic screens are nonetheless valuable paths to improve functional genomics. By using the concept of phenologs (McGary et al. 2010), one can then generate a list of human diseases that may be caused by defects in genes or pathways of interest and explore their related fly phenotypes. Reverse genetic approaches are also valuable and can add to the fly and human phenotypes and gene function knowledge (Figure 3). Clearly, while specific strategies may vary, Drosophila studies are an important resource for functional genomics at multiple levels.
Combining state-of-the-art technologies to facilitate functional genomics
To systematically study the role of fly genes with human homologs to propel functional annotation of conserved genes in vivo, we propose to combine a set of tools developed in Drosophila. Of the 8000 genes that are conserved, we will prioritize the 3500 genes with multiple human homologs as they are much more likely to contribute to human genetic diseases and generate publicly available resources that will facilitate the characterization of these genes. We propose to do this in a targeted fashion by using the CRISPR/Cas9 system (Bassett et al. 2013; Gratz et al. 2013; Kondo and Ueda 2013; Yu et al. 2013) to integrate MiMIC (Minos-mediated insertion cassettes)-like insertions in coding introns of genes. MiMIC is a transposable element that inserts almost at random in the genome (Venken et al. 2011a) and already ∼2600 MiMICs inserted in introns are available through the Drosophila Genome Disruption Project (GDP) (Bellen et al. 2011; Genome Disruption Project 2014). These insertions function as gene traps that terminate the transcript when inserted in the proper orientation and all of them (including those inserted in the opposite orientation) can be converted to artificial exons that encode a marker such as EGFP with flexible linkers (Venken et al. 2011a). Most of these intronically tagged proteins can be used for numerous assays such as immunoelectron microscopy and co-immunoprecipitation using reliable commercial antibodies as well as for live imaging of proteins expressed under the control of endogenous promoters and enhancers. The tagged genes or proteins can also be inactivated in a time- and tissue-specific fashion using RNAi against GFP (iGFPi) (Neumuller et al. 2012) or a ubiquitin–proteasome system-mediated protein degradation system (deGradFP) (Caussinus et al. 2012), under the control of UAS and driving their expression using specific GAL4 drivers in almost any tissue of choice (Brand and Perrimon 1993). This should permit one to assess the function of thousands of tagged genes in any tissue of interest and assess knockdown, an important advantage since most genes are pleiotropic. We anticipate that these libraries of MiMIC-tagged genes will permit elegant, precise functional annotations of conserved genes in vivo and be very cost effective when compared to many other strategies. Moreover, these strains will be publicly available through the Bloomington Drosophila Stock Center (Cook et al. 2010a). Therefore, in a short time a significant fraction of fly genes will be immediately accessible for detailed phenotypic approaches.
To demonstrate the relevance of the functional annotation of the genes being studied in Drosophila, we recommend that the fly researchers attempt to rescue the mutant phenotype with human complementary DNAs (cDNAs) that are expressed under the control of GAL4-, LexA-, or QF-specific drivers (Venken et al. 2011b). Although this strategy is not always productive, we and others find that rescue is at least partially effective for the majority of fly–human gene pairs tested. In our hands, more than 85% of the fly mutations tested can be rescued with ubiquitous expression of a human cDNA. In some cases, when the expression levels of the human cDNA are low, codon optimization can be beneficial. Bioinformatics tools to determine if the human cDNAs have codons that are rarely used in Drosophila can be used to optimize the codon usage prior to initiating the experiment (Wu et al. 2007). These experiments are relatively easy to perform and provide compelling evidence that the information acquired in flies is relevant to human biology. For some genes, replacing the coding region of a fly gene with the human sequence using knock-in technologies may be a more attractive way to demonstrate functional conservation. In summary, many technological platforms can be employed but the flexibility of manipulating the Drosophila genome for disease studies is an obvious strength for the field.
In summary, parallel and integrative studies of genotype and phenotype in flies and human have tremendous potential for providing positive feedback that will benefit both fields (Figure 3). By utilizing numerous tools to manipulate the genome and performing diverse experiments in vivo that are not feasible in other model organisms, Drosophila research will continue to provide valuable functional information about thousands of genes and many biological processes that are evolutionarily conserved in a very cost-effective manner. In addition, deeper understanding of insect biology using Drosophila will provide us with better tools and strategies to decrease the threat of vector-borne diseases. Technological advances will allow larger collaborations and team efforts, while public resources and databases allow individual researchers to answer questions of a broader scope. Therefore we argue that there has never been a better time to use fly genetics for biomedical research.
The NIH should increase their support for these efforts because Drosophila will be an ideally suited model organism for the studies required to annotate the genome. In this postgenomic era, the amount of functional information that can be derived for a fraction of the cost of using vertebrate models should allow fly researchers to play an important role. In addition, new technologies in Drosophila and new databases providing access to human genotypes–phenotypes make it increasingly possible for Drosophila studies to fill the gap in annotating the function of thousands of genes that are involved in human diseases. The current constraints can be viewed as an opportunity to improve the “fitness” of Drosophila as a model organism for functional genomics considering the numerous advantages our field has to offer. Researchers can look to technology development, use of public databases, and collaboration as ways to work around funding constraints. There is no reason not to choose Drosophila as the “model organism of choice.”
We apologize to our colleagues for any citation omissions due to length restrictions. We thank Drs. Larry Zipursky, Hamed Jafar-Nejad, Joshua M. Schulman, and the reviewers for valuable feedback on this manuscript. We thank Dr. Karen L. Schulze for critical reading and editing of this manuscript. M.F.W. received support from the National Institute of Neurological Disorders and Stroke (NS076547), and the Simmons Family Foundation. S.Y. is a fellow of the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital. H.J.B. is a Howard Hughes Medical Institute investigator and is supported by funds from the National Institute of General Medical Sciences (2R01GM067858, 1RC4GM096355), the Robert and Renee Belfer Family Foundation, the Huffington Foundation, and Target ALS.
Communicating editor: A. S. Wilkins
- Received October 24, 2014.
- Accepted December 9, 2014.
- Copyright © 2015 by the Genetics Society of America