Abstract
Drosophila has long been a premier model for the development and application of cutting-edge genetic approaches. The CRISPR-Cas system now adds the ability to manipulate the genome with ease and precision, providing a rich toolbox to interrogate relationships between genotype and phenotype, to delineate and visualize how the genome is organized, to illuminate and manipulate RNA, and to pioneer new gene drive technologies. Myriad transformative approaches have already originated from the CRISPR-Cas system, which will likely continue to spark the creation of tools with diverse applications. Here, we provide an overview of how CRISPR-Cas gene editing has revolutionized genetic analysis in Drosophila and highlight key areas for future advances.
THROUGH more than a century of study and extensive development of genetic tools, Drosophila melanogaster has become a premier system for understanding complex biological processes at the molecular, cellular, and organismal levels. Despite this strength, until recently, making precise modifications to the genome was challenging, labor intensive, and had a low frequency of success. The first successful genome-editing strategies induced homologous recombination through P-element excision, or the in vivo generation of linear templates using multiple transgenic constructs (Gloor et al. 1991; Banga and Boyd 1992; Rong and Golic 2000; Gong and Golic 2003; Huang et al. 2008, 2009). More recent strategies have relied on the generation of a targeted double-strand break (DSB) in the genome to trigger DNA repair by the cellular repair machinery—a process that can be co-opted to precisely modify genomic sequences. Targeted DSBs were first generated by programmable nucleases, either zinc-finger nucleases (ZFNs; Bibikova et al. 2002) or transcription activator-like effector nucleases (TALENs; Liu et al. 2012). The more recent co-option of highly programmable bacterial adaptive immune systems for generating targeted DSBs has resulted in an unprecedented level of control of the genome of nearly any organism (Harrison et al. 2014). With this simple, two-component, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system, the researcher need only provide a small targeting RNA, which is easily synthesized, and the bacterial nuclease. Because all of these genome-editing strategies rely upon the cellular DNA-repair machinery, lessons learned from earlier approaches have driven the rapid advance of CRISPR-based approaches.
CRISPR-Cas Systems in Bacterial Immunity
The first CRISPR locus was identified in 1987 based on its highly repetitive sequence (Ishino et al. 1987), but it took nearly 20 years until a definitive link was made between these repeats and a role in adaptive immunity (Barrangou et al. 2007). Impressively, it was only 6 years after this seminal discovery that multiple groups published successful co-option of the system for genome editing (Cong et al. 2013; Jinek et al. 2013; Mali et al. 2013b). In the short time since the original demonstrations of genome editing using the bacterial CRISPR-Cas system, it has rapidly become a nearly universal tool in biological research. The continued development of this system builds on understanding its fundamental role in bacterial and archaeal immunity.
The CRISPR locus is an array of alternating repeats and spacer sequences that essentially provides a chronological history of the viruses and plasmids that have invaded a given bacterial strain (Barrangou et al. 2007). CRISPR-based immunity involves three steps that are all critical for protecting the bacteria from invading viruses: adaptation, expression, and interference (van der Oost et al. 2009; Makarova et al. 2011). In the adaptation step, fragments of foreign DNA are incorporated into the CRISPR locus as new spacers. During expression, the transcription and subsequent processing of the CRISPR locus provides an RNA template for recognition of complementary protospacer sequences in the invading DNA. In the final interference step, the invading DNA is cleaved and inactivated by an RNA-guided Cas effector protein. To allow the immune system to distinguish invader DNA from genomic DNA incorporated in the CRISPR array, stable binding by the effector protein and subsequent DNA cleavage requires a protospacer adjacent motif (PAM) present only in the targeted DNA.
The constantly active arms race between virus and host has resulted in a large amount of variation in CRISPR-Cas systems, which are present in most archaea and about half of all bacteria and currently divided into two classes, six types, and numerous subtypes based on the complement of Cas genes associated with the CRISPR locus (Makarova et al. 2011, 2015; Shmakov et al. 2015). Class 1 systems utilize a multi-protein effector complex to cleave invading DNA. By contrast, class 2 systems require only a single effector protein, and are therefore advantageous for genome engineering when compared to the multi-protein class 1 effector systems. The Cas effector protein most widely used for genome editing is Cas9. More recently, the single effector Cpf1 has been demonstrated to be an effective tool for genome engineering (Zetsche et al. 2015a). Species-specific features within a given type of CRISPR-Cas system have also been used to diversify genome-engineering capabilities. Species-specific variations of particular note for genome-editing purposes include the recognition of different PAM sequences, which expands the genomic sequences available for targeting, and the size of the nuclease, with smaller variants being more easily delivered into cells (Gasiunas et al. 2012; Cong et al. 2013; Esvelt et al. 2013; Hou et al. 2013; Kleinstiver et al. 2015; Ran et al. 2015; Zetsche et al. 2015a). Thus, the rapid expansion in the utility of this system for biological research and the therapeutic promise it holds is built on a foundation of research into fundamental biological properties. The continued studies of the varied mechanisms by which the CRISPR-Cas system provides bacterial immunity will be essential for further developing this powerful tool.
Detailed biochemical studies of the single Cas9 effector from Streptococcus pyogenes defined a simple two-component system for generating targeted DSBs (Jinek et al. 2012). This simplified system requires only the Cas9 nuclease and a single chimeric guide RNA (gRNA) programmed through a short sequence that base pairs with complementary DNA in the targeted genome to direct Cas9 cleavage (Figure 1). Because it only requires that the researcher generate a single small RNA easily tailored to a specific sequence, this system has been rapidly adopted as the mechanism of choice for generating targeted DSBs. Furthermore, the identification of the two nuclease domains required for generating double-strand DNA breaks enabled the subsequent inactivation of one or both domains to generate nickase and nuclease-dead versions of Cas9, respectively, which have been co-opted for further manipulations of the genome (Gasiunas et al. 2012; Jinek et al. 2012; Mali et al. 2013a; Qi et al. 2013; Ran et al. 2013). In this chapter, we will focus on how this continuously evolving technology has been developed for use in Drosophila and highlight specific areas that will be exciting for future expansion.
Adoption of the CRISPR-Cas system for two-component genome editing. (A) A Cas effector protein is targeted to a PAM-containing DNA target by a crRNA and tracrRNA. The invading DNA is subsequently cleaved by the RuvC and HNH nuclease domains, generating a DSB. (B) A two-component system composed of Cas9 and a single chimeric gRNA can cleave genomic DNA containing a PAM sequence. Target specificity is determined by the ∼20 nucleotides at the 5′ end of the gRNA, allowing the researcher to program Cas9 cleavage.
CRISPR-Cas Components
The research community has rapidly adopted the CRISPR-Cas9 system for genome engineering in large part because the system is straightforward and easy to use. Any laboratory with basic molecular biology expertise can readily generate the required reagents, gRNAs, Cas9, and DNA donors, and the delivery of these components is similarly straightforward. Since there are multiple print and web-based resources that provide step-by-step instructions on how to carry out a Cas9-mediated genome-editing experiment in Drosophila (Bassett and Liu 2014; Housden et al. 2014; Gratz et al. 2015a,b; Housden et al. 2016; Housden and Perrimon 2016a,b,c,d; Port and Bullock 2016b); see Box 1), here we provide a general overview of the applications of CRISPR that highlights the resources available to fly researchers, and discuss the results of tool-development studies that may improve the outcomes of genome-editing experiments.
Web-based CRISPR resources for researchers working with Drosophila
Addgene: addgene.org/crispr/drosophila
- repository of Cas9, gRNA, and donor plasmids; informational resources
Bloomington Drosophila Stock Center: flystocks.bio.indiana.edu/Browse/misc-browse/CRISPR
- repository of transgenic CRISPR-Cas9 stocks
CRISPR fly design: crisprflydesign.org
- descriptions and links for fly and molecular reagents, protocols, unpublished data
DRSC/TRiP Functional Genomics Resources: fgr.hms.harvard.edu
- descriptions and links for fly and molecular reagents, Cas9-modified cell lines, protocols, a curated selection of publications, and resources in development for the fly community
DGRC: dgrc.bio.indiana.edu
- repository of Cas9, gRNA, and donor plasmids
FlyBase:CRISPR: flybase.org/wiki/FlyBase:CRISPR
- a compendium of resources, including links to target sequence search programs, repositories of fly strains and vectors, and additional links of interest; a curated selection of methods papers and reviews
flyCRISPR: flycrispr.molbio.wisc.edu
- descriptions and links for fly and molecular reagents, protocols, unpublished data
Insect Genetic Technologies Research Coordination Network: igtrcn.org
- highlights of recent genome engineering advances
NIG-FLY FlyCas9: shigen.nig.ac.jp/fly/nigfly/cas9
- repository of Cas9 and transgenic gRNAs fly strains and related molecular reagents; protocols
Online programs for target identification
CCTop: crispr.cos.uni-heidelberg.de
CHOPCHOP: chopchop.cbu.uib.no
CRISPOR: crispor.tefor.net/
CRISPR Optimal Target Finder: tools.flycrispr.molbio.wisc.edu/targetFinder
CRISPRscan: crisprscan.org
E-CRISP: e-crisp.org/E-CRISP
Find CRSPRs: flyrnai.org/crispr2
Cas9 Target Finder: shigen.nig.ac.jp/fly/nigfly/cas9/cas9TargetFinder.jsp
Genome engineering approaches using ZFNs, TALENs, and CRISPR are all based on the premise that inducing double-strand DNA breaks at targeted sites will force the cell to repair the break, opening a window of opportunity for modifying the original sequence during the repair process. Thus, all CRISPR experiments require Cas9 to induce a double-strand DNA break. Cas9 can be injected as DNA, mRNA, or protein, or transgenically expressed. These options have been reviewed extensively elsewhere (Bassett and Liu 2014; Housden et al. 2014, 2016; Gratz et al. 2015a,b; Housden and Perrimon 2016a,b,c,d; Port and Bullock 2016b). We note that all strategies are functional, but that transgenic Cas9 sources yield the highest efficiency of engineering. Several groups have created transgenic lines that express Cas9 under the control of various enhancers and promoters to influence when and where genome editing occurs. Strains that express Cas9 in the germline can be used to generate heritable mutations, although there is evidence that Cas9 is likely active in the soma of many of these reportedly germline-specific transgenic strains, with the possible exception of a nos-Cas9 strain, which may be important to consider in the case of cell-lethal manipulations (Port et al. 2014; Ren et al., 2013; Gratz et al., 2014). It is also important to recognize that maternal deposition of Cas9 can modify the embryonic genome in the presence of gRNA (Lin and Potter 2016b). Thus, even progeny that do not inherit the Cas9 transgene are subject to editing if the mothers express Cas9 in the germline. More recently, researchers have explored using the Gal4-UAS system for expressing Cas9 in somatic tissues in combination with gRNAs to generate tissue-specific knockouts (Port et al. 2014; Xue et al. 2014; Port and Bullock 2016a). On a cautionary note, there is some evidence that Cas9 by itself has toxic effects when expressed at high levels using binary expression systems (Port et al. 2014; Port and Bullock 2016a). To address this, there have been some efforts in flies to limit Cas9 activity by, for example, dialing down UAS-driven expression levels (Port and Bullock 2016a). Others, working mainly in mammalian cell lines, have reported alternate strategies to regulate Cas9 activity using optogenetic and chemical-induction approaches (Hemphill et al. 2015; Nihongaki et al. 2015; Zetsche et al. 2015b).
The second obligate component of all CRISPR experiments is a gRNA for targeting Cas9 to a specific sequence. gRNAs can be targeted to different regions of a gene to obtain a variety of desired outcomes. For example, a gRNA targeting a coding region can be used to create a loss-of-function allele, or a pair of gRNAs flanking a gene can generate a deletion. gRNAs find their target via 18- to 20-nt sequences that base-pair with complementary genomic DNA (Figure 1). In designing a gRNA, the goal is to identify an 18- to 20-nt sequence in the genomic-target region that is both adjacent to a PAM site and unique, or, at a minimum, has limited similarity to other sequences in the genome that might be subject to off-target cleavage. As discussed in more detail below, target sites as close as possible to intended edits are preferred. The frequency at which the PAM sequence occurs in a genome will define the number and position of target sequences. Multiple web-based gRNA target sequence search engines, including several created by fly researchers, can be used to identify high quality target sequences (Box 1). Guidelines for designing the most efficient gRNAs are still under investigation, and it remains to be determined how well guidelines based on data collected from cultured mammalian cells and other organisms apply to designing gRNAs for Drosophila (Farboud and Meyer 2015; Malina et al. 2015; Moreno-Mateos et al. 2015; Doench et al. 2016; Haeussler et al. 2016). Nonetheless, a number of studies suggest that the gRNAs that cleave most effectively have higher than average GC content and low U content [note that poly(T) stretches should be avoided since these may signal transcription termination]. Yet even the best-designed gRNA will be rendered ineffectual by a SNP in the target sequence or PAM. Thus, it is critical to sequence the target region in the genome of the fly strain being engineered rather than rely on the reference genome sequence. While gRNAs with multiple predicted off-targets should be avoided in general, when establishing engineered animal lines off-target effects are less of a concern than in cells because it is possible to eliminate these events by outcrossing. Thus, only potential off-target sites positioned close to the target site are of significant concern for Drosophila researchers generating engineered fly lines.
gRNAs are commonly supplied as RNAs or plasmids. To generate deletions, or target two or more genes simultaneously, multiple gRNAs can be supplied (Gratz et al. 2014). Researchers working in a variety of systems, including flies, have co-opted ribozymes to liberate multimerized gRNAs that are expressed under the control of a single promoter (Gao and Zhao 2014; Xie et al. 2015; Port and Bullock 2016a). Similarly, work done in flies showed that encoding tRNAs along with the gRNAs enables processing of multiple gRNAs from a single transcript and may enhance cell-specific gene editing outcomes of an individually expressed gRNA (Port and Bullock 2016a). gRNAs can also be transgenically expressed. When designing experiments using integrated gRNAs, the possible generation of a genedrive that can be propagated throughout a population must be considered. As discussed in detail in the section Active Genetics, genedrives are powerful mechanisms to introduce genome edits that propagate by non-Mendelian inheritance. Because genedrives can drastically alter allele frequencies in populations, they pose an ecological risk and are not suitable for standard genome editing experiments. While CRISPR genedrives are only beginning to become subject to institutional biosafety precautions to ensure their containment, a multidisciplinary group of researchers recently came together to draft recommendations for their safe use (Akbari et al. 2015). When properly designed, integrated gRNAs can be useful tools; however, particular attention must be paid when generating lines that transgenically express both Cas9 and a gRNA.
The final component of a CRISPR experiment is a DNA donor template for introducing specific edits or exogenous sequences. Donor templates, which are only required in a subset of CRISPR experiments, are discussed below.
Co-Opting Cellular Repair Events for Genome Editing
Cells employ two major pathways to repair double-strand DNA breaks, nonhomologous end joining (NHEJ) and homology-directed repair (HDR), and both can be co-opted for genome editing (Figure 2). [For an in-depth discussion of DNA repair events and processes, see the FlyBook chapter on DNA repair (Sekelsky 2017)]. Here, we briefly outline the key considerations for CRISPR gene editing. NHEJ and related pathways repair broken DNA ends by ligation. Because NHEJ does not use a homologous DNA template for restoring the original sequence, insertions and deletions (indels) often occur at the breakpoint. Thus, NHEJ-based approaches are well suited for generating loss-of-function alleles by targeting breaks to critical genomic sequences and selecting for disruptive repair events. A major advantage of this approach is that these experiments are straightforward to design and execute, requiring only that the researcher supply Cas9 and a gRNA to direct DNA cleavage to a specific site in the genome. To disrupt protein function, NHEJ experiments are often designed with a single gRNA that targets cleavage within the coding region of a gene followed by selection of flies with frame-shifting indels (Figure 2 and Figure 3). Use of a single gRNA to generate indels by NHEJ can similarly be applied to interrupt noncoding elements. An alternate strategy is to supply two gRNAs to delete the sequences between the two targeted sites. In Drosophila, deletions up to ∼15 kb have been achieved in this way, and larger deletions are likely feasible (Gratz et al. 2014). The most labor-intensive step in NHEJ-based genome editing is the identification of engineered lines of interest, which, in the absence of a recognizable phenotype, requires molecular characterization of candidates. The use of coconversion approaches can reduce the workload. In coconversion, a visible marker is targeted along with the gene of interest, and only those flies showing the visible phenotype are analyzed molecularly based on the assumption that these are the flies in which the CRISPR-Cas system was active (Ward 2015; Ge et al. 2016).
CRISPR-Cas9 catalyzed genome engineering. Cas9 is guided to specific sequences in the genome by a programmable gRNA where it cleaves both DNA strands to create DSBs. The cell uses two main pathways to repair DSBs—NHEJ and HDR—both of which can be co-opted for genome editing. Templates for HDR can be single- or double-stranded exogenous DNA, the sister chromatid, the homologous chromosome, or even highly related paralogous sequences elsewhere in the genome.
Flowchart of the experimental design options for editing the genome using CRISPR-Cas9. While repair by NHEJ and HDR is observed in most cases, other DNA repair pathways such as NHEJ-related microhomology-mediated end joining (MMEJ) may also repair Cas9-induced DSBs. Different sources of Cas9 and gRNA can be used in combination; for example, a gRNA-expressing plasmid can be injected into transgenic flies expressing Cas9. Transgenic flies that express Cas9 are readily available at stock centers. Cas9 can also be supplied as a protein. Modified with permission from Gratz et al. (2015b).
The second major cellular DNA repair pathway, HDR, employs homologous DNA as a template for DNA synthesis to bridge the gap across a DSB. The fact that this pathway uses a template for repair provides an opportunity for the engineering of specific sequence changes through the introduction of a donor repair template (Figure 2 and Figure 3). HDR has been appropriated to make a wide variety of precise modifications, including the introduction of sequence substitutions, the generation of conditional alleles, and the incorporation of protein tags. Targeting the HDR pathway requires, in addition to Cas9 and an appropriately targeted gRNA, a donor repair template comprising the new sequences flanked by DNA homologous to the sequences adjacent to the cleavage site (commonly called homology arms) for recognition by the broken genomic DNA ends.
Two types of DNA donors have been used to successfully engineer the Drosophila genome: single-stranded DNA (ssDNA) donors, often referred to as ssODNs, and double-stranded DNA (dsDNA) donors (Figure 3). To ensure the incorporation of the desired editing event it is essential to consider the directionality inherent in the DNA repair process when designing the donor template, particularly with ssDNA donors. ssDNA donors are oligonucleotides that require only short homology arms (∼60 nt) for efficient HDR. The major advantage of ssDNA donors is that they can be rapidly and inexpensively synthesized. However, synthesized single-stranded oligonucleotides are size limited, so their use is restricted to small modifications such as minimal base-pair edits or the incorporation of short peptide tags. The size limit of ssDNAs also precludes incorporation of markers to facilitate screening for engineered flies, so molecular or phenotypic screening is required. In contrast, dsDNA donors, which must be supplied as circular plasmids to escape degradation, can incorporate large DNA sequences to facilitate the generation of a great variety of genome modifications. The primary drawback of dsDNA donors is that they are more labor intensive to generate, and require larger homology arms of 0.5–1 kb for efficient editing. Markers for visible screening, such as 3xP3-DsRed, which enables rapid screening for DsRed expression in the eye, are commonly incorporated into dsDNA donors and greatly facilitate identification of engineered lines (Bischof et al. 2007; Gratz et al. 2014). When flanked by LoxP or FRT recombinase sites, markers can be readily removed to minimize alterations to the engineered locus. Alternatively, they can be retained to provide a marked allele. Scarless approaches for removing screenable markers have also been developed (S. Gratz and K. M. O’C.-G., unpublished data, flycrispr.molbio.wisc.edu). In this approach, the visible marker is flanked by piggyBac transposon inverted repeat sequences, and either inserted at an endogenous TTAA site or incorporated into a TTAA site in the introduced sequences. With donors designed to recapitulate the duplication of the TTAA site that normally occurs during piggyBac insertion, piggyBac transposase-mediated removal of the visible marker results in restoration of a single TTAA site.
Whether using dsDNA or ssDNA donors, an often-unavoidable source of scars arises from the need to edit DNA donors to prevent their gRNA-directed cleavage by Cas9 either before or after successful editing. This is generally done by introducing mutations that disrupt the gRNA target and/or PAM sequence. Although sometimes the desired edit itself disrupts the target sequence and will prevent cleavage, the majority of genome-editing experiments result in the introduction of both an experimentally relevant mutation as well as a mutation designed to block further cleavage by Cas9. Because a major goal of genome engineering is to limit the introduction of ancillary changes when modifying the genome, strategies have been developed to eliminate these cleavage-blocking mutations. These approaches necessarily involve a two-step process: in the first step, the experimentally relevant mutation and cleavage-blocking mutation are introduced; in the second step, the cleavage-blocking mutation is reverted to the wild-type sequence. One scar-removal approach developed in cultured mammalian cells takes advantage of a Cas9 nuclease that has been engineered to recognize an altered PAM (Kleinstiver et al. 2015). In the first engineering step, the PAM recognized by wild-type Cas9 is mutated to the PAM recognized by the engineered Cas9. In the second step, the engineered Cas9 is used to target the mutant PAM sequence, which can then be reverted to wild type with an appropriate DNA donor template. A conceptually similar system that relies on a visible marker, rather than two forms of Cas9, has been developed for evolutionary studies in Drosophila species (Lamb et al. 2017). Scar removal may not be necessary for all experiments, but these strategies provide valuable means to eliminate unwanted ancillary mutations when desired.
In designing an HDR experiment, selection of the ideal gRNA target site requires balancing specificity and location. Because partial incorporation of donor sequences can and does occur, cleavage sites closer to the desired edit generally result in more efficient incorporation of sequence changes. Another strategy is to place the selection marker distal to the desired edit such that repair events resulting in marker incorporation will necessarily encompass the desired edit as well. Donor design also influences efficiency, and has been studied in early Drosophila genome engineering studies with ZFNs, and in mammalian cell lines where HDR is less efficient than in the Drosophila germline (Beumer et al. 2013; Richardson et al. 2016). One notable recent study found that ssDNA donors complementary to the nontargeted strand with asymmetric homology arms yielded the highest HDR rates in cells (Richardson et al. 2016). Because these results are explained by the nature of Cas9-DNA interactions, it is likely they will apply to gene editing in Drosophila as well. A growing mechanistic understanding of Cas9 function is likely to lead to additional advances in the rational design of gene editing experiments.
It is important to emphasize that while the experimenter supplies the reagents necessary for targeting a particular DNA repair pathway and selects for desired outcomes, the cell determines how the CRISPR-generated break is repaired. The predominant repair pathway used depends on cell type, developmental stage, and cell cycle. Donor configuration and locus-specific effects also likely influence repair pathway selection in ways that are currently poorly understood. Mutations that block one pathway would be expected to shift repair to the other pathway. Indeed, previous work demonstrated that mutations in lig4, encoding a ligase with a key role in NHEJ, biases repair toward HDR (Beumer et al. 2008; Bozas et al. 2009). However, this does not appear to be the case for CRISPR-mediated experiments in flies, but may be an effective strategy if combined with knockdown of Mus308, a polymerase that mediates lig4-independent end joining, as recently demonstrated in Drosophila tissue culture. (Gratz et al. 2014; Ge et al. 2016; Kunzelmann et al. 2016). The development of strategies to control or bias the cellular repair employed by the cell is an important goal for more precise genome engineering.
Finally, it is critical to recognize that anytime chromosomal DNA breaks are induced, unexpected rearrangements can occur during the repair process. Thus, it is necessary to conduct a full molecular characterization of all engineered lines. One common undesired event that should be noted is crossover repair during HDR, which results in the incorporation of the dsDNA donor plasmid backbone (the regions outside of the homology arms). In our experience, crossover repair occurs with sufficient frequency that we now incorporate a visible marker in the backbone of all our donor plasmids for selecting against backbone incorporation (K. M. O’C.-G., unpublished data; see flyCRISPR.molbio.wisc.edu for details and reagents). An advantage of ssDNA donors is that they are not subject to this complicating repair event.
Current and Future Applications of CRISPR-Cas in Drosophila
The adoption of CRISPR-Cas systems for gene editing has led to the rapid expansion of clever genomic manipulations for probing gene function and the development of additional Cas-based tools to interrogate genomes. Many of these approaches are now being combined with existing Drosophila tools and techniques to synergistic effect. Here, we highlight these and other CRISPR-Cas applications of interest to Drosophila researchers.
Modular access to the genome
Functional understanding of a gene and the processes it regulates is advanced by a variety of genetic modifications, ranging from deletion of the gene to incorporation of function-probing point mutations to the insertion of molecular tags. Thus, the desire to repeatedly edit loci of interest has driven the development of modular genome editing approaches. The most commonly employed strategies in flies take advantage of ΦC31 recombinase, which mediates recombination between attP and attB sites. One versatile strategy is the replacement of a targeted gene with a single attP “docking” site for subsequent ΦC31-mediated incorporation of any number of engineered versions of the gene (Huang et al. 2009). Originally developed for use with earlier homologous recombination techniques, this approach has since been coupled with CRISPR for rapid gene replacement and repeated access to the targeted locus (Huang et al. 2009; Gratz et al. 2013, 2014). In a related modular approach, called recombination mediated cassette exchange (RMCE), attP-flanked genomic DNA or integrated exogenous DNA can be readily exchanged with compatible attB-flanked cassettes. RMCE has been leveraged in several approaches to obtain modular access to the genome, many of which have been combined with CRISPR. For example, CRIMIC is a large-scale effort to expand the MiMIC collection (for a detailed discussion, see the FlyBook chapter on Gene Tagging). Briefly, MiMICs are attP-flanked cassettes nested in a specialized Minos transposon for integration into the genome, frequently into introns. MiMIC elements have been hopped around the genome and selected for incorporation in useful positions near genes, where RMCE has been used to incorporate cassettes that allow for the knock-down of gene expression and generation of protein traps that report the expression of targeted genes. By combining CRISPR and MiMIC, CRIMIC provides access to genes missed by transposon mobilization and those lacking introns. A similar approach allows genetic access to cells expressing targeted genes through the incorporation of modular donor exons containing viral T2A and Gal4 coding sequences. This approach, called Plug and Play, relies on T2A-mediated ribosomal skipping to yield Gal4 expression in the pattern of the host gene (Diao et al. 2015). The CRISPR-Cas9 system has also been used to replace transgenic elements previously incorporated into the fly genome; for example, Homology Assisted CRISPR Knock-in (HACK) was developed to repurpose the large number of well-characterized Gal4 enhancer traps available in Drosophila for the orthogonal QF2 binary expression system (Lin and Potter 2016a).
Additional Cas enzymes expand the reach of CRISPR
Because the need for a defined PAM limits the number of sequences that can be targeted by Cas9 nucleases, efforts have been directed at developing genome-editing systems that can be used, alone or in combination, to target any genomic sequence. The most commonly used S. pyogenes Cas9 requires an NGG PAM sequence, which varies in frequency between organisms, and between genomic regions within species. The number of sequences that can be targeted for genome editing has been expanded in three different ways. First, researchers have isolated Cas9 proteins from additional bacterial species, such as Neisseria meningitidis, that use divergent PAM sequences (Esvelt et al. 2013; Hou et al. 2013). Second, S. pyogenes Cas9 has been engineered to recognize noncanonical PAM sequences (Kleinstiver et al. 2015). Third, a Cas enzyme, Cpf1, from a type V CRISPR-Cas system that relies on the PAM sequence (T)TTN has been isolated from Staphylococcus aureus and used to edit the genomes of eukaryotic cells (Zetsche et al. 2015a; Kleinstiver et al. 2016; Port and Bullock 2016a). Cpf1 also has the advantage of being smaller than Cas9, making it easier to manipulate and transfect into cultured cells. One recent study adapted Cpf1 for use in Drosophila and found that the Cpf1-mediated cleavage frequency was reduced in comparison to Cas9-mediated cleavage (Port and Bullock 2016a). Thus, in Drosophila, Cpf1 may largely be used as an alternative approach that will be particularly useful for targeting A/T rich genomic regions lacking NGG motifs. Studies are underway to characterize new Cas-like proteins, while other efforts will increase the targeting capacity and efficiency of the current repertoire of Cas enzymes through protein engineering.
Modified Cas9 and gRNAs expand gene editing and enable targeting of proteins to DNA
Multiple approaches take advantage of mutant Cas9 proteins in which one or both nuclease domains have been inactivated. Mutating just one nuclease domain creates a nickase version of Cas9. Created to reduce off-target mutations, nickase Cas9 has been used by itself and in pairs to introduce single- and DSBs, respectively, to edit the genome (Mali et al. 2013a; Ran et al. 2013). Nickase Cas9 is seldom used in Drosophila because of the greatly reduced frequency at which desired modifications are recovered in comparison to wild-type Cas9 and the capacity to readily eliminate off-target mutations by outcrossing (Port et al. 2014; Ren et al. 2014). In mammalian cells, nickase Cas9 has also been used as a platform to create a nucleotide “base editor” that harnesses the base-excision repair pathway to generate C:G-to-T:A mutations (Komor et al. 2016). Thus, mutant Cas9 can be co-opted to trigger specific DNA-repair pathways in addition to less error-prone DSB or nickase repair pathways for genome-editing purposes.
Nuclease-inactive Cas9 (dCas9) has proven to be a readily adaptable platform to probe genome function and structure in a variety of ways (Figure 4). In these approaches, the majority of which were pioneered in cultured cells, dCas9 is fused to an effector protein, such as a transcriptional regulator or fluorescent protein, and guided to specific nucleic acid sequences by a gRNA. This platform has enabled diverse strategies to manipulate and visualize DNA. Nuclease-inactive Cas9 was first developed as a platform to regulate gene expression (Qi et al. 2013), and many of the dCas9-based tools subsequently developed have been used for this purpose in both cultured cells and organisms, including fruit flies. The initial studies of dCas9 discovered that the inactive nuclease itself could disrupt gene expression, likely by steric inhibition of RNA polymerase (Qi et al. 2013). Approaches to inhibit gene expression using dCas9 rapidly expanded to include dCas9 fused to various transcriptional repressors and chromatin modifiers (Dominguez et al. 2016). These dCas9-based tools for inhibiting gene expression are often collectively referred to as “CRISPRi.” In contrast, dCas9-fusion proteins designed to activate endogenous gene expression gained the moniker “CRISPRa.” CRISPRa and CRISPRi have both been adapted for use in Drosophila (Chavez et al. 2015; Lin et al. 2015; Ghosh et al. 2016). Like other genetic tools, CRISPRa and CRISPRi are not without drawbacks: off-target effects are a concern, as Cas9 binding appears to be more promiscuous than Cas9 cleavage, and CRISPRa may be limited in its ability to increase the expression of a gene that is already highly expressed (Lin et al. 2015). Nonetheless, CRISPRa and CRISPRi offer advantageous new means to control gene expression in flies. Advantages of CRISPRa and CRISPRi are their potential to target genes that are difficult to manipulate using Gal4-UAS or RNAi, such as large, complex genes with multiple isoforms or genes that produce noncoding RNAs, and the potential to tune expression levels more precisely than possible with binary expression systems. A collection of fly stocks for genome-wide CRISPRa is currently under development (B. Ewen-Campen and N. Perrimon, personal communication). CRISPRa and CRISPRi can also be used to modulate the expression of multiple genes at the same time. While the CRISPRa and CRISPRi tools currently adapted for use in Drosophila have a unidirectional effect on target genes (i.e., the targeted genes are either all turned on or all turned off), researchers have developed approaches in cultured cells to differentially regulate the expression of multiple genes simultaneously (e.g., to enhance the expression of one gene at the same time the expression of another gene is repressed). These approaches rely on the fusion of different transcriptional regulators to distinct dCas9-effector proteins with different PAM specificities (Esvelt et al. 2013). In an effort to gain additional spatial and temporal control over gene regulation, dCas9 has also been developed as an optogenetic tool (Polstein and Gersbach 2015). In this approach, which uses the light-sensitive dimer of the CRY2 and CIB1 plant proteins to bring together dCas9 and its effector, the dCas9-mediated effect on gene expression is regulated by exposure to blue light. This approach has been named light-activated CRISPR-Cas9 effector, or LACE. The dCas9 platform will undoubtedly continue to facilitate the development of new approaches to regulate gene expression with temporal and spatial precision.
Cas9 and the gRNA as platforms for manipulating and visualizing the genome. (A) dCas9 or nickase Cas9 can be fused to a diverase array of proteins to regulate gene expression, visualize genomic loci, and modify local DNA or chromatin. (B) RNA sequence can be added to the minimal gRNA to recruit proteins, fluoresce, or affect RNA-specific functions.
The dCas9 platform has also been developed as a tool to visualize genomic loci by bringing fluorophores to targeted sequences (Chen et al. 2013). Combining dCas9 from different bacteria species has enabled the illumination of multiple loci simultaneously (Ma et al. 2015; Chen et al. 2016). Since the detection of single molecules of fluorescent proteins in cells is challenging, typically multiple dCas9-GFP fusion proteins must be targeted to the locus of interest (Chen et al. 2013). While repetitive sequences, such as telomeres, can be visualized using as few as two gRNAs, multiple gRNAs are necessary to detect nonrepetitive loci (Chen et al. 2013). The challenge of visualizing genomic loci with just a few gRNAs may be overcome by recent advances in molecular tags. For example, the SunTag, which is composed of a multimerized peptide sequence that is recognized by a GFP-tagged nanobody, has been fused to dCas9 to enhance visualization of genomic loci (Tanenbaum et al. 2014). It is likely that the split-GFP strategy, which is also amenable to multimerization, will be similarly useful in visualizing genomic loci (Kamiyama et al. 2016). While these techniques are currently under development in cell culture systems, it is reasonable to expect they will be transferable to Drosophila for dissecting the distribution and dynamics of genomic loci.
Following the successful, broad application of dCas9 as a platform to interrogate genome function and dynamics, gRNAs have recently been engineered as modular platforms to regulate gene expression and visualize genomic loci through the recruitment of transcriptional regulators and fluorescent proteins (Shechner et al. 2015; Zalatan et al. 2015; Cheng et al. 2016; Fu et al. 2016; Ma et al. 2016a; Wang et al. 2016) (Figure 4). By attaching different protein-binding RNA motifs to gRNAs, unique combinations of proteins can be recruited to different nucleic acid targets. For example, it is possible to use this approach to activate and inactivate different target genes at the same time. By attaching MS2 stem loops to one gRNA and PP7 hairpins to another gRNA, different transcriptional regulators can be recruited to individual target genes to obtain specific transcriptional outcomes. This approach has also been applied to visualize up to six loci simultaneously by taking full advantage of the ability to mix and match RNA aptamer and fluorescent protein pairs, using a combinatorial approach to expand the color options beyond red, green, and blue (Ma et al. 2016b). The techniques described here likely represent just the first of many ways in which Cas9- and gRNA-based platforms will be adapted to interrogate the function and localization of genomic sequences.
CRISPR-based approaches for targeting RNA
There is a keen interest in developing CRISPR-based reagents to target RNA, with the goals of visualizing RNA localization in cells, improving RNA knock-down efficacy, and otherwise manipulating endogenous RNAs (e.g., regulating splicing or translation; reviewed in Nelles et al. 2015). To this end, several approaches have been developed over the past couple years: first, it was discovered that supplying an exogenous PAM in the form of a DNA oligonucleotide (called a “PAMmer”) was sufficient to redirect the S. pyogenes gRNA and Cas9 complex away from genomic DNA to an RNA target (O’Connell et al. 2014; Sternberg et al. 2014). In the presence of a PAMmer, the targeted RNA was cleaved and the gRNA and Cas9 did not target the corresponding genomic sequence. This approach was recently adapted to visualize cellular RNAs using dCas9 fused to a fluorescent protein (Nelles et al. 2016). However, the PAMmer requires a specific 5′ chemical modification to avoid degradation by RNase H that cannot be genetically encoded, limiting its potential use in vivo. Second, a Cas protein, C2c2, that specifically targets RNA was recently characterized (Abudayyeh et al. 2016). Analogous to Cas nucleases that target DNA, C2c2 can be programmed to target a specific RNA sequence; however, once C2c2 is activated following target cleavage, the enzyme remains active and cleaves nontarget RNAs, ultimately causing cell death. There are intriguing ways in which C2c2 could be harnessed to eliminate specific cell populations within an organism to determine how a subset of cells functions (e.g., neuronal circuit mapping) or to remove unwanted cells (e.g., cancer-causing cells and tumors). However, the C2c2 system cannot currently be applied to selectively remove individual RNA transcripts from cells. Thus, CRISPR-based approaches to target specific RNAs in vivo await development.
Cell Culture Strategies, Screens and Future Directions
While studies in Drosophila have pioneered the use of CRISPR-mediated genome engineering in organisms, technologies for manipulation and screening in tissue culture have been more extensively developed in mammalian systems. Thus, the many strategies established in mammalian cells provide a framework for future use in Drosophila tissue culture and will complement RNAi-based screening platforms. Encouragingly, Cas9-induced DSBs have been successfully harnessed for both NHEJ and HDR in S2 cells, despite the fact that these cells are largely tetraploid (Bassett et al. 2013; Bottcher et al. 2014; Lee et al. 2014; Housden et al. 2015; Kunzelmann et al. 2016). Successful experiments have relied on the U6 promoter driving gRNA expression from either plasmids or PCR products, and have found an optimal gRNA length of 18–19 nucleotides. While transient transfection of constitutively expressed Cas9 resulted in gene editing, the frequency of the desired event was low (Bassett et al. 2013; Bottcher et al. 2014; Housden et al. 2015). Multiple strategies have been employed to overcome this low efficiency: (1) experiments have been performed in cell lines stably expressing Cas9 (Bottcher et al. 2014); (2) cells expressing Cas9 have been selected using puromycin (Bassett et al. 2013); and (3) individual cells have been sorted into conditioned media (Housden et al. 2015). Strategies that rely on selection or prior integration of a Cas9-expression cassette do not require optimization of sorting protocols that allow recovery of single cells, but have the disadvantage of maintaining continuous expression of Cas9, which may confound downstream experiments. Together, these experiments have, for the first time, enabled rapid generation of knockout Drosophila cell lines. Importantly, while off-target events can be genetically removed when engineering organisms, they represent a significant concern in cell culture experiments and their likelihood must be strongly considered when designing gRNAs and screening for editing events.
Successful creation of epitope-tagged genes in S2 cells has been achieved using both plasmid and PCR-based donors coupled with selection (Bassett et al. 2013; Bottcher et al. 2014; Kunzelmann et al. 2016). PCR-generated donors require only 60 bp of homology flanking the desired modification on either side, whereas plasmid-based donors optimally contain 1-kb homology arms (Bottcher et al. 2014; Kunzelmann et al. 2016). PCR-generated donors, which unfortunately do not work in vivo, presumably because the linear templates are quickly degraded, are easily constructed by ordering gene-specific primers and using publically available plasmids as templates. Downstream removal of an FRT-flanked selectable marker using FLP recombinase has enabled successful addition of epitope tags with minimal disruption to the DNA sequence (Bottcher et al. 2014; Kunzelmann et al. 2016). Nonetheless, development of the piggyBac-based method to scarlessly remove the selectable marker will improve the system. Coupling Cas9-mediated genome engineering with RNAi targeting both lig4 and mus-308 to suppress NHEJ has been shown to further increase efficiency of HDR repair events (Kunzelmann et al. 2016). Because off-target integration events can occur, and can result in cells that are resistant to selection, confirmation of successful tagging of the desired locus is important.
The CRISPR system has provided a powerful platform for screening in mammalian tissue culture, and will likely provide an important addition to RNAi-based screening in Drosophila cells. Many screens in mammalian systems are based on NHEJ-mediated mutation coupled with selection for a specific phenotype, and have been used to identify mutations that confer resistance to chemotherapeutics, bacterial toxins and DNA-damaging agents (Koike-Yusa et al. 2014; Shalem et al. 2014; Wang et al. 2014; Zhou et al. 2014). These screens use pooled libraries consisting of multiple gRNAs targeting each gene and high-throughput sequencing to identify those gRNAs specifically enriched or depleted following selection. Similar screens have recently been made possible in Drosophila cells with the production of a library consisting of >40,000 gRNAs targeting 13,501 genes (Bassett et al. 2015). While these pooled libraries have significant strengths, they require screens to be based on selections. Future generation of arrayed libraries in which individual gRNAs are arranged in a multi-well format will provide a platform similar to the current RNAi libraries, and will expand the kinds of screens that can be performed (Hartenian and Doench 2015). CRISPRi and CRISPRa provide additional screening strategies that have proven powerful in mammalian tissue culture, and lay the groundwork for similar loss-of-function and gain-of-function screens in Drosophila cell lines in the future (Lin et al. 2015; Shalem et al. 2015). As with the variety of strategies being developed to engineer the genome in vivo using CRISPR-based approaches, continued advances will enable rapid and reproducible manipulation of the genome in tissue culture and new screening platforms.
Active Genetics
Mendelian inheritance is characterized by two salient features: independent chromosome assortment and genetic linkage of loci that reside in proximity on the same chromosome. These two basic elements of standard inheritance are typically etched in the minds of trained geneticists, and, as such, can be compared to dedicated computer hardware. Thus, it may at first present a challenge to such experts to fully appreciate the impact of emerging active genetic technologies based on self-copying elements guided by the CRISPR-Cas9 system that entirely bypass these traditional rules.
In this section, the concept behind active genetic elements will be summarized, emphasizing the transformational potential that this technology offers with regard to methods for targeted transgenesis and facilitating combinatorial genetics. In addition, the use of CRISPR-Cas9 technology to generate active genetic elements that can copy themselves with high efficiency to the homologous chromosome during meiosis will be discussed. It is important to recognize that other non-Mendelian systems of inheritance have been known and exploited for quite some time (e.g., transposons, homing-endonuclease genes, and balanced translocations, to name but a few), and the rapidity of new innovation will likely create myriad modifications and add-ons to the basic strategies outlined below. For the purposes of this chapter, however, active genetics refers to the use of targeted gene-editing systems, such CRISPR-Cas9, to generate self-propagating transgenic elements.
The mutagenic chain reaction: a potent combination of mutagenesis and genedrive
In early 2015, a proof-of-concept study demonstrated the autocatalytic CRISPR-Cas9 mediated genetic transmission of an element in which genes encoding Cas9 and a gRNA were inserted as a cassette into the genome precisely at the location targeted by the gRNA (Gantz and Bier 2015) (Figure 5A). The linkage of these editing components and their integration into the targeted locus resulted in both high-efficiency somatic mutation and germline transformation. This method was named the mutagenic chain reaction (MCR) in analogy to the polymerase chain reaction (PCR) to emphasize that MCR could result in the doubling of selected DNA sequences in vivo in the same way that PCR does in vitro. Since this first experimental demonstration of MCR transmission, germline propagation of similar constructs have been observed with high efficiency in yeast (DiCarlo et al. 2015) and mosquitoes (Gantz et al. 2015; Hammond et al. 2016) (see below). Since genedrives rely on germline transmission, it is important to consider the repair processes that are active in the germline and drive propagation at non-Mendelian frequencies.
Active genetic systems for genome editing. (A) The MCR results in the autocatalytic copying of an MCR element to the homologous chromosome. (B) CopyCat elements carry a gRNA(s), which results in the insertion of the CopyCat element at the target site. CopyCat elements can include a COI, such as Gal4, UAS, or an attP site. CopyCat elements require an unlinked Cas9 to copy. A COI can also be included in the MCR cassette (A) or trans-complementing drive cassette (C). (C) A trans-complementing drive consists of two elements: one element comprised of just Cas9, and a second element comprised of gRNAs. One gRNA targets the site at which Cas9 is integrated and one targets the site of gRNA integration, facilitating copying of both the gRNA(s) and Cas9. The genedrive potential of each active genetic approach is indicated.
An interesting question to pose to novice students in genetics that illustrates the novel behavior of active genetic elements is to ask how they might use purely genetic criteria to map an ideally behaving MCR element to a chromosome and then relative to known genetic markers? Since MCR elements are transmitted to all offspring, it is not possible to use traditional measures of recombination frequency to assign chromosome location or linkage to known reference markers. In Drosophila, with its rich trove of genetic tools, there is of course a solution: cross the MCR element to a deficiency collection and identify a deletion that results in standard Mendelian inheritance of the element. Indeed, this novel way of moving genetic information in the form of active-genetic elements is likely to be a powerful tool for future studies.
Why might active genetic elements copy so efficiently via the germline?
The ability of an element or genetic allele to be inherited more frequently than the expected Mendelian frequency of 50% is often referred to as meiotic drive, or, more recently, as genedrive. At a very simplistic level, the striking frequency at which active genetic elements are transmitted can be attributed to a strong bias in the germline toward repairing double stranded breaks by HDR vs. the competing error-prone NHEJ pathway (Gantz and Bier 2015; Gantz et al. 2015; Hammond et al. 2016). However, this explanation glosses over several important factors, detailed discussion of which lies beyond the current chapter, but which have been well reviewed by others (Chapman et al. 2012; Anand et al. 2013; Baudat et al. 2013; Keeney et al. 2014; Haber 2015). In the following section, key insights into HDR vs. NHEJ repair as they pertain to efficient genedrive are summarized.
Briefly, as described above, HDR and NHEJ act as mutually antagonistic pathways at several different levels, resulting in the repair process being restricted to one or the other pathway (Figure 2). The repair process used can be influenced by central transcription and DNA-repair determinants. For example, the mammalian Brca1/53P regulators cross inhibit each other in a dose-dependent fashion. Additionally, there are mutual inhibitory mechanisms that regulate the formation of complexes required for initiating the two competing repair pathways (Rad51-mediated strand invasion followed by single-stranded DNA resection vs. Ku70/Ku80 blunt-end capping and bridging) (Chapman et al. 2012). Importantly, the decision of which repair pathway to use is under quite different control in somatic cells vs. the germline. Although data on how DNA breaks will typically be repaired in somatic cells carrying active genetic elements are minimal at this time, much of this repair may occur via NHEJ (see genedrive section for evidence supporting this view in mosquitoes). This question will be important to address in future studies. With regard to HDR, several other processes are layered over the core HDR machinery shared by germline and somatic cells. For example, in somatic cells, HDR is primarily used to fix DNA breaks resulting from errors in DNA replication, and acts predominantly in a postreplicative fashion to repair lesions from one chromosome with the sequence present on the identical sister chromatid. Although repair using the nonidentical chromosome homolog can also occur in somatic cells, this process is generally repressed, most likely to avoid loss of heterozygosity, which can lead to various disorders, including cancer. In contrast, during the first meiotic prophase, where again HDR acts following replication, the bias is to fix DSBs from the homologous chromosome. During meiosis I, actively induced double-stranded DNA breaks are repaired by invasion of the homologous chromosome to establish crossover events that are essential for tethering these chromosomes together and for subsequent proper chromosomal segregation. The great majority of these meiotic HDR events, however, are resolved without crossover, leading to gene conversion events, and such events are likely responsible for propagating MCRs or other active-genetic alleles.
Another important consideration is that, in the germline, there are additional meiotic checkpoint processes that ensure full suppression of the NHEJ pathway (Joyce et al. 2012). Thus, HDR in the germline has evolved specifically to repair DSBs from the homologous chromosome, and then resolve the majority of these events through gene conversion. These unique aspects of repair in the germline most likely form the basis for why active genetic elements are so efficiently copied to the homologous chromosome in the germline. Thus, although strong genedrive has only been documented thus far in yeast and insects (DiCarlo et al. 2015; Gantz and Bier 2015; Gantz et al. 2015; Hammond et al. 2016), it seems highly probable that the process will be similarly efficient in all organisms that undergo sexual recombination, including vertebrates and plants. This prediction will undoubtedly be validated or falsified in the near future by ongoing efforts to extend active-genetic strategies to other organisms.
Split Cas9:gRNA transgenesis systems for accelerating combinatorial genetics
Use of full MCR-like genedrive elements for experimental purposes will most likely remain limited due to potential ecological dangers posed by the inability to regulate the copying process, and to the extreme care that must be taken in handling such elements responsibly (Akbari et al. 2015). Thus, as described below, implementation of such systems will primarily be restricted to devising and testing genedrive systems in various disease vectors or pest species. MCR elements may also prove of significant value for gaining genetic access to pioneer species for which few tools (e.g., balancer chromosomes or existing recessive alleles) are available apart from genome sequence data. Nonetheless, in many instances, split systems that separate the Cas9 and gRNA(s) may also be adequate for such purposes. The two-component nature of the CRISPR-Cas9 system offers flexibility in the configuration of the components, which should be of general use to the design of active-genetic systems.
One of the largest impacts of active genetics will likely be the use of split Cas9:gRNA transgenesis systems to edit the genome at multiple places simultaneously. We have proposed such a system and named it CopyCat (Gantz and Bier 2016). In the CopyCat system, the Cas9 source is a traditional Mendelian transgene located at a separate genomic locus from the gRNA-containing vector, which is inserted at the gRNA cut site (denoted as <gRNA> wherein the brackets indicate that the gRNA will be copied in the presence of Cas9; Figure 5B). CopyCat elements can be designed to incorporate the same features as transposon or ΦC31-mediated integration, including ΦC31 docking sites, UAS sequences, various dominant-marker genes, and other cassettes of interest (COI). CopyCat elements also offer the potential to accelerate the assembly of complex genotypes, or to generate trans-complementing systems that enable the active copying of both the gRNA and Cas9 (Figure 5C). Following the desired assembly of transgenic elements, the separate source of Cas9 can be segregated away from the active genetic elements returning standard genetic control over the experiment. (Note that the converse is not true: one cannot simply outcross the strain to remove the CopyCat element since all progeny will inherit the element in the presence of Cas9. As a standard safety precaution, we recommend that stocks carrying CopyCat element(s) be housed separately from those carrying Cas9 to avoid the potential release of strains harboring even these weak genedrives.)
CopyCat elements can be designed to carry either one or two gRNAs, and can be used to either insert sequence via a single cut or to replace a genomic interval through two cuts, which would eliminate the intervening genomic sequence and replace it with the “cargo” sequence carried by the CopyCat element. In this way, vectors carrying GAL4 or UAS sequences could be inserted into the genome in a site-directed manner using a gRNA carried on the same vector. Since the insertion of sequence relies on the cell’s endogenous DNA repair machinery, the size of the DNA that can be inserted may be limited, and the sequence inserted should be verified to ensure that no errors were incorporated. Indeed, in many cases it may be possible to retrofit existing frequently used transgenic elements. Determining the site of genomic insertion and inserting a gRNA that cuts very near the vector-insertion site into the original transgene would endow the repurposed transgene with active-genetic properties. It is possible that the imprecise match of the cut and vector insertion sites will reduce copying efficiencies. However, precedent from yeast (Cho et al. 2014) and flies (Do et al. 2014) would suggest that, so long as the cut site is within HDR-mediated resection distance of the vector insertion, copying should take place. Nonetheless, proof of this suggestion will need to be determined empirically.
The ability to insert and/or replace genetic elements at specific genomic sites, and then to combine the products of such modifications without regard to chromosomal location or requirement for balancer chromosomes has the potential to revolutionize the analysis of both coding and cis-regulatory components of genes, and their cooperative interactions in gene regulatory networks. In addition, it should be possible to combine design features accommodating both genomic engineering and active-genetic strategies to enable facile modifications or replacements of large chromosomal intervals. Development of such tools should facilitate the next generation of genetic manipulation of complex loci, allowing control of both simple and complex genetic traits.
Genedrive systems and accessory elements
The idea of using genedrive systems to combat vectors of disease or pest species has a long history dating back to the 1960s (Curtis 1968), and the theory behind such approaches has been well developed (Curtis 1968; Burt 2003; James 2005; Deredec et al. 2008; North et al. 2013; Burt 2014). In-depth discussion of this topic is thus well beyond the scope of the chapter. Nonetheless, recent Cas9-based genedrive experiments in mosquitoes highlight the potential of such systems, and interesting lessons have already been gleaned from the few published cases describing such systems that may have more general implications (Gantz et al. 2015; Hammond et al. 2016).
Lessons learned from mosquito vector suppression vs. modification schemes
Two general schemes have been proposed to combat vector-borne diseases, often referred to as suppression (killing the vector host) or modification (preventing the disease agent from being propagated by the host but leaving the host intact). While proof of principle for both schemes has recently been demonstrated in Anopheline mosquitoes that serve as key vectors of malaria, populations in the wild may rapidly evolve resistance due to a number of features, including natural variation in gRNA-target sequence and sequence-altering NHEJ repair events (Drury et al. 2017; Unckless et al. 2017). In Anopheles stephensi, a genedrive system was devised that carries a sophisticated gene cassette that induces anti-parasite effector genes (single-chain antibodies recognizing invariant epitopes on the plasmodium parasite fused to a Cecropin killing moiety following females feeding on a blood meal; Isaacs et al. 2012). This large gene cassette (∼17 kb), when transmitted via the male germline, propagates to 99.5% of the progeny, which corresponds to 98–99% efficiency of copying itself to the homologous chromosome. In contrast, in females this same element only converted the homologous chromosome 10–20% of the time. Similarly, in An. gambiae, a simple suppression genedrive element targeting genes required for female fertility was passed with much higher efficiency via males than females (Hammond et al. 2016).
An alternative to creating a one-component full genedrive element is a trans-complementing drive configuration (Figure 5C). In this arrangement, the Cas9 provided is a Mendelian source and can be linked to various COIs, such as antipathogen transgenes, or gRNAs targeting host factors. The second element, which also behaves in Mendelian fashion on its own, carries two gRNAs (as well as possible COIs). One of the gRNAs cuts at its own site of chromosomal insertion, and the other gRNA cuts at the site of Cas9 insertion. When two such matched strains are crossed to each other, the result is a self-propagating two-component drive system. The advantages of this system include the ability to test and compare many different Cas9 and gRNA combinations, to carry more and varied cargo than a single drive, and to safely store the component lines (separately housed) as Mendelian rather than full-drive elements.
Why might transmission via the male vs. female germline result in such differing conversion efficiencies in mosquitoes? The likely answer seems to be that when the element is passed via males crossed to wild-type females, the egg contains no Cas9 (which is under the control of the vasa promoter that is active in both male and female germline lineages). In contrast, when a female carrying the genedrive element is crossed to a wild-type male, Cas9 is present throughout the egg (since Cas9 does not become concentrated in pole cells post-translationally as the Vasa protein does). Since the gRNA is expressed under the control of a ubiquitous U6 promoter, in progeny of females carrying the genedrive element, active Cas9/gRNA complexes can form in the egg prior to segregation of the germline. During the preblastoderm stage, cells are somatic in nature, and Cas9-induced DNA breaks can be repaired by either the NHEJ or HDR pathways. If NHEJ is used, often it will result in an indel mutation at the DNA-cut site that precludes any further cutting. When such an allele is created in a cell that subsequently gives rise to a germ cell lineage (i.e., a pole cell), it will no longer be a substrate for HDR-mediated gene conversion, and hence will block copying of the active genetic element to the homologous chromosome. While the requirement for excluding Cas9 activity from embryonic cells prior to germline segregation may vary between organisms, this issue is likely to be important to consider when designing active-genetic elements.
Hitchhiking elements
It is also worth noting that CopyCat elements can be combined with a full MCR drive element, in which case the element will copy along with the full-drive element. When used in this fashion we refer to such elements as Construct Hitchhiking on the Autocatalytic Chain Reaction (or CHACRs) as they chase after the drive (Gantz and Bier 2016). CHACRs could be used to update genedrive systems. For example, CHACRs could combat the evolution of various forms of resistance to effectors by introducing new protective gene cassettes into a population of vector organisms that already have a Cas9-based genedrive in place. Alternatively, CHACRs could carry gRNAs that cut a nonpreferred allelic variant. As an example of this latter application, which we refer to as CopyCutting, one might include a gRNA that cuts an insecticide resistant form of a host gene (e.g., encoding the Na+ ion channel or Cyp450 genes) but not the wild-type sensitive form. When such an element drives through a population, if it encounters an insecticide resistant form, it should cut that chromosome and repair it using the sensitive allele as a template, thus rendering all the progeny insecticide sensitive. Similarly, CopyCutting could be used in agriculture to cut a nonpreferred allele (e.g., a drought-sensitive allele) and replace with a drought-resistant form. This would be particularly advantageous in polyploid crop species such as wheat. Thus, one could cross an individual carrying Cas9 and a CopyCutting gRNA generated in the background of a resistant strain to a standard drought-sensitive cultivar. The result should be conversion of all alloalleles to the resistant form. If two such strains were then crossed to each other, the result would be combination of both of the desired alleles across all of the alloalleles. The unlinked Cas9 source could also be segregated out at this stage to create a non-GMO organism with desired alleles (at least as defined by US standards).
Outlook/Future Directions
The ability to readily modify the fly genome is significantly changing the course of research in Drosophila. Cas9-mediated genome engineering and active-genetic strategies are likely to have a transformative impact on genetics as we currently conceive of the field. Researchers working with fruit flies rely heavily on transgenic approaches for diverse applications, ranging from analyzing gene regulation to determining protein function and localization. While transgenic approaches will remain a core strength of Drosophila genetics, the CRISPR-Cas system now provides the ability to directly edit endogenous genomic sequences and precisely incorporate exogenous DNA, which will undoubtedly lead to a new generation of experimental approaches. The CRISPR-Cas system also holds the potential to expand the range of forward genetic screens through approaches in which Cas9 is used to introduce mutations in combination with gRNA libraries, or in which inactive Cas9 or a gRNA platform is exploited to recruit mutagenic enzymes, such as cytidine deaminase or mini-singlet oxygen generator, to specific genomic regions (Noma and Jin 2015; Komor et al. 2016). In our experience, one remaining technical hurdle lies in designing an effective gRNA. Since identifying gRNAs that cleave efficiently and precisely in vivo remains a trial-and-error process, we propose that the fly community establish a database of validated gRNA sequences. This would be an invaluable resource for the repeated manipulation of individual genes, and it is also possible that the compiled sequences would reveal new parameters for designing gRNAs that are effective in Drosophila. Along these lines, it is important to note that laboratories in Boston, Heidelberg, and Kyoto are coordinating independent efforts to generate gRNA libraries to disrupt gene function in vivo (M. Boutros, B. Ewen-Campen, S. Kondo, N. Perrimon, F. Port, and J. Zirin, personal communication). While CRISPR-based tool development to date has predominantly focused on gene function and visualization, a recent study took advantage of Cas9-induced mutations to generate DNA barcodes that can be used to trace cell lineages in a developing zebrafish embryo (McKenna et al. 2016). This study highlights the potential for gene-editing technologies to improve and expand existing techniques used to analyze cell and developmental processes. Active genetics has the potential to transform genetic manipulation in many species and will certainly impact the implementation of genedrive systems to combat vector borne disease in a broad variety of pest species. Based on the mechanism of germline-mediated HDR repair of DNA breaks it seems likely that active genetics will be highly efficient in virtually all sexually reproducing species so long as Cas9 expression is restricted to the male germline. If active genetics can be readily adapted to other species it may also serve as an efficient site-directed form of transgenesis. The success of the initial Cas9-mediated genedrive experiments, suggests that this application of active genetics will likely have a large impact. Nonetheless, this necessitates pending regulatory and public approval of such uses following the continued discussion of these applications. In summary, the broad variety of applications covered here provide only a glimpse into the myriad ways researchers are likely to apply CRISPR-Cas approaches to address fundamental biological questions. As it has for >100 years, Drosophila will undoubtedly continue to be influential in the development and creative application of these new genetic technologies.
Footnotes
Communicating editor: H. Bellen
- Received January 9, 2017.
- Accepted July 8, 2017.
- Copyright © 2018 by the Genetics Society of America
Available freely online through the author-supported open access option.