In principle, clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 allows genetic tags to be inserted at any locus. However, throughput is limited by the laborious construction of repair templates and guide RNA constructs and by the identification of modified strains. We have developed a reagent toolkit and plasmid assembly pipeline, called “SapTrap,” that streamlines the production of targeting vectors for tag insertion, as well as the selection of modified Caenorhabditis elegans strains. SapTrap is a high-efficiency modular plasmid assembly pipeline that produces single plasmid targeting vectors, each of which encodes both a guide RNA transcript and a repair template for a particular tagging event. The plasmid is generated in a single tube by cutting modular components with the restriction enzyme SapI, which are then “trapped” in a fixed order by ligation to generate the targeting vector. A library of donor plasmids supplies a variety of protein tags, a selectable marker, and regulatory sequences that allow cell-specific tagging at either the N or the C termini. All site-specific sequences, such as guide RNA targeting sequences and homology arms, are supplied as annealed synthetic oligonucleotides, eliminating the need for PCR or molecular cloning during plasmid assembly. Each tag includes an embedded Cbr-unc-119 selectable marker that is positioned to allow concurrent expression of both the tag and the marker. We demonstrate that SapTrap targeting vectors direct insertion of 3- to 4-kb tags at six different loci in 10–37% of injected animals. Thus SapTrap vectors introduce the possibility for high-throughput generation of CRISPR/Cas9 genome modifications.
THE clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system has revolutionized genome editing in nearly all model systems, including Caenorhabditis elegans (Frøkjær-Jensen 2013; Doudna and Charpentier 2014; Xu 2015). The Cas9 protein cuts genomic DNA at sites that match an ∼20-nucleotide guide RNA sequence (Gasiunas et al. 2012; Jinek et al. 2012). Error-prone repair of the break can create point mutations, small indels, or large deletions at the cut site (Cho et al. 2013; Cong et al. 2013; Friedland et al. 2013; Jinek et al. 2013; van Schendel et al. 2015). However, the true power of CRISPR/Cas9 for genome editing lies in the insertion of exogenous DNA sequences, such as genetically encoded protein tags (Katic and Grosshans 2013; Lo et al. 2013; Mali et al. 2013; Tzur et al. 2013). To insert an exogenous sequence, one must simply supply a repair template with homology arms that flank the cut site. The cell uses homology-based repair to heal the break and copy the exogenous DNA into the cut site.
Although in theory CRISPR/Cas9 makes it easy to insert exogenous sequences into a genome, practical limitations have prevented high-throughput implementation. Two critical limiting factors are (1) the time and expense required to build both the repair template and guide RNA constructs for each desired insertion and (2) the time required to screen through candidates to identify the rare, heritably modified organisms. In the simplest insertion strategy, a repair template that contains only the exogenous sequence flanked by homology arms is introduced with Cas9 and a guide RNA. Insertion events among the progeny can be identified only by PCR or by screening for the expected phenotype, which creates a large screening burden. There are two strategies to enrich for the rare animals with edited DNA: either by insertion of selectable markers at the target site or by co-CRISPR events at a second site (Arribere et al. 2014; Kim et al. 2014; Dickinson et al. 2015; Norris et al. 2015; Paix et al. 2015; Ward 2015).
In the first strategy, a selectable marker is included with the exogenous sequence in the repair template (Dickinson et al. 2013). Transgenic animals can then be directly identified, but it complicates the design and construction of the repair template. The position of the selectable marker within the repair template is typically chosen on a case-by-case basis to limit interference between the target gene and the selectable marker. However, the unique design and construction of each repair template is time-consuming. Recently, the Goldstein and Calarco groups have embedded selectable markers within a synthetic intron in the exogenous sequence (Dickinson et al. 2015; Norris et al. 2015). This arrangement simplifies construction because the selectable marker no longer needs to be positioned in a unique location within each repair template. However, the selectable markers are oriented in the same direction as the target gene, and transcription of the target gene is terminated in the synthetic intron until the selectable marker is removed in a later step. Thus, selectable marker strategies allow direct selection of insertions but either require complex repair template designs or compromise target gene expression.
In “co-CRISPR” strategies, a second locus is edited simultaneously with the target locus (Arribere et al. 2014; Kim et al. 2014). Simple markerless repair templates can be used at the target locus because animals are selected by the second-site edit. The phenotype produced by the second-site edit identifies worms in which Cas9 was highly active, the progeny of which are enriched for edits at the target locus. However, among selected animals, the fraction of animals edited at the target locus is highly variable, so screening is still required (Arribere et al. 2014; Paix et al. 2015; Ward 2015). Hence, co-CRISPR strategies allow simplified repair template construction and allow target gene function in primary strains but reduce selection efficiency.
Both selectable marker-based strategies and co-CRISPR have been optimized for the insertion of relatively simple sequences, such as translational GFP fusions. Classical transgene methods allow a greater variety of tag functions, including transcriptional reporting, translational fusions, conditional expression, and tissue-specific and even single-cell expression. Strategies to add these functions to native-locus tags can be envisioned, but they will generally involve adding regulatory sequences around the tag in the repair template. Adding these regulatory sequences will in turn complicate repair template design and assembly. Ideally, repair templates would be assembled in a modular fashion to simplify the addition of regulatory sequences to tags.
Here we present SapTrap, a PCR-free high-efficiency modular plasmid assembly method for high-throughput production of CRISPR/Cas9 targeting plasmids for C. elegans. A single-tube SapTrap assembly reaction generates a single plasmid targeting vector that encodes both a guide RNA transcript and a repair template for an individual insertion event. The guide RNA targeting sequence and homology arms are supplied as synthetic, annealed oligonucleotides, and a prebuilt plasmid library supplies the remaining repair template components: fluorescent and nonfluorescent tags, a selectable marker, and optional regulatory sequences. A novel intron-embedded selectable marker strategy obviates the need to position the marker on a gene-by-gene basis and yields concurrent expression of the tagged gene and selectable marker after genomic insertion. Nonetheless, the marker is removable to yield a scarless insertion of the tag. Finally, we provide repair templates capable of tagging a protein in a tissue-specific manner. The SapTrap toolkit reduces the expense and workload necessary to produce vectors for genome editing in the worm and will expand the experimental utility of tags inserted in the genome.
Materials and Methods
All chemicals were purchased from Sigma-Aldrich (St. Louis). All enzymes were purchased from New England Biolabs (Beverly, MA). All synthetic DNAs were purchased from Integrated DNA Technologies.
Single guide RNA and repair template destination vectors:
pMLS134 was derived from Addgene plasmid #46169 [PU6::unc-119_sgRNA, a gift from John Calarco (Friedland et al. 2013)]. Using PCR site-directed mutagenesis, the unc-119 single guide RNA (sgRNA) targeting sequence was replaced with a SapI insertion site (see Supporting Information, Figure S3B), and the SapI site in the plasmid backbone was replaced with an NdeI site. pMLS256 and pMLS257 were derived from pBluescript II (Agilent Technologies). First, using PCR site-directed mutagenesis, the multiple-cloning site of pBluescript II was replaced with a SapI insertion site (see Figure S3A), and the SapI site in the pBluescript backbone was deleted. To make pMLS256, the PU6:sgRNA cassette from pMLS134 was blunt ligated into pMLS257 linearized by PCR amplification with the primers used to delete the backbone SapI site. PCR site-directed mutagenesis was then used to modify the sgRNA SapI insertion site to match the repair template insertion site.
Three-site destination vectors:
To make pMLS234, pMLS235, and pMLS236, first a gfp::syntron-nested-Cbr-unc-119 cassette containing flexible linker sequences was introduced into pDONR-221 by BP reaction (Thermo Fisher). The PU6::sgRNA cassette from pMLS134 was amplified and blunt ligated into the resultant vector linearized using the same primers as for PU6::sgRNA insertion into pMLS256 (above). The resultant vector was amplified in two pieces with primers that introduced additional SapI insertion sites on the 5′ end of GFP and on the 3′ end of GFP. For pMLS234, the 5′ flexible linker was deleted by the SapI insertion site, for pMLS235 the 3′ flexible linker was deleted by the SapI insertion site, and for pMLS236 neither flexible linker was deleted. The final vectors were produced by blunt ligating the appropriate PCR fragments.
Tag/selectable marker donor plasmids:
pMLS252, pMLS254, pMLS271, pMLS291, and pMLS292 were made in two steps. First, the tag sequences (gfp, halotag, snaptag, mcherry, or 2xNLS-mcherry) were PCR amplified with primer-encoded SapI restriction sites and attB1 and attB2 tails. PCR products were inserted into pDONR-221 by BP reaction. The resultant entry vectors were linearized by PCR, and the syntron::Cbr-unc-119 cassette, amplified from pCFJ150 (Frøkjær-Jensen et al. 2008) with primer-appended syntron sequences, was inserted by Gibson assembly (New England Biolabs). pMLS286 was produced by first blunt ligating a PCR-amplified tagRFP sequence with primer-encoded SapI sites into pMLS280 (described below) linearized by digestion with XmaI. The syntron::Cbr-unc-119 cassette was introduced into this entry vector by Gibson assembly, as described above.
N- and C-tagging connector plasmids:
pMLS268, pMLS269, and pMLS279 were constructed by PCR amplifying the insert sequence with primers containing attB1 and attB2 tails and inserting the PCR products into pDONR-221 by BP reaction. pMLS285, pMLS287, pMLS288, pMLS382, and pMLS383 were constructed by blunt ligating the insert sequences into pMLS280. All inserts were generated from ligated oligos; the oligos for the egl-13 NLS sequence were designed to anneal with large 5′ overhangs that were subsequently filled in by incubation with Phusion polymerase (New England Biolabs). To construct pMLS272, -281, and -282, individual fragments of the insert sequence were produced by PCR or annealing oligos. The fragments were then assembled by Gibson assembly and inserted into pDONR-221 by BP reaction.
pMLS280 was made by Gibson assembly. The multiple-cloning site from pBluescript II was amplified with M13F and M13R primers and inserted between the M13F and M13R primer binding sites in pDONR221. pMLS328 was generated by PCR site-directed mutagenesis of pDD104 [Peft-3::Cre, Addgene plasmid #47551, a gift from Bob Goldstein (Dickinson et al. 2013)] with a primer-encoded egl-13 NLS sequence inserted at the 3′ end of the Cre recombinase sequence. FLP-D5 was modified to include an N-terminal SV40 nuclear localization sequence and a C-terminal egl-13 nuclear localization sequence and harbors an aspartic acid residue at position 5. FLP expression vectors were constructed by multisite LR reactions (Thermo Fisher) into either pCFJ150(pMLS260, pMLS262) or pCFJ201(pMLS359, pMLS360) (Frøkjær-Jensen et al. 2012) with promoters initially cloned into [4–1] entry vectors, the FLP construct cloned into a [1–2] entry vector, and the let-858 3′-UTR cloned into a [2–3] entry vector .
Selection of overhang sequences
The 5′ overhangs within the sgRNA insertion site, at the repair template–plasmid junctions and at the junctions of the tag and marker cassette, were designed without selection-based optimization. To select 5′ overhangs for the homology arm–connector junctions with the highest fidelity, we designed oligo pairs to occupy both homology arm slots and both connector module slots in the SapTrap assembly reaction. We screened a panel of oligo pairs encoding four different candidate overhang sequences for each junction. We ran 16 SapTrap assembly reactions containing the pMLS256 destination vector, a GFP and Cbr-unc-119 donor plasmid, and every possible combination of overhang sequences for the connector–homology arm junctions. Each reaction included both N- and C- terminal connector modules. The reaction containing GCG (Ala) as the 5′ homology arm–C-tagging connector overhang and ACG (Thr) as the N-tagging connector–3′ homology arm overhang produced the highest rate of correctly assembled plasmids (11/12, >90%).
SapTrap assembly reactions
SapTrap enzyme mixture:
A 1.25X master SapTrap enzyme mixture (1.25X NEB cutsmart buffer; 1.25 mM ATP, pH 7.6; 6.25 mM DTT; 12.5 units/μl T4 DNA ligase; 0.25 units/μl T4 polynucleotide kinase; and 1.25 units/μl SapI) was prepared on ice and frozen in 2-μl aliquots at −80°. Because SapI precipitates from solution, all solutions containing SapI were vigorously pipetted up and down to resuspend SapI before withdrawing an aliquot.
Oligo pairs (see File S1 for step-by-step protocols) were annealed as follows: complementary oligo pairs were resuspended to 10 μM each in 1× oligo annealing buffer (OAB) (20 mM Tris-Cl, pH 7.5; 50 mM NaCl; and 1 mM MgCl2) and heated to 95°–100° in a heat block. The heat block was then switched off and allowed to cool slowly (1–2 hr) to room temperature. Sets of three annealed oligo pairs comprising the 5′ homology arm, the 3′ homology arm, and the sgRNA targeting sequence were diluted in TE buffer (10 mM Tris-Cl, pH 8.0; 1 mM EDTA, pH 8.0) to 150 nM each pair. Destination and donor plasmids were diluted individually to 50 nM in TE, using the formula(1)
To assemble combined sgRNA and repair template vectors, equal volumes of each 50-nM plasmid stock (pMLS256 destination, connector donor, and tag and selectable marker donor) and the annealed oligo mixture were premixed. A total of 0.5 μl of this DNA mixture was mixed with 2 μl of SapTrap enzyme mixture, yielding a final reaction containing 2.5 nM of each plasmid and 7.5 nM of each annealed oligo pair. Reactions were incubated at 20°–25° overnight. Then, the T4 DNA ligase was inactivated by a 30-min incubation at 65°. Reactions then received 2.5 μl of 1× Cutsmart buffer (New England Biolabs) + 2 units/μl of a counterselection restriction enzyme and were incubated at 37° for 1 hr. A total of 1–2 μl of the reaction mixture was used to transform chemically competent TOP10 Escherichia coli.
Colonies were screened for correctly assembled plasmids by colony PCR with M13F (5′-TGTAAAACGACGGCCAGT) and M13R (5′-CAGGAAACAGCTATGACCATG) primers. Individual bacterial colonies were sampled with a sterile P10 pipette tip, which was dipped into a 5-μl 1× PCR reaction mixture (Phusion polymerase, buffer HF; New England Biolabs). Reactions were cycled 30 times with a 10-sec, 58° annealing step and a 1-min 45-sec, 72° extension step. Plasmid DNA was isolated from colonies that produced a clear band of the correct size (∼3.4–4 kb, depending on insert) in the colony PCR reaction. Properly assembled plasmids were isolated and sequenced using M13F, M13R, and oMLS471 (5′-TCCAAGAACTCGTACAAAAATGCTC) sequencing primers to confirm the 3′ homology arm, the 5′ homology arm, and the sgRNA targeting sequence, respectively.
Worm injections and strain isolation
To insert tags with combined sgRNA and repair template targeting vectors, young adult EG6207 [unc-119(ed3) III] hermaphrodites reared at 15° on HB101 bacteria were micro-injected in the gonad with an injection mixture of 65 ng/μl combined sgRNA and repair template targeting plasmid, 25 ng/μl Addgene plasmid #46168 (Peft-3::cas9-SV40_NLS::tbb-2 3′UTR, a gift from John Calarco) (Friedland et al. 2013), and 10 ng/μl fluorescent co-injection markers. For gfp and nonfluorescent insertions, the co-injection markers were 2 ng/μl pCFJ90 (Pmyo-2::mcherry), 4 ng/μl pCFJ104 (Pmyo-3::mcherry), and 4 ng/μl pGH8 (Prab-3::mcherry) (Frøkjær-Jensen et al. 2008). For tagrfp insertions, the co-injection markers were 2 ng/μl pCFJ420 (Pmyo-2::gfp::h2b) and 8 ng/μl pCFJ421 (Peft-3::gfp::h2b) (Frøkjær-Jensen et al. 2012). All plasmids were purified using a QIAGEN (Valencia, CA) miniprep kit followed by ethanol precipitation. Injected P0 worms were placed two to a plate on 6-cm NGM plates seeded with OP50 bacteria and incubated for 7–10 days at 25°. Plates were inspected under a fluorescence dissecting microscope for the presence of motile [unc-119(+)] animals without co-injection markers. Such animals were singled out and insertions were confirmed by PCR with primers outside the homology arms. Only a single strain was isolated from each plate.
Quantification of insertion frequency:
Insertion frequency was calculated by dividing the number of independent strains containing the desired insertion by the estimated number of P0 animals that were successfully injected. A successful injection was defined as an injection producing array-positive animals in the F2 generation. Each recovery plate was originally seeded with two P0 animals. After 7–10 days, the plates were inspected and scored as either positive for a successful injection event (presence of array-positive F2 animals) or negative for a successful injection event (no array-positive F2 animals) (see Table S3). Assuming the P0 animals were distributed independently of injection quality, the number of successfully injected animals was calculated as(2)(3)(4)where u is the fraction of worms that were not successfully injected, i is the fraction of worms that were successfully injected, N is the number of plates with two unsuccessfully injected animals, and P is the number of plates with one or two successfully injected animals.
Cre recombinase injections:
To excise the LoxP-flanked Cbr-unc-119 cassette, young adult hermaphrodites were micro-injected in the gonad with an injection mixture of 50 ng/μl pDD104 (Peft-3::Cre) or 50 ng/μl pMLS328 (Peft-3::2xNLS-Cre), 48 ng/μl pBluescript II, and 2 ng/μl pCFJ90 (Pmyo-2::mcherry). A total of 10–20 array-positive F1 animals were placed 5 to a plate on 6-cm NGM plates seeded with OP50 and incubated at room temperature for 3–5 days. Unc-119 F2 animals were cloned. To remove the unc-119(ed3) allele, Cbr-unc-119(−) hermaphrodites were mated with N2 males, and the F1 male progeny were mated to either EG6207 (unc-32 insertions) or N2 (all other insertions) hermaphrodites. F2 progeny from the second cross were cloned to identify strains homozygous for both the tag insertion and unc-119(+). unc-32 insertions were backcrossed to EG6207 to facilitate identification of chromosomal crossovers between unc-32 and unc-119 (5.6 cM apart) on chromosome III.
Young adult hermaphrodites of the Cbr-unc-119(−) gfp::FLP-on::rab-3 [gfp::FLP-on::rab-3 II; unc-119(ed3), III] or snt-1::FLP-on::gfp strain [snt-1::FLP-on::gfp II; unc-119(ed3) III] were micro-injected in the gonad with an injection mix of 5 ng/μl FLP expression vector [unc-119(+)] and either 95 ng/μl pBluescript II or 5 ng/μl pRJH179 (Punc47::snb-1::tagrfp) and 90 ng/μl pBluescript II. unc-119(+) F1 animals were cloned, and strains with stably passaging extrachromosomal arrays were selected for imaging studies.
Individual young adult hermaphrodite worms reared on HB101 (unc-119) or OP50 (all other strains) were placed in a 75-μl drop of M9 buffer on an NGM plate. After a 1-min equilibration period, worms were observed for 1 min and the number of thrashing events was counted.
Worms were anesthetized in 25 mM NaN3 on a 5% agarose pad. Once fully anesthetized, the pads were overlaid with a glass coverslip, sealed, and imaged on a Zeiss (Thornwood, NY) LSM-5 Pascal laser-scanning fluorescence microscope equipped with a 488-nm argon laser and a 543-nm helium-neon laser. Raw image stacks were converted to Z projections and adjusted using ImageJ software.
Strains are available upon request. Plasmids are available from Addgene as a single kit or as individual plasmids (https://www.addgene.org/).
All relevant data are contained within the manuscript and supporting information files.
A combined tag and selectable marker
To streamline the production of repair templates for CRISPR/Cas9-mediated insertion, we first sought to reduce the general complexity of the repair template. We opted to use a selectable marker to facilitate direct identification of modified worms. In this design, a LoxP-flanked C. briggsae unc-119 selectable marker is positioned within a synthetic intron of the tag, such as gfp (Figure 1A). The LoxP sites allow the unc-119 cassette to be excised by expression of CRE recombinase. To allow concurrent expression of both gfp and unc-119, we inserted the unc-119 gene in the opposite orientation relative to gfp, a configuration that mimics naturally occurring intron-nested genes (Kumar 2009). Using this cassette, a complete repair template can be generated simply by adding homology arms to each side of the gfp tag.
To test whether our cassette allows concurrent expression of both the tagged gene and the selectable marker, we used CRISPR/Cas9 to insert the cassette as an N-terminal tag on UNC-17 (vesicular acetylcholine transporter) or as a C-terminal tag on UNC-32 (v-ATPase subunit). Compromising the function of either of these genes causes an uncoordinated phenotype, and complete knockout of either gene is lethal (Alfonso et al. 1993; Pujol et al. 2001). The repair templates were injected into unc-119(ed3) animals and rescued animals were selected in the F2. For both loci, homozygous strains expressed GFP in the expected pattern and were phenotypically wild type (Unc-119+; Figure 1, B and C). We assayed locomotion in each strain by measuring thrashing in liquid media. The primary unc-32::gfp strain showed no locomotion defect compared to the wild type, whereas the primary gfp:unc-17 strain exhibited a moderate thrashing defect. After excising the unc-119(+) selectable marker by germline CRE recombinase expression and outcrossing to remove the unc-119(ed3) mutation, both tagged strains exhibited thrashing behavior indistinguishable from the wild type (Figure 1C). Thus, the locomotion defect of the primary gfp::unc-17 strain is not attributable to the GFP tag impairing UNC-17 protein function; rather, the embedded unc-119 gene interferes with unc-17 expression or vice versa. The locomotion defect of the primary gfp::unc-17 strain was not obvious on solid media and did not complicate selection of insertions from the unc-119(ed3) parent strain. Thus, our tagging cassette allows concurrent expression of both the tagged gene and the selectable marker; however, the function of either gene may be moderately impaired in primary strains for some loci.
Modular assembly of targeting vectors
Next, we sought to streamline the production of vectors for CRISPR/Cas9 repair. Each individual targeting event requires two components: an sgRNA (Jinek et al. 2012) that directs Cas9 to cleave at a specific site and a repair template containing the tag flanked by homology arms of the target (Figure 1A). Typically, these components are generated independently, so that two constructs must be built for each targeting event. In addition, Cas9 must be supplied as yet another plasmid (in our case Peft-3:cas9 is co-injected in all experiments). To reduce the plasmid assembly workload and subsequent plasmid management, we designed a single plasmid to encode both the sgRNA transcript and the repair template (Figure 2A).
To modify a single vector at multiple sites in a single step, we employed Golden Gate assembly (Engler et al. 2009). Golden Gate assembly uses the property of type-IIS restriction enzymes to cut outside of their recognition sequence to drive the ordered assembly of up to 10 different DNA fragments in a single-tube digest-and-ligate reaction. We designed our Golden Gate strategy, using the restriction enzyme SapI, and have named our assembly method “SapTrap.” We chose SapI because its 7-bp recognition sequence is rare and because it produces 3-base overhangs that can be conveniently positioned to coincide with codons in the open reading frame. First, we designed a destination vector (pMLS256) that is opened by SapI digestion at two sites: one site is in a U6 promoter-driven sgRNA expression cassette and accepts the ∼20-bp sgRNA targeting sequence; the second site is flanked by M13 sequencing primer binding sites and accepts the repair template (Figure 2A). We divided the repair template into five separate components to be supplied independently to the assembly reaction: the 5′ and 3′ homology arms, the combined tag and selectable marker cassette, and two optional N- and C-terminal “connectors.” The connector modules fit between the tag and homology arms and encode either peptide linkers or regulatory sequences that control how the tag is expressed in relation to the target gene (Figure 2B, Table S1, and Table S5). SapI generates 3-base 5′ overhangs; we designed unique overhang sequences for each junction between DNA components. Overhang sequences were chosen to maximize assembly fidelity and to encode favorable amino acids (M, G, A, T) at junctions within coding regions of the repair template. The overhang sequences were designed to achieve high fidelity in reactions containing all five repair template DNA components; in most constructs the tag is inserted at the N or the C terminus of the protein and only one connector module is necessary (Figure 2).
Individual DNA components can be supplied to the SapTrap reaction in three ways: as PCR products, donor plasmids, or annealed oligos. PCR products and donor plasmids require SapI digestion to produce 5′ overhangs, whereas annealed oligos are designed to contain the appropriate 5′ overhangs without digestion. Because we have successfully inserted tags using homology arms of just 50–60 bp (see below), we prefer annealed oligos for homology arms and sgRNA targeting sequences. A library of donor plasmids supplies DNA components of the three remaining types: N-tagging connectors, C-tagging connectors, and the tags (which also include the selectable marker) (Figure 2B, Table S1, and Table S5). Donor plasmids of the same type all produce the same 5′ overhangs upon SapI digestion; by selecting different combinations of connector and tag donor plasmids from the library, a large number of distinct repair templates can be built with a single set of homology arms and sgRNA oligos. For high-throughput projects with the goal of introducing a single tag at a large number of loci, the tag and connector donor plasmids can be eliminated from the SapTrap reaction by preassembling these components into a “three-site” destination vector (Figure S1). The three-site vector requires only synthetic oligos for assembly.
Targeting vectors are assembled by incubating the destination vector, annealed oligos, and donor plasmids with SapI, T4 polynucleotide kinase (to phosphorylate annealed oligos), T4 DNA ligase, and ATP (Figure S2). The enzyme mixture can be prepared in bulk and stored at −80° in single reaction aliquots. To assemble a targeting vector, the DNA components are mixed with a thawed aliquot of SapTrap enzyme mixture and incubated overnight at 20°–25°. Background from unreacted destination vector can be eliminated by subsequently heat-inactivating the DNA ligase and digesting the assembly reaction with a counterselection restriction enzyme. Recognition sites for these enzymes exist only in the portion of the destination vector that is removed during targeting vector assembly (Figure S3). Counterselection restriction sites should be absent from homology arms, repair template, and sgRNA coding sequence; eight different counterselection sites are available so that an appropriate enzyme can be chosen for any particular plasmid. Transformation of competent E. coli with a single 2.5-μl SapTrap reaction yields >100 colonies in our experience. The donor plasmids do not contribute background colonies because they contain a kanamycin resistance gene, whereas the assembled targeting vector contains an ampicillin resistance gene. For 24 unique constructs assembled during this study, 49% of all colonies screened contained the correctly assembled plasmid (91/185 colony PCRs performed) with a range of 20–100% correct assemblies for individual reactions. Of 51 plasmids subjected to Sanger sequencing of the oligo-derived sequences, 86% (44/51) had the correct sequence and 14% (7/51) had a single-base deletion or point mutation in one of the oligo-derived sequences.
Short homology arms
In previous studies, homology arms as short as 500 bp for plasmid repair templates and 30 bp for linear repair templates efficiently direct insertion into the C. elegans genome (Paix et al. 2014; Dickinson et al. 2015). To determine the effect of homology arm length on insertion frequency for SapTrap vectors, we targeted the 5′ end of the snb-1 gene with a gfp cassette flanked by homology arms 0, 44, 100, or 400 bp in length (Figure 3A and Table S3). All vectors used the same sgRNA that cleaves 7 bases upstream of the translational start site (see Table S2). As expected, no insertions were recovered using the construct lacking homology arms. However, 44-bp homology arms generated insertions in 21% of injected animals (P0s segregating insertions/successfully injected P0s), and increasing the length of the homology arms to 400 bp resulted in a marginal but insignificant increase in the insertion rate. Short homology arms are advantageous because they can be generated inexpensively by annealing pairs of synthetic oligos. Oligo-derived homology arms eliminate the need to perform PCR during vector construction and facilitate mutation of the PAM site or the sgRNA binding site in the repair template to prevent recutting after repair. For these reasons, we used annealed oligos (“short arms”) for all subsequent plasmid constructions, unless noted otherwise. Currently, 60-base oligos are the longest oligos that can be custom synthesized inexpensively with a low error rate; 3 bp must be dedicated to the SapTrap overhang, and thus homology arms are 57 bp in length. Homology arm length was extended from 44 bp tested at the snb-1 locus to 57 bp because longer homology arms allow more flexibility in picking a guide sequence relative to the insertion site.
To validate our SapTrap vectors, we tested insertion at six different genes. We generated short-arm targeting vectors to introduce gfp or tagrfp with intron-nested unc-119 at the 5′ end of rab-3, snb-1, and unc-17 and at the 3′ end of sng-1, snt-1, and unc-32. These vectors generated insertions at all loci except the synaptotagmin locus snt-1, with insertion frequencies ranging from 10% to 37% (Figure 3B and Table S3). All insertions were verified by PCR and fluorescence imaging (Figure S6). We conclude that the short-arm targeting vectors direct efficient insertion at a wide range of genetic loci.
Because we were unable to insert a tag into snt-1 at the original site, we tested a different Cas9 cut site. We identified a suitable cut site 277 bp upstream of the snt-1 stop codon (see Figure S4). To tag snt-1 at the 3′ end using this cut site, the 5′ homology arm needed to include the 277 bases between the upstream cut site and the insertion site, as well as a true 5′ homology arm flanking the 5′ side of the cut site. Because this homology arm is too long to assemble from annealed oligos, we ordered a synthetic 750-bp DNA fragment (gBlocks; IDT) encoding both 5′ and 3′ homology arms and the sgRNA targeting sequence. We flanked each homology arm and the sgRNA targeting sequence with SapI sites so the intact gBlock sequence could be fed into a SapTrap reaction to release all three target site-specific sequences. The 750-bp length gBlock allotment allowed us to encode longer homology arms (165 bp for the 5′ homology arm and 147 bp for the 3′ homology arm) than possible using synthetic oligos. The resulting targeting vectors (snt-1 long) directed insertion of gfp or tagrfp at the 3′ end of snt-1 in 7% of injected P0s, a low but comparable frequency to the short-armed vectors targeting other genes. These results demonstrate that large synthetic double-stranded DNA substrates are an alternative to oligo-based assembly pipelines for SapTrap vectors.
In addition to 40 successfully tagged strains (Figure 3 and Table S3), we isolated 7 strains that lacked an insertion at the targeted locus, but were stably rescued for unc-119. Because these strains did not carry extrachromosomal array markers, they appear to represent off-site insertions of the unc-119(+) repair template and were not counted as targeted insertions in our frequency calculations. We observed an off-site insertion rate of 15% (7 false-positive strains/47 total positive strains). False-positive strains that appear to contain off-site insertions have been reported previously (Dickinson et al. 2015; Katic et al. 2015) and appear to be a general consequence of tag insertion by CRISPR/Cas9.
Within SapTrap’s modular design, we included two optional connector modules to add sequences between the tag and each homology arm (Figure 2A). For simple tagging operations, we built connectors that encode glycine-rich flexible linkers (see Table S1 and Table S5). However, the connector modules can include more complex regulatory sequences to support more complex tagging operations. For example, it is often experimentally useful to restrict expression of a tagged protein to a specific subset of tissues or even a single cell. To illustrate the utility of the connectors, we developed a set of connector modules that can confer cell type-specific expression of tags at native loci.
We developed specialized strategies for cell-specific expression of either N- or C-terminal tags. Both strategies are based on the FLP/FRT recombination system (Golic and Lindquist 1989; Hubbard 2014). FLP is a site-specific recombinase from yeast that acts at FLP-recombinase targets (FRTs); if two FRTs are in the same orientation, FLP will excise the sequence between the FRTs. For both the N- and the C-terminal tags, we built connector modules containing tandem FRT sites flanking an “off cassette” that disrupts the attachment of the tag to the protein and blocks expression of the fluorescent protein (Figure 4A). Excision of the intervening sequence by FLP couples the tag to the protein of interest, using an “FLP-on” strategy (Davis et al. 2008).
For N-terminal tags, the off cassette consists of a PEST degron from mouse ornithine decarboxylase (Li et al. 1998) and the intergenic region from the gdp-2/gdp-3 operon (Lee et al. 1992). Before the off cassette is excised, the transcript is trans-spliced into two messages: one containing the tag fused to the PEST degron and a second containing the untagged target gene. The tag::PEST protein is translated but rapidly degraded, while the target gene is translated separately with only the FRT site (12 amino acids) appended to the N terminus. For C-terminal tags, the off cassette consists of the let-858 3′-UTR. This 3′-UTR sequence contains a transcriptional stop motif, terminating transcription before the tag sequence is reached (Davis et al. 2008). In both cases, FLP expression induces recombination between the two FRT sites, excising the off cassette and leaving a single FRT site between the native gene and the tag sequence (Figure 4A). The residual FRT site lies in frame with the target gene and tag sequences and encodes a 12-amino-acid flexible linker sequence when translated (GSSYSLESIGTS).
To validate these FLP-dependent tags, we used CRISPR/Cas9 to introduce an FLP-on N-terminal GFP tag at the rab-3 locus and an FLP-on C-terminal GFP tag at the snt-1 locus and assayed GFP expression in the absence and presence of FLP recombinase expression. To promote excision of the off cassette, we used a hyperactive variant of FLP recombinase [FLP-D5 (Nern et al. 2011)] with two nuclear localization signals appended (M. W. Davis, unpublished data). In the absence of FLP expression, neither strain produced detectable levels of GFP in the nervous system (Figure 4B and Figure S5). Upon pan-neuronal expression of FLP, both strains exhibited strong GFP expression throughout the nervous system, as expected since both of these genes are expressed in all neurons (Stefanakis et al. 2015). We conclude that the FLP-on connectors permit recombination-dependent tag expression for both N- and C-terminal tags.
To validate cell specificity of FLP-on GFP induction, we expressed FLP under the control of a variety of more restrictive promoters and assayed the induction of SNT-1::GFP. First, we examined conditional tagging of SNT-1 in GABA neurons. Since SNT-1 is a synaptic vesicle protein, we expressed the synaptic vesicle marker SNB-1::tagRFP in GABA neurons to confirm subcellular localization. Expression of FLP under the unc-47 promoter (McIntire et al. 1997) caused SNT-1::GFP to colocalize with tagRFP at synapses along the ventral nerve cord (Figure 4C), indicating that recombination of the locus was specifically induced in the GABA neurons. In a second experiment, we induced SNT-1::GFP in the acetylcholine neurons by expressing FLP under the control of the unc-17 promoter. In this strain, SNT-1::GFP localized to different synapses from the GABA synapses. Finally, we tested induction of SNT-1::GFP in the serotonin neurons by expressing FLP under the promoter for tph-1. There are only six serotonin neurons in the C. elegans hermaphrodite: the bilateral ADF and NSM neurons in the head (Figure 4D) and the HSN neurons flanking the vulva (Sze et al. 2000). Expression of FLP under the control of the tph-1 promoter caused specific expression of SNT-1::GFP in the presynaptic regions of the ADF and NSM neurons in the head (Figure 4E, HSN not detected). Note that the untagged SNT-1 protein is still expressed in all other neurons; it is simply not tagged by GFP in these cells. In all experiments presented in Figure 4, GFP was induced in 99.6% (261/262 animals) of the worms harboring the FLP-expressing array (Table S4). Thus, these FLP-on constructs enable careful analysis of the subcellular localization of proteins without fluorescence appearing in other cells and without the overexpression that is often inherent in classical transgenes.
SapTrap is a high-efficiency plasmid assembly protocol and component toolkit for inserting genetically encoded tags at native loci in C. elegans. SapTrap produces single plasmid targeting vectors that, when co-injected with a Cas9 expression plasmid, insert genetic tags with high frequency (10–37%). Four design features distinguish the SapTrap method: (1) modular assembly, (2) a library of tags and regulatory sequences, (3) an embedded selectable marker, and (4) short oligo-derived homology arms.
The primary advantage of SapTrap is it allows high-efficiency modular assembly of a single but complex targeting vector. Each targeting vector contains a guide RNA expression cassette and a repair template. The only site-specific reagents required for assembly of the SapTrap targeting construct are the sgRNA targeting sequence and the 5′ and 3′ homology arms. These sequences are sourced from oligos or other synthetic DNAs, eliminating the need for PCR or molecular cloning during vector assembly. The non-site-specific components are tags, selectable markers, and “connector” modules that are provided as a prebuilt plasmid library. By choosing different combinations of tags and connectors from the library, a variety of functionally distinct tagging vectors can be assembled for a single insertion site. The modular design is particularly useful if target genes need to be coupled to a variety of different tags, such as different fluorescent markers or affinity tags. Finally, assembly is robust and inexpensive: the 2.5-μl reactions produce hundreds of clones, and the non-DNA reagents cost <$1 per reaction.
The library of non-site-specific elements of the repair template is divided into two functional types: the tag and the connector. Tag sequences generally encode proteins and currently include fluorescent tags (GFP, tagRFP) and nonfluorescent tags (Halo, SNAP) (Keppler et al. 2003; Los and Wood 2007) useful for fluorescence imaging or biochemical purification. We envision that the library will grow to include other tags such as degrons that can induce proteolytic degradation of the protein leading to functional knockout (Rakhit et al. 2014), proximity labeling tags for identifying neighboring proteins or interaction partners (Roux et al. 2012; Rhee et al. 2013), and MS2 stem loops for tracking mRNAs in vivo (Bertrand et al. 1998). Connectors are optional DNA modules that fit between the tag sequence and the homology arms and generally control the transcriptional and translational coupling between the target gene and the tag. The connectors include basic flexible peptide linkers for building translational fusions, trans-splicing elements for generating transcriptional reporters, and a transcriptional termination sequence for generating fluorescently marked null mutations (Table S5). To address an important limitation of native-locus tags, we developed a set of N- and C-terminal FLP-on connectors that prevent expression of the tag in the absence of FLP recombinase. By expressing FLP in specific cells, the tag can be coupled to the protein in only those cells. By separating the tags from the connector regulatory sequences, the SapTrap library is kept small and the combinatorial utility of the modules is maximized. For example, the single gfp donor plasmid in the current library can be inserted internally or at either the N or the C terminus of any gene product and can be coupled to the protein as a constitutive or conditional translational fusion or as a nuclear-localized or cytoplasmic transcriptional reporter, simply by pairing it with different connectors from the existing library.
Successful CRISPR/Cas9 insertions are identified by co-insertion of a selectable marker. Tags in the SapTrap library contain the C. briggsae unc-119 gene nested in a synthetic intron of the tag, and worms containing SapTrap tag insertions are directly selected by screening for rescue of the injected unc-119(ed3) strain. A disadvantage of co-insertion of a selectable marker is that it could affect expression of the target gene. Co-CRISPR avoids selection markers and enriches for successfully injected strains by monitoring events at a second locus (Arribere et al. 2014; Kim et al. 2014). Targeted events can be identified in co-CRISPR by phenotypic selection or a secondary selection such as expression of GFP. On the other hand, a selectable marker is useful when inserting tags that are undetectable on a fluorescence dissecting microscope, such as Halo and SNAP tags or tags incorporating FLP-on connectors. A feature of the SapTrap constructs is that the unc-119 selectable marker is inserted in the reverse orientation relative to the target gene, and thus the target gene and the selectable marker can be concurrently expressed. For genes with viable mutant phenotypes concurrent expression is not crucial (Dickinson et al. 2015; Norris et al. 2015). However, when tagging essential genes for which loss-of-function cannot be tolerated during strain construction, concurrent expression of the target gene and tag is advantageous. Although it was conceivable that this arrangement would lead to colliding RNA polymerases or silencing from double-stranded RNA production in neurons, we observed successful expression of six different neuronal gene constructs containing the inverted unc-119 selectable marker. However, we did observe moderate interference between the tag and marker genes at one locus. Although weak, the interference suggests that the unc-119 marker should always be removed after strain isolation. In our work, CRE-mediated excision of the Cbr-unc-119 marker is efficient; injection of a CRE expression construct generated excisions at all six loci and nearly all successfully injected animals produce unc-119 progeny due to successful excision. The Goldstein laboratory has recently demonstrated that a heat-shock-inducible CRE recombinase inserted at single copy in the worm genome drives effective germline CRE expression (Dickinson et al. 2015). We hope to incorporate this advance in future implementations of our system, negating the need for a second injection step. Finally, we note that the SapTrap assembly protocol is not limited to our novel selection cassette. By removing the unc-119 marker from the tag donor plasmids, markerless repair templates for co-CRISPR can be built. Alternatively, other recently published syntron-embedded selectable markers (Dickinson et al. 2015; Norris et al. 2015) can be added to tag donor plasmids, combining the full utility of the SapTrap modular assembly toolkit with these alternative selection strategies, each with its own associated strengths.
Short homology arms of just 30–60 bp were pioneered for co-CRISPR applications by the Seydoux group (Paix et al. 2014). We have similarly observed that homology arms of just 57 bp generated insertions at high frequency at most loci tested. An advantage to short homology arms is that they can be generated inexpensively and without PCR by annealing synthetic oligonucleotides. But short arms also mean that the cut site must be very close to the targeted insertion site (in our experiments separated by only 4–20 bp), which limits selection of the guide RNA binding site. The Meyer group recently demonstrated that guide RNAs that bind to sites with a diguanine motif immediately preceding the PAM sequence (called ggNGG guides) are significantly more efficient at generating double-stranded breaks in the worm genome in vivo (Farboud and Meyer 2015). Binding sites containing the ggNGG motif are significantly less common than binding sites containing only the NGG PAM sequence. Within the ranges of the short homology arms we employed, we were unable to locate suitable binding sites conforming to the ggNGG design principle (Table S2). Nonetheless, high insertion rates were achieved at most loci tested, as observed by others using short homology arms (Paix et al. 2014). It is possible that high injection concentrations of the targeting construct overcome inefficient cutting. Either higher levels of sgRNA expression may compensate for suboptimal sgRNAs or higher repair template levels may favor insertion events even when cutting is inefficient. In cases where a guide RNA binding site cannot be located near the desired insertion site, SapTrap can accept longer homology arms from alternative synthetic DNA sources or from PCR products. Alternatively, Cas9 variants with altered PAM specificities may increase the availability of guide RNA binding sites within these narrow windows (Bell et al. 2015; Kleinstiver et al. 2015).
Finally, in addition to the practical advantages of SapTrap for building constructs for modification of individual genes, the SapTrap single vectors will be advantageous for high-throughput applications, for example tagging of hundreds of genes in the genome. It will be simpler to build libraries with a single targeting vector for each gene than to build separate sgRNA and repair template vectors for each targeting event. For high-throughput applications seeking to introduce a specific tag at a large number of loci, the tag can be preassembled in a three-site destination vector (see Figure S1), reducing the complexity of the assembly operation. Coupled with robotic micro-injection (Gilleland et al. 2015), it is conceivable that SapTrap vector libraries could be used for genome-wide projects to determine the expression pattern or generate knockouts of all genes in the C. elegans genome.
We thank Christian Frøkjær-Jensen for suggesting the reverse-oriented Cbr-unc-119 selectable marker, M. Wayne Davis for contributing the 2xNLS-FLP-D5 construct, and Robert Hobson for contributing the Punc-47::snb-1::tagRFP plasmid. We thank Jamie White, Aude Peden, Eric Bend, and Christian Frøkjær-Jensen for additional plasmid reagents. We thank Christian Frøkjær-Jensen, M. Wayne Davis, and members of the Jorgensen laboratory for critical discussions. This work was supported by National Institutes of Health grant R01 2R01GM095817 (to E.M.J.).
Communicating editor: O. Hobert
Supporting information is available online at www.genetics.org/lookup/suppl/doi:10.1534/genetics.115.184275/-/DC1.
- Received October 31, 2015.
- Accepted January 20, 2016.
- Copyright © 2016 by the Genetics Society of America