Engineered nucleases that cleave specific DNA sequences in vivo are valuable reagents for targeted mutagenesis. Here we report a new class of sequence-specific nucleases created by fusing transcription activator-like effectors (TALEs) to the catalytic domain of the FokI endonuclease. Both native and custom TALE-nuclease fusions direct DNA double-strand breaks to specific, targeted sites.
ZINC finger nucleases (ZFNs) and meganucleases cleave specific DNA target sequences in vivo and are powerful tools for genome modification (Carroll 2008; Cathomen and Joung 2008; Galetto et al. 2009). Chromosome breaks created by these engineered nucleases stimulate homologous recombination or gene targeting: in the presence of a template for repairing the double-strand break, specific DNA sequence changes in the template become incorporated into chromosomes at or near the break site. In the absence of a repair template, broken chromosomes are rejoined by nonhomologous end joining, which often introduces short DNA insertions or deletions that create targeted gene knockouts. For both ZFNs and meganucleases, a barrier to their widespread adoption has been the challenge in engineering new DNA binding specificities. While significant progress has been made in recent years (Urnov et al. 2005; Fajardo-Sanchez et al. 2008; Maeder et al. 2008; Kim et al. 2009), generating the necessary reagents to modify new DNA targets is still resource intensive and to some degree empirical.
A novel DNA binding domain was recently described in a family of proteins known as transcription activator-like effectors (TALEs) (Boch et al. 2009; Moscou and Bogdanove 2009). TALEs are produced by plant pathogens in the genus Xanthomonas, which deliver the proteins to plant cells during infection through the type III secretion pathway (Bogdanove et al. 2010). Once inside the plant cell, TALEs enter the nucleus, bind effector-specific DNA sequences, and transcriptionally activate gene expression. Typically, activation of target genes increases plant susceptibility to pathogen colonization, but in some cases, it triggers plant defense. TALE binding to DNA is mediated by a central region of these proteins that contains as many as 30 tandem repeats of a 33- to 35-amino-acid-sequence motif (Figure 1A). The amino acid sequence of each repeat is largely invariant, with the exception of two adjacent amino acids (the repeat variable diresidue or RVD). Repeats with different RVDs recognize different DNA base pairs, and there is a one-to-one correspondence between the RVDs in the repeat domain and the nucleotides in the target DNA sequence, constituting a straightforward cipher (Boch et al. 2009; Moscou and Bogdanove 2009). Using this cipher, targets of new TALEs have been correctly predicted, and functional targets of TALEs composed of randomly assembled repeats have been generated (Boch et al. 2009; Moscou and Bogdanove 2009; Romer et al. 2010). The ability to predict the DNA binding specificity of native or artificial TALEs suggests a variety of applications for these proteins in the targeted modification of genomes. In particular, the DNA recognition domain could direct a fused nuclease to a specific DNA sequence. Further, the apparent modularity of the repeats should enable rapid construction of TALE nucleases (TALENs) with novel specificities to target double-strand breaks at specific locations in the genome.
To test this idea, we modified TALEs by adopting the molecular architecture used for ZFNs. ZFNs function as dimers, with each monomer comprising a DNA binding domain (a zinc finger array) fused to the catalytic domain of the FokI restriction enzyme (Carroll 2008; Cathomen and Joung 2008). Two zinc finger arrays are engineered to recognize target sequences in the genome (each typically 9–12 bp) that are separated by a spacer sequence (typically 5–7 bp). Binding of the zinc finger arrays to the target allows FokI to dimerize and create a double-strand break within the spacer. We reasoned that the zinc finger arrays could be substituted with the DNA recognition domain of TALEs to create TALENs that recognize and cleave DNA targets (Figure 1A). To assess TALEN function, we adapted a yeast assay in which LacZ activity serves as an indicator of DNA cleavage (Townsend et al. 2009). In this assay, a target plasmid and a TALEN expression plasmid are brought together in the same cell by mating. The target plasmid has a lacZ reporter gene with a 125-bp duplication of coding sequence. The duplication flanks the TALEN target site. When a double-strand DNA break occurs at the target site, it is repaired through single-strand annealing between the duplicated sequences, creating a functional lacZ gene whose expression can be measured by standard assays.
We first used two well-characterized TALEs, namely AvrBs3 from the pepper pathogen Xanthomonas campestris pv. vesicatoria and PthXo1 from the rice pathogen X. oryzae pv. oryzae (Bonas et al. 1989; Yang et al. 2006). Both TALE-encoding genes have a BamHI restriction fragment that encompasses the coding sequence for the repeat domain and 287 aa prior and 231 aa after (Figure 1A). Absent from the BamHI fragment is the TALE transcriptional activation domain, which was replaced with a FokI nuclease monomer. Because the FokI monomers must dimerize to cleave, it was unclear what an appropriate spacer length between the two DNA recognition sites might be. For ZFNs, in which the zinc finger array is separated from FokI by a 4- to 7-amino-acid linker, the typical spacer between the two recognition sites is 5–7 bp. Since 235 amino acids separate the repeat domain from FokI in our TALEN constructs, we selected a 15-bp spacer to separate the two recognition sites, assuming that a larger spacer may be required to enable FokI to dimerize. As a positive control we used a well-characterized zinc finger nuclease with a DNA binding domain derived from the mouse transcription factor Zif268 (Porteus and Baltimore 2003). As negative controls, the TALE domains were fused to a catalytically inactive FokI variant or tested against noncognate DNA targets. Robust activity was observed for both the AvrBs3 and the PthXo1 TALENs (Figure 1B). The activity of the PthXo1 TALEN approximated that of the ZFN positive control.
We next varied the distance between the TALE binding sites (11 length variants between 12 and 30 bp) to identify spacer lengths that enable FokI to dimerize most efficiently (Figure 1C). Both enzymes showed two spacer length optima—one at 15 bp and the other at either 21 bp (AvrBs3) or 24 bp (PthXo1). For PthXo1, activity was observed for all tested spacer lengths 13 bp and longer. Some spacer lengths for AvrBs3 showed no activity, however, suggesting that spacer length is critical for certain TALENs.
The above experiments tested activity of homodimeric TALENs, which bind two identical recognition sequences placed in opposition on either side of the spacer. Since such palindromic sites are unlikely to occur naturally in genomic targets, we tested whether TALENs could function as heterodimers. AvrBs3 and PthXo1 recognition sites were placed on either side of a 15-bp spacer (Figure 1D). The resulting activity of the heterodimeric TALEN approximated an average of the activities observed with the two homodimeric enzymes.
To realize the potential of TALENs for genome modification, it will be necessary to engineer them to recognize novel chromosomal DNA sequences. Randomly assembled repeat domains have been shown to bind DNA targets predicted by the cipher (Boch et al. 2009); however, custom TALEs designed to recognize new target sequences have not been reported. This is an important distinction, because the former experiments validate the cipher, whereas the latter would demonstrate that DNA binding domains can be engineered with the requisite specificity for in vivo manipulations. To test whether repeat domains can be assembled to target TALENs to arbitrary chromosomal sequences, we chose two genes previously targeted for mutagenesis with ZFNs—ADH1 from Arabidopsis and gridlock from zebrafish (Foley et al. 2009; Zhang et al. 2010). We first searched for 12- to 13-bp sequences in the coding regions, preceded by a 5′ T and with a nucleotide composition similar to that of TALE binding sites identified by Moscou and Bogdanove (2009). In ADH1 and gridlock, such sites occurred on average every 7–9 bp. Four 12-bp sites were selected in ADH1 (at positions 360, 408, 928, and 975 of the chromosomal gene sequence) and one 13-bp site in gridlock (at position 2356 of the chromosomal gene sequence) (Figure 2A). We then constructed TALE repeat domains to recognize these targets, using the most abundant RVDs from native TALEs (NI for A, HD for C, NN for G, and NG for T). Sequences encoding the repeat domain of the native TALE tal1c were replaced with the custom repeat domains, and BamHI fragments from these engineered TALEs were then fused to sequences encoding the catalytic domain of FokI (see supporting information, File S1). The resulting custom TALENs were tested in the yeast assay as homodimeric TALENs; that is, the identical DNA binding site was duplicated in inverse orientation on either side of a spacer. It is important to note that heterodimeric TALENs would need to be constructed to direct cleavage at naturally occurring DNA targets. Robust nuclease activity was observed for the ADH1-360-12 and gridlock-2356-13r TALEN (Figure 2B). The ADH1-928-12 TALEN had modest activity that was nonetheless significantly above the negative controls. For each TALEN that gave positive results, nuclease activity was specific to the cognate target. These results indicate that novel, functional TALENs can be created by assembly of customized repeat domains.
The experiments described here suggest that TALENs hold much promise for applications requiring sequence-specific nucleases. In particular, the ability to create TALENs that recognize novel, arbitrarily selected target sequences suggests a potential for engineering TALENs for targeted genome modification. That said, the failure of some custom TALENs suggests that yet unknown rules govern the assembly of functional repeat domains. For example, repeat composition may influence protein stability, or interactions among repeat domains may affect DNA binding activity as has been observed for finger–finger interactions in zinc finger arrays (Elrod-Erickson et al. 1996; Isalan et al. 1997). Alternatively, the spacer lengths we used may have prevented dimerization of FokI, as appeared to be the case for some spacers with AvrBs3. Clearly, it will be important to gain a better understanding of the relationship of spacer length to function for TALENs with different repeat domains. Ascertaining the minimal DNA binding domain might help accomplish this; however, we believe the repeats alone are not sufficient for adequate DNA binding, as TALENs constructed with just the repeat domain did not function in the yeast assay (data not shown). In the short term, we will test whether custom TALENs can be created that recognize and cleave endogenous chromosomal targets, and we will evaluate the efficiency with which custom TALENs create genome modifications by nonhomologous end joining and homologous recombination. Such experiments will be key to assessing the full utility of these reagents for eukaryotic genome engineering.
We thank J. Keith Joung and Joshua Baller for helpful advice and suggestions. This work was supported by the National Science Foundation Program (DBI 0923827 and MCB 0209818 to D.F.V. and DBI 0820831 to A.J.B.) and funds from the University of Minnesota.
Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.110.120717/DC1.
Communicating editor: J. Boeke
- Received July 8, 2010.
- Accepted July 20, 2010.
- Copyright © 2010 by the Genetics Society of America
Available freely online through the author-supported open access option.