TALENs: a widely applicable technology for targeted genome editing (original) (raw)

. Author manuscript; available in PMC: 2013 Jul 1.

Published in final edited form as: Nat Rev Mol Cell Biol. 2012 Nov 21;14(1):49–55. doi: 10.1038/nrm3486

Preface

Engineered nucleases enable the targeted alteration of any DNA sequence in a wide range of cell types and organisms. The newly-developed transcription activator-like effector nucleases (TALENs) comprise a non-specific DNA-cleaving nuclease fused to a DNA-binding domain that can be easily engineered so that TALENs can target essentially any sequence. The capability to quickly and efficiently alter any gene sequence using TALENs promises to have profound impacts on biological research and to yield potential therapeutic strategies for genetic diseases.

Introduction

Targeted genome editing is a broadly applicable approach for efficiently modifying essentially any sequence of interest in living cells or organisms. This technology relies on the use of engineered nucleases, that is, artificial proteins composed of a customizable sequence-specific DNA-binding domain fused to a nuclease that cleaves DNA in a non-sequence-specific manner. These targetable nucleases are used to induce double-strand breaks (DSBs) into specific DNA sites, which are then repaired by mechanisms that can be exploited to create sequence alterations at the cleavage site. Nuclease-mediated genome editing enables genetic studies that were previously difficult or impossible to perform. This technology might also provide therapeutic avenues for genetic disorders including monogenic diseases such as sickle cell anemia or cystic fibrosis. Reflecting its broad importance, genome editing with engineered nucleases was named “Method of the Year” for 2011 by Nature Methods.1

The vast majority of targeted genome editing described in the literature (including initial foundational studies) has been performed using zinc finger nucleases (ZFNs) (Box 1). ZFNs have been used to modify endogenous genes in a wide range of organisms and cell types2. Several types of genomic alterations can be introduced with ZFNs including point mutations, deletions, insertions, inversions, duplications, and translocations, thus providing researchers with unprecedented tools to perform genetic manipulations. Furthermore, ZFNs can potentially be used for therapeutic purposes; for example, ZFNs designed to disrupt the expression of the HIV co-receptor CCR5 (chemokine receptor 5) gene3 have entered Phase 2 clinical trials for treatment of HIV/AIDS.

Box 1 | Engineered Zinc Finger Proteins.

Cys2His2 zinc fingers are DNA-binding domains that each recognize approximately three bps of DNA. Alteration of a small number of residues in or near an alpha-helix within this domain can lead to changes in its DNA-binding specificity.83, 84 Engineered zinc fingers can be joined together into more extended arrays capable of recognizing longer DNA sequences. However, to obtain proteins with high affinities and specificities, context-dependent effects between and among fingers in an array should be considered.8487 Sangamo BioSciences, Inc. has developed a proprietary and highly effective platform for engineering zinc finger arrays36 and customized proteins made by this method can be purchased from Sigma-Aldrich. In addition, various publicly available methods for engineering zinc finger proteins have been described. Modular assembly of individual pre-selected zinc finger domains provides a simple and rapid method for creating customized arrays88 but the success rate of this strategy has been reported to be low89. Additional methods that explicitly account for context-dependent effects among zinc fingers in an array include: Oligomerized Pool Engineering (OPEN)90, Context-Dependent Assembly (CoDA)52, and a bacterial one-hybrid (B1H) selection-based system37. However, OPEN and B1H selections can be challenging to practice for some non-specialist laboratories and OPEN, B1H, and CoDA have limitations in the range of sequences they can be used to target. A large number of zinc finger arrays engineered using these various methods have been fused to a non-specific nuclease domain from the Type IIS FokI restriction enzyme to create zinc finger nucleases (ZFNs)91. The FokI nuclease functions as a dimer and therefore two zinc finger arrays must be designed for each target site. Early ZFNs used wild-type homodimeric FokI domains, which can form unwanted dimers of the same monomeric ZFN. More recent studies have described obligate heterodimeric FokI domains that reduce the formation of unwanted homodimeric species and therefore have improved specificities.76 Engineered zinc finger arrays have also been fused to transcriptional regulatory domains to create artificial transcription factors capable of altering the expression of endogenous genes.92

Recently, transcription activator-like effector nucleases (TALENs) have rapidly emerged as an alternative to ZFNs for genome editing and introducing targeted DSBs. TALENs are similar to ZFNs and comprise a non-specific FokI nuclease domain fused to a customizable DNA-binding domain. This DNA-binding domain is composed of highly conserved repeats derived from transcription activator-like effectors (TALEs), which are proteins secreted by Xanthomonas bacteria to alter transcription of genes in host plant cells4(Figure 1a and 1b).

Figure 1. Overview of TALENs and TALE repeat arrays.

Figure 1

a | Schematic diagram of a TALEN. TALE repeats are shown as colored cylinders with a final carboxy-terminal truncated “half” repeat. Letters inside each repeat represent the two hypervariable residues. TALE-derived amino- and carboxy-terminal domains required for DNA-binding activity are shown as longer blue and grey cylinders, respectively. The non-specific nuclease domain from the FokI endonuclease is shown as a larger orange cylinder. b | TALENs bind and cleave as dimers on a target DNA site. Note that the TALE-derived amino- and carboxy-terminal domains flanking the repeats may make some contacts to the DNA. Cleavage by the FokI domains occurs in the “spacer” sequence that lies between the two regions of the DNA bound by the two TALEN monomers. c | Schematic diagram of a TALE-derived DNA-binding domain. The amino acid sequence of a single TALE repeat is expanded below with the two hypervariable residues highlighted in orange and bold text. d | TALE-derived DNA-binding domain aligned with its target DNA sequence. Note the matching of repeat domains to single bases in the target site according to the TALE code. Also note the presence of a 5’ thymine preceding the first base bound by a TALE repeat.

TALENs have generated much interest and excitement because they can be very easily and rapidly designed by researchers using a simple ‘protein-DNA code’ that relates modular DNA-binding TALE repeat domains to individual bases in a target-binding site. Over the last two years, leveraging technologies and methodologies previously developed for the use of ZFNs, several groups have used TALENs to modify endogenous genes in yeast5, fruit fly6, roundworm7, crickets8, zebrafish911, frog12, rat13, pig14, cow14, thale cress15, rice16, silkworm17, and human somatic15, 18, 19 and pluripotent stem cells20 (Supplementary information S1) and presumably the technique will continue to extend to additional organisms. Furthermore, a recent large-scale test demonstrated that TALENs have a very high success rate and can be used to target essentially any DNA sequence of interest in human cells18 (Supplementary information S2). Although ZFNs and TALENs have not been directly compared, many studies have shown that TALENs and ZFNs have comparable efficiencies when targeted to the same gene9, 13, 18, 20. Thus, the ease of design, high rates of cleavage activity, and the essentially limitless targeting range of TALENs make them suitable for the use by non-specialist researchers.

In this Innovation article, we will first briefly describe the simple modular strategy used to design customized TALE repeat DNA-binding domains and then review recent progress on the use of TALENs to introduce different types of genome alterations in a range of organisms and cell types. In addition, we provide a comparison of the various publicly available methods for constructing TALENs. Finally, we outline important goals for future research that will further enhance the utility of these tools for research and therapeutic applications.

Custom TALE DNA-binding domains

The fundamental building block used to engineer the DNA-binding domain of TALENs is a highly conserved repeat derived from naturally occurring TALEs encoded by Xanthomonas proteobacteria. These TALEs are injected into host plant cells via a Type III secretion system and bind to genomic DNA to alter transcription in these cells, thereby facilitating pathogenic bacterial colonization.4 DNA binding by these TALEs is mediated by arrays of highly conserved 33–35 amino acid repeats flanked by additional TALE-derived domains at the amino- and carboxy-terminal ends of the array (Figure 1c).

Individual TALE repeats in an array each specify a single base of DNA determined by the identities of two hypervariable residues typically found at positions 12 and 13 of the domain (Figure 1c and d). Experimental evidence for this simple recognition code was first provided by Bonas and colleagues in 2009.21 The researchers observed that the number of repeats in an array corresponded to the length of its target site and this insight enabled them to deduce a simple correlation between the hypervariable residues and the base bound by each repeat. Moreover, they found that a thymine is conserved at the position just 5’ to the base bound by the first repeat in the array (Figure 1d). This group provided experimental evidence for the TALE repeat code by constructing the first examples of engineered TALE repeat arrays with novel specificities.21 The TALE repeat code was also confirmed by another group by performing a computational analysis of the binding specificities of naturally occurring TALEs22. Subsequent reports provided additional evidence that engineered TALE repeats with desired specificities can be created using the code.19, 2325

More recently, co-crystal structures of TALE DNA-binding domains bound to their cognate sites have shown that individual repeats comprise two-helix v-shaped bundles that stack to form a superhelix around the DNA and the hypervariable residues at positions 12 and 13 are positioned in the DNA major-groove. The residues at position 8 and position 12 within the same repeat make a contact with each other that may stabilize the structure of the domain while the residues at position 13 can make base-specific contacts with the DNA.26, 27

Nearly all engineered TALE repeat arrays published to date use four domains with the hypervariable residues NN, NI, HD and NG, for the recognition of G, A, C, and T, respectively. Another repeat with hypervariable residues NK has been reported to be more specific for G than the NN repeat (which can also recognize A)19, 22 but arrays using the NK repeats show less activity than those using NN repeats.10, 28 More recently, a repeat with hypervariable residues NH has been reported to be more specific than the NN repeat, but with only slightly lower activity.28, 29 Additional studies with a greater number of repeat arrays are needed to determine the optimal repeat domain for G recognition. It will also be interesting to explore whether repeats bearing other hypervariable residue combinations will have higher or different specificities for one or more DNA nucleotides.

Nuclease-mediated alterations

A substantial body of literature demonstrates that normal cellular repair of ZFN-induced DSBs by non-homologous end-joining (NHEJ) or homology-directed repair (HDR) can be exploited to introduce targeted genome alterations in a wide range of organisms and cell types2, 30. NHEJ-mediated repair of a nuclease-induced DSB leads to the efficient introduction of variable length insertion/deletion (indel) mutations that originate at the site of the break (Figure 2a). Thus, NHEJ-mediated repair of DSBs introduced into gene coding sequences will often yield frameshift mutations that can lead to knockout of gene function.

Figure 2. Nuclease-induced genome editing.

Figure 2

a | Single nuclease-induced double-strand breaks (DSBs) in a gene locus can be repaired by either NHEJ (thin black arrow) or HDR (thick black arrows). NHEJ-mediated repair leads to introduction of variable length indel mutations. HDR with double-stranded DNA “donor templates” can lead to the introduction of precise nucleotide substitutions or insertions. b | Introduction of two nuclease-induced DSBs in cis on the same chromosome can lead to deletion of the intervening sequence or inversion (left panel) and introduction of two nuclease-induced DSBs on two different chromosomes can lead to the formation of translocations (right panel).

Alterantively, if a double-stranded DNA “donor template” is supplied, HDR of a nuclease-induced DSB can be used to introduce precise nucleotide substitutions or insertions of up to 7.6 kb at or near the site of the break31 (Figure 2a). Recent work has also shown that oligonucleotides can be used with ZFNs to introduce precise alterations, small insertions, and large deletions32. ZFNs have been used to introduce NHEJ- or HDR-mediated gene alterations in fruit fly33, 34, roundworm7, 35, zebrafish36, 37, rainbow trout38, catfish39, sea urchin40, frog41, pig42, cattle43, cricket8, rabbit44, silkworm45, butterfly46, mouse47, 48, rat49, 50, soybean51, 52, thale cress53, 54, corn55, tobacco56, petunia57, hamster cells58, and human somatic59, 60 and pluripotent stem cells6163. In most of these organisms and cell types, the high absolute rates of mutagenesis that can be achieved with ZFNs have enabled researchers to screen for mutations without the need for selection.

As summarized in Figure 2b, ZFNs and the I-SceI homing endonuclease have also been used to induce other more complex types of genome alterations in mammalian cells. These include large deletions induced by introduction of two DSBs with subsequent deletion of intervening sequence of up to 15 Mbps in length64, translocations induced by two DSBs introduced on different chromosomes65, 66 and inversions of chromosomal sequence between two DSBs induced on the same chromosome67. Given the requirement to introduce two DSBs, it is not surprising that these more complex alterations are obtained with lower efficiencies compared to alterations that are dependent on a single DSB.

Although TALENs were first described only two years ago, these nucleases have already been utilized in a large number of applications. TALENs have been used to generate NHEJ-mediated mutations in a wide variety of organisms with generally high efficiencies (summarized in Supplementary information S1). TALENs have also been used to introduce specific insertions in human somatic and pluripotent stem cells using double-stranded donor templates.19, 20

Applications of Genome Editing

As noted above, the rapid development of customized ZFNs has already substantially expanded the scope of genetic research that can be performed in a broad range of different organisms and cell types. The high efficiencies of alterations observed have already inspired efforts to use ZFNs as a potential therapeutic approach for genetic-based diseases. The relative simplicity with which TALENs can be engineered will further spur efforts to explore the research and therapeutic applications of customized nuclease technology. In this section, we review recent progress on the use of TALENs in various organismic and cellular contexts and briefly discuss prospects for their future applications.

Model organisms

As with ZFNs, TALENs have enabled efficient introduction of targeted alterations in a number of model organisms that were previously difficult or impossible to manipulate genetically such as fruit fly6, roundworm7, zebrafish911, frog12, rat13, and pig14. In addition, TALENs have also been used to modify endogenous genes in cow14, cricket8, and silkworm17. Most of these studies used a single TALEN pair to generate NHEJ-induced knockout mutations but two of these reports also described the use of two TALEN pairs targeted to the same chromosome to generate deletions and/or inversions of large chromosomal segments14, 17. In addition, another recent study also used TALENs together with short single-stranded DNA oligodeoxynucleotide donors to make precise insertions into the zebrafish genome11. The ability to efficiently induce mutations in various organisms should lead to the development of new animal models of human diseases. For example, TALENs have been used to inactivate the LDL receptor gene in pigs, thereby generating a model for familial hypercholesterolemia14.

Plants and Livestock

Unlike model organisms, which are typically selected for their rapid life cycles, many agricultural plants and species have long reproductive cycles and therefore can take many years to breed successfully. Nuclease-mediated editing of plants and agricultural animals may greatly decrease the time required to generate new varieties of both species relative to traditional breeding strategies. For example, using micropropagation strategies, some plant species can be modified on the single cell level and then grown into mature plants. ZFNs have been used to make targeted indels in soybean51, 52 and to introduce specific mutations and transgene insertions that confer herbicide resistance in tobacco56 and corn55, respectively. To date, TALENs have been used to make knockout mutations in Arabidopsis15 and to introduce resistance to infection by Xanthomonas bacteria in rice by disrupting the target sites of naturally occurring TALEs that contribute to pathogenicity16. Gene-editing nucleases have also been used to make targeted NHEJ-induced indel mutations in pigs and cows14. In addition, large deletions and inversions of sequence greater than 6 kb in length have been induced in pigs by targeting two TALEN pairs to the same chromosome14.

Cell-based disease modeling

Gene-editing nucleases offer the potential to directly assess the impacts of gene disruption and of specific sequence variants on gene function in somatic cell-based models of disease. To date, TALENs have primarily been used to disrupt human genes via introduction of NHEJ-induced indels into coding sequence15, 18, 19, 6871. In principle, such loss-of-function mutations could be used to create somatic cell-based models of disease. Alternatively, precise insertions have also been introduced into endogenous human genes using HDR with TALENs and a double-stranded homologous donor template plasmid19, 20. Targeted insertions could be used to create endogenous gene fusions to fluorescent proteins or epitope tags to visualize protein expression, distribution, or interactions. Beyond generation of such fusions, HDR-based approaches might also be used to create isogenic human or other mammalian cell lines bearing specific single nucleotide polymorphisms (SNPs) identified by large-scale GWAS, ENCODE, or other sequencing projects, potentially enabling studies to determine the functional significance of these sequence variants.

Therapeutics

In contrast to therapies that treat symptoms of genetic diseases, targeted nucleases offer the potential to correct or disrupt gene products or sequences responsible for causing the disorder. For example, recent studies have shown that ZFN-induced HDR can be used to correct genetic mutations responsible for sickle cell anemia72 or alpha1-antitypsin disease73 or in the Parkinson’s disease-associated alpha-synuclein gene74 in patient-specific induced pluripotent stem cells that have been reprogrammed from fibroblasts. These reports provide important proofs-of-principle for autologous transplant strategies in which patient-derived cells might be corrected ex vivo and then re-introduced back into patients with presumably reduced probability of immune complications. Although these studies were performed with ZFNs, TALEN-induced HDR in human pluripotent stem and somatic cells has also been achieved20. Another potential strategy for therapy is to use nuclease-induced disruption via NHEJ-mediated repair to abolish the activity of a gene. This type of approach is currently being tested as a therapy for AIDS in which ZFNs are used to disrupt expression of the CCR5 gene product required by certain strains of HIV as a co-receptor to infect cells3. The ability to target essentially any DNA sequence with TALENs will undoubtedly motivate exploration of both gene-correction and gene-disruption strategies for the treatment of a wide range of genetic and other diseases.

Platforms for Engineering TALENs

The construction of DNA encoding engineered TALE repeat arrays can be challenging due to the requirement to assemble multiple, nearly identical repeat sequences. Different platforms have been designed to facilitate the assembly of plasmids encoding TALE repeat arrays. These methods are depicted in Supplementary information S3 and can be grouped into three broad categories: standard restriction enzyme and ligation-based cloning, “Golden Gate” cloning, and solid-phase assembly. Detailed descriptions of these methods are provided in Supplementary Information S2.

As summarized in Supplementary information S4, these platforms vary in their throughput, molecular cloning techniques used, numbers of plasmids required (and time required to prepare these DNAs), use of potentially mutagenic PCR, flexibility in the length of arrays that can be constructed, ease with which the required reagents and detailed protocols can be acquired, and availability of author-supported web-based software for practicing the method. Reagent kits for three of these methods are available to academics through the non-profit plasmid distribution service Addgene (http://www.addgene.org/TALEN/). We have established and maintain an active and open newsgroup (currently with over 660 members) for discussion of TALE-related projects (http://groups.google.com/group/talengineering) as well as a “one-stop” comprehensive website with links to protocols, reagents, software, and other information about engineered TALE technology (http://www.TALengineering.org).

The specific architecture of a TALEN is an important factor for users to consider as they choose an assembly method. Various TALEN architectures have been utilized to date and one difference among these is the length and sequence composition of the amino- and carboxy-terminal TALE-derived sequences that flank the TALE repeat array. In the earliest TALENs described in the literature, large segments of naturally occurring TALE sequence were used to join the FokI domain to the carboxy-terminal end of engineered TALE repeat arrays.24 The TALEN framework was then refined by showing that nuclease activities could be greatly enhanced by truncating the length of this carboxy-terminal TALE-derived sequence19,68. In addition, although early studies used wild-type homodimeric FokI nuclease domains, more recent reports10, 20, 75 have employed various obligate heterodimeric domains originally developed and used with ZFNs76 (Box 1). Therefore, because not all architectures are the same, we suggest that users should carefully consider the reported activity levels and potential specificities of TALENs made on these different frameworks when choosing a method of assembly. We note that as of the writing of this review, the most extensively tested and validated TALEN framework remains that described by Rebar and colleagues7, 9, 13, 1820, 75, 77 (Supplementary information S1).

Future Directions

Although the development of engineered TALE technology has proceeded at an extremely rapid pace over the past three years, many important questions remain to be addressed if these proteins are to be used routinely for research and therapeutic applications. First, although TALENs and ZFNs can induce specific HDR events, competing mutagenesis by NHEJ can still lead to unwanted mutation of the original and, in some cases, the HDR-altered allele. It will therefore be important to develop generalizable methods that tip the balance away from NHEJ- and toward HDR-mediated repair. For example, recent work78,79, 80 has demonstrated that ZFN-derived nickases that cleave only one strand, instead of both strands, of DNA can shift this balance, although the absolute rates of HDR-mediated repair can be lower than those induced by the ZFNs from which they are derived. Second, developing methods that enable definition of the genome-wide specificities of TALENs will be critical to minimizing off-target NHEJ-mediated mutagenesis. Third, optimization of methods for efficiently delivering TALENs or nucleic acids encoding them into cells will also be an important area for future research. It will be interesting to investigate whether purified native or modified TALEN proteins, like ZFNs81, might be efficiently taken up directly by cells.

Another potential area for future exploration will be creation of fusion proteins harbouring domains others than nucleases. TALE-based activators and repressors have already been described (Box 2). However, one can also envision that engineered TALE repeat arrays might be used to direct functional domains that induce epigenetic alterations (such as covalent histone or DNA modifications) to specific genomic loci to induce stable, heritable alterations in gene expression. TALE repeat arrays fused to a recombinase domain have recently been described82, raising the exciting possibility of enabling targetable site-specific recombination events.

Box 2 | TALE-based Transcription Factors.

As previously done with zinc fingers (Box 1), TALE repeat arrays have been fused to transcriptional regulatory domains to create artificial transcription factors that can activate or repress gene expression. To date, various studies have shown that TALE-based activators and repressors can be used to modulate expression of endogenous genes in plants and human cells, with changes in gene or protein expression in the range of two- to 30-fold,15, 19, 29, 9398 but the vast majority of these proteins show relatively modest levels of activity. Although the high success rate and robust targeting range of dimeric TALENs has been recently established,18 relatively less is known about these parameters for monomeric engineered TALE-based transcriptional activators or repressors. The epigenetic status (e.g.—chromatin or DNA methylation) of the target sites may influence the activities of TALE-based transcription factors, as has been noted previously for zinc finger-based activators99. In addition, it has been suggested that users should apply computationally derived design guidelines (originally proposed for dimeric TALENs15) to monomeric TALE-based transcription factors100 but no published experimental support yet exists for this recommendation. Guidelines both for the selection of potential target sites and for the use of particular TALE repeat domains (harboring NH residues at the hypervariable positions) for recognition of G bases have been proposed.28 Large-scale, systematic studies should prospectively determine whether adherence to these recommendations (or the effects of varying other parameters such as the number of repeats, the length and composition of TALE-derived protein sequences that flank the TALE repeat array, or the nature of the transcriptional regulatory domain used) might influence the activities and specificities of engineered TALE-based transcription factors.

Engineered TALE technology promises to facilitate and enhance genetic manipulations in different organisms and cell types. The simplicity with which TALENs can be designed together with their robust success rates has already spurred much broader adoption of genome editing technology. Although many interesting and challenging questions remain, the accessibility and power of TALENs makes this technology an exciting and important subject for future research and development.

Supplementary Material

S1

S2

S3

S4

S4 figure on last page

S5

Acknowledgments

J.K.J. acknowledges support from the National Institutes of Health (NIH) (grants DP1 GM105378, R01 GM088040, and P50 HG005550) and The Jim and Ann Orr MGH Research Scholar Award. J.D.S. was supported by NIH T32CA009216. The authors apologize to colleagues whose studies were not cited due to length constraints.

Biographies

J. Keith Joung is Associate Chief of Pathology, the Jim and Ann Orr MGH Research Scholar, and Associate Pathologist at Massachusetts General Hospital, Boston, USA and Associate Professor of Pathology at Harvard Medical School (HMS), Boston. His laboratory develops genome engineering technologies based on engineered zinc finger and TALE proteins.

Jeffry D. Sander is an Assistant in Pathology at the MGH and an Instructor in Pathology at HMS. His research interests are in developing computational and experimental technologies for genome engineering.

Footnotes

Supplementary information

Supplementary information S1 (table) | Published examples of endogenous genes modified by TALENs

Supplementary information S2| Supplementary Discussion

Supplementary information S3 (figure) | Platforms for assembling DNA encoding TALE repeat arrays.

Supplementary information S4 (table) | Characteristics and features of various platforms for assembling DNA encoding TALE repeat arrays

Conflict of Interest Statement

J.K.J. has a financial interest in Transposagen Biopharmaceuticals. J.K.J.’s interests were reviewed and are managed by Massachusetts General Hospital and Partners HealthCare in accordance with their conflict of interest policies.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1

S2

S3

S4

S4 figure on last page

S5