Advances in Targeted Genome Editing (original) (raw)

. Author manuscript; available in PMC: 2013 Aug 1.

Published in final edited form as: Curr Opin Chem Biol. 2012 Jul 20;16(3-4):268–277. doi: 10.1016/j.cbpa.2012.06.007

Abstract

New technologies have recently emerged that enable targeted editing of genomes in diverse systems. This includes precise manipulation of gene sequences in their natural chromosomal context and addition of transgenes to specific genomic loci. This progress has been facilitated by advances in engineering targeted nucleases with programmable, site-specific DNA-binding domains, including zinc finger proteins and transcription activator-like effectors (TALEs). Recent improvements have enhanced nuclease performance, accelerated nuclease assembly, and lowered the cost of genome editing. These advances are driving new approaches to many areas of biotechnology, including biopharmaceutical production, agriculture, creation of transgenic organisms and cell lines, and studies of genome structure, regulation, and function. Genome editing is also being investigated in preclinical and clinical gene therapies for many diseases.

Introduction

Genome editing is the introduction of a predetermined sequence change to the chromosomal DNA of a cellular genome. The instructions for almost all functions of living systems are encoded in the genome. Consequently, the ability to easily and precisely add, remove, or exchange DNA sequences within a cellular genome would theoretically enable routine reprogramming of biological systems for numerous applications relevant to all areas of biotechnology, including medicine, energy, and the environment. The editing of genome sequences in diverse cell types and species has recently become possible through the advent of synthetic nucleases that can be engineered to target almost any site in a complex genome. The enhancement of gene targeting through nuclease-mediated DNA cleavage has been known for over fifteen years, but genome editing has not been widely applied to diverse areas of biotechnology until recently (Figure 1). This rapid growth is the result of the increased availability of public and commercial sources for engineering targeted nucleases (Table 1), as well as significant progress in enhancing and monitoring genomic modifications. Despite the exponential growth of the use of this technology, current methods still do not fulfill the criteria of an ideal gene editing tool: 1) high frequency of desired sequence changes in the target cell population, 2) no off-target mutations, and 3) rapid and efficient assembly of nucleases that target any site in the genome at low cost. Progress in genome editing has been the subject of several comprehensive review articles [13]. Therefore this review emphasizes the most significant advances in genome editing in the last few years and the corresponding adoption of this technology for new applications. We also discuss the current challenges and future directions necessary to establish a genome editing technology that is sufficiently robust, efficient, specific, economical, and readily available for routine use in research and biotechnology.

Figure 1. Milestones in genome editing and accelerated progress in nuclease engineering.

Figure 1

An estimate of the number of articles referring to genome editing is shown for each year with specific reference to major advances. Data for this graph was obtained from the Web of ScienceSM by searching for articles referencing “zinc finger nuclease OR tale nuclease”.

Table 1.

Publicly Available Methods for Custom Nuclease Design and Assembly

DNA-Binding Domain Assembly Method Description
Zinc Finger Proteins Sigma-Aldrich CompoZr Ready-made, pre-validated commercial ZFN constructs for targeted gene knockout or integration into a safe harbor locus54, 64 Pre-validated ZFN constructs against custom targets available within 10 weeks of purchase
Oligomerized Pool Engineering (OPEN)10 Utilizes a publicly available molecular biology toolkit for end-user selection and validation of custom zinc finger arrays against a desired DNA sequence Requires molecular biology expertise
Context-Dependent Assembly (CoDA)12 A database of custom zinc finger arrays derived from previously identified OPEN zinc finger arrays that share a common zinc finger motif End- user can assemble the desired ZFN constructs using standard molecular biology techniques or commercial gene synthesis based on output from a webserver7
Modular Assembly (Barbas Kit)8 Utilizes a toolkit of off-the-shelf synthetic zinc finger modules for most targetable DNA triplet sequences End-user assembles modules into custom arrays using standard molecular biology techniques or commercial gene synthesis based on output from a webserver6
Modular Assembly (Joung Kit) Utilizes an extensive toolkit of off-the-shelf synthetic zinc finger modules from the Barbas8, Sangamo, and ToolGen5 set of zinc finger modules End-user assembles modules into custom arrays using standard molecular biology techniques or commercial gene synthesis based on output from a webserver7
Modular Assembly (Wolfe System)13 Utilizes a toolkit of off-the-shelf synthetic zinc finger modules based on an engineered library of two-finger modules supplemented with additional one-finger modules End-user assembles modules into custom arrays using standard molecular biology techniques or commercial gene synthesis based on output from a webserver13
TAL Effectors Cellectis Bioresearch Pre-validated commercial TALEN constructs for targeted gene knockout, as well as for custom targets and available several weeks following order confirmation
Life Technologies GeneArt® Precision TALs fused to various effector domains are supplied as Gateway® compatible entry clones for custom targets and available several weeks following order confirmation
FLASH Assembly20 High-throughput, automatable assembly of custom TALEs using a publicly available library of preassembled TAL arrays
Modular Assembly (Voytas Kit)21 Publicly available plasmid-based TALE monomer library that can generate arrays of 12–31 TALE repeats Custom arrays are assembled using rapid two-step Golden Gate molecular cloning; webserver is available to aid in target site selection21
Modular Assembly (Joung Kit) Publicly available plasmid-based TALE monomer library that can generate arrays of any desired length Custom arrays are assembled in iterative ligation steps using Golden Gate molecular cloning; webserver is available to aid in target site selection7
Modular Assembly (Zhang Kit)18 PCR-based TALE monomer library that can rapidly generate arrays of 18 TALE repeats using standard molecular biology techniques

2. Targeted DNA-Binding Proteins

The engineering of enzymes that target specific sequences within complex genomes is a formidable challenge. The most successful approaches to date have been based on modular proteins in which a DNA-binding domain that recognizes the target DNA sequence is fused to an effector domain that catalyzes changes to the structure or function of the target gene. The DNA recognition domain is typically based on the structure of natural DNA-binding proteins, including zinc finger proteins and transcription activator-like effectors. These targeted DNA-binding proteins can be combined with effector domains to create functional enzymes, including synthetic transcription factors, methyltransferases, integrases, nucleases, and recombinases, that modify genes in many cell types and species.

2.1 Zinc Finger Proteins

The Cys2-His2 zinc finger domain is the most common DNA-binding motif in the human proteome and consists of 30 amino acids in a ββα configuration, where the α-helix projects into the major groove of DNA and recognizes 3–4 contiguous nucleotide bases [4] (Figure 2A). The DNA-binding specificity of synthetic zinc finger domains has been extensively engineered through site-directed mutagenesis and rational design or the selection of large combinatorial libraries. Collectively, this work has yielded unique zinc finger domains with specificity for almost all of the 64 possible nucleotide triplets [4]. Significantly, the modular structure of the zinc finger motif permits the conjunction of several domains in series, allowing for the recognition and targeting of extended sequences in multiples of three nucleotides. As a result of this work, it is now theoretically possible to design synthetic zinc finger proteins to bind practically any target in the genome of any species.

Figure 2. Three dimensional structure of a Zinc Finger Protein and TAL effector.

Figure 2

(A) Front and lateral view of a six-finger zinc finger protein that consists of six tandem repeats of C2H2 zinc finger motifs, each consisting of approximately 30 amino acids. A single zinc finger, which recognizes 3 bp of DNA, consists of an α helix and two antiparallel β sheets that coordinate with a zinc ion through two histidine residues and two cysteine residues. Contacts with DNA are made through interactions with side chains on the α helix. (B) TAL effectors consist of repeats of 34 amino acids that recognize one single bp of DNA. Each of these units is formed by two nearly identical alpha helices flanking two variant amino acids, known as the repeat variable di-residue, that determine the binding specificity. The structures have been modeled using MacPyMOL with Protein Data Bank files 1P47 (crystal structure of tandem Zif268 molecules complexed to DNA) and 3UGM (structure of TAL effector PthXo1 bound to its DNA target).

Despite the numerous successful uses of engineered zinc finger proteins for regulating and editing genes in many species and cell types, the full potential of this technology has not yet been fulfilled. This has largely been attributed to the nuances of zinc finger protein engineering and failed attempts to adopt the methods in new laboratories. Although the source of the challenges in zinc finger engineering is still unclear, several new methods have become available in recent years that facilitate the rapid assembly and screening of numerous novel zinc finger proteins in parallel (Table 1). The modular assembly approach uses a single engineered zinc finger domain for each possible three base pair sequence [5,6]. The resulting zinc finger array is assembled from this library for any particular target sequence with the assistance of an online web server [6,7] and is then created with standard recombinant DNA methods or by commercial gene synthesis [8,9]. An alternative approach, known as “OPEN”, selects new zinc finger proteins from randomized libraries for each new target site [10]. Although this strategy entails considerably more effort and resources than modular assembly, it has been reported to generate functional zinc finger proteins with a higher frequency of success [10,11]. In 2010, a method labeled “context-dependent assembly” (CoDA), which recombines zinc finger domains that have previously been validated to work together, was suggested to be highly effective by accounting for interactions between zinc fingers while maintaining the simplicity of modular assembly [12]. An archive of optimized two-finger modules has also been described that generates highly successful proteins with a greater targeting range than CoDA [13]. Finally, engineered zinc finger proteins are available commercially for custom targets from Sigma-Aldrich Corporation’s CompoZr Zinc Finger Nuclease platform. This commercial service was created through partnership with Sangamo Biosciences, Inc. (Richmond, CA) and licensing of Sangamo’s proprietary methods for assembling zinc finger proteins.

2.2 TAL Effectors

The discovery of a simple modular DNA recognition code by transcription activator-like effectors (TALEs), reported in 2009, created another option for engineering programmable DNA-binding proteins [14,15]. TALEs are naturally occurring DNA-binding proteins produced by plant pathogenic bacteria to regulate host gene expression. In contrast to zinc finger domains of 30 amino acids that each recognize three base pairs, each TALE repeat consists of 34 amino acids and recognizes only a single base pair [1416] (Figure 2B). DNA binding preference by each repeat is determined by only two hypervariable amino acids in positions 12 and 13, called repeat-variable diresidues [14, 15]. Like zinc finger domains, these modular TALE repeats can be linked together to recognize a specific DNA sequence and then fused with transcriptional activation domains or nuclease domains to direct enzyme activity to targeted chromosomal loci [1720]. Several protocols have recently been described that enable rapid assembly of custom TALE repeat arrays in only a few days using publicly available reagents [18,20,21] (Table 1). Custom engineered TALEs have also become available commercially through Cellectis Bioresearch (Paris, France) and Life Technologies (Grand Island, NY). The rapid progress of TALE engineering relative to the development of synthetic zinc finger proteins has led many to suggest that this protein motif may be more amenable to reengineering, potentially due to a more modular structure [22]. However TALEs have been much less studied than zinc finger proteins, and continued work is necessary to fully understand the strengths and weaknesses of these different technologies.

3. Nuclease-Mediated Genome Editing

Although conventional homologous recombination can be used to introduce sequence changes into the genomic DNA of some species and cell types, this process is not efficient enough for most applications in which genome editing would be useful. However, the synthetic DNA-binding proteins described above can be used to engineer nucleases that can be targeted to almost any site in a cellular genome [13]. These nucleases create targeted double-strand breaks (DSBs) that stimulate natural DNA repair machinery to mend these breaks by non-homologous end joining (NHEJ). This repair pathway can be used to disrupt a gene or excise segments of genomes. Alternatively, the DSB also dramatically enhances the rate of homologous recombination at that locus and a homologous donor template can be delivered to the cells along with the nuclease to target gene addition to that site or make small substitutions to gene sequences. Recent advances in these techniques that have enabled more effective genome editing are described below.

3.1 FokI Endonuclease Domain

Engineered zinc finger proteins or TALEs can be fused to the catalytic domain of a restriction endonuclease to generate zinc finger nucleases (ZFNs) or TALE nucleases (TALENs) that create a DSB at the locus of interest. Because the nuclease acts as a dimer, two DNA-binding proteins must be engineered to target adjacent sequences, separated by a spacer region where the nuclease catalytic domain can dimerize and cleave the target DNA [13]. The catalytic domain most commonly used to induce the DSB is derived from the type IIS restriction endonuclease FokI. Several recent advances in the structure of the FokI domain have been made to increase its activity and specificity. First, a directed evolution strategy was used to identify a hyperactive FokI variant, named Sharkey, that increases cleavage activity in vitro and in vivo [23]. Second, several mutations have been described that prevent unwanted FokI homodimer formation and genotoxic cleavage of off-target sequences [24]. Although FokI heterodimer variants had previously been described to prevent homodimer formation, this was accompanied by lower catalytic activity. These new mutations appear to restore the lost nuclease activity while maintaining the strict requirement for heterodimer formation [24]. Finally, variants of FokI have been described that act as orthogonal obligate heterodimers, such that autonomous pairs of nucleases can be used together without cross-reactivity or homodimer formation [25].

3.2 Donor Vectors for Homologous Recombination

Although NHEJ-based repair of DSBs is sufficient for gene disruption [26] and the deletion of chromosomal segments [27], the introduction of new sequences to the nuclease target site requires a homologous donor repair template [2830]. For certain applications, creating this homologous donor with homology arms of >700 base pairs may be complicated or laborious. Two new approaches have provided methods for simplification of this strategy. First, linear donor sequences with as little as 50 base pairs of homology were efficiently integrated into sites of nuclease cleavage in human cells [31]. Second, single-stranded DNA oligonucleotides, in combination with engineered nucleases, were used instead of a donor targeting vector to induce targeted point mutations, deletions or insertions of short sequences [32,33]. Importantly, oligonucleotide-based templates contain the minimum genetic information needed to introduce DNA sequence changes, therefore reducing the risk of off-target effects.

3.3 Enhancing Genome Editing

Current methods for nuclease-mediated genome editing do not allow for directing gene repair exclusively to either the NHEJ or homologous recombination pathway. Consequently, the inclusion of a donor vector with the nuclease treatment results in a cell population containing a mixture of cells modified by both pathways [34]. In order to better monitor this process, a “traffic light” reporter system was created in which NHEJ and homologous recombination events are differentially monitored by flow cytometric analysis of green and red fluorescence [35]. This study also identified factors that bias the balance of the two repair pathways and confirmed findings that creating single-strand breaks with a nickase, in contrast to DSBs with a nuclease, favors homology-directed repair and minimizes NHEJ [3638]. The use of episomal fluorescent reporters of gene repair was later extended to enrich for cells modified at their endogenous locus by cytometric cell sorting [39]. Other methods that have been used to improve the efficiency of genome editing include the regulation of nuclease activity with small molecules to minimize toxicity [40] and transient hypothermia to increase nuclease expression levels [41].

4. Monitoring Specificity of Genome Editing

The usefulness of genome editing technologies is largely dependent on achieving single site specificity in the context of large and complex genomes. However, it is challenging to prove that no other sequences across the whole genome are unintentionally modified. This is particularly important given the observed cytotoxicity of many nucleases, presumably due to off-target DNA cleavage. Analysis had previously been limited to predicting potential off-target sites based on in vitro binding profiles [42]. To address these concerns, new methods have been developed for comprehensive mapping of nuclease activity in vitro [43] and in vivo [44,45]. Although the ZFNs analyzed in these studies acted primarily at their intended target site, many previously unknown off-target sites were also revealed. These off-target sites had high sequence homology to the intended ZFN binding site and therefore these methods will be valuable to designing improved nucleases. Additionally, advances in high-throughput DNA sequencing have enabled direct analysis of genomes with single base pair resolution. For example, a recent study sequenced the complete exome of a ZFN-treated clonal cell population and showed that only a single point mutation was created in this process [46].

5. Genome Editing in Basic Science and Biotechnology

The advent of genome editing has created a variety of new approaches that are progressively becoming routine methods to interrogate biological systems. The accessibility of commercially and publicly available custom nucleases has facilitated novel studies of protein glycosylation [47], gene destabilization [48], protein localization and dynamics [49,50], chromosomal translocation [51] and DNA repair [52]. Genome editing tools can be used to model human disease [53] or generate human [54] or mouse [55] isogenic cell lines that allow for robust and uniform transgene expression. Traditionally, gene targeting in animal models has been largely limited to mice, but engineered nucleases have enabled targeted gene modifications in rats [56], pigs [57], zebrafish [58,59], frogs [60], rabbits [61], cattle [62], flies [26], and worms [63]. Furthermore, gene targeting in zygotes is possible, independent of embryonic stem cells [64]. Genome editing in plants is providing new opportunities in agricultural biotechnology for the production of food and biofuels [65,66]. Finally, genome editing has been applied to the generation of apoptosis-resistant mammalian cells lines for improved biopharmaceutical production [67].

6. Genome Editing in Gene and Cell Therapy

The field of gene therapy has typically focused on the addition of new genes to cells, leading to a variety of challenges and obstacles. Genome editing has provided several distinct means for addressing the limitations of previous gene therapy approaches. First, transgenes can be added to specific “safe harbor” loci in the genome [68,69], in contrast to conventional gene delivery vectors that integrate randomly into chromosomal DNA. This approach was recently explored as a gene therapy for chronic granulomatous disease [70]. Alternatively, the disease-causing mutations can be directly corrected by genome editing, as has been done in studies of X-linked SCID [71], α1-antitrypsin deficiency [46], sickle cell anemia [72,73], hemophilia B [74], and p53-related cancer [75]. Genes may also be disrupted by genome editing to produce therapeutic phenotypes. For example, the HIV co-receptor CCR5 has been disrupted in both T cells [42] and hematopoietic stem cells [76] thus blocking HIV entry. Clinical trials are underway with this approach (NCT00842634, NCT01044654, and NCT01252641) and at the time of this review, data from the first phase 1 clinical trial has demonstrated improvement in several clinical parameters while being well tolerated. The HIV co-receptor CXCR4 has also been targeted in similar preclinical studies [77]. Furthermore, gene editing has been used to enhance cellular immunotherapy by disrupting endogenous T cell receptors [78,79] or the glucocorticoid receptor in T cells in a clinical trial for malignant glioma (NCT01082926). Finally, successful genome editing in human embryonic stem cells and induced pluripotent stem cells has provided new avenues for genetic correction or augmentation in regenerative medicine [68,8082].

7. Conclusions

Genome editing is rapidly progressing towards a golden era of easily accessible, highly specific enzymes that can directly manipulate genomic targets of interest. In the last two years, there has been an explosion in the number and diversity of applications of this technology (Figure 1). Collectively these advances represent a paradigm shift in the way we manipulate and study complex genomes and cellular processes.

Several challenges and opportunities still remain as these technologies move towards widespread adoption. A large-scale study of the in vitro and in vivo DNA-binding properties of TALEs relative to zinc finger proteins would provide insightful information on the differential activity and capacity for reengineering of these scaffolds. There is still much work to be done to further improve both the specificity of engineered nucleases as well as the methods used to monitor off-target events, and advances in high-throughput sequencing are facilitating these efforts [43,45,46]. The structure and epigenetic state of the genomic target site are likely equally important as the engineered DNA-binding proteins [69], and this is a subject that has largely been understudied. The continued development of methods for controlling the mechanisms of DNA repair will enhance the robustness of genome editing and the uniformity of modified alleles [3438]. Improved methods for efficient nuclease delivery, particularly in vivo [74], will be essential to translating genome editing into gene therapies. Recent evidence that AAV-based homologous donor vectors lead to enhanced homology-directed repair provides a promising path forward in this area [8386]. Finally, the development of methods for genome editing that do not depend on DNA repair pathways, such as zinc finger recombinases [8791], may ultimately improve the safety and specificity of genome editing.

Table 2.

Representative Applications of Genome Editing

Type of modification Species Gene Nuclease Source Reference(s)
Gene knockout Human CCR5 Sangamo 19,42,76
Rat eGFP, IgM, Rab38 Sangamo 56
Pig GGTA1 Sangamo 57
Zebrafish GOL, NTL Sangamo 58
Frog eGFP, NOG Sangamo 60
Rabbit IgM Sangamo 61
Hamster BAK, BAX Sangamo 67
Human CXCR4 Sangamo 77
Human TCR Sangamo 78, 79
Cattle β-lactoglobulin Sigma CompoZr 62
Human COSMC Sigma CompoZr 47
Zebrafish KDRA Bacterial 1-hybrid 59
Human VEGF-A, HoxB13, CFTR OPEN 10
Plants, Zebrafish 20 unique genes CoDA 12
Drosophila yellow Modular assembly 26
C. elegans Nw Modular assembly 63
Gene addition Human H3f3b Sangamo 49
Human, Monkey CLTA, DNM2 Sangamo 50
Maize IPK1, Zp15 Sangamo 65
Human Factor IX Sangamo 74
Human Oct4, AAVS1, PITX3 Sangamo 80,82
Human MALAT1 Sigma CompoZr 48
Human AAVS1 Sigma CompoZr 54, 68, 70
Mouse ROSA26 Sigma CompoZr 64
Mouse ROSA26 Modular assembly 55
Human VEGF-A OPEN 10
Human PIG-A OPEN 81
Genetic substitution Human α1-antitrypsin Sangamo 46
Human α-synuclein Sangamo 53
Human IL2Rγ Sangamo 71
Human β-globin Sigma CompoZr 72
Human β-globin OPEN 73
Human p53 Yeast 1-hybrid 75
Tobacco SuR Modular assembly, OPEN 66
Drosophila yellow Modular assembly 29
Chromosomal modification Mouse ROSA26, H3f3b Sigma CompoZr, Sangamo 52
Human IL2Rγ, AAVS1 Sangamo 51

Highlights.

Acknowledgements

This work was supported by The Hartwell Foundation, a Basil O’Connor Starter Scholar Award from the March of Dimes, an NSF Faculty Early Career Development (CAREER) Award (1151035) and an NIH Director’s New Innovator Award (1DP2OD008586).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

REFERENCES

*special interest

**outstanding interest