Directed evolution of an enhanced and highly efficient FokI cleavage domain for Zinc Finger Nucleases (original) (raw)

. Author manuscript; available in PMC: 2011 Jul 2.

Published in final edited form as: J Mol Biol. 2010 May 4;400(1):96–107. doi: 10.1016/j.jmb.2010.04.060

Abstract

Zinc finger nucleases (ZFNs) are powerful tools for gene therapy and genetic engineering. The high specificity and affinity of these chimeric enzymes are based on custom-designed zinc finger proteins (ZFPs). In order to improve the performance of existing ZFN technology, we developed an in vivo evolution-based approach to improve the efficacy of the _Fok_I cleavage domain (FCD). After multiple rounds of cycling mutagenesis and DNA shuffling, a more efficient nuclease variant (Sharkey) was generated. In vivo analyses indicated that Sharkey is >15-fold more active than wild-type FCD on a diverse panel of cleavage sites. Further, a mammalian cell-based assay showed a 3 to 6-fold improvement in targeted mutagenesis for ZFNs containing derivatives of the Sharkey cleavage domain. We also identified mutations that impart sequence specificity to the FCD that might be utilized in future studies to further refine ZFNs through cooperative specificity. In addition, Sharkey was observed to enhance the cleavage profiles of previously published and newly selected heterodimer ZFN architectures. This enhanced and highly efficient cleavage domain will aid in a variety of ZFN applications in medicine and biology.

Keywords: zinc finger nuclease, directed evolution, gene targeting

INTRODUCTION

Zinc finger nucleases are artificial restriction enzymes generated by fusing the nonspecific DNA cleavage domain of the endonuclease _Fok_I with site-specific DNA binding zinc finger proteins (ZFPs) 1. These chimeric enzymes have emerged as powerful tools for genome editing. Due to the flexible nature of zinc finger proteins, ZFNs can be assembled that induce double strand breaks (DSBs) site-specifically into genomic DNA. Targeted genes can be disrupted via mutagenic non-homologous end joining (NHEJ) 25 or modified via homologous recombination (HR) if a closely related DNA template is supplied69. This method has been applied in many organisms, including plants7, Drosophila2, 8, C. elegans9, zebrafish3, 4, rats10, and various mammalian cells5, 6, 11. ZFNs can also be used in basic molecular research, providing diverse options for molecular cloning12.

The modular structure of C2H2 zinc finger motifs and their independent recognition capabilities make them an ideal framework for developing custom ZFPs with novel sequence specificities13. Each motif recognizes 3 or 4 base pairs via its α-helix and engineering multiple ZFs in tandem enables the recognition of extended DNA sequences with unparalleled specificity1315. Multiple methods have been developed towards the goal of generating highly selective ZFPs. In the modular-assembly approach, a strategy that has proven to be very effective for the generation of zinc finger transcription factors, individual ZF modules with pre-selected specificities are assembled using standard recombinant DNA technology15, 16. Modular-assembly is rapid, but constrained by sequence limitations as there are not ZF motifs that recognize each of the 64 DNA triplets 1624. Alternatively, cell-based selection methods like oligomerized pool engineering (OPEN) have been shown to be effective but require the construction and interrogation of large libraries and have similar sequence constraints2526. When ZFPs are coupled with the nonspecific _Fok_I cleavage domain (FCD), their affinity and specificity are major determinants of the activity and toxicity of the resulting ZFN6, 27.

Unlike zinc finger transcription factors, which are usually composed of high affinity six-finger ZFPs and can be designed to recognize a single site within the human genome14, 15, ZFNs are typically composed of lower affinity three or four-finger ZFPs. To a certain extent this is attributable to the nature of ZFN target sites. ZFN target sites are composed of two ZFP binding sites positioned in a tail-to-tail orientation and separated by a 5 to 7-bp spacer sequence28. And while theoretically every sequence can be targeted by custom ZFPs, in practice, not all can be targeted efficiently. Many factors affect the efficiency and precision of a chimeric nuclease, including the specificity and affinity of ZFPs6, 27, the length and composition of the inter-domain linker29, 30, the length of spacer sequence DNA in between ZFP binding sites29, 30 and the interaction between two _Fok_I cleavage domains31, 32. Towards the goal of improving the performance ZFNs in mammalian cells, we sought to enhance the catalytic capabilities of the _Fok_I cleavage domain through laboratory-directed protein evolution. We developed an in vivo cell-survival and evolution strategy to identify a FCD variant called Sharkey, which is >15-fold more active than wild-type FCD on an array of diverse cleavage sites. When coupled with ZFPs, Sharkey derivatives stimulated a 3 to 6-fold increase in mutagenesis in mammalian cells compared to ZFNs constructed with wild-type _Fok_I cleavage domain. In addition, we demonstrate that the Sharkey framework increases the rate of stimulated mutagenesis for both existing and newly selected heterodimeric ZFN architectures. We believe this novel FCD variant will serve as a useful reagent in a wide variety of ZFN applications.

RESULTS

Selection strategy

In order to selectively enrich for catalytically improved _Fok_I cleavage domain variants, we have based our design on a two-plasmid system developed by Chen and Zhao33 for directing activity of the homing endonuclease I-Sce_I to a novel sequence. This system links DNA cleavage events with cell survival. The reporter plasmid contains the toxic gene ccdB34 under tight control of the ara_BAD promoter. Downstream of the ccdB gene we engineered one copy of the desired ZFN cleavage site (Fig. 1a). A low copy ZFN expression plasmid encoding the ZFN gene under the control of a modified lac promoter, which possesses an additional lac operator sequence (Fig. 1b) for tighter control of ZFN expression, was constructed. Cleavage of reporter plasmid by a ZFN mutant linearizes the plasmid and results in degradation by RecBCD nuclease35. The process includes three steps: 1) expression of endonuclease mutants; 2) cleavage of reporter plasmids by these mutants; 3) degradation of linearized plasmids. Time needed for step three is constant, but that for step two is variable. The DNA cleavage rate of a mutant will influence the rate of step two and, therefore, the time of linearization of the toxic gene. As a consequence, the survival rate (SR) of a mutant, calculated as the ratio of the number of colonies on an arabinose selection plate to that on a non-selective plate, is positively correlated with its catalytic activity. Mutants with higher catalytic activity linearize all reporter plasmids in an E. coli cell within a shorter time window and will be enriched during evolution (Fig. 1c).

Figure 1. Schematic representation of the selection strategy used for isolating novel FCD variants.

Figure 1

(a,b) A two-plasmid approach utilizing a reporter consisting of a single ZFN cleavage site downstream of ccdB and a ZFN expression plasmid under tight control of a modified lac promoter can be used to selectively enrich for catalytically improved _Fok_I cleavage domains. (c) A library of FCD variants can be transformed into the ccdB harboring BW25141 selection strain and enriched following ZFN mediated reporter plasmid cleavage/degradation. Decreasing the recovery time following transformation facilitates the isolation of ZFN variants with enhanced catalytic properties.

Directed evolution of an enhanced _Fok_I cleavage domain

In order to establish proper selective pressure in our system, we first measured the SR of wild-type ZFN. An expression plasmid harboring wild-type _Fok_I cleavage domain fused to ZF domain P3 (P3.wt) (Supplementary Fig. 1) was transformed into the p11LacY-sP3/P3 harboring strain BW2514133. The sequence of the ZFN site in the reporter plasmid p11LacY-sP3/P3 is shown in Table 1. The spacer sequence of the ZFN cleavage site is the restriction site _Mlu_I. Aliquots of transformants were withdrawn at regular time intervals post transformation and grown on solid media with and without arabinose to calculate SR. Without IPTG induction, the resulting SR curve was linear and ~10% survival was observed after one hour of recovery (data not shown). With IPTG induction, however, the reaction was finished within 1 hour and survival was 80–100%. We therefore used IPTG induction only for the first round of selection.

Table 1.

ZFP binding characteristics and ZFN target sites.

ZFP _K_d (nM) ZFP Binding Site
P3 - GCA GTG GCG
E3 35 GGG GCC GGA
E4 10.6 ± 3.3 GGG GCC GGA GCC
E5 4.2 ± 1.9 GGG GCC GGA GCC GCA
E6 (E2C) 0.85 ± 0.2 GGG GCC GGA GCC GCA GTG
Cleavage Site DNA sequence (6 bp spacer)
E3/E3 TCC GGC CCC (ACGCGT) GGG GCC GGA
P3/P3 CGC CAC TGC (ACGCGT) GCA GTG GCG
P3/E6 CGC CAC TGC (ACGCGT) GGG GCC GGA GCC GCA GTG
E4/E4 GGC TCC GGC CCC (ACGCGT) GGG GCC GGA GCC
E5/E5 TGC GGC TCC GGC CCC (ACGCGT) GGG GCC GGA GCC GCA
E6/E6 CAC TGC GGC TCC GGC CCC (ACGCGT) GGG GCC GGA GCC GCA GTG

Error-prone PCR was used to introduce diversity into the catalytic domain of _Fok_I prior to fusion to the P3 ZF domain. The mutation rate was determined to be ~4 amino acids per catalytic domain. In addition to cycling mutagenesis, DNA shuffling36 was applied to the full ZFN gene to combine beneficial mutations every three rounds of selection. The ZFN library was cloned into pPDAZ and electroporated into E. coli. ZFN libraries were routinely composed of 107–108 transformants. Following transformation, ZFN containing plasmids were harvested from overnight culture and electroporated into the selection strain containing the reporter plasmid p11LacY-sP3/P3. As the evolution progressed enrichment was observed (Fig. 2a). We measured SR at rounds 3, 6 and 9 (Fig. 2b). Relative to wild-type _Fok_I, rounds 6 and 9 showed much higher SR at all measured time intervals. A more direct evaluation of ZFN activity was performed in vitro. Supercoiled plasmid pSub-P3, which contained a single copy of ZFN cleavage site, was incubated with cell extracts from rounds 3, 6 and 9 at room temperature. Round 9 cell extracts displayed the highest activity, linearizing all substrate within 10 minutes. In comparison, wild-type _Fok_I linearized <10% of its substrate (Fig 2c).

Figure 2. Enhancing the _Fok_I cleavage domain by directed evolution.

Figure 2

A library of ZFNs was transformed into selection strain BW25141 and subjected to multiple rounds of evolution. (a) Survival rate (SR), which correlates directly with catalytic activity, was measured at 1 hr for Rounds 3, 6 and 9. (b) SR was observed to increase with recovery time. SR curves were measured for wt (◆), R3 (■), R6 (▲) and R9 (●). (c) The extent of substrate linearization from Rounds 3, 6 and 9 was measured from cellular extracts prepared from overnight cultures. ‘Sub’ indicates supercoiled substrate plasmid pSub-P3. ‘Prod’ indicates linearized substrate plasmid pSub-P3. (d) SR for selection rounds 11–18 measured at 1 hr. Bar 10 indicates FCDR18-28.

Because round 9 displayed a SR of >50% at 1 hr, higher selection stringency was desired for further evolution. One way of increasing stringency is to decrease the level of protein expression. The initiation codon of a transcript has a direct impact on its translation efficiency. Changing the start codon from the most frequently used ATG to GTG can reduce protein translation levels by five fold37. We tested this strategy on clone FCDR9-3 and found that a change in the start codon reduced the SR of this variant from ~80% to ~8% (data not shown). A secondary library with a GTG initiation codon was then constructed from the 9th round via error prone PCR. The transformed library was composed of ~108 transformants. IPTG was added for the 10th round of selection. Recovery time was initially set at 2.5 hours and shortened by 0.5 hours after every three rounds of selection. Nine more rounds of evolution were executed in the absence of IPTG induction. Following every three rounds of selection, beneficial mutations were propagated by DNA shuffling. SR was measured following each round of selection (Fig. 2d). The selection was concluded when SR was measured to be >30%.

Analysis of selected _Fok_I cleavage domain variants

During sequence analysis of the final round, we noticed a large number of clones contained mutations in the ZF domain and/or the inter-domain linker. Those mutations were primarily in two areas: the second linker of the ZF motif and in the four-amino acid inter-domain linker. These two areas are removed from the catalytic center and mutations in these regions are likely to have altered the affinity and specificity of the ZFN rather than the catalytic domain. For a more direct comparison of FCD mutants, we re-amplified the FCDs from the 18th round of selection, cloned them in the original framework and screened 50 FCD variants for optimal performance in vivo. One of the most active catalytic domains, FCDR18-28 (S418P, F432L, K441E, Q481H, H523Y, N527D and K559Q) was selected for further characterization. With a GTG start codon, this mutant had a 25% SR at 1 hr, slightly lower than the overall 33% SR of the 18th round, but significantly higher than the 8% SR of the best clone from the 9th round. In comparison, wild-type _Fok_I had a SR <1% under the same conditions (data not shown).

To determine whether the presence of the FCDR18-28 domain enhanced the rate of DNA cleavage, we purified P3 nucleases with either the wild-type _Fok_I cleavage domain or FCDR18-28 and performed an in vitro DNA cleavage analysis. Rates of DNA cleavage were determined using a constant concentration of ZFN (12 nM) and increasing substrate concentrations (4, 6, 12, 24, or 36 nM). Linearized plasmid pSub-P3 DNA was used as a substrate. Cleavage of this substrate generates two product DNA molecules of the same size, simplifying the analysis. The progress of each reaction was monitored over time by measuring the initial velocity (Fig. 3a). The rate of cleavage for FCDR18-28 was 4 to 5-fold higher than that of wild-type _Fok_I, demonstrating that FCDR18-28 has enhanced catalytic activity relative to wild-type ZFN. FCDR18-28 was also observed to have a faster turnover rate than wild-type FCD (Fig. 3b).

Figure 3. Selected ZFN variants have enhanced catalytic profiles as demonstrated by in vitro DNA cleavage assays and in vivo activity assays.

Figure 3

(a) In vitro cleavage of target DNA by P3.nuclease with either wild-type _Fok_I cleavage domain (white) or FCDR18-28 (black). Cleavage rates were determined by measuring the initial velocity of DNA cleavage. 12 nM ZFN was added to pre-warmed ZFN Reaction Buffer containing 4, 6, 12, 24, or 36 nM pSub-P3 substrate DNA. Reaction conditions were standardized. (b) In vitro cleavage of target DNA by P3 nuclease with either FCDR18-28 or wild-type _Fok_I cleavage domain. ‘Uncut’ indicates linearized substrate plasmid pSub-P3. ‘Cut’ indicates cleavage products. Cleavage was monitored incrementally over 90 min. (c) Activity analysis of _Fok_I cleavage domain variants containing the selected mutations S418P, K441E, Q481H, N527D and S418P::K441E with the P3 zinc finger domain. ZFN activity was measured against _Mlu_I (ACGGCT) and Nx6 (N: A, T, C or G) spacer sequences and normalized to wild-type FCD. Error bars indicate standard deviation of three replicates. (d) Activity analysis of _Fok_I cleavage domain variants S418P::K441E (Sharkey) and FCDR18-28 (_Sharkey_’) with the P3 zinc finger domain. ZFN activity was measured against _Mlu_I, Nx6, ACGAAT, VF2471 (GAGAGT) and CFTR (TGGTGA) spacer sequences and normalized to wild-type FCD. Error bars indicate standard deviation of three replicates.

Having established that FCDR18-28 exhibits enhanced DNA cleavage and turnover rates in vitro, we next sought to identify the requisite amino acids mutations required for increasing the performance of FCD. We sequenced individual FCD variants from the 18th round of selection and identified 4 mutations (S418P, K441E, Q481H and N527D) present in ≥70% of active mutants, including FCDR18-28 (Supplementary Fig 2). Site-directed mutagenesis was performed to generate FCD variants containing these point mutations and SR was measured against ZFN target sites containing _Mlu_I (ACGCGT) and Nx6 (N: A, T, C or G) spacer sequences (Fig. 3c). Relative to wild-type _Fok_I, FCD variants containing the single point mutations S418P and K441E exhibited increased rates of cleavage for each spacer sequence. Interestingly, an FCD variant containing the single point mutation Q481H exhibited modest activity on the _Mlu_I core cleavage site and decreased activity on the Nx6 spacer sequence. N527D FCD exhibited considerably reduced levels of activity on both _Mlu_I and Nx6 core sequences. An FCD variant comprised of the beneficial mutations S418P and K441E (FCD418P.441E) cleaved _Mlu_I and Nx6 containing ZFN target sites >15-fold more efficiently than wild-type FCD (Fig. 3c). To probe the generalized cleavage capabilities of FCDR18-28 relative to FCD418P.441E and wild-type _Fok_I, SR was measured against a diverse array of spacer sequences. In addition to _Mlu_I and Nx6, SR was measured against ZFN target sites containing the endogenous Drosophila spacer sequence ACGAAT, the human VEGF-A spacer sequence GCGAGT (VF2471) and the human CFTR spacer sequence TGGTGA (CFTR) 25. Unexpectedly, FCDR18-28 exhibited substantially increased levels of cleavage exclusively on _Mlu_I, displaying only a modest gain in activity against the remaining panel of spacer sequences (Fig. 3d). However, relative to wild-type _Fok_I cleavage domain, FCD418P.441E exhibited >15-fold improvement in activity against each endogenous cleavage site (Fig. 3d). We anticipate the enhanced cleavage capabilities of FCD418P.441E (herein after referred to as Sharkey) will have great utility in a wide variety of ZFN based applications.

The complete nucleic acid and amino acid sequence of Sharkey is provided in Supplementary Information (Supplementary Fig. 3).

Sharkey enhances the efficiency of mutagenesis in mammalian cells

Because one major application of ZFNs is in gene therapy, we sought to evaluate the catalytic activity of Sharkey in a mammalian model system. In mammalian cells, the majority of DSBs are repaired by NHEJ, a somewhat error-prone process, resulting in small deletions and insertions at the site of DSBs38. It is expected that an increased frequency of DSBs at a given site will lead to an increased rate of mutagenesis. We constructed an EGFP based reporter system to rapidly gauge the potential for ZFNs to create site-specific DSBs. The recognition site for E2C nuclease (Table 1) was inserted in between amino acids 157 and 158 of the gene encoding EGFP and subsequently disabled with a frameshift. We chose to insert the ZFN recognition site between EGFP residues 157 and 158 because it is a position that has been previously shown to accommodate 20 amino acid peptide insertions39. The resulting nonfunctional transgene was stably integrated at a single location in the genome of HEK 293 cells using the Flp-In system. Certain deletions (e.g., 2, 5, or 8-bp) or insertions (e.g., 1, 4 or 7-bp) caused by NHEJ mediated mutagenesis will restore the frame and consequently EGFP function (Fig. 4a). While this assay only reflects a small portion of the total mutation events_,_ it has the advantage of being robust and high throughput with little background. Moreover, the rate of mutagenesis can also be measured by _Mlu_I cleavage. Because the 6-bp spacer sequence between ZFP binding sites is an _Mlu_I restriction site, any mutations within this spacer will abolish cleavage by _Mlu_I and can be easily evaluated by limited-cycle PCR/restriction digest analysis (Fig. 4d).

Figure 4. Sharkey’ increases the rate of mutagenesis in a mammalian model system.

Figure 4

(a) Schematic overview of the reporter system used to evaluate the efficiency of mutagenesis in mammalian cells. The model system consists of a HEK 293 cell line containing a modified and disabled EGFP transgene stably integrated in a single locus. An _Mlu_I restriction site flanked by E2C zinc finger recognition sites was inserted between EGFP residues 157 and 158. Select deletions (e.g., 2, 5 or 8-bp) or insertions (e.g., 1, 4, or 7-bp) result in frame restoration and EGFP expression. (b) Representative flow cytometry data for reporter cells transfected with CMV controlled wild-type _Fok_I and _Sharkey_’ cleavage domains with 3, 4, 5 and 6-finger zinc finger DNA binding domains. Mutagenesis is measured by counting the % of EGFP positive cells. (c) Quantification of EGFP positive reporter cells following transfection with ZFN. Error bars denote standard deviation of three replicates (d) _Mlu_I restriction digest assay of HEK 293 reporter cells transfected with ZFN. ‘Cut’ indicates the presence of unmodified reporter gene. ‘Uncut’ indicates the presence of ZFN modified reporter gene. The % of modified reporter cells is indicated.

We constructed a series of ZFPs containing different numbers of ZF motifs. ZFP E2C (E6) is a six-finger protein and recognizes an 18-bp sequence, whereas ZFPs E5, E4 and E3 were made by deleting one, two or three fingers from the N-termini of the protein, respectively. The affinity of each of the ZFPs for its target was determined by electrophoretic mobility-shift assays (Table 1). ZFNs composed of these ZFPs target the same location in the mammalian genome, simplifying the analysis and reducing the potential background interference. We first fused these ZFPs to wild-type _Fok_I cleavage domain, under the control of a CMV promoter, and compared their abilities to stimulate mutagenesis in mammalian cells by transient expression experiments. On day 3 post transfection, we analyzed these samples by flow cytometry and found that ZFNs with four or five fingers promoted mutagenesis with the highest efficiency (6.87% ± 0.8% and 7.03% ± 0.96% EGFP positive, respectively). In contrast, the activity of E3.wt was barely above background (0.55% ± 0.10%) and E6.wt (3.96% ± 0.63%) was about half as active as E4.wt or E5.wt (Fig. 4b,c). The low activity of E3.wt was expected, because the affinity of E3 is rather modest (35 nM). E6.wt also showed reduced activity even though E6 exhibits appreciable affinity for its target site (0.85 nM). It is possible that too high an affinity may obstruct downstream processes. _Mlu_I assays were in agreement with these results (Fig. 4d).

We then substituted the wild-type cleavage domain in these ZFNs with Sharkey’ (previously FCDR18-28) a derivative of the hyperactive Sharkey cleavage domain, and performed an identical mutagenesis assay. Following transfection, Sharkey’ based ZFNs increased EGFP expression 3 to 6-fold relative to wild-type ZFN (Fig. 4b,c). A 2 to 3-fold increase was observed by _Mlu_I digestion resulting in up to ~64 % targeted mutagenesis (Fig. 4d). Additionally, Sharkey’ enhanced the performance of E6 nucleases. This may be due to the higher turnover rate of Sharkey’. To ensure that the improved activity was not achieved at the cost of increased off-target cleavage, we utilized a well-established assay to measure genome-wide DNA cleavage levels. Phosphorylated histone H2AX (γH2AX) appears rapidly after DNA damage and can be used as a DSB indicator40. Using FITC labeled antibody against γH2AX, we quantified the percentage of fixed antibody-stained cells by flow cytometry. We observed no appreciable difference in the levels of γH2AX staining resulting from Sharkey’ or wild-type ZFN expression (Supplementary Fig. 4).

Sharkey is compatible with heterodimeric _Fok_I cleavage domain architectures

The use of ZFNs can often be associated with considerable cytotoxicity, an undesirable result presumably linked to off-target cleavage events11. In recent years structure-based design strategies have successfully been applied towards the generation of improved ZFN architectures with diminished levels of cleavage-competent homodimers31, 32. Through iterative rounds of structure-guided mutagenesis and screening, Miller et al. were able to identify compatible FCDs with two substitutions in each cleavage domain, E490K:I538K and Q486E:I499L31. Relying primarily on in silico approaches, Szczepek et al. generated compatible FCD variants by remodeling the dimerization interface of the native enzyme, simply exchanging salt-bride participating Arg and Asp residues32. However, while these ZFN architectures are efficient for endogenous gene targeting applications, often times fewer mutations are introduced5; a result perhaps due to reduced binding affinity between specialized cleavage domains31. We sought to address this issue and engineer enhanced heterodimeric ZFNs by 1) integrating the Sharkey framework within the existing ZFN architectures developed by Miller et al. and Szczepek et al. and 2) by developing novel Sharkey based architectures via directed evolution.

We began the process of evolving a Sharkey based heterodimeric ZFN framework by introducing a ZFN expression cassette containing an E6.Sharkey’ D483R mutant into the reporter plasmid p11-LacY-sE6/P3. The homodimeric P3/P3 cleavage site within this reporter plasmid was replaced with the heterodimeric E6/P3 cleavage site (Table 1). The Sharkey derivative FCD9-3 (S418P, K448E, H523Y, N527D and R570Q) was used as a starting point for our selections. To limit the library size, we focused on introducing diversity exclusively into the FCD9-3 dimerization interface, holding E6.Sharkey’ constant over the course of our evolutions. Randomizing FCD9-3 positions 483-487 with all 20 amino acids (theoretical library size: 3.4 × 107 variants), we cloned the P3.FCD9-3 library into the ZFN expression plasmid pPDAZ, generating ~108 transformants. Active ZFN variants were enriched as described earlier. Following three rounds of selection, we observed Asp as the consensus residue at position 483. We subsequently generated additional diversity within FCD9-3 at positions 484-487, this time holding D483 constant. All 20 amino acids were represented at each position (theoretical library size: 1 × 106 variants). Following five rounds of selection we observed the SR to be ~27% (Supplementary Fig 5). Sequence analysis of selected mutants led to the identification of the consensus motif DAMQS (herein after referred to as DS) (Supplementary Fig. 1f).

We next sought to integrate Sharkey with the existing heterodimeric ZFN architectures. We hypothesized that introducing Sharkey mutations into the FCD variants designed by Miller et al. and Szczepek et al. would lead to increased rates of stimulated mutagenesis in mammalian cells. Towards the goal of validating our hypothesis, we constructed an EGFP reporter cell line containing the heterodimeric P3/E4 ZFN cleavage site (Table 1). This cell line is identical to the previous reporter line described, with the exception of the P3/E4 cleavage site. Sharkey’ mutations were positioned alongside the heterodimeric KK::EL and RR::DD interfaces and fused to either P3 or E4 ZFPs. In addition to these existing architectures, we analyzed the mutagenesis stimulating characteristics of our selected DS domain with wild-type FCD and Sharkey’ residues. Each ZFN was under control of a CMV promoter and mutagenesis was measured by flow cytometry following transient ZFN expression. The pair-wise analysis of these three architectures shows that ZFN heterodimers containing Sharkey’ mutations are >6-fold more efficient at stimulating mutagenesis than wild-type _Fok_I based KK::EL and RR:DD ZFNs (Fig. 5a,b). In addition, Sharkey’ based ZFN heterodimers comprised of our selected DS interface are >2-fold more efficient at stimulating mutagenesis than the RR::DD heterodimers described by Szczepek et al. (Fig 5a). None of the characterized variants were observed to be cleavage-competent homodimers. These results demonstrate that Sharkey mutations are compatible with existing heterodimer ZFN architectures and may further increase the rate of stimulated mutagenesis.

Figure 5. Sharkey is compatible with alternative ZFN architectures.

Figure 5

Mutagenesis efficiencies for asymmetric ZFN scaffolds comprised of wild-type _Fok_I cleavage domain (black) or Sharkey mutations (white). (a) RR::DD, RR::DS and (b) KK::EL FCD architectures were fused to E4 or P3.ZFPs. Stimulated mutagenesis was measured by the counting % of EGFP positive cells. Error bars indicate standard deviation of three replicates.

DISCUSSION

We have improved the efficiency of ZFNs through use of an evolutionary strategy geared towards optimizing the catalytic activity of the _Fok_I cleavage domain. Numerous factors affect the proficiency of a chimeric nuclease, including the specificity and affinity of ZFPs6, 27, the length and composition of the inter-domain linker29, 30, the length of spacer sequence DNA in between ZFP binding sites29, 30 and the interaction between two _Fok_I cleavage domains31, 32. We hypothesized that by increasing the catalytic activity of FCD through directed protein evolution; we could improve the performance of ZFNs even if they were constructed using lower affinity ZFPs. Using a sensitive in vivo selection methodology in parallel with cycling mutagenesis, in vitro DNA shuffling, and site-directed mutagenesis we were able to identify a FCD mutant with enhanced catalytic activity relative to the wild-type domain, as demonstrated by bacterial genetic assays, in vitro DNA cleavage assays, and targeted mammalian genome mutagenesis assays (Fig. 24).

Sharkey, the most efficient _Fok_I cleavage domain variant we interrogated, consists of two amino acid mutations, S418P and K441E, which were shown to independently enhance the cleavage capabilities of wild-type FCD. Situated within 8 Å of the catalytic center Lys 469 and within 4 Å of the phosphate backbone of bound substrate DNA (Fig. 6), S418P was observed to increase activity >10-fold relative to wild-type FCD. In comparison, K441E increased catalytic performance >5-fold, relative to wild-type FCD (Fig. 3c). The S418P mutation appears at a turn region in the protein and may contribute to the observed increase in catalytic efficiency by fine-tuning the structure of the enzyme. The introduction of Pro at the onset of helix α3 may influence its structural rigidity and enable rapid substrate turnover and recognition. In addition to the advantageous S418P and K441E mutations, both Q481H and N527D were observed in ≥70% of sequenced FCD variants. However, while Q481H lies within 7 Å of the catalytic center (Fig. 6), introduction of either itself or N527D to _Fok_I diminishes activity between 2 to 8-fold, relative to the wild-type cleavage domain. Indeed, it appears the accumulation of either of these point mutations over the course of our Sharkey’ evolutions resulted in preferential activity, specifically towards the _Mlu_I core sequence. The ability to refine the substrate specificity of the FCD by identification of mutations that contribute to substrate discrimination may have great utility for various endogenous gene-targeting applications. Saturation mutagenesis of Gln 481 and Asn 527, two amino acids that may play a role DNA recognition, as well as neighboring amino acids may yield the identification of selective FCD variants capable of discriminating between highly homologous endogenous target sites, thus refining overall specificity in conjunction with the zinc finger domains through cooperative specificity42. The development of ZFNs with preferential FCD activities may further reduce off-target cleavage events and consequently toxicity. Additionally, over the course of these evolutions, we were able to identify mutations outside the cleavage domain, which may contribute to the activity of ZFNs by affecting the affinity and specificity of custom ZFPs. Our results suggest that our adapted selection system may be utilized for optimizing the zinc finger domains of the ZFN as well.

Figure 6. Selected mutations mapped on the crystal structure of the restriction endonuclease _Fok_I.

Figure 6

Mutations identified in ≥70% sequenced variants are highlighted. Activating mutations S418P and K441E are depicted as red spheres on full-length _Fok_I bound to substrate DNA (PDB ID: 1FOK) 41. The selected mutations Q481H and N527D are shown as blue spheres. The catalytic center amino acids Asp 450, Asp 467 and Lys 469 are depicted as green balls and sticks.

In addition to demonstrating the enhanced cleavage capabilities of Sharkey in a variety of contexts, including a mammalian mutagenesis assay, we showed that Sharkey mutations are compatible with the ZFN architectures developed by Miller et al. and Szczepek et al31, 32. Moreover, towards the goal of improving the efficiency of existing asymmetric ZFN scaffolds, we were successfully able to amend our selection system for the directed evolution of novel heterodimeric architectures. Utilizing saturation mutagenesis to target two critical hydrogen bonds within the dimer interface of _Fok_I, we were able to identify a novel _Fok_I interface, DAMQS, that in conjunction with the D483R ZFN architecture, was able to stimulate mutagenesis more efficiently than the RR::DD ZFN scaffolds developed by Szczepek et al. We suspect the original interface contained sub-optimal hydrogen bonding networks as a result of the in silico based approach used to generate the asymmetric interface. We believe targeting neighboring amino acids by saturation mutagenesis enabled us to experimentally survey favorable regions of sequence space inaccessible through computational approaches and identify increasingly stable ZFN heterodimers with optimal side-chain configurations.

In summary, with increased catalytic capabilities and compatibility with the promising and potentially powerful heterodimeric ZFN scaffolds, we anticipate the Sharkey domain will prove indispensable for a growing list of zinc finger nuclease related applications throughout biology and medicine.

METHODS

Directed evolution of ZFN variants

The expression plasmid pPDAZ was constructed by first removing nucleotides 5-33 from pPROLar.A322 (Clontech Laboratories, Inc) to generate pPROLar.del.ara. This modification abolished arabinose control over protein expression. The zeocin resistance gene was next PCR amplified from pcDNA3.2/zeo(−) and cloned into pPROLar.del.ara with AatII and SacI to form pPDAZ. ZFNs were cloned into pPDAZ with the restriction sites KpnI and XbaI. The P3.FN recognition sequence was PCR amplified using the overlapping primers P3.FN Fwd and P3.FN Rev and cloned into the reporter plasmid p11-LacY (kindly provided by Dr. H. Zhao) to generate p11-LacY-sP3/P3. Additional DNA beyond the ZFN recognition site was included so as to avoid formation of a target site hairpin during PCR. Primers sequences are reported in Supplementary Figure 6. The reporter plasmid p11-LacY-sP3/P3 was transformed into E. coli BW25141 (kindly provided by Dr. H. Zhao). Electrocompetent cells containing the reporter plasmid p11-LacY-sP3/P3 were prepared using standard protocols.

The _Fok_I cleavage domain was amplified by PCR from the template pcDNA.FCD with the primers FokI Fwd and FokI Rev. The P3 ZFP was amplified by PCR from the template pMal-PBS2.HS1 with the primers ZIF-KpnI Fwd and 3ZF-FokI Rev. Overlap PCR with ZIF-KpnI Fwd and FokI Rev was used to fuse _Fok_I cleavage domain to the P3 ZFP. P3.wt ZFN was cloned into pPDAZ with KpnI and XbaI to generate pPDAZ-P3.FCD. Libraries of ZFN mutants were generated by error-prone PCR. Amplification of _Fok_I cleavage domain was performed over 20 cycles with the template pPDAZ-P3.FCD and the primers ZFN Lib Fwd and Lib Restore Rev in the presence of 12.5 μM dPTP and 12.5 μM 8-oxo-dGTP. An error-free copy of the P3 ZFP was PCR amplified with ZIF-KpnI Fwd and ZFN Lib Rev. Subsequent overlap PCR with ZIF-KpnI Fwd and Lib Restore Rev was used to fuse cleavage domain (with an average of 4 amino acid mutations) to the error-free copy of P3 ZFP. The resulting ZFN library was cloned into pPDAZ with KpnI and XbaI and electroporated into E. coli. ZFN libraries were routinely composed of 107–108 members. Following transformation, ZFN containing plasmids were isolated from overnight culture and electroporated into the selection strain BW25141. Transformed cells were recovered in SOC at 37 °C for 1 hr before plating on solid selection media containing 25 ng/mL zeocin and 10 mM arabinose. Survival rate (SR) was calculated as the ratio of the number of colonies on an arabinose plate to that of an arabinose-free plate. Following 9 rounds of selection, the start codon ATG was replaced with GTG and the recovery time following electroporation was increased to 3 hr. Subsequent rounds of selection saw an incremental decrease in recovery time.

ZFN purification and in vitro DNA cleavage assay

N-terminal His6-tagged P3.wt and P3.Sharkey’ were cloned into the p11-LacY-wtx1 expression vector, replacing the Nhe1 and Xba1 flanked ccdB gene, and transformed into E. coli TOP10F’. Single colonies were picked and grown overnight at 37 °C in SB media with 100 μg/mL carbenicillin, an ampicillin analogue, and 1% glucose. The overnight culture was then diluted 1:100 in SB media containing 90 μM ZnCl2 and 100 μg/mL carbenicillin at 37 °C with shaking until an OD600 of 0.4, at which point each culture was incubated at RT. At an OD600 of 0.6, protein expression was induced with 10 mM arabinose. After 5 hrs, cells were harvested via centrifugation and sonicated in lysis buffer (50 mM NaH2PO4, 500 mM NaCl, 0.1 mM ZnCl2, 1 mM PMSF, 5 mM DTT, 5 mM Imidazole and 10% glycerol, pH 7.9). Proteins were purified using Ni-NTA agarose resin (Qiagen) and eluted in 50 mM NaH2PO4, 500 mM NaCl, 0.1 mM ZnCl2, 400 mM Imidazole and 10% glycerol, pH 7.9 and were further purified by ion exchange chromatography with SP Sepharose Fast Flow resin (Amersham Pharmacia Biotech AB). Protein was concentrated and stored at −80 °C in 20 mM NaH2PO4, 50 mM NaCl2, 100 μM ZnCl2, 5 mM DTT and 50% glycerol, pH 7.4 until use.

The in vitro cleavage assay was performed against linearized substrate plasmid pSub-P3. 12 nM ZFN was added into pre-warmed ZFN Reaction Buffer (20 mM Tris-acetate, 10 mM Magnesium acetate, 50 mM Potassium acetate, 90 μM ZnCl2 and 5 mM DTT, pH 7.9) containing 4, 6, 12, 24, or 36 nM of substrate DNA. Samples were incubated at 37 °C. 5–10 μL aliquots were withdrawn at regular time intervals and quenched with 2 μL stop solution (10 mM EDTA, 39% glycerol, 0.5% SDS, 0.025% bromophenol blue and 0.025% xylene cyanol). ZFN mediated DNA cleavage was monitored by gel electrophoresis and analyzed by ImageJ.

In vivo ZFN activity assay

P3.FCD variants containing a GTG initiation codon and the mutations S418P, Q481H, N527D and K441E were generated by Quickchange site-directed mutagenesis PCR. Site-directed mutagenesis was confirmed by DNA sequencing. _Mlu_I (ACGGCT), Nx6 (N: A, T, C or G), ACGAAT, VF2471 (GAGAGT) and CFTR (TGGTGA) containing target sites were amplified by the overlapping synthetic oligonucleotides sPBSm-SceI-X Fwd and sPBSm-SceI-X Rev, where X = ZFN cleavage site core sequence. ZFN target sites were subsequently cloned into the reporter plasmid p11-LacY-X and transformed into E. coli BW25141. Individual colonies containing the reporter plasmid p11-LacY-X were cultured overnight and electrocompetent cells for each ZFN cleavage site were prepared. ZFN expression plasmid was transformed into the appropriate BW25141 selection strain. ZFN activity was measured by calculating the SR at the time point at which wild-type ZFN SR was observed to be 0.5–2%. ZFN variant SR was normalized to wild-type ZFN.

Directed evolution of Sharkey’ based ZFN heterodimers

To construct the heterodimer reporter plasmid p11-LacY-sE6/P3, an additional expression cassette was PCR amplified from pROLar.A322 and cloned into the NsiI site of p11-LacY-sE6/P3-ΔXbaI. The mutation D483R was introduced into E6.Sharkey’ by Quickchange site-directed mutagenesis PCR. The resulting fixed ZFN variant was cloned into the reporter plasmid p11-LacY-sE6/P3-ΔXbaI and transformed into the selection strain BW25141. The FCD variant FCD9-3 was used as a starting point for diversity introduction as Sharkey had yet to be identified. Diversity was introduced into FCD9-3 at amino acids 483-487 with the oligonucleotides Rd DEMQR Fwd and Rd DEMQR Rev. The Rev primer contained an NNK codon at each of the corresponding targeted amino acids (where N: A, T, C or G and K: G or T). The NNK codon yields 32 possible codons, which encode 20 amino acids. The ZFN library was subsequently cloned into pPDAZ with KpnI and XbaI and electroportated into E. coli. The ZFN library was composed of >108 transformants. ZFN containing plasmids were isolated from overnight culture and electroportated into the selection strain BW25141, which contained the E6.FCD9-3 D483R expressing reporter plasmid p11-LacY-sE6/P3-ΔXbaI. Active ZFN variants were enriched as previously described. The secondary FCD9-3 library was generated with the oligonucleotides Rd DEMQR Fwd and Rd EMQR Rev.

Construction of mammalian cell lines and measurements of mutagenesis

To generate an EGFP reporter gene containing either E6/E6 or E6/P3 target sites, a single SgrAI site was inserted between EGFP residues 157 and 158 using the overlapping primers midEGFP SgrAI Fwd and midEGFP SgrAI Rev. The overlapping oligonucleotides E2C.I-SceI Fwd and E2C.I-SceI Rev were subsequently used to clone into the engineered SgrAI site to generate the E6/E6 EGFP reporter gene. The overlapping oligonucleotides PBS2/E2C.I-SceI Fwd and E2C.I-SceI Rev were used to clone into SgrAI to generate the E6/P3 EGFP reporter gene. Reporter cell lines containing a single CMV promoter and a modified EGFP reporter gene were generated in Flp-In-293 cells using the Flp-In system (Invitrogen).

Wild-type FCD and Sharkey’ were sub-cloned into the ZFN expression plasmid pVAX1 (Invitrogen) using the restriction enzymes PstI and XbaI. E3, E4, E5 and E6 ZFPs were constructed as previously described21, and sub-cloned into ZFN expression plasmid using the restriction enzymes XmaI and AgeI. To measure the rates of stimulated mutagenesis, reporter cells were seeded onto polylysine-coated 24-well plate at a density of 1.5 × 105 per well. After 24 hr incubation, reporter cells were co-transfected with 100 ng ZFN expression plasmid and 500 ng pcDNA3.1/Zeo(−) using Lipofectamine 2000 (Invitrogen) under conditions specified by the manufacturer. Similarly, for heterodimer ZFN assays, cells were co-transfected with 100 ng ZFN expression plasmid and 400 ng pcDNA3.1/Zeo(−) carrier DNA. Transfection efficiencies were measured to be between 70–80%. 3 days post transfection, 30,000 cells were analyzed by flow cytometry (FACScan Dual Laser Flow Cytometer, BD Biosciences) to measure the percentage of EGFP positive cells. Additionally, the rate of mutagenesis was measured by MluI cleavage. Briefly, 3 days post transfection; genomic DNA was harvested and purified with a QIAmp DNA Mini Kit (Qiagen). Modified EGFP gene was amplified over 30 cycles by PCR (Expand High Fidelity, Roche) with 1 μg template DNA and 10% v/v DMSO with an annealing temperature of 72 °C. The PCR products were digested overnight with _Mlu_I and visualized by gel electrophoresis.

γ-H2AX based cytotoxicity assay

γ-H2AX cytotoxicity assay is described in Supplementary Information.

Accession Numbers

The nucleic acid sequence of the Sharkey cleavage domain has been deposited in GenBank with accession number HM130522.

Supplementary Material

01

Acknowledgments

This study was supported by U.S. National Institutes of Health grant R01GM065059 and The Skaggs Institute for Chemical Biology.

ABBREVIATIONS

ZFP

zinc finger protein

ZFN

zinc finger nuclease

DSBs

double strand breaks

NHEJ

non-homologous end joining

HR

homologous recombination

OPEN

oligomerized pool engineering

FCD

_Fok_I cleavage domain

SR

survival rate

IPTG

isopropyl β-D-1-thiogalactopyranoside

Footnotes

AUTHOR CONTRIBUTIONS

J.G., T.G. and C.F.B. designed research; J.G. and T.G. performed research; J.G., T.G. and C.F.B. analyzed data; and J.G., T.G. and C.F.B. wrote the paper.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01