Cryptic loxP sites in mammalian genomes: genome-wide distribution and relevance for the efficiency of BAC/PAC recombineering techniques (original) (raw)

Journal Article

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

,

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

Search for other works by this author on:

1Molecular Physiology Group, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK, 2Current address: Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK, 3Endocrine Science Research Group, University of Manchester M13 9PT, UK and 4Endocrinology Unit, University of Edinburgh Medical School, Edinburgh EH16 4TJ, UK

*To whom correspondence should be addressed. Tel: +44-131-242-6722; Fax:

+44-131-242-6782

; Email: J.Mullins@ed.ac.uk

Search for other works by this author on:

Revision received:

01 December 2006

Accepted:

05 December 2006

Published:

06 February 2007

Cite

S. Semprini, T.J. Troup, N. Kotelevtseva, K. King, J.R.E. Davis, L.J. Mullins, K.E. Chapman, D.R. Dunbar, J.J. Mullins, Cryptic _lox_P sites in mammalian genomes: genome-wide distribution and relevance for the efficiency of BAC/PAC recombineering techniques, Nucleic Acids Research, Volume 35, Issue 5, 1 March 2007, Pages 1402–1410, https://doi.org/10.1093/nar/gkl1108
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

Cre is widely used for DNA tailoring and, in combination with recombineering techniques, to modify BAC/PAC sequences for generating transgenic animals. However, mammalian genomes contain recombinase recognition sites (cryptic _lox_P sites) that can promote illegitimate DNA recombination and damage when cells express the Cre recombinase gene. We have created a new bioinformatic tool, FuzznucComparator, which searches for cryptic _lox_P sites and we have applied it to the analysis of the whole mouse genome. We found that cryptic _lox_P sites occur frequently and are homogeneously distributed in the genome. Given the mammalian nature of BAC/PAC genomic inserts, we hypothesised that the presence of cryptic _lox_P sites may affect the ability to grow and modify BAC and PAC clones in E. coli expressing Cre recombinase. We have observed a defect in bacterial growth when some BACs and PACs were transformed into EL350, a DH10B-derived bacterial strain that expresses Cre recombinase under the control of an arabinose-inducible promoter. In this study, we have demonstrated that Cre recombinase expression is leaky in un-induced EL350 cells and that some BAC/PAC sequences contain cryptic _lox_P sites, which are active and mediate the introduction of single-strand nicks in BAC/PAC genomic inserts.

INTRODUCTION

BAC and PAC clones have become the preferred tool for generating transgenic animals, since they accommodate large genomic DNA fragments, are well characterized, stable and easy to propagate and purify. In recent years, efficient and reliable methods have been developed to modify their sequence in E. coli. These techniques are generally termed recombineering (homologous recombination-mediated genetic engineering) and they are based on homologous recombination between a linear double-stranded DNA cassette or synthetic oligonucleotides and a circular DNA molecule (plasmid, BACs or PACs) (1,2). A variety of methods and strains have been described for homologous recombination, but the most widely used methods derive the homologous recombination machinery from the bacteriophage λ-encoded Red system (3). The DY380 strain of E. coli is derived from DH10B and encodes a defective λ-prophage, in which the recombination apparatus is expressed under the control of the temperature-sensitive λ repressor (cI857) (3,4). Two further strains have been derived from DY380 in which the tetracycline resistance gene, (encoded on the bacterial genome) is substituted by either Flp (strain EL250) or Cre (strain EL350) genes (5). The Flp or Cre genes are expressed under the control of the arabinose-inducible AraC-PBAD promoter to allow further manipulation of introduced BAC/PACs. Cre DNA recombinase catalyses the recombination between two 34-bp _lox_P elements. The outcome (excision or inversion) depends on the relative orientation of the 2 _lox_P elements (6). The _lox_P element contains a core spacer sequence of 8-bp flanked by two palindromic sequences each of 13-bp to which Cre binds (7). Up to 5 mismatches from the consensus in each of the two palindromic sequences can be tolerated by Cre, without significantly reducing DNA binding (8,9). The core sequence is the cleavage site; its asymmetry defines the direction of the _lox_P site and homology in this region between two _lox_P sites is required for efficient recombination (10). When recombination occurs between a mutant _lox_P site, which bears a deletion in the spacer region, and a wild-type _lox_P site, the introduction of single- and double-strand breaks in the DNA may occur (11). On a statistical basis, the 34-bp consensus _lox_P site is not expected to be present in mammalian genomes, but the presence of functional recombinase recognition sites, that diverge significantly from the native _lox_P site, has been identified in both the human and mouse genomes (12). These cryptic (or pseudo) _lox_P sites, can support Cre-mediated recombination at high efficiency when cryptic _lox_P sites with the same spacer region are involved in the recombination (12). Studies have shown that Cre expression in mammalian cells causes recombination events between cryptic _lox_P sites resulting in gross chromosomal rearrangements in spermatids (13) and, in cultured mammalian cells, growth inhibition accompanied by DNA damage (14). Furthermore, recombination can occur even when the spacer regions display a relatively high level of non-homology (15). Moreover, it is generally acknowledged that some large DNAs grow slowly in Cre-containing E. coli and it has been reported that modified strains of E. coli are not ideal for receiving PACs and BACs and some BAC clones cannot be transformed into such strains for reasons that are not fully understood (16).

Our hypothesis is that cryptic _lox_P sites, encoded in the mammalian genomic DNA insert, may act as substrates for Cre recombinase, expression of which may be leaky in some bacterial strains.

Here, we describe the use of a bioinformatics tool to identify such cryptic sites and experiments to test these predictions carried out on BAC/PACs.

MATERIAL AND METHODS

In silico identification of cryptic _lox_P sites

To automate the identification of cryptic _lox_P sites within a given DNA sequence a bioinformatics workflow was created using Taverna [TAVERNA] (17). Taverna provides a workbench application that enables the construction and enactment of workflows within a graphical environment. A schematic representation of the workflow is illustrated in Figure 1. This workflow takes a DNA sequence and searches for matches to three different patterns using Fuzznuc, which is freely available as part of the European Molecular Biology Open Software Suite (EMBOSS) [EMBOSS] (18) and is accessed via a Soaplab [SOAPLAB] web service (19).

Schematic representation of the workflow created to automate the identification of cryptic loxP sites.

Figure 1.

Schematic representation of the workflow created to automate the identification of cryptic _lox_P sites.

Pattern 1 is described by the sequence ATAACTTCGTATA (N)8 TATACGAAGTTAT (12) and it selects for high homology in the 13-bp palindromic regions of the wild-type _lox_P. This pattern is augmented by a mismatch parameter that allows up to 10 mismatches to be tolerated (8,9). Pattern 2 is described by the sequence ATNAC(N)2CNTATA (N)8 TATANG(N)2GTNAT. It selects for conservation of those base pairs in the _lox_P site believed to be contact points for the Cre enzyme (underlined bases in the sequence ATAACTTCGTATA ATGTATGC TATACGAAGTTAT). This pattern is augmented by a mismatch parameter that allows up to 5 mismatches to be tolerated. We hypothesized that the mismatches allowed in this search could be tolerated by the Cre recombinase. Pattern 3 is described by the sequence (N)9TATA (N)8 TATA(N)9. It ensures that the TATA motif surrounding the core 8-bp spacer region is present. This pattern is augmented by a mismatch parameter that ensures no mismatches are tolerated. The fulfilment of these three criteria defines a primary cryptic _lox_P site and provides a wider classification then previously proposed (12).

A web service called FuzznucComparator was developed that compares the output from two Fuzznuc processes and outputs only those sequences present in both. When the result of the comparison contains more than one sequence, the FuzznucComparator tool performs 2 pairwise alignments of the core 8-bp spacer regions. The first alignment is calculated using the sequences in their given orientation. The second complements one sequence prior to making the alignment. The output file format consists of the result of the pairwise comparison (if any) followed by those sequences present in both input files in fuzznuc's seqtable format.

To isolate those sequences that match all three patterns two comparisons are required. First, a FuzznucComparator process is used to isolate those sequences that match patterns 1 and 2. A fileDivider process splits the output content and outputs only the fuzznuc seqtable section. Second, a FuzznucComparator process compares the output from the fileDivider process with those sequences that match pattern 3. The final step in the workflow is to write those sequences that match all three patterns to file. The Scufl workflow can be downloaded from http://www.bioinf.mvm.ed.ac.uk/projects/workflows/loxp. This workflow can be opened and enacted within the Taverna workbench. A web page interface to this workflow is also freely available at: http://wilkie226.dmed.ed.ac.uk:8080/loxpFinder.

The default mismatch values enable discovery of primary cryptic _lox_P sites. The Taverna workbench and the web page interface enable users to edit the workflow and thus change the number of mismatches tolerated for each pattern. By relaxing the number of allowed mismatches the workflow can find secondary cryptic _lox_P sites. Additional information regarding how to use these resources is available in the online manual.

Plasmid, BAC/PAC vectors, E. coli strains and growth conditions

pROSA26/Tet is a pBluescript-modified plasmid containing a tetracycline resistance gene sub-cloned between two _lox_P consensus sequences (pROSA26 unpublished, personal gift from Yuri Kotelevtsev, University of Edinburgh. Original source Igor Samokhvalov, RIKEN Center for Developmental Biology, Kobe Japan). This plasmid contains no _AraC_-PBAD promoter elements.

PAC111L11 (20) (kindly provided by Craig A. Jones, Buffalo, New York) maps on human chromosome 1 and spans the Ren gene locus. The vector backbone (pCYPAC2) contains a single consensus _lox_P site and encodes kanamycin resistance. BACN10 was isolated by the screening of a mouse genomic library (129/Ola mouse strain) (Invitrogen Corporation, Formally Research Genetics) using a renin gene probe. It maps on mouse chromosome 1 and spans the Ren1c and Ren1d gene locus. The vector backbone (pBeloBAC11) contains a single _lox_P site and encodes chloramphenicol resistance. ASBAC (kindly provided by Keith Parker Dallas, Texas) also comes from a pBeloBAC11 library, and maps to chromosome 15, spanning the cyp11b1 and cyp11b2 genes. [BAC ends coordinates refer to RP23-23009 clone sequence T7: 205723 bp, SP6: 82689 bp]. The DY380, EL250 and EL350 strains of E. coli have been described (3), and were kindly provided by Neal Copeland (Mouse Cancer Genetics Program, National Cancer Institute-Frederik).

EL350 and EL250 differ from DY380 in that they encode the Cre and Flp recombinase genes respectively under the control of _AraC_-PBAD promoter and they are not tetracycline resistant (5).

One hundred nanograms of PAC or BAC DNA were transformed into EL350 and DY380 bacterial strains by electroporation (Easyjet Plus, Equibio; 1.75 kV, 200 ohms, 2 μF). Following selection on kanamycin/chloramphenicol plates, a 10−6 dilution of cells was plated on LB agar plates containing either 0.2% arabinose or 0.2% glucose and incubated overnight at 32°C.

In vivo assay for Cre activity in EL350

EL350 cells were transformed with 40 ng of pROSA26/TET plasmid DNA by electroporation and plated on LB agar containing 50 μg/ml ampicillin or 25 μg/ml tetracycline or both antibiotics. After overnight growth at 32°C, colonies were counted and the ratio of tetracycline-resistant colonies to ampicillin-resistant colonies calculated as a measure of the Cre gene expression in EL350 in the un-induced state. This measure was then compared to the Cre activity after induction of the PBAD-promoter with 0.1% arabinose, according to a protocol adapted from Lee et al. (2001) (5) which is available at http://recombineering.ncifcrf.gov/Protocol.asp. Two cell dilutions (10−4 and 10−6) were plated on Amp, Amp/Tet and Tet plates.

In vitro assay for BAC/PAC DNA nicking

One microgram of BAC/PAC DNA extracted from DY380 cells was incubated with 1 or 2 μl of Cre enzyme (1000 U/ml, New England Biolabs) overnight. After the incubation, each reaction was phenol-chloroform extracted and ethanol precipitated. An aliquot of DNA was then digested with _Eco_RV or _Hin_dIII restriction enzymes (Promega). DNA, re-suspended in alkaline gel loading buffer (50 mM NaOH, 1 mM EDTA, 2.5% Ficoll and 0.025% bromoCresol green), was loaded on a denaturing gel as previously described (21). Following electrophoresis, the gel was soaked in 0.25 N HCl for 6–7 min, rinsed and neutralised in 0.5 N NaOH, 1.5 M NaCl for 30 min, blotted onto a positively charged membrane and hybridized according to standard procedures with specific radiolabelled probes, generated using the Ready-To-Go™ DNA Labelling beads (Amersham Pharmacia Biotech). Probes P1 (for the detection of cryptic _lox_P site 1) P2 and P3 (to detect cryptic _lox_P site 2) were generated by PCR from the PAC111L11 clone, using primers c-_lox_P1F 5′-CTCAGACACTTTGGTGGGTC-3′ and c-_lox_P1R 5′-GACTTTCAGTATGGCTGCCTAAC-3′ for probe 1 (P1); c-_lox_P2F 5′-CAGGAGTTAGAGACCAGC-3′ and c-_lox_P2R 5′-GCTATCTCGGCTCACTG-3′ for probe 2 (P2) and c-_lox_P3F 5′-GAAGGGCTGAGGTTAGGCAG-3′ and c-_lox_P3R 5′-GAACACCTACTGAGCTTGAG-3′ for probe 3 (P3).

RESULTS

Mouse genome-wide distribution of cryptic _lox_P sites

In order to assess the distribution and frequency of cryptic _lox_P sites in the genome, a new bioinformatics tool (FuzznucComparator) was developed to perform a mouse genome-wide search (see materials and methods). Two stringencies were applied. Primary cryptic _lox_P sites were identified as sequences conforming to three patterns, which together define a primary cryptic _lox_P site (homology in the 13-bp palindromic sequences of the _lox_P consensus sequence; conservation of base pairs in the _lox_P site believed to be contact points for the Cre enzyme and presence of the four bases (TATA) flanking the core sequence). Figure 1 shows the workflow for the search of cryptic _lox_P sites. FuzznucComparator also defined secondary cryptic _lox_P sites, using less stringent criteria where the mismatch allowance for the three patterns is increased arbitrarily by the operator. In our hypothesis, some of these secondary sites have the potential to bind Cre and mediate DNA damage, if they are located near primary cryptic _lox_P sites.

The NCBI m34 mouse assembly (http://www.ncbi.nlm.nih.gov/genome/seq/NCBIContigInfo.html) was split into one-megabase regions and submitted to the Fuzznuc and FuzznucComparator search for the three patterns defining a primary cryptic _lox_P site. The output of the search is represented in Figure 2. The overall frequency of primary cryptic _lox_P sites in the mouse genome is 1.2 per megabase. Some chromosomes show a more tightly clustered distribution of cryptic _lox_P sites than others (Figure 2, Chr 3, 13, 14, 15, 16, 18 and X). A few chromosomes (Figure 2, Chr 1, 2, 4, 7, 16) present with hot spots of cryptic _lox_P sites, generally 8 or 9 in a Mb DNA window, but up to 17 in the case of chromosome 1. Chromosome Y has no primary cryptic _lox_P sites.

Mouse genome-wide search for cryptic loxP sites. Each graph shows the distribution and the number of cryptic loxP sites in 1 Mb regions along the 21 mouse chromosomes. The length of each graph is proportional to the corresponding chromosomal length. A megabase scale is present at the bottom of each graph.

Figure 2.

Mouse genome-wide search for cryptic _lox_P sites. Each graph shows the distribution and the number of cryptic _lox_P sites in 1 Mb regions along the 21 mouse chromosomes. The length of each graph is proportional to the corresponding chromosomal length. A megabase scale is present at the bottom of each graph.

About 10% of the spacer regions in primary cryptic _lox_P sites are not unique, and occur more than once.

In silico identification of cryptic _lox_P in PAC/BAC sequences

Three BAC and PAC clones (BACN10, ASBAC and PAC111L11) were scanned for cryptic _lox_P sites (in addition to the consensus _lox_P site in the backbone of their respective vectors) to see whether their presence caused instability in Cre-expressing host cells. Both PAC111L11 and BACN10, which span the human and mouse renin locus respectively, show the presence of primary cryptic _lox_P sites that match all three patterns. The search returned two potential cryptic _lox_P sequences (c-_lox_P1 and c-_lox_P2) in PAC111L11 with 5 out of 8 bp matching in the spacer region (62.5% identity) (Table 1). A search for secondary cryptic _lox_P sites within 6 kb of each primary site revealed one hit (sc-_lox_P), located 4.3 kb upstream of c-_lox_P2 in PAC111L11. The sc-_lox_P has twelve mismatches to the consensus _lox_P in the 13-bp palindromic arms, 12 out of 18 conserved Cre contact points and 3 mismatches in the TATA sequences flanking the spacer region. When aligned with the complementary strand of the c-_lox_P2 site, 4 out of 8 bp match in the spacer region.

Table 1.

Number of primary cryptic _lox_P sites in three BAC/PAC molecules and their characteristics

PAC/BAC name Putative primary cryptic _lox_P sites (search pattern1 and 2) Pairs of primary c-_lox_P sites with 6/8 matches in the spacer region Pairs of primary c-_lox_P sites with 5/8 matches in the spacer region Pairs that have conservation in the 4 bases flanking the spacer region
PAC111L11(H) 2 0 1 1
ASBAC (M) 5 0 1 0
BACN10 (M) 4 1 1 2
PAC/BAC name Putative primary cryptic _lox_P sites (search pattern1 and 2) Pairs of primary c-_lox_P sites with 6/8 matches in the spacer region Pairs of primary c-_lox_P sites with 5/8 matches in the spacer region Pairs that have conservation in the 4 bases flanking the spacer region
PAC111L11(H) 2 0 1 1
ASBAC (M) 5 0 1 0
BACN10 (M) 4 1 1 2

Table 1.

Number of primary cryptic _lox_P sites in three BAC/PAC molecules and their characteristics

PAC/BAC name Putative primary cryptic _lox_P sites (search pattern1 and 2) Pairs of primary c-_lox_P sites with 6/8 matches in the spacer region Pairs of primary c-_lox_P sites with 5/8 matches in the spacer region Pairs that have conservation in the 4 bases flanking the spacer region
PAC111L11(H) 2 0 1 1
ASBAC (M) 5 0 1 0
BACN10 (M) 4 1 1 2
PAC/BAC name Putative primary cryptic _lox_P sites (search pattern1 and 2) Pairs of primary c-_lox_P sites with 6/8 matches in the spacer region Pairs of primary c-_lox_P sites with 5/8 matches in the spacer region Pairs that have conservation in the 4 bases flanking the spacer region
PAC111L11(H) 2 0 1 1
ASBAC (M) 5 0 1 0
BACN10 (M) 4 1 1 2

BACN10 is predicted to contain four primary cryptic _lox_P sites. Two of these are identical at 6 out of 8 bp in the spacer region and the other two match at 5; both pairs of sites show conservation in the four bases flanking the spacer region (Table 1). A search for secondary cryptic _lox_P sites in the regions surrounding the primary sites returned one hit with the same characteristics outlined for the sc-_lox_P site in PAC111L11.

ASBAC contains 5 sequences that match the first two criteria for primary cryptic _lox_P sites but, when considered in pairs, none show conservation in the four bases flanking the core region (Table 1). Since not all three criteria for cryptic _lox_P definition are satisfied, these sites are not predicted to be functional.

Differential growth of PAC/BAC-transformed EL350 E. coli strain compared to DY380 and EL250

To test the hypothesis that cryptic _lox_P sites mediate DNA damage in E. coli strains expressing Cre recombinase, PAC111L11 (depicted in Figure 3) was transformed into DY380 (expressing neither Cre nor Flp), EL250 (expressing Flp) and EL350 (expressing Cre) E. coli (3). Transformed cells were grown in the presence of arabinose (to induce the _AraC_-PBAD promoter) or glucose (to ensure catabolite repression of the promoter) (22–24). Whilst the growth of PAC111L11-transformed DY380 and EL250 cells was similar on plates containing glucose or arabinose (Figure 4B and C), the growth of PAC111L11-transformed EL350 cells differed on arabinose, with a similar number of colonies of much smaller size (Figure 4A). Statistically significant reduced colony diameter is observed on arabinose plates when comparing PAC111L1-transformed EL350 (colony diameter 0.51 ± 0.02 mm) cells to PAC111L11-transformed DY380 (colony diameter 1.69 ± 0.02 mm on arabinose, p < 0.001) or EL250 (colony diameter 1.44 ± 0.02 mm on arabinose, p < 0.01) cells. In addition, the EL350-PAC111L11 colonies formed in the presence of glucose were appreciably smaller than those formed by DY380 or EL250 transformants. The effect of glucose on PAC111L11-transformed EL350 colony size is concentration-dependent with 1% glucose resulting in the largest colonies (Figure 5). Un-transformed DY380 and EL350 cells grow with the same efficiency in 0.2% arabinose-containing agar, despite the presence of one primary cryptic _lox_P site in the E. coli bacterial genome (Figure 5, inset).

(A) Positions of the cryptic loxP sites on the PAC111L11 insert. (B) Sequence comparison between the loxP consensus site and PAC111L11 cryptic loxP sites (primary c-loxP and secondary sc-loxP). Underlined nucleotides — Cre contact points on loxP consensus sequence; bold — conserved nucleotides.

Figure 3.

(A) Positions of the cryptic _lox_P sites on the PAC111L11 insert. (B) Sequence comparison between the _lox_P consensus site and PAC111L11 cryptic _lox_P sites (primary c-_lox_P and secondary sc-_lox_P). Underlined nucleotides — Cre contact points on _lox_P consensus sequence; bold — conserved nucleotides.

Growth of PAC111L11-transformed EL350 (A), DY380 (B) and EL250 (C) cells on LB agar supplemented with either 0.2% glucose or 0.2% arabinose. Arrows indicate very small and barely detectable colonies formed on arabinose-containing agar by EL350/PAC111L11.

Figure 4.

Growth of PAC111L11-transformed EL350 (A), DY380 (B) and EL250 (C) cells on LB agar supplemented with either 0.2% glucose or 0.2% arabinose. Arrows indicate very small and barely detectable colonies formed on arabinose-containing agar by EL350/PAC111L11.

Growth of PAC111L11-transformed EL350 on LB agar supplemented with increasing concentrations of glucose. Inset: Un-transformed EL350 and DY380 on agar with 0.2% arabinose.

Figure 5.

Growth of PAC111L11-transformed EL350 on LB agar supplemented with increasing concentrations of glucose. Inset: Un-transformed EL350 and DY380 on agar with 0.2% arabinose.

EL350 cells transformed with BACN10 also gave rise to very small colonies that failed to grow or had delayed growth on arabinose (data not shown), whereas EL350 cells transformed with ASBAC showed no difference in growth rate.

In vivo assay for Cre activity in EL350

Low levels of expression from the PBAD promoter can occur in the absence of the inducer arabinose if the expression at maximum induction is very high. Leaky Cre expression from the _AraC_-PBAD promoter has been described for multi-copy plasmids (25). However, in EL350 cells the _AraC_-_PBAD_-Cre gene is present in a single copy. To test if Cre was produced from the un-induced AraC-PBAD promoter in EL350, a functional test was performed, using a plasmid vector (pROSA26/Tet) encoding a tetracycline resistance gene between two _lox_P consensus sequences (Figure 6A). The plasmid pROSA26/Tet also confers ampicillin resistance to the host. If recombination occurs between the _lox_P sequences, the Tet resistance gene is excised and the E. coli strain can only grow under Amp selection. This was the case when Cre expression in EL350 cells was induced with arabinose. Two cell dilutions (10−4 and 10−6) were plated on LB agar plates containing either 50 μg/ml ampicillin or 25 μg/ml tetracycline or both. Although many colonies were obtained on ampicillin, no colonies were observed on Amp/Tet or Tet plates (Figure 6B), even after 24 h. Following transformation with pROSA26/Tet, Tet resistant colonies were obtained in the absence of arabinose (when the AraC-PBAD promoter is not induced in EL350 cells), however, the number of colonies was reduced approximately twenty-fold (Figure 6B). These data show that about 95% of the colonies that are able to grow on an Amp/Tet plate have lost the tetracycline gene and are no longer able to grow under tetracycline selection (p < 0.01). In order to verify if all the plasmid molecules inside a single E. coli colony have undergone site-specific recombination or a few of them retain the tetracycline resistance gene, a PCR assay was performed on colonies using primer T3 and T7 (Figure 6C). A mixed plasmid population (excised and non-excised) is recovered from un-induced EL350. This is shown by the presence of two DNA fragments, one of 1.5 Kb (Tet retaining plasmids) and one of 0.2 Kb (Tet excised plasmids). After induction of EL350 with arabinose, the tetracycline resistance gene was found to be 100% excised. Plasmids that grow into DY380 do not show any sign of site-specific recombination and maintain the tetracycline resistance gene (Figure 6C, right panel).

(A) pROSA26/Tet plasmid map. (B) Summary of the transformation of pROSA26/Tet plasmid into EL350. (C) PCR analysis of pROSA26/Tet transformed EL350 and DY380 under arabinose-induced and un-induced conditions using T3 and T7 primers. M1: 1 Kb Marker (New England Biolabs). M2: pBluescriptII SK+ plasmid digested with Sau3AI. The 1.5 Kb band indicates the presence of the Tet gene; the 0.2 Kb band is generated after Cre/loxP-mediated excision of the Tet gene.

Figure 6.

(A) pROSA26/Tet plasmid map. (B) Summary of the transformation of pROSA26/Tet plasmid into EL350. (C) PCR analysis of pROSA26/Tet transformed EL350 and DY380 under arabinose-induced and un-induced conditions using T3 and T7 primers. M1: 1 Kb Marker (New England Biolabs). M2: pBluescriptII SK+ plasmid digested with Sau3AI. The 1.5 Kb band indicates the presence of the Tet gene; the 0.2 Kb band is generated after Cre/_lox_P-mediated excision of the Tet gene.

Analysis of nicks in PAC111L11 DNA after incubation with Cre

The ability of Cre recombinase to introduce single or double-strand nicks at cryptic _lox_P sites was tested by incubation with PAC111L11 DNA. DNA fragments produced by the action of Cre were detected by Southern blot following either _Hin_dIII or _Eco_RVV restriction digest for the analysis of c-_lox_P1 nicks or c-_lox_P2 nicks, respectively using a c-_lox_P1 specific probe (P1) or c-_lox_P2 specific probes (P2 and P3; Figure 7).

In vitro analysis of the presence of nicks in PAC111L11 insert. (A) Schematic representation of expected fragment size for DNA nicked at cryptic loxP1 site. The arrow indicates the location of the c-loxP1 site on the HindIII fragment. P1: probe 1. (B) Schematic representation of predicted fragment sizes for DNA nicked at the c-loxP2 and secondary sc-loxP sites on the EcoRVV fragment. Arrows indicate the location of the c-loxP2 and sc-loxP sites, P2 — probe 2, and P3 — probe 3. (C) Southern analysis of PAC111L11 DNA, following overnight incubation with Cre recombinase, digested with HindIII and hybridized to probe P1 to detect nicks produced at the c-loxP1 site. M:1 kb DNA ladder (NEB) Plus and minus signs on each lane refer to the presence or absence of Cre recombinase and its relative abundance. (D) and (E) Southern analysis of PAC111L11 DNA, following overnight incubation with Cre recombinase, digested with EcoRVV and hybridized with probe P2 (D) and P3 (E) to detect nicks produced by cryptic loxP site 2 and/or sc-loxP. M:1 kb DNA ladder (NEB) Plus and minus signs on each lane refer to the presence or absence of Cre recombinase and its relative concentration.

Figure 7.

In vitro analysis of the presence of nicks in PAC111L11 insert. (A) Schematic representation of expected fragment size for DNA nicked at cryptic _lox_P1 site. The arrow indicates the location of the c-_lox_P1 site on the _Hin_dIII fragment. P1: probe 1. (B) Schematic representation of predicted fragment sizes for DNA nicked at the c-_lox_P2 and secondary sc-_lox_P sites on the _Eco_RVV fragment. Arrows indicate the location of the c-_lox_P2 and sc-_lox_P sites, P2 — probe 2, and P3 — probe 3. (C) Southern analysis of PAC111L11 DNA, following overnight incubation with Cre recombinase, digested with _Hin_dIII and hybridized to probe P1 to detect nicks produced at the c-_lox_P1 site. M:1 kb DNA ladder (NEB) Plus and minus signs on each lane refer to the presence or absence of Cre recombinase and its relative abundance. (D) and (E) Southern analysis of PAC111L11 DNA, following overnight incubation with Cre recombinase, digested with _Eco_RVV and hybridized with probe P2 (D) and P3 (E) to detect nicks produced by cryptic _lox_P site 2 and/or sc-_lox_P. M:1 kb DNA ladder (NEB) Plus and minus signs on each lane refer to the presence or absence of Cre recombinase and its relative concentration.

PAC111L11 DNA showed no evidence of nicking or cleavage at c-_lox_P1 following overnight incubation with Cre recombinase (Figure 7C). Only one band of 3.5 kb, corresponding to the intact DNA, was detected by probe P1. In contrast, hybridization of probes P2 and P3 to _Eco_RVV-digested PAC111L11 DNA detected fragments consistent with cleavage at both c-_lox_P2 and sc-_lox_P sites (Figure 7D and E). In the absence of Cre, the expected 6.6 kb fragment produced by _Eco_RVV digestion of intact PAC111L11 DNA was detected. At the higher concentration of Cre, additional fragments of 4.3 kb (with probe P2, Figure 7D) and 0.9 kb (with probe P3, Figure 7E) were detected.

These results suggest that the primary (c-_lox_P2) and the secondary (sc-_lox_P) c-_lox_P sites that map in the PAC111L11 insert are able to bind Cre recombinase enzyme and start a recombination event, which is non-productive and results in damage to the PAC DNA molecule.

The absence of the nicked 2.9 kb at the c-_lox_P1 site and 5.7 kb at the c-_lox_P2 site suggests that these cryptic _lox_P site are not getting paired and defectively recombined with the proper consensus _lox_P site that is located in the backbone of the PAC sequence.

DISCUSSION

Although little evidence has been reported, certain BAC/PAC inserts are difficult to propagate in Cre-containing E. coli strains. It is possible this could be due to low levels of Cre promoting single strand DNA breaks or recombination through cryptic _lox_P sites present in the inserts. Computational analysis of the whole mouse genome has revealed an average of 1.2 primary cryptic _lox_P sites per Mb DNA, with few hotspots present (chromosome y has none). The new bioinformatic tool, FuzznucComparator has been made available through a Distributed Annotation System (DAS) resource to enable dynamic access to the data using http requests (i.e. a URL), with the response being returned as XML. This resource is implemented using a Dazzle [DAZZLE] server backed by an LDAS database and can be accessed at: http://wilkie226.dmed.ed.ac.uk:8080/das. Access via the DAS protocol also allows these annotations to be viewed as a track within Ensembl (Ensembl help, http://www.ensembl.org/Homo_sapiens/helpview?se?1;kw?dasconfview, documents how to achieve this).

We have tested the power of the programme to predict functional cryptic _lox_P sites in the genomic insert of BAC and PAC sequences, which could potentially affect their growth in Cre recombinase expressing E. coli strains. We have identified primary and secondary cryptic _lox_P sites in the genomic insert of three BAC/PAC clones and shown that these are substrates for Cre recombinase in vitro. Furthermore, following transformation into EL350 cells, they demonstrate a dramatically reduced growth rate in the presence of arabinose, compared to DY380 and EL250. The poor growth is ameliorated by glucose suggesting a Cre-mediated effect on bacterial growth. We suggest that, since Cre recombinase is leaky, EL350 has a growth defect when transformed with PAC111L11.

Despite the higher conservation level of c-_lox_P1 and c-_lox_P2, our data shows that c-_lox_P2 and sc-_lox_P are involved in the single-strand nicking of PAC111L11, incubated in vitro with Cre recombinase. c-_lox_P2 and sc-_lox_P are located relatively close to each other, whereas c-_lox_P1 lies 70 kb away from c-_lox_P2. The interaction of Cre recombinase with cryptic _lox_P sites is likely to be weaker than with genuine _lox_P sites. However, proximity of two cryptic _lox_P sites may serve to increase the local concentration of Cre recombinase, increasing the likelihood of DNA nicking at one or both sites.

All three BAC/PAC molecules we have analysed contain a _lox_P consensus sequence in their respective vector backbones. However, we consider it unlikely that they mediate nicking for the following reasons: (1) they occur in most if not all BAC/PAC vectors and clearly DNA nicking and poor growth is not a problem with all BAC/PACs. (2) ASBAC also encode putative cryptic _lox_P sites, however, their presence did not result in slower growth in EL350 cells than EL250 or DY380 cells. We cannot exclude that in other molecules DNA damage might occur if the conditions (homology in the spacer region and location of the c-_lox_P site close to the end of the insert, and thus proximal to the genuine _lox_P site) are favourable. These data provide a functional definition of active cryptic _lox_P sites; more relaxed mismatches allowance in comparison to the criteria previously set (12) shows that a high degree of deviation from the _lox_P consensus sequence is still tolerated by Cre recombinase.

The demonstration that Cre recombinase expression is leaky in this strain supports the hypothesis that the damaged PAC DNA fails to replicate, depriving the E. coli daughter cells of the selectable marker. These findings are in accordance with data in the literature that show that mammalian genomes contain active recombinase recognition sites (12) and that growth inhibition and DNA damage can be induced by the expression of Cre recombinase in mammalian cells, E. coli and yeast (14,26–28). The leakiness of Cre recombinase in EL350 cells is not surprising. Studies have demonstrated that PBAD promoters are very efficient, but the levels of repression are not always zero, because they are relative to the levels of expression at maximum induction (25).

During cleavage, Cre becomes covalently attached to the DNA through a 3′-phosphate. This type of covalent protein-DNA linkage is very similar to that observed with DNA topoisomerases (11,29–31). In the presence of cryptic _lox_P sites or mutant _lox_P sites (11), Cre protein attempts to carry out recombination, but the reaction is abortive. If the reaction progresses to the stage where nicks are introduced into the DNA, then a damaged DNA molecule is produced, which is covalently linked to a protein making it very difficult to repair by the cell repair machinery (32).

Many lines of transgenic mice have been generated that express Cre recombinase, but there are few reports of adverse effects of this protein in vivo. The potential for DNA damage is demonstrated by the infertility of transgenic mice expressing Cre in spermatids, due to illegitimate Cre-dependent chromosome rearrangements (13). It is possible that somatic mutations remain undetected, due to the high tolerance of mammals for somatic cell death and, in light of this study, a more detailed analysis of the phenotype of transgenic animals bearing the Cre recombinase gene may be informative.

Finally, these results suggest that the presence of cryptic _lox_P sites in BAC/PAC inserts can affect the efficiency of recombineering techniques if the host cells express Cre recombinase even in a leaky way. This problem is BAC/PAC dependent and does not diminish the usefulness of recombineering techniques. Nevertheless, these data propose and describe a mechanism, which explains why recombineering experiments sometimes don’t give the expected results and provide a bioinformatic tool, which can alert and guide in the planning process. The use of FuzznacComparator to identify cryptic _lox_P sites in BAC/PAC inserts may be helpful in determining which BACs may prove to be most manipulatable with the Cre/loxP system.

ACKNOWLEDGEMENTS

We are grateful to David R.F. Leach (Institute of Cell and Molecular Biology, University of Edinburgh), Janice Paterson and David Brownstein for useful discussions during the course of this work and we thank Matt Sharp for critically reading the manuscript.

We acknowledge funding from the Wellcome Trust, including the Functional Genome Initiative and the Cardiovascular Research Initiative.

J.J.M. is a Wellcome Trust PRF and S.S. is a Wellcome Trust Intermediate Research Fellow.

REFERENCES

1

Recombineering: a powerful new tool for mouse functional genomics

,

Nat. Rev. Genet

,

2001

, vol.

2

(pg.

769

-

779

)

2

Genetic engineering using homologous recombination

,

Annu. Rev. Genet

,

2002

, vol.

36

(pg.

361

-

388

)

3

An efficient recombination system for chromosome engineering in Escherichia coli

,

Proc. Natl. Acad. Sci. U.S.A

,

2000

, vol.

97

(pg.

5978

-

5983

)

4

Use of bacteriophage lambda recombination functions to promote gene replacement in Escherichia coli

,

J. Bacteriol

,

1998

, vol.

180

(pg.

2063

-

2071

)

5

A highly efficient Escherichia coli-based chromosome engineering system adapted for recombinogenic targeting and subcloning of BAC DNA

,

Genomics

,

2001

, vol.

73

(pg.

56

-

65

)

6

Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1

,

Proc. Natl. Acad. Sci. U.S.A

,

1988

, vol.

85

(pg.

5166

-

5170

)

7

P1 site-specific recombination: nucleotide sequence of the recombining sites

,

Proc. Natl. Acad. Sci. U.S.A

,

1982

, vol.

79

(pg.

3398

-

3402

)

8

Phage P1 Cre-_lox_P site-specific recombination. Effects of DNA supercoiling on catenation and knotting of recombinant products

,

J. Mol. Biol

,

1985

, vol.

184

(pg.

211

-

220

)

9

Properties of a mutant Cre protein that alters the topological linkage of recombination products

,

J. Mol. Biol

,

1988

, vol.

202

(pg.

59

-

66

)

10

The role of the _lox_P spacer region in P1 site-specific recombination

,

Nucleic. Acids. Res

,

1986

, vol.

14

(pg.

2287

-

2300

)

11

Bacteriophage P1 Cre-_lox_P site-specific recombination. Site-specific DNA topoisomerase activity of the Cre recombination protein

,

J. Biol. Chem

,

1986

, vol.

261

(pg.

391

-

396

)

12

Mammalian genomes contain active recombinase recognition sites

,

Gene

,

2000

, vol.

244

(pg.

47

-

54

)

13

Illegitimate Cre-dependent chromosome rearrangements in transgenic mouse spermatids

,

Proc. Natl. Acad. Sci. U.S.A

,

2000

, vol.

97

(pg.

13702

-

13707

)

14

Growth inhibition and DNA damage induced by Cre recombinase in mammalian cells

,

Proc. Natl. Acad. Sci. U.S.A

,

2001

, vol.

98

(pg.

9209

-

9214

)

15

A high-throughput screen identifying sequence and promiscuity characteristics of the _lox_P spacer region in Cre-mediated recombination

,

BMC Genomics

,

2006

, vol.

7

pg.

73

16

Mini-lambda: a tractable system for chromosome and BAC engineering

,

Gene

,

2003

, vol.

315

(pg.

63

-

69

)

17

et al.

Taverna: a tool for the composition and enactment of bioinformatics workflows

,

Bioinformatics

,

2004

, vol.

20

(pg.

3045

-

3054

)

18

EMBOSS: the European Molecular Biology Open Software Suite

,

Trends Genet

,

2000

, vol.

16

(pg.

276

-

277

)

19

,

2003

Proceedings, UK e-Science, All Hands Meeting

2003

(pg.

509

-

513

)

20

Highly regulated cell type-restricted expression of human renin in mice containing 140- or 160-kilobase pair P1 phage artificial chromosome transgenes

,

J. Biol. Chem

,

1999

, vol.

274

(pg.

35785

-

35793

)

21

,

Molecular Cloning: A Laboratory Manual

,

1989

New York

Cold Spring Harbor

22

,

The Operon

,

1980

New York

Cold Spring Harbor

(pg.

389

-

410

)

23

Arabinose-induced binding of AraC protein to araI2 activates the araBAD operon promoter

,

Proc. Natl. Acad. Sci. U.S.A

,

1987

, vol.

84

(pg.

8814

-

8818

)

24

Regulation of the araC gene of Escherichia coli: catabolite repression, autoregulation, and effect on araBAD expression

,

Proc. Natl. Acad. Sci. U.S.A

,

1984

, vol.

81

(pg.

4120

-

4124

)

25

Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter

,

J. Bacteriol

,

1995

, vol.

177

(pg.

4121

-

4130

)

26

Bacteriophage P1 site-specific recombination. II. Recombination between _lox_P and the bacterial chromosome

,

J. Mol. Biol

,

1981

, vol.

150

(pg.

487

-

507

)

27

Identification of cryptic lox sites in the yeast genome by selection for Cre-mediated chromosome translocations that confer multiple drug resistance

,

J. Mol. Biol

,

1992

, vol.

223

(pg.

911

-

928

)

28

Multiplex Cre/lox recombination permits selective site-specific DNA targeting to both a natural and an engineered site in the yeast genome

,

Nucleic Acids Res

,

1996

, vol.

24

(pg.

4608

-

4613

)

29

E. coli and M. luteus DNA topoisomerase I can catalyze catenation of decatenation of double-stranded DNA rings

,

Cell

,

1980

, vol.

22

(pg.

269

-

276

)

30

Breakage of single-stranded DNA by rat liver nicking-closing enzyme with the formation of a DNA-enzyme complex

,

Nucleic Acids Res

,

1980

, vol.

8

(pg.

6129

-

6142

)

31

DNA is linked to the rat liver DNA nicking-closing enzyme by a phosphodiester bond to tyrosine

,

J. Biol. Chem

,

1981

, vol.

256

(pg.

4805

-

4809

)

32

Repair of DNA covalently linked to protein

,

Mol. Cell

,

2004

, vol.

13

(pg.

307

-

316

)

© 2007 The Author(s).

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 1,662

1,132 Pageviews

530 PDF Downloads

Since 12/1/2016

Month: Total Views:
December 2016 1
January 2017 1
February 2017 8
March 2017 7
June 2017 1
July 2017 4
August 2017 5
September 2017 4
October 2017 6
November 2017 5
December 2017 8
January 2018 17
February 2018 15
March 2018 14
April 2018 10
May 2018 4
June 2018 11
July 2018 7
August 2018 30
September 2018 4
October 2018 13
November 2018 7
December 2018 13
January 2019 12
February 2019 11
March 2019 19
April 2019 30
May 2019 25
June 2019 9
July 2019 14
August 2019 14
September 2019 15
October 2019 22
November 2019 20
December 2019 20
January 2020 42
February 2020 20
March 2020 34
April 2020 21
May 2020 14
June 2020 51
July 2020 22
August 2020 31
September 2020 19
October 2020 21
November 2020 22
December 2020 14
January 2021 14
February 2021 10
March 2021 23
April 2021 37
May 2021 16
June 2021 16
July 2021 14
August 2021 14
September 2021 15
October 2021 18
November 2021 9
December 2021 19
January 2022 15
February 2022 20
March 2022 36
April 2022 21
May 2022 25
June 2022 10
July 2022 27
August 2022 31
September 2022 34
October 2022 14
November 2022 6
December 2022 10
January 2023 13
February 2023 21
March 2023 15
April 2023 19
May 2023 17
June 2023 25
July 2023 23
August 2023 25
September 2023 32
October 2023 23
November 2023 11
December 2023 34
January 2024 40
February 2024 38
March 2024 13
April 2024 29
May 2024 19
June 2024 11
July 2024 18
August 2024 19
September 2024 26
October 2024 20

Citations

79 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic