Molecular cloning methods
All oligonucleotides were purchased from Sigma-Aldrich and are listed in Supplementary Table 4. Except for the ZF-SLiDE (described below), all PCR for cloning was performed with high-fidelity Herculase II Phusion DNA polymerase (Agilent, 600675); the cycling programs are listed in Supplementary Table 5. All restriction enzymes were purchased from New England Biolabs (NEB). An ISOLATE II PCR and Gel Kit (Bioline, Meridian Bioscience, BIO-52060) was used for purification of the PCR products and DNA fragments isolated from agarose gels. T4 DNA Ligase (Thermo Fisher Scientific, EL0011) was used for ligation reactions, and 2 µl of the ligation reaction was directly transformed into electrocompetent XL1-blue E. coli bacteria. In the case of libraries, the ligation reaction was membrane purified (MF-Millipore, GSWP01300), and 4 µl of the ligation was used for transformation. The transformed bacteria were grown overnight in LB medium with addition of antibiotic (30 µg ml−1 chloramphenicol for pEVO, 30 µg ml−1 chloramphenicol and 15 µg ml−1 kanamycin for pEVO containing Entranceposon during the pentapeptide scanning mutagenesis and 100 µg ml−1 ampicillin for pIRES, pEF1a and pCAGGs plasmids) and with addition of ʟ-arabinose, when induction of recombinase or ZF–recombinase from pEVO was of interest. Plasmids were purified from the overnight cultures using a GeneJET Plasmid Miniprep Kit (Thermo Fisher Scientific, K0503). Sequence verification was done with Sanger sequencing (Microsynth).
The plasmid vectors used for the tests in bacteria were based on pEVO (described previously in refs. 2,5,7). The target sites were cloned as described previously7. In brief, primers containing the target sites of interest, BglII restriction site and an overlap with the pEVO plasmid (primers 1–106) were designed. These primers were used to produce a PCR product from a pEVO plasmid, which was subsequently cloned into a BglII-digested pEVO vector using a ColdFusion Cloning Kit (Systems Bioscience). For the target site libraries, each target site was cloned one by one, and the plasmids were mixed together in equal ratios. In pEVO, the recombinase or ZF–recombinase complex was cloned between BsrGI and XbaI or SacI and SbfI restriction sites. Dimer recombinases were cloned between SacI and XhoI (left monomer) and BsrGI and XbaI (right monomer). A Shine–Dalgarno sequence is located in front of each recombinase gene, which, in the case of the dimer, allowed bicistronic expression of both recombinases. In pEVO, expression of a recombinase or ZF–recombinase fusion complex is induced by arabinose promoter (araBAD). Different ʟ-arabinose (Sigma-Aldrich, A3256) concentrations were used for adjusting expression levels of the proteins (from 1 µg ml−1 to 200 µg ml−1).
Zif268 and ZFCCR5L genes were assembled by using a polymerase cycling assembly method (primers 107–120 and 121–133 were used), or, for some of the fusions, the sequence of Zif268 was produced by Twist Bioscience. The designed ZFDs were produced by Twist Bioscience.
For the N-terminal and C-terminal fusions, the linker libraries were created by cloning the sequences of different number (2, 4, 6, 8, 10 and 12) of GGS repeats between XhoI and BsrGI restriction sites in the pEVO plasmid. For this, we designed oligonucleotides containing the linker sequence and the sticky ends of the respective restriction sites (primers 144–155). These oligonucleotides were annealed at 95 °C for 10 min, followed by incubation at 23 °C for 20 min, to obtain the double-stranded DNA fragments and were directly used for ligation with the digested pEVO vector. The obtained plasmids with the linkers were mixed together in equal ratios to create a linker library. This library was used for a two-step cloning. In the case of the N-terminal fusion, first, Brec1 was cloned between the BsrGI and XbaI restriction sites, followed by the cloning of Zif268 sequence without stop codon between the SacI and XhoI restriction sites. For the C-terminal fusion, first, Zif268 was cloned between the BsrGI and XbaI restriction sites, followed by the cloning of Brec1 sequence without stop codon between the SacI and XhoI restriction sites. For the insertional fusion library, first, the XhoI and BsrGI restriction sites were introduced between the residues 278 and 279 of Brec1 by overlap extension PCR (primers 162 + 164 and 163 + 165). Then, the linker library was created by doing a PCR with a mix of primers containing different numbers (from 1 to 8) of GGS repeats, overlap with Zif268 and XhoI (for the forward primers) or BsrGI (for the reverse primers) restriction sites (primers 166–181). The digested PCR product of the Zif268 flanked by the linker libraries was subsequently cloned into pEVO_Brec1 vector between residues 278 and 279 via XhoI and BsrGI restriction sites. During the cloning of libraries, a coverage of at least 100,000 clones was reached.
For cloning of single ZF–recombinase fusion complexes, either the same cloning strategy as described for the libraries was employed or the modified pEVO vectors were used. For construction of the pEVO-N-Zif268-(GGS)12 vector, the DNA fragment flanked by BsrGI and XbaI restriction sites was produced by Twist Bioscience, in which the Zif268 is fused with the (GGS)12 linker, and this sequence is followed by BsiWI and XbaI restriction sites that can be used for the in-frame cloning of the recombinase. The Zif268 sequence is flanked by BsrGI and BbvCI restriction sites, which allows to exchange the ZFD in the fusion construct. For construction of the pEVO-(GGS)12-Zif268-C vector, in a similar manner, the DNA fragment flanked by BsrGI and XbaI restriction sites was produced by Twist Bioscience. In this case, BsrGI and SpeI (has compatible sticky ends with XbaI) restriction sites that allow cloning of the recombinase, were followed by the (GGS)12 linker sequence and the Zif268 gene, flanked by BbvCI and XbaI restriction sites. The produced fragments were cloned into the pEVO vector via BsrGI and XbaI restriction sites for construction of these plasmids. In the pEVO-N-Zif268-(GGS)12 vector, the recombinases were cloned between BsiWI (has compatible sticky ends with BsrGI) and XbaI restriction sites. For cloning into the pEVO-(GGS)12-Zif268-C vector, the recombinases were first amplified with a reverse primer that removes the stop codon from its coding sequence and cloned via BsrGI and SpeI restriction site. For the insertional fusion, by using overlap extension PCR, BbvCI and PspOMI restriction sites were introduced between residues 278 and 279 of the recombinase cloned into pEVO between BsrGI and XbaI restriction sites. The primers with the overhangs containing the BbvCI and PspOMI restriction sites and the (GGS)8 linker sequences were used for amplifying the Zif268 or designed ZFPs, and the digested PCR product was cloned into the recombinase sequence. The construct RecFlex278–Zif268 was produced by Twist Bioscience.
For transient expression of Brec1 or Brec1–Zif268 in HEK293T cells, these genes were cloned via BsrGI and XbaI restriction sites into the previously described pIRES-NLS-EGFP vector7 (Fig. 2g). For the transient expression of D7 or D7-ZF heterodimer in HEK293T cells, the monomers were cloned via BsrGI and XbaI restriction sites into a mammalian expression vector (pEF1a-mTagBFP-P2A-NLS-RecL or pEF1a-EGFP-P2A-NLS-RecR). In this vector, the recombinase or ZF–recombinase complex was translationally linked with mTagBFP or EGFP using a P2A self-cleaving peptide sequence, and the expression of this construct was driven by EF1a promoter.
The pCAGGs-lox-pA-lox-mCherry reporter plasmid was generated as previously described7,8,9 (Fig. 2g). In brief, the loxBTR, loxBTR-5-zif (A) and loxBTR-5-zif (B) target sites were introduced by PCR with overhang primers (primers 187–192) and were cloned into the pCAGGs vector via SalI and EcoRI restriction sites.
The pLentiR-loxF8-zif (SFFV-loxF8-zif-PURO) reporter lentivirus plasmid was generated from the pLentiCRISPR v.2 lentiviral backbone, a gift from F. Zhang (Harvard University) (Addgene plasmid 52961 (ref. 51); RRID: Addgene_52961)52, as described previously8,9. The loxF8-5-zif (B) target sites were introduced by PCR with overhang primers (primers 193 + 194) and were cloned via EcoRI and AgeI.
Pentapeptide scanning mutagenesis and selection of the active mutated recombinase variants
Pentapeptide scanning mutagenesis was done using a Mutation Generation System Kit (Thermo Fisher Scientific), according to the manufacturer’s instructions. In brief, the M1-KanR Entranceposon was inserted into the pEVO containing the recombinase. To select for the variants where the transposon was inserted within the recombinase sequence, plasmid DNA of the obtained libraries was digested with BsrGI and XbaI restriction enzymes, and a DNA fragment that indicated successful integration of the transposon into the recombinase sequence (around 2 kb) was extracted and subcloned into a fresh pEVO vector. Next, the Entranceposon was removed from the library by NotI restriction digestion, and the mutated library containing five-amino-acid in-frame insertions throughout the recombinase sequence was cloned into the pEVO vectors containing recombinase-respective lox sites and induced by arabinose supplement to the medium. At this step, to confirm randomness of the mutations, single clones were sequenced and analyzed for recombination activity by PCR (described in the subsection ‘PCR for assessing recombination activity’). To select only the mutated recombinase variants that retained recombination activity, 500 ng of the induced library plasmid DNA was digested with NdeI and AvrII, which are located between the two lox sites on the pEVO plasmid. Thereby, the variants that did not excise the DNA sequence between the two lox sites were digested and removed from the pool, whereas the plasmids carrying the active variants remained intact. The digested library was then membrane purified, transformed and grown overnight with arabinose supplement. The next day, 500 ng of the active library DNA was again digested with NdeI and AvrII, and 25 ng of the digested DNA was used as a template for high-fidelity PCR (primers 156 + 157) to amplify the active mutated recombinases, which were digested with BsrGI and XbaI and subcloned to a fresh pEVO vector containing the respective lox sites and induced with arabinose. The selection cycle was repeated twice. At the final step, the plasmid DNA of active mutated recombinase libraries was extracted and prepared for long-read PacBio sequencing.
Long-read PacBio sequencing of the active libraries of the mutated Cre, Brec1, D7L and D7R after the pentapeptide scanning mutagenesis was performed as previously described by Schmitt et al.53.
Nanopore sequencing of the active libraries of the mutated Vika, Vika2 and Vika3 recombinases after the pentapeptide scanning mutagenesis was performed in the following way. The plasmid DNA of the obtained libraries of the active mutants of Vika and Vika-like recombinases was extracted, and fragments containing the recombinases and target sites were obtained by digesting with BsrGI and ScaI restriction enzymes and subsequent gel extraction. Preparation of the libraries was performed following the protocol ‘Native barcoding amplicons’ using an SQK-LSK110 and an EXP-NBD104 kit (Oxford Nanopore Technologies). The three libraries were mixed before the preparation in a 1:1:1 ratio. Sequencing was performed on a MinION Mk1B nanopore sequencer with a FLO-MIN106 r9.4.1 flowcell (Oxford Nanopore Technologies).
The high-throughput screen for testing different combinations of linker and spacing lengths to develop the ZF–recombinase fusion architecture was performed as follows. The libraries of Brec1–Zif268 fusions were cloned to the pEVO target site library, transformed into XL1-blue E. coli and grown for 14–16 h in LB supplemented with chloramphenicol, and Brec1–Zif268 expression was induced by ʟ-arabinose (1 µg ml−1 for the N-terminal and C-terminal fusion and 200 µg ml−1 for the insertional fusion). The plasmid DNA was extracted, and fragments containing Brec1–Zif268 fusion complexes and target sites were obtained by digesting with SacI and ScaI restriction enzymes and subsequent gel extraction. Preparation of the libraries was performed following the protocol ‘Native barcoding amplicons’ using the SQK-LSK110 and the EXP-NBD104 kit (Oxford Nanopore Technologies). The three libraries (N-terminal fusion, C-terminal fusion and insertional fusion) were mixed before the preparation in a 1:1:5 ratio. Sequencing was performed on the MinION Mk1B nanopore sequencer with the FLO-MIN106 r9.4.1 flowcell (Oxford Nanopore Technologies).
Deep sequencing analysis
PacBio HiFi DNA sequences after the pentapeptide scanning mutagenesis screen for Cre-type recombinases were aligned to the wild-type DNA reference sequence (Brec1, Cre, D7L or D7R) using Exonerate (version 2.3.0) with the ‘affine:bestfit’ model. From this alignment, the CIGAR values for each read were processed with a custom R script that counts 15-bp insertions for each position (R version 4.1.1 with tidyverse package version 1.3.1; ref. 54).
All nanopore sequencing data were basecalled with Guppy (Oxford Nanopore Technologies, version 5.0.7) in high accuracy mode. Only reads with a Phred quality score of 10 or greater were retained for further processing.
Sequencing reads from the pentapeptide scanning mutagenesis screen for Vika-type recombinases were aligned in two phases. In the first demultiplexing phase, all reads were aligned to sequences of backbones containing Vika, Vika2 and Vika3 and their respective target sites in unrecombined and recombined variants—six reference sequences in total. Subsets of reads unambiguously mapping to each of the references were individually subjected to the second alignment phase, in which each subset was mapped to a library of corresponding recombinase sequences containing pentapeptide insertions at each possible position. The final recombination rates of each protein variant were obtained by calculating a fraction of counts of reads mapped to a recombinase variant with recombined target sites to counts of all reads mapped the recombinase variant with either recombined or unrecombined target sites.
Sequencing reads from the high-throughput screen for testing different combinations of linker and spacing lengths were aligned to all possible sequence combinations (ZF fusion type, linker length, spacing between the loxBTR and zif268 target sites, recombined or non-recombined target sites) using minimap2 (version 2.17; ref. 51) with the ‘secondary’ option set to ‘no’. The alignment file was then filtered with SAMtools (version 1.11; ref. 55) using the view command and the -L option to only include reads that cover the target site and the recombinase by supplying BED files that contain specific coordinates for these regions. Relevant information from this alignment file was then extracted using GNU Awk (version 5.1.1) and processed and visualized in R (version 4.1.1 with tidyverse package version 1.3.1; ref. 54). The recombination rates were calculated by counting the number recombined and non-recombined reads for each ZF–recombinase complex and target site combination.
PCR for assessing recombination activity
For a quick clonal analysis, the recombination activity was assessed by a PCR-based assay, described in Lansing et al.7,8. In brief, after the transformation, the recovery was plated on agar with chloramphenicol (15 µg ml−1). Single colonies were picked and grown in 500 µl of LB in the presence of chloramphenicol and ʟ-arabinose in a 96 deep-well plate for 16 h. Then, 1 µl of the grown cell suspension was used for colony three-primer PCR with MyTaq Polymerase (Bioline, BIO-21106) (primers 198–200) (Extended Data Fig. 5b). Primer 198 binds between two lox sites, and primer 199 binds upstream of the lox sites. Therefore, this primer combination generates a PCR product of ∼500 bp, indicating the non-recombined substrate. Primer 200 binds downstream of the second lox site, and a combination with primer 199 will generate a shorter ∼400-bp product, indicating the recombined plasmid. Short elongation time used for this PCR reaction does not allow product generation from the non-recombined template (∼1,140 bp) for this primer combination.
Plasmid-based recombination test in bacteria
To asses recombination activity, the efficiency of the excision of the two target sites on the pEVO plasmid was evaluated. For this, the expression of the recombinase or ZF–recombinase complex in the overnight cultures was induced by addition of ʟ-arabinose. Testing of the fused ZF–recombinase complexes and recombinases alone on the same target sites was always performed at the same concentration of ʟ-arabinose, but the induction levels varied between the different experiments, as described here. Testing of the N-terminal and C-terminal fusions for Brec1–Zif268 and Brec1–ZFCCR5L were performed at 1 µg ml−1 and, for RecHTLV–Zif268, at 10 µg ml−1. This low induction level was used to demonstrate the enhanced recombination efficiencies when recombinases were fused to the ZFD on the lox-zif target sites. The insertional fusions were tested at different concentrations, depending on the activity of the non-fused recombinase on the respective lox sites (200 µg ml−1 was used for Brec1-loxBTR, D7R-loxF8R and RecFlex-loxMECP2; 100 µg ml−1 was used for RecHTLV-loxHTLV, D7L-loxF8L, D7-loxF8 and all the off-targets, RecFlex-loxFlex1, RecFlex-loxFlex4, Vika2-vox2, Vika3-vox3 and Vika4-vox4; and 10 µg ml−1 was used for RecFlex-loxFlex2, RecFlex-loxFlex3 and RecFlex-loxFlex5). Next, 500 ng of the plasmid DNA extracted from the induced cultures was digested with BsrGI and XbaI or SacI and SbfI restriction enzymes. Then, 200 ng of the digested DNA and 5 µl (2.5 µg) of GeneRuler DNA Ladder Mix (Thermo Fisher Scientific, SM0331) as a loading control were loaded to a 0.8% agarose gel stained with RedSafe (Intron Biotechnology, 21141), and the gel was run at 70 V for 90 min. Three bands could be seen on the agarose gel after the gel electrophoresis. The smallest band of 1 kb shows the recombinase (∼1.5 kb for the recombinase fused with a ZFD, ∼2 kb for the recombinase heterodimer (D7) and ∼3 kb for the recombinase heterodimer fused with ZFP (D7-ZF)) and is used as a control for the presence of the tested recombinase or complex in the digested plasmid pool. The biggest band of ∼5 kb shows the unrecombined pEVO backbone, and a smaller band of ∼4.3 kb shows the recombined substrate. The gel images were taken with an Infinity VX2-3026 transilluminator and InfinityCapt software (Vilber). The band intensities of the bands (5 kb and 4.3 kb) were calculated using Fiji (version 2.0.0.-rc-65/1.52a). The recombination efficiency was quantified by the ratio of the non-recombined and the recombined band intensities.
ZFD design for genomic F8 locus
ZFDs were designed for the sequences upstream and downstream of the loxF8 target site in the human genome. Two three-finger ZFDs (ZFL1 and ZFL2) targeting the DNA sequence 5′-GCAATGAAT-3′ 5 bp upstream of the loxF8 target site (reverse strand) were designed using the publicly available platform of Persikov et al.30. Two three-finger ZFDs (ZFR1 and ZFR2) targeting the DNA sequence 5′-AAGATTGGC-3′ 5 bp downstream of the loxF8 target site (forward strand) and one three-finger ZFD (ZFR3) targeting the DNA sequence 5′-CAAGATTGG-3′ 4 bp downstream of the loxF8 target site (forward strand) were designed using the same publicly available platform30. One four-finger ZFD (ZFR4) targeting the DNA sequence 5′-CAAGATTGGCAG-3′ 4 bp downstream of the loxF8 target site (forward strand) was designed by modular assembly using the list of published ZF modules29. The amino acid sequences of the designed ZFDs are listed in Supplementary Table 1.
Zinc-finger domains designed for the loxF8 flanking sequences were evolved based on the established substrate-linked directed evolution of recombinases2,4,6,7,8. SLiDE links excision activity of the lox sites by a recombinase to the plasmid that encodes its gene. Because the activity of the recombinase was induced by ZFD binding to its target sites next to the lox sites, we used this property for evolving the ZFDs in this system. A schematic of the procedure is depicted in Fig. 4a. We started by creating a library of the ZFDs by performing 50 cycles of error-prone PCR with primers 156 and 157 and MyTaq Polymerase (Bioline, BIO-21106), which lacks a proof-reading activity and, therefore, introduces mutations. The PCR product was digested with BbvCI and PspOMI, and the band of around 400 bp for the ZFL and around 500 bp for the ZFR (both cases included the ZFs and the flanking linkers) was extracted from an agarose gel. This insert containing a ZF library was cloned into the digested pEVO vectors that contained the loxF8L-flank or loxF8R-flank target sites and D7L or D7R recombinase sequences, respectively, with the frameshift insertion between amino acids 278 and 279 that is flanked by BbvCI and PspOMI restriction sites. XL1-blue E. coli was transformed with the pEVO libraries and grown in 100 ml of LB medium with chloramphenicol (30 µg ml−1) and ʟ-arabinose (200 µg ml−1, 10 µg ml−1 or 1 µg ml−1). Then, 10 ml of the culture was used for the plasmid extraction, and 500 ng of plasmid DNA was digested with NdeI and AvrII that are located between the two lox sites on the plasmid. Thereby, the inactive variants, which did not perform excision, were eliminated from the pool. The remaining active variants were amplified using error-prone PCR with primers binding upstream of the Rec-ZF gene and downstream of the target site (primers 195 + 196). The PCR product was digested with BbvCI and PspOMI to extract only the ZFD and the flanking linker sequences and was cloned in the pEVO into the intact, wild-type recombinase gene, as described above, thereby starting a new cycle of ZF evolution. Additionally, to prevent the evolving ZFD from gaining a generally relaxed specificity, we performed counter-selection on the loxF8L and loxF8R target sites, which did not have flanking ZF target sequences. For the counter-selection, the digested ZF library fragments were cloned into pEVO containing the D7L or D7R recombinase sequence and the loxF8L or loxF8R sites, respectively. In this case, a high ʟ-arabinose concentration was used (200 µg ml−1). After plasmid DNA extraction, we directly proceeded to error-prone PCR and amplified the inactive variants with the primers binding upstream of the ZF-recombinase gene and between the lox sites (primers 197 + 198). The cycling process was repeated with lowering ZF–recombinase expression levels (by lowering the concentration of ʟ-arabinose) on the flanked target sites to select for most improved variants and keeping it always high on the lox sites for the counter-selection. Overall, 17 cycles of evolution on the flanked target sites and three cycles of counter-selection evolution on the lox sites were performed for both ZFL and ZFR libraries. Finally, both recombinases fused with ZF libraries were combined, and the dimers inactive on the loxF8 target site (eight cycles) and active on the loxF8-flank sites (three cycles) were selected in a similar way, as described in Hoersten et al.12 and Lansing et al.8. High-fidelity Herculase II Phusion DNA polymerase (Agilent) was used for dimer selection to select the compatible combinations without introducing new mutations in the recombinase sequences.
Sequence analysis of the evolved ZFDs
Active clones from the final ZFL (75 clones) and ZFR (59 clones) libraries were picked and sent for E. coli overnight Sanger sequencing (Microsynth). The obtained sequences were analyzed to determine the mutational changes in the ZFD and linker sequence, by comparing to the respective ZFL1 and ZFR4 sequences. The analysis of the sequencing data was performed in R version 4.1.0 using the dplyr, Sequence tools (https://github.com/ltschmitt/SequenceTools) and ggplot2 packages.
HEK293T (American Type Culture Collection) cells were cultured in DMEM (Gibco, 10564011) with 10% FBS (Gibco, A5256701) and 1% penicillin–streptomycin (10,000 U ml−1, Thermo Fisher Scientific, 15140122) at 37 °C and 5% CO2 in a HERAcell Incubator 240i (Thermo Fisher Scientific). Trypsin-EDTA (Gibco, 25200056) was used for dissociation of the cells for splitting.
Patient-derived F8 hiPSCs were reprogrammed at the Stem Cell Engineering Facility of the Center for Molecular and Cellular Bioengineering (CMCB) at TU Dresden (described in Lansing et al.8). hiPSCs were cultured in StemFit Basic04 Complete Type (AJINOMOTO). The first 24 h after splitting, the medium was supplemented with 10 µM ROCK inhibitor (Y-27632, Tocris, 1254). Accutase (Thermo Fisher Scientific, 00-4555-56) was used for detachment of the cells for splitting. The coating was performed with iMatrix-511 silk laminin (NIPPI) according to the manufacturer’s instructions.
Generation of the HEK293TloxF8-zif cell line
HEK293T cells were transfected with the pLentiR-loxF8-zif-PURO plasmid, lentiviral gag/pol packaging plasmid (psPAX2, Addgene no. 12260) and the envelope plasmid VSV-G (pMD2.G, Addgene no. 12259), using standard polyethylenimine transfection. Forty-eight hours after transfection, viral particles generated in the supernatant were harvested and used to infect fresh HEK293T cells. Seventy-two hours after transduction, cells were exposed to selection with 2 µg ml−1 puromycin for 7 d. gDNA of the surviving cells was isolated and subjected to a reporter-specific PCR. Sequencing of the amplified fragment confirmed integration of the reporter construct in the genome.
Cell culture plasmid recombination assay
To test activity of Brec–Zif268 fusion complexes, a plasmid assay in HEK293T cells was performed. In total, 30,000 HEK293T cells per well were seeded in a 96-well plate. The next day, 25 ng of the pIRES expression plasmid and 25 ng of the pCAGGs reporter plasmid were transfected using Lipofectamine 2000 Transfection Reagent (Thermo Fisher Scientific, 11668019). The cells were analyzed with a MACSQuant VYB (Miltenyi Biotec) 48 h after transfection. HEK293T cells were gated for single cells, for transfected population (GFP+ cells) and, finally, for the transfected cells that successfully performed the recombination of the reporter (mCherry+GFP+ cells). The recombination efficiency was calculated by the percentage of double-positive cells (mCherry+GFP+) divided by the percentage of all GFP+ cells.
To test the inversion efficiency of the genomic loxF8 locus, HEK293T cells were transfected with pEF1a expression plasmids expressing D7 or D7-ZF. For this, 200,000 HEK293T cells per well were seeded in a 12-well plate. The next day, 400 ng of pEF1a plasmid expressing D7L or D7L-ZFL and 400 ng of pEF1a plasmid expressing D7R or D7R-ZFR were transfected using Lipofectamine 2000 Transfection Reagent (Thermo Fisher Scientific, 11668019). Seventy-two hours after transfection, the cells were analyzed with the MACSQuant VYB and harvested. To determine transfection efficiency, HEK293T cells were gated for single cells and for transfected population (GFP+BFP+ cells). In both experiments, analysis of the flow cytometry data was performed using FlowJo 10 software (BD Biosciences).
In vitro transcription
The DNA templates for in vitro transcription (IVT) were generated by PCR from the pEF1a plasmids with EGFP (primers 201 + 202), D7L, D7L-ZFL or D7L-Zif268 (primers 203 + 204), D7R, D7R-ZFR or D7R-Zif268 (primers 205 + 206). D7L, D7R, D7L-ZFL(G10), D7R-ZFR(G10), D7L-Zif268, D7R-Zif268 and eGFP mRNA were produced using a HiScribe T7 ARCA mRNA Kit (NEB, E2065S) and purified using a Monarch RNA Cleanup Kit (NEB, T2040L), according to the manufacturer’s instructions.
HEK293TloxF8-zif reporter cells and patient-derived F8 hiPSCs were transfected with IVT-produced mRNA using Lipofectamine MessengerMAX Transfection Reagent (Thermo Fisher Scientific, LMRNA015). HEK293TloxF8-zif cells were seeded at a density of 300,000 cells per well in a 12-well format the day before transfection. Then, 300 fmol of mRNA per well (140 ng of D7L-Zif268 and D7R-Zif268 or 140 ng of D7L-ZFL and 150 ng of D7R-ZFR or 70 ng of eGFP mRNA) was used for transfection. F8 hiPSCs were seeded at a density of 600,000 cells per well in a six-well format the day before transfection. For each well, 740 fmol of recombinase mRNA (250 ng of D7L and D7R mRNA or 360 ng of D7L-ZFL mRNA and 380 ng of D7R-ZFR mRNA) and 50 ng of eGFP mRNA were used for transfection. In both cases, cells were analyzed 48 h after transfection by fluorescence microscopy and harvested.
Detection of recombination by PCR on gDNA
gDNA from HEK293T cells and F8 hiPSCs transfected with D7 or D7-ZF was isolated using a QIAamp DNA Blood Mini Kit (Qiagen, 51106). The inversion of the 140-kb DNA fragment between the two loxF8 target sites was detected by PCR, as described previously by Lansing et al.8. Primer pairs 207 + 208 and 209 + 210 were used to amplify the WT (‘healthy’) orientation of the 140-kb fragment that can be detected in HEK293T cells or, in case of the inversion event, in F8 hiPSCs. Primer pairs 207 + 210 and 208 + 209 were used to amplify the inverted (‘hemophilic’) orientation of the int1h that is detected in F8 hiPSCs or, in case of the inversion event, in HEK293T cells. Recombination of loxF8-zif genomic reporter was detected by PCR using the primer pair 211 + 212 that amplifies both recombined (644 bp) and unrecombined (1,308 bp) fragments.
Inversion efficiency was quantified using a qPCR-based assay as described previously by Lansing et al.8. In brief, to detect the WT orientation (inversion event in F8 hiPSCs), the primers 209 + 210 were used; to detect the inverted ‘hemophilic’ orientation (inversion event in HEK293T cells), the primers 206 + 207 were used. In both cases, a TaqMan amplicon specific probe was used. Samples of 1%, 5%, 10%, 25%, 50% and 100% inversion were generated by mixing gDNA of WT iPSCs and F8 hiPSCs at appropriate ratios. The Cq values of these mixtures were used to build a standard curve and extrapolate the inversion efficiency of the gDNA samples of interest. The calculated inversion efficiencies from the transfected HEK293T cells were normalized by transfection efficiencies. Because gDNA of male iPSCs (one X chromosome) was used for generation of the standard curve used in the quantification, the calculated inversion efficiencies from the transfected HEK293T cells (female, two X chromosomes) were divided by 2. For quantification in F8 hiPSCs, an average of the triplicate samples transfected with D7 was calculated, and the fold change of each replicate treated with D7-ZF was quantified using the following formula: (D7-ZF inversion − D7 inversion average) / D7 inversion average.
ChIP-seq and qPCR validation
D7L-ZFL and D7R-ZFR were fused with EGFP and cloned in a modified version of the tetracycline-inducible plentiX vector (described previously in refs. 8,9). These plasmids were used as a template for PCR with the primers 201 + 204, 201 + 206 and 201 + 202. The obtained DNA templates were used for IVT, as described above. Two 10-cm dishes were seeded with four Mio HEK293T cells each. The next day, 6.5 µg of D7L-ZFL-EGFP and 6.5 µg of D7R-ZFR-EGFP or 3 µg of EGFP mRNA were transfected as described above. ChIP was performed as described previously8,9. In short, 24 h after transfection, cells were crosslinked with 1% formaldehyde for 10 min, and chromatin extraction and shearing were performed using a truChIP Chromatin Shearing Kit (Covaris) following the manufacturer’s protocol (high cell number), followed by chromatin shearing with a Covaris M220 sonicator. Then, 1% of the sheared chromatin was separated for further qPCR validation as an input sample, and the rest was used for immunoprecipitation. Sonicated chromatin was immunoprecipitated with a goat GFP antibody (MPI-CBG antibody facility, 1:5,000) and Protein G sepharose beads (Protein G Sepharose 4 Fast Flow, GE Healthcare). Eluates were reverse crosslinked, followed by RNA and protein digestion.
ChIP DNA sequencing was performed at the Novogene facility. The DNA fragments were repaired, A-tailed and further ligated with Illumina adapter. The final DNA library was obtained by size selection and PCR amplification. The library was checked with Qubit and real-time PCR for quantification and bioanalyzer for size distribution detection. Quantified libraries were sequenced on the Illumina platform, aiming for at least 30 million pairs of sequencing reads per sample, with each read being 150 bp long. Additionally, the same set of DNA samples was sequenced at the Deep Sequencing Facility of TU Dresden, using the same sample processing pipeline but a read length of 100 bp.
Reads obtained from both sequencing facilities were pooled and aligned to the human reference genome assembly GRCh38.p13 (ref. 56) using bwa-mem2 aligner57 and SAMtools58. Reads identified as PCR and optical duplicates were removed using the Picard MarkDuplicates tool, and the final peak calling was performed with Genrich59, using the ENCODE blacklist (version 2)60 for filtering out problematic regions. Visualizations of the ChIP-seq pile-up signals were generated with the USCS Genome Browser61, directly from the read alignment files after the duplicate removal step. All steps involving manipulations and comparisons of genomic intervals were done using BEDTools62.
De novo motif discovery was performed with MEME-ChIP script from the MEME suite63, which was executed with the following set of arguments: ‘-ccut 0 -seed 0 -meme-mod oops -minw 8 -maxw 30 -meme-nmotifs 10 -meme-minsites 20 -centrimo-local’. For the motif comparison stage, a database of loxP and loxF8 sequences was used, with addition of predicted zinc-finger DNA-binding motifs. To generate an input file, BEDTools62 were used to extract 500-bp-long sequences centered at the peak summits reported by the peak-calling pipeline.
Twenty-five high-confidence peaks were found for D7-ZF. As a comparison, 84 off-target sites were detected for the recombinases alone using the same cutoff8, indicating that fusion with the ZFDs did not increase the number of binding sites in the genome. Ten out of 25 peaks that were identified by ChIP-seq were additionally tested by qPCR for recombinase binding (primers 213–236). qPCR was performed using SYBR Green Master Mix (Thermo Fisher Scientific, ABsolute qPCR SYBR Green Mix, AB1159A), and the ChIP samples and input samples were compared.
Ten peak sequences were tested in a plasmid-based assay for recombination in bacteria. A sequence of 70 bp around each peak was chosen based on the position of the identified zinc-finger DNA-binding motifs. DNA insert of 70 bp for each peak was generated by PCR (primers 237–256) and cloned into the pEVO vector twice as target sites for excision, as described above. D7-ZF was expressed at 100 µg ml−1 ʟ-arabinose on the cloned sequences, and plasmid-based recombination tests in E. coli were performed, as described above.
Recombinase and CRISPR–Cas9 deletion detection
To investigate potential unintended deletion of the 140-kb fragment on the F8 locus of D7 and D7-ZF recombinases, we tested them in HEK293T cells and compared recombinase approach with CRISPR–Cas9. Then, 600 fmol of mRNA per well (200 ng of D7L and D7R, 280 ng of D7L-ZFL and D7R-ZFR, 800 ng of Cas9 (TriLink BioTechnologies, L-7206) or 140 ng of eGFP mRNA) was used for transfection in a 12-well format. Cas9 mRNA was transfected in a combination with 8 pmol gRNA specific for the inverted repeat (5′-GGUCCCCGGGGUUGUGCCCC-3′), as published by Park et al.32. Genomic DNA was isolated 48 hours after transfection and analyzed for inversion and deletion events by PCR. Primer pair 207 + 208 was used to amplify the WT (‘healthy’) orientation of the 140-kb fragment, and primer pair 208 + 209 was used to amplify the inversion event in HEK293T cells. Primer pair 207 + 209 was used to amplify the potential deletion of the 140-kb fragment. The PCR product obtained from Cas9 and gRNA transfected samples was cloned into pMiniT 2.0 plasmid using an NEB PCR Cloning Kit, according to the manufacturer’s recommendations. After transformation into NEB 10-beta E. coli, nine colonies were picked and sequenced using E. coli overnight Sanger sequencing (Microsynth) with the ‘cloning analysis forward primer’, provided in the NEB PCR Cloning Kit.
The position weight matricies (PWMs) for the evolved ZFL and ZFR in the selected D7-ZF clone were obtained using the Interactive PWM Predictor64. Potential genomic off-targets of D7 recombinases8 were then scanned for occurrences of the PWM motifs upstream or downstream of the lox sites using the FIMO tool from the MEME Suite63, using a P value threshold of 0.001. Reported results were filtered to ensure that coordinates of matches are within an expected distance of 4 bp to 6 bp from the corresponding ‘left’ or ‘right’ half site. The predicted off-targets are listed in Supplementary Table 1.
Target site identification
To find a locus in the human genome that can be targeted by RecFlex, a human genome-wide search for loxFlex motif occurrences was performed using FIMO63.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.