[go: up one dir, main page]

WO2025137069A1 - Compositions and methods for enhanced genome editing using cas9 fusion proteins - Google Patents

Compositions and methods for enhanced genome editing using cas9 fusion proteins Download PDF

Info

Publication number
WO2025137069A1
WO2025137069A1 PCT/US2024/060720 US2024060720W WO2025137069A1 WO 2025137069 A1 WO2025137069 A1 WO 2025137069A1 US 2024060720 W US2024060720 W US 2024060720W WO 2025137069 A1 WO2025137069 A1 WO 2025137069A1
Authority
WO
WIPO (PCT)
Prior art keywords
fragment
variant
sequence
cas9
nucleic acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/060720
Other languages
French (fr)
Inventor
Alexandros POULOPOULOS
Ryan Richardson
Colin Robertson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Maryland Baltimore
University of Maryland College Park
Original Assignee
University of Maryland Baltimore
University of Maryland College Park
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Maryland Baltimore, University of Maryland College Park filed Critical University of Maryland Baltimore
Publication of WO2025137069A1 publication Critical patent/WO2025137069A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/7105Natural ribonucleic acids, i.e. containing only riboses attached to adenine, guanine, cytosine or uracil and having 3'-5' phosphodiester links
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/711Natural deoxyribonucleic acids, i.e. containing only 2'-deoxyriboses attached to adenine, guanine, cytosine or thymine and having 3'-5' phosphodiester links
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/0008Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
    • A61K48/0016Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the nucleic acid is delivered as a 'naked' nucleic acid, i.e. not combined with an entity such as a cationic lipid
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/0012Galenical forms characterised by the site of application
    • A61K9/0019Injectable compositions; Intramuscular, intravenous, arterial, subcutaneous administration; Compositions to be administered through the skin in an invasive manner
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/10Dispersions; Emulsions
    • A61K9/127Synthetic bilayered vehicles, e.g. liposomes or liposomes with cholesterol as the only non-phosphatidyl surfactant
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/48Preparations in capsules, e.g. of gelatin, of chocolate
    • A61K9/50Microcapsules having a gas, liquid or semi-solid filling; Solid microparticles or pellets surrounded by a distinct coating layer, e.g. coated microspheres, coated drug crystals
    • A61K9/51Nanocapsules; Nanoparticles
    • A61K9/5107Excipients; Inactive ingredients
    • A61K9/5123Organic compounds, e.g. fats, sugars
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/315Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P39/00General protective or antinoxious agents
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]

Definitions

  • the field of the invention generally relates to biotechnology and medicine, in particular compositions and methods for genome editing.
  • Genome editing is a powerful technology that allows for the specific and often precise addition or removal of genetic material. Genome editing is initiated by making double stranded DNA breaks in the target cell. These double stranded DNA breaks can be created by several methods including meganucleases, Zine-Finger Nucleases, TALE-nucleases, and/or the CRISPR/Cas9 restriction modification system. Each of these systems creates a dsDNA break at a user designated genomic location. After the creation of the dsDNA break, the cellular machinery acts quickly to repair this dsDNA using either by the non-homologous end joining (NHEJ) pathway or by homologous recombination (HDR).
  • NHEJ non-homologous end joining
  • HDR homologous recombination
  • Cas9 targets genomic loci with high specificity. For knockin with doublestrand break repair, however, Cas9 often leads to unintended on-target knockout rather than intended edits. This imprecision is a barrier for direct in vivo editing where clonal selection is not feasible.
  • the present inventors disclose a high-throughput workflow to quantify editing outcomes for creating and identifying editing agents with increased performance and their optimal combinations for knockin applications. The inventors have established editing efficiency and precision as generalizable assessment metrics for comparisons of knockin performance across existing and novel agents. Using this platform, the inventors aimed to enhance DSB repair-based editing performance by combinatorial screens of Cas9 variants, DNA donors, and new compound fusions to DNA repair protein domains.
  • Cas9-RC a high-performance DSB repair-based editing agent with increased editing efficiency and precision.
  • Cas9- RC was tested for its editing performance in vivo in the embryonic mouse brain, where it enhanced fluorescent protein knockin in some cases by in utero electroporation. These improvements showcase the utility of this workflow for continued development and assessment of new precision editing tools for in vivo knockin applications, including expression vectors, host cells and methods of producing the recombinant Cas9 fusion protein.
  • the invention provides a nucleic acid comprising a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a nucleic acid comprising a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • Rad 18 or a variant or fragment thereof is fused to the amino terminus of Cas9 or a variant or fragment thereof and CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9 or a variant or fragment thereof.
  • a Radi 8 fragment is fused to the amino terminus of Cas9 or a variant or fragment thereof and a HDR Enhancing N-terminal fragment of CtIP is fused to the carboxy terminus of Cas9 or a variant or fragment thereof.
  • the fusion protein comprises an amino acid sequence of SEQ ID NO:7.
  • the invention provides a nucleic acid sequence that encodes a polypeptide comprising SEQ ID NO:7.
  • the nucleic acid sequence of the fusion protein comprises SEQ ID NO: 8.
  • the Cas9 is a wild-type Cas9 from Streptococcus pyogenes.
  • the fusion protein comprises a fragment of Cas9 that lacks an N-terminal methionine.
  • the Cas9 or fragment or variant thereof comprises a nucleic acid sequence encoding a polypeptide comprising SEQ ID NO: 1.
  • the Cas9 or fragment or variant thereof nucleic acid sequence comprises SEQ ID NO:2 (DNA) or SEQ ID NO:36 (RNA).
  • Cas9 or fragment or variant thereof is a high fidelity variant (Cas9-HF).
  • the fusion protein comprises a fragment of CtIP.
  • the fusion protein comprises the HDR Enhancing N-terminal fragment of CtIP, comprising SEQ ID NO:3 (amino acids 1-296).
  • the CtIP fragment nucleic acid sequence comprises SEQ ID NO:4 (DNA) or SEQ ID NO:38 (RNA).
  • the CtIP, variant or fragment lacks an N-terminal methionine.
  • the fusion protein comprises a variant or fragment of Rad 18.
  • the fusion protein comprises a fragment of Radi 8, which contains a deletion of a putative DNA-binding domain.
  • this enhanced version of Radi 8 (also referred to herein as eRadl8) has a deletion in amino acids 242-282 in Radl8 and comprises SEQ ID NO:5.
  • the eRadl8 nucleic acid sequence comprises SEQ ID NO:6 (DNA) or SEQ ID NO:40 (RNA).
  • the Radi 8, variant or fragment lacks an N-terminal methionine.
  • the nucleic acid is selected from DNA or RNA. In some embodiments, the nucleic acid is RNA. In some embodiments, the RNA is circular RNA (circRNA).
  • the nucleic acid encoding the fusion protein comprises a vector.
  • the vector is a viral vector.
  • the vector is a mammalian expression vector.
  • the invention provides a host cell transformed or transfected with a vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a host cell transformed or transfected with a vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a pharmaceutical composition comprising a nucleic acid encoding the fusion protein.
  • the composition comprises a delivery system, such as lipid nanoparticles or lipid-like nanoparticles.
  • the lipid or lipid-like nanoparticles are ionizable.
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CUP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the invention provides a method of treating a disease of condition in a subject by administering to the subject an effective amount of the modified cells.
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify a genomic sequence in the subject.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the present invention provides compositions and methods for expression and purification of Cas9 fusion proteins and variants thereof in prokaryotic host cells.
  • the invention provides a prokaryotic expression vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a prokaryotic host cell transformed or transfected with a prokaryotic expression vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a prokaryotic expression vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a method for producing the fusion protein in a prokaryotic host cell, comprising culturing the prokaryotic host cell in a growth media under conditions suitable for the expression of the fusion protein and isolating the fusion protein.
  • FIG. 1 BFP-to-GFP editing as a platform to quantify knockin efficiency and precision.
  • A Schematic of BFP-to-GFP conversion assay screen. Genome editing agents are tested by transfecting HEK:BFP cells, a HEK293 knockin cell line that genomically expresses BFP driven by an EFla promoter. Successful editing knocks in the H62Y mutation into the BFP locus, thus producing GFP.
  • Genotyping of the three collected populations was performed by PCR of the BFP locus from genomic DNA, followed by Sanger sequencing of amplified fragment pools.
  • C Alignment of Sanger sequencing reads from BFP + , dark, and GFP + cell populations to the reference BFP locus sequence (top), showing wild-type, knockout (indel), and knockin genotypes, respectively.
  • D Editing outcomes (indels, knockin frequency) were quantified by decomposition of Sanger sequencing reads using the ICE algorithm and plotted on relative histograms binned by indel size. GFP + cells represent true knockins, and over 90% of dark cells contain indels predicted to cause knockout.
  • FIG. 2. Cas9-CtIP fusion and HMEJ donors additively improve knockin precision.
  • A Schema illustrating the combinatorial parameters of editing agents: dsDNA donor template HR or HMEJ variants in combination with Cas9 WT or HiFi variants, alone or fused to CtIP.
  • B Flow cytometry plots of HEK:BFP cells 7 days after transient transfection with gRNA, Cas9, and donor plasmids. GFP fluorescence is shown on the x-axis (log), and BFP fluorescence on the y-axis (log). Quantification gates are indicated on the plots for BFP + (WT), GFP + (KI), and dark (KO) cells.
  • Cas9 variants with HR (top row, yellow) or HMEJ (bottom row purple) donors were measured. Mean values from individual experiments (n >3) were normalized to those of the Cas9 WT with HR donor condition and presented as the mean ⁇ SEM.
  • C Heatmaps displaying statistical significance of pairwise comparisons of editing agent performance, calculated using one-way ANOVA with Tukey’s multiple comparison test and single pooled variance (* P ⁇ 0.05; ** P ⁇ 0.01; *** P ⁇ 0.001).
  • Cas9-CtIP fusion with HMEJ donor outperforms Cas9 with HR donor in knockin precision by over 30-fold.
  • FIG. 3 Iterative screening of novel Cas9 fusions and compound fusions with DNA-repair domains for increased editing performance.
  • A Schema of the fusion screen. Candidate DNA-repair protein domains are fused N-terminally to either Cas9 alone or Cas9-CtIP and together with HMEJ donor assayed by BFP-to-GFP for knockin efficiency and precision.
  • B Bar graphs showing quantification of relative knockin efficiency, knockout frequency, and knockin precision for Cas9 fusion (light blue) or Cas9-CtIP compound fusion (dark blue) with the listed DNA-repair protein domains. Values from individual experiments (n > 3) were normalized to Cas9 only and presented as the mean + SEM.
  • FIG. 4 Expression of affinity-tagged Cas9-RC ribonucleoprotein.
  • A Schema of the Cas9-RC ribonucleoprotein N-terminally fused with cleavable tandem GST and MBP affinity tags. The molecular weights of each domain (kDa) and an HRV 3C protease cleavage site (red triangle) are indicated.
  • B Protein electrophoresis gel stained with Coomassie blue showing lysates from bacteria transformed with GST-MBP-Cas9-RC expression construct prior to (left lane) and after inducing expression with IPTG for 3h (right lane).
  • FIG. 5 Cas9-RC enhances knockin efficiency in vivo.
  • A Schema of the Cas9 and Cas9-RC agents compared for in vivo editing using in utero electroporation in the fetal mouse brain at embryonic day (E) 14.5, and analysis in the cerebral cortex at postnatal day (P) 7.
  • B Gene editing donor template DNA targets the endogenous ActB locus to insert mCherry downstream of the P-actin coding sequence separated by a 2A. Knockin efficiency was quantified as the number of mCherry-i- (knockin) neurons over the number of GFP+ (electroporated) neurons.
  • C Representative fluorescence images of the cerebral cortex at P7 with electroporated neurons receiving Cas9 or Cas9-RC. Images show plasmid GFP (green) and genomic ActB-2A-mCherry (red) expression. Scale bar 100 pm.
  • D Swarm plots showing quantification of in vivo knockin efficiency for Cas9 vs. Cas9-RC and effect size estimation. Points show means from each brain and are plotted on the left axis for both groups indicating knockin efficiency. The effect size on knockin efficiency of Cas9-RC versus Cas9 is plotted as a distribution on a floating axis on the right indicating standard deviations (S).
  • the effect size estimated by unpaired Cohen's d between Cas9 and Cas9-RC is 2.0 S (black dot), indicating a large effect size.
  • the 95% confidence interval is 0.739 to 3.97 (vertical error bar).
  • the P value is 0.0022.
  • FIG. 6 Knockin applications with Cas9-RC.
  • A Primary fibroblasts from wildtype newborn mice were cuvette electroporated (nucleofected, schema on the right) with Cas9-RC plasmid and ActB-2A-mCherry donor DNA. Fibroblasts show two intensities of mCherry fluorescence (red) from the mouse -Actin locus, indicating mCherry knockin on one (arrowheads) or both (arrows) ActB alleles.
  • Cortical projection neurons (arrows) and astrocytes (arrowheads) are knocked in with Cas9-RC plasmid and ActB-2A-mCherry donor DNA in utero as in Fig.
  • C mScarlet- fusion knockin onto Negri, which represents a locus with developmentally regulated levels of expression that vary between neurons.
  • Cas9-RC and mScarlet-Negrl donor DNA were in utero electroporated in E14.5 mouse brain to extracellularly tag the endogenous Negri GPI-linked membrane protein with mScarlet.
  • Knockin cells displayed variable levels of mScarlet fluorescence (red) with a wide range of high (arrows) and low (arrowheads) expressing cells.
  • Electroporated knockin positive (arrow) and knockin negative (arrowhead) cells can be seen in the Purkinje cell layer in P14 cerebellum.
  • Cas9-RC has versatile applications and may offer advantages over the disadvantage of its large size depending on the target gene and cell type. Color labels and scale bars as indicated.
  • FIG. 7 Map of bacterial expression vector encoding Cas9-RC.
  • FIG. 8 Gating strategy for flow cytometry analysis of editing in HEK:BFP cells. Live cells were first gated by size and granularity using FSC-A vs SSC-A and then singlets were gated using SSC-A vs SSC-H.
  • FIG. 9. Biological replicate data and statistical analysis related to Fig. 3.
  • (a- c) Quantification of flow cytometry data for biological replicates from HEK:BFP cells 7 days after transient transfection indicating (a) KI efficiency (% GFP+) (b) KO frequency (% dark) and (c) KI precision (KI/KO ratio) for Cas9 variants with HMEJ donor,
  • (d-f) Heatmap matrices showing statistical significance calculated using a oneway ANOVA with Tukey’s multiple comparison test and pooled variance (also shown in Fig. 3). Differences between conditions were judged to be significant at P ⁇ 0.05 (*), P ⁇ 0.01 (**), and P ⁇ 0.001 (***).
  • FIG. 10 Expression of affinity-tagged Cas9-RC ribonucleoprotein: A) Protein electrophoresis gel stained with Coomassie blue showing lysates from bacteria transformed with GST-MBP-Cas9-RC expression construct prior to (left lane) and after inducing expression with IPTG for 3h (right lane). Arrows indicate putative GST-MBP-Cas9RC (316 kDA) and Cas9-RC (250 kDa) bands. (B) Anti-Cas9 immunoblot of lysate lanes shows Cas9 immunoreactive bands after IPTG induction, including bands with electrophoretic mobilities corresponding to full-length GST- MBP-Cas9RC and Cas9-RC, as indicated.
  • the present invention generally relates to compositions and methods for modifying a genomic sequence of a cell, including nucleic acids and polypeptides encoding fusion proteins comprising Cas9 or a variant or fragment thereof, Rad52 or Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject.
  • the term "about” means plus or minus 10% of the numerical value of the number with which it is being used.
  • nucleic acid and “polynucleotide,” are used interchangeably and refer to a deoxyribonucleotide or ribonucleotide polymer, in linear or circular conformation, and in either single- or double-stranded form.
  • these terms are not to be construed as limiting with respect to the length of a polymer.
  • the terms can encompass known analogues of natural nucleotides, as well as nucleotides that are modified in the base, sugar and/or phosphate moieties.
  • polynucleotides coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
  • loci locus defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched poly
  • a polynucleotide may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by nonnucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
  • chimeric RNA refers to the polynucleotide sequence comprising the guide sequence, the tracr sequence and the tracr mate sequence.
  • guide sequence refers to the about 20 bp sequence within the guide RNA that specifies the target site and may be used interchangeably with the terms “guide” or “spacer”.
  • tracr mate sequence may also be used interchangeably with the term “direct repeat(s)”.
  • Exemplary CRISPR-Cas system are provided in U.S. Pat. No. 8,697,359 and US 20140234972, both of which are incorporated herein by reference in their entirety.
  • polypeptide peptide
  • protein protein
  • amino acid polymers in which one or more amino acids are chemical analogues or modified derivatives of corresponding naturally-occurring amino acids.
  • sequence relates to a nucleotide sequence of any length, which can be DNA or RNA; can be linear, circular or branched and can be either single-stranded or double stranded.
  • identity relates to an exact nucleotide- to-nucleotide or amino acid-to-amino acid correspondence of two polynucleotides or polypeptide sequences, respectively.
  • Two or more sequences can be compared by determining their percent identity. Calculations of homology or sequence identity between two sequences (the terms are used interchangeably herein) are performed as follows. The sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes).
  • the optimal alignment is determined as the best score using the GAP program in the GCG software package with a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frame shift gap penalty of 5.
  • the amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position.
  • the percentage identity between the two sequences is a function of the number of identical positions shared by the sequences.
  • Sequence similarity between polynucleotides can be determined by hybridization of polynucleotides under conditions that allow formation of stable duplexes between homologous regions, followed by digestion with single- stranded- specific nuclease(s), and size determination of the digested fragments.
  • nucleic acids are written left to right in 5' to 3’ orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. It is to be understood that this invention is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context they are used by those of skill in the art.
  • recombinant includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid sequence or that the cell is derived from a cell so modified.
  • recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all as a result of deliberate human intervention.
  • wild type is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms.
  • variable should be taken to mean the exhibition of qualities that have a pattern that deviates from what occurs in nature.
  • nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.
  • “Complementarity” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson-Crick or other non- traditional types.
  • a percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary).
  • Perfectly complementary means that all the contiguous residues of a nucleic acid sequence will hydrogen bond with the same number of contiguous residues in a second nucleic acid sequence.
  • “Substantially complementary” as used herein refers to a degree of complementarity that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%. 97%, 98%, 99%, or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
  • stringent conditions for hybridization refer to conditions under which a nucleic acid having complementarity to a target sequence predominantly hybridizes with the target sequence, and substantially does not hybridize to non-target sequences. Stringent conditions are generally sequencedependent, and vary depending on a number of factors. In general, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence. Non-limiting examples of stringent conditions are described in detail in Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology-Hybridization With Nucleic Acid Probes Part 1, Second Chapter “Overview of principles of hybridization and the strategy of nucleic acid probe assay”, Elsevier, N.Y.
  • Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
  • the hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner.
  • the complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self-hybridizing strand, or any combination of these.
  • a hybridization reaction may constitute a step in a more extensive process, such as the initiation of PCR, or the cleavage of a polynucleotide by an enzyme.
  • a sequence capable of hybridizing with a given sequence is referred to as the “complement” of the given sequence.
  • expression refers to the process by which a polynucleotide is transcribed from a DNA template (such as into and mRNA or other RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins.
  • Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell.
  • subject refers to a vertebrate, preferably a mammal, more preferably a human.
  • Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
  • therapeutic agent refers to a molecule or compound that confers some beneficial effect upon administration to a subject.
  • the beneficial effect includes enablement of diagnostic determinations; amelioration of a disease, symptom, disorder, or pathological condition; reducing or preventing the onset of a disease, symptom, disorder or condition; and generally counteracting a disease, symptom, disorder or pathological condition.
  • treatment or “treating,” or “palliating” or “ameliorating” are used interchangeably. These terms refer to an approach for obtaining beneficial or desired results including but not limited to a therapeutic benefit and/or a prophylactic benefit.
  • therapeutic benefit is meant any therapeutically relevant improvement in or effect on one or more diseases, conditions, or symptoms under treatment.
  • the compositions may be administered to a subject at risk of developing a particular disease, condition, or symptom, or to a subject reporting one or more of the physiological symptoms of a disease, even though the disease, condition, or symptom may not have yet been manifested.
  • an effective amount refers to the amount of an agent that is sufficient to effect beneficial or desired results.
  • the therapeutically effective amount may vary depending upon one or more of: the subject and disease condition being treated, the weight and age of the subject, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary skill in the art.
  • the term also applies to a dose that will provide an image for detection by any one of the imaging methods described herein.
  • the specific dose may vary depending on one or more of: the particular agent chosen, the dosing regimen to be followed, whether it is administered in combination with other compounds, timing of administration, the tissue to be imaged, and the physical delivery system in which it is carried.
  • the present invention relates to compositions and methods for using nucleic acids and/or polypeptides encoding a Cas9 fusion protein to enhance genomic editing.
  • the compositions are useful in genomic or nucleic acid modification in vitro, ex vivo, and in vivo for a variety of research, screening, and therapeutic applications.
  • the invention provides a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the invention provides a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, a nucleic acid encoding Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the nucleic acid is DNA. In some embodiments, the nucleic acid is RNA. In some embodiments, the nucleic acid is a circular RNA molecule.
  • the invention provides a polypeptide encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof. In some embodiments, the invention provides a polypeptide encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the nucleic acid or polypeptide encodes a fusion protein that comprises Cas9 or a variant or fragment thereof.
  • the Cas9 or variant or fragment thereof is not particularly limiting.
  • Cas molecules of a variety of species can be used in nucleic acids, polypeptides, vectors and methods described herein.
  • the Cas9 is from Staphylococcus aureus. In some embodiments, the Cas9 is from S', pyogenes, S. thermophiles, or Neisseria meningitides. Additional Cas9 species include: Acidovorax avenae, Actinobacillus pleuropneumoniae.
  • Actinobacillus succinogenes Actinobacillus suis, Actinomyces sp., cycliphilus denitrificans, Aminomonas paucivorans, Bacillus cereus, Bacillus smithii, Bacillus thuringiensis, Bacteroides sp., Blastopirellula marina, Bradyrhizobium sp., Brevibacillus laterosporus, Campylobacter coli, Campylobacter jejuni, Campylobacter lari, Candidatus Puniceispirillum, Clostridium cellulolyticum, Clostridium perfringens, Coryn ebacterium accolens, Corynebacterium diphtheria, Corynebacterium matruchotii, Dinoroseobacter shibae, Eubacterium dolichum, gamma proteobacterium, Gluconacetobacter diazotrophicus, Haemophilus parainfluenzae
  • Cas9 molecule refers to a molecule that can interact with a gRNA molecule and, in concert with the gRNA molecule, localize (e.g., target or home) to a site which comprises a target domain and PAM sequence.
  • the Cas9 molecule is capable of cleaving a target nucleic acid molecule.
  • the ability of a Cas9 molecule to interact with and cleave a target nucleic acid is PAM sequence dependent.
  • a PAM sequence is a sequence in the target nucleic acid.
  • cleavage of the target nucleic acid occurs upstream from the PAM sequence.
  • Cas9 molecules from different bacterial species can recognize different sequence motifs (e.g., PAM sequences).
  • a Cas9 molecule of S. pyogenes recognizes the sequence motif NGG and directs cleavage of a target nucleic acid sequence 1 to 10, e.g., 3 to 5, base pairs upstream from that sequence.
  • meningitidis recognizes the sequence motif NNNNGATT and directs cleavage of a target nucleic acid sequence 1 to 10, e.g., 3 to 5, base pairs upstream from that sequence. See, e.g., Hou et al., PNAS Early Edition 2013, 1-6.
  • the ability of a Cas9 molecule to recognize a PAM sequence can be determined, e.g., using a transformation assay described in Jinek et al., Science 2012, 337:816.
  • Cas9 molecules include Cas9 molecules of a cluster 1 bacterial family, cluster 2 bacterial family, cluster 3 bacterial family, cluster 4 bacterial family, cluster 5 bacterial family, cluster 6 bacterial family, a cluster 7 bacterial family, a cluster 8 bacterial family, a cluster 9 bacterial family, a cluster 10 bacterial family, a cluster 11 bacterial family, a cluster 12 bacterial family, a cluster 13 bacterial family, a cluster 14 bacterial family, a cluster 15 bacterial family, a cluster 16 bacterial family, a cluster 17 bacterial family, a cluster 18 bacterial family, a cluster 19 bacterial family, a cluster 20 bacterial family, a cluster 21 bacterial family, a cluster 22 bacterial family, a cluster 23 bacterial family, a cluster 24 bacterial family, a cluster 25 bacterial family, a cluster 26 bacterial family, a
  • Exemplary naturally occurring Cas9 molecules include a Cas9 molecule of a cluster 1 bacterial family.
  • Examples include a Cas9 molecule of: S. pyogenes (e.g., strain SF370, MGAS10270, MGAS10750, MGAS2096, MGAS315, MGAS5OO5, MGAS6180, MGAS9429, NZ131 and SSI-1), S. thermophilus (e.g., strain LMD-9), S. pseudoporcinus (e.g., strain SPIN 20026), S. mutans (e.g., strain UA159, NN2025), S.
  • S. pyogenes e.g., strain SF370, MGAS10270, MGAS10750, MGAS2096, MGAS315, MGAS5OO5, MGAS6180, MGAS9429, NZ131 and SSI-1
  • S. thermophilus e.g., strain LMD-9
  • macacae e.g., strain NCTC11558
  • S', gallolyticus e.g., strain UCN34, ATCC BAA-2069
  • S. equines e.g., strain ATCC 9812, MGCS 124
  • S. dysdalactiae e.g., strain GGS 124
  • 5. bovis e.g., strain ATCC 7003378
  • S. anginosus e.g.; strain F0211
  • S. agalactiae e.g., strain NEM316, A909
  • Listeria monocytogenes e.g., strain F6854
  • Listeria innocua L.
  • Additional exemplary Cas9 molecules are a Cas9 molecule of Neisseria meningitidis (Hou et al. PNAS Early Edition 2013, 1-6) and a 5. aureus Cas9 molecule.
  • the polynucleotide may include the coding sequence for the full-length polypeptide or a fragment thereof, by itself; the coding sequence for the full-length polypeptide or fragment in reading frame with other coding sequences, such as those encoding a leader or secretory sequence, a pre-, or pro or prepro-protein sequence, nuclear localization signal or other fusion peptide portions.
  • the polynucleotide may also contain non-coding 5' and 3' sequences, such as transcribed, non-translated sequences, signals, ribosome binding sites and sequences that stabilize mRNA.
  • the nucleic acid sequence of Cas9, or variant or fragment thereof contains a nucleotide sequence that is highly identical, at least 90% identical, with a nucleotide sequence encoding Cas9 polypeptide.
  • the nucleic acid sequence of Cas9 comprises a nucleotide sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical with the encoding nucleotide sequence set forth in SEQ ID NOS:2, 36, 14 or 37.
  • the molecule is a wild-type Cas9.
  • the Cas9 is a wild-type Cas9 from Streptococcus pyogenes and comprises SEQ ID NO: 1.
  • the expression vector encodes a variant of the Cas9 protein referred to herein as a high fidelity Cas9.
  • the high fidelity Cas9 comprises an amino acid sequence comprising SEQ ID NO: 13.
  • the Cas9 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NOS: 1 or 13.
  • the nucleotide sequence encoding Cas9 or a biologically active fragment or derivative thereof includes nucleic acid molecules comprising a polynucleotide having a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical to a nucleotide sequence encoding Cas9 having the amino acid sequence in SEQ ID NO: 1 or 13.
  • the Cas9 portion of the fusion protein is encoded by a nucleic acid sequence at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NOS:2, 36, 14 or 37, which lacks the codon for the N-terminal methionine.
  • SEQ ID NO:13 is encoded by SEQ ID NO: 14 or SEQ ID NO:37.
  • the nucleic acid or polypeptide encodes a biologically active fragment of Cas9 protein.
  • a Cas9 molecule comprises an amino acid sequence having at least 90%, 95%, 96%, 97%, 98%, or 99% identity with SEQ ID NOS:1 or 13 (either full length or lacking the N-terminal methionine) or a naturally occurring Cas9 molecule sequence, e.g., a Cas9 molecule from a species listed herein or described in Chylinski et al., RNA Biology 2013, 10:5, 727-737; Hou et al. PNAS Early Edition 2013, 1-6.
  • the Cas9 polypeptide comprises an amino acid sequence that differs from a sequence of SEQ ID NOS:1 or 13 by as many as 1, but no more than 2, 3, 4, or 5 residues.
  • the nucleic acid or fusion protein encodes a Cas9 fragment.
  • a fragment is a polypeptide having an amino acid sequence that entirely is the same as part but not all of the amino acid sequence of one of a Cas9 polypeptide or variant. Fragments may be continuous or discontinuous. In some embodiments, the fragment may constitute from about 1000 contiguous amino acids identified in SEQ ID NOS: 1 or 13 In some embodiments, the fragment is about 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1360, 1361, 1362, 1363, 1364, 1365, or 1367 contiguous amino acids identified in SEQ ID NOSH or 13. In some embodiments, the fragment comprises SEQ ID NO:1 but lacks the N-terminal methionine residue (i.e., comprises amino acids 2-1368 of SEQ ID NO: 1).
  • the fragments include, for example, truncation polypeptides having the amino acid sequence of Cas9 polypeptides, except for deletion of a continuous series of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl terminus or deletion of two continuous series of residues, one including the amino terminus and one including the carboxyl terminus.
  • Naturally occurring Cas9 molecules possess a number of properties, including: nickase activity, nuclease activity (e.g., endonuclease and/or exonuclease activity); helicase activity; the ability to associate functionally with a gRNA molecule; and the ability to target (or localize to) a site on a nucleic acid (e.g., PAM recognition and specificity).
  • a Cas9 molecule can include all or a subset of these properties.
  • Cas9 molecules have the ability to interact with a gRNA molecule and, in concert with the gRNA molecule, localize to a site in a nucleic acid.
  • Other activities e.g., PAM specificity, cleavage activity, or helicase activity can vary more widely in Cas9 molecules.
  • Cas9 molecules with desired properties can be made in a number of ways, e.g., by alteration of a parental, naturally occurring Cas9 molecule to provide an altered Cas9 molecule having a desired property.
  • One or more mutations or differences relative to a parental Cas9 molecule can be introduced. Such mutations and differences can comprise substitutions e.g., conservative substitutions or substitutions of non-essential amino acids); insertions; or deletions.
  • a Cas9 molecule can comprise one or more mutations or differences, e.g., at least 1, 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to a reference Cas9 molecule.
  • Candidate Cas9 molecules can be evaluated by art-known methods or as described herein. For example, exemplary methods for evaluating the endonuclease activity of Cas9 molecule are described, e.g., in Jinek et al., Science 2012; 337(6096):816-821.
  • the nucleic acids or fusion proteins comprise a nucleic acid sequence encoding CtIP or a variant or fragment thereof.
  • CtIP is a DNA repair protein. See e.g., Huertas et al. J. Biol. Chem. 284, 9558-9565 (2009) and Charpentier are incorporated by reference herein.
  • human CtIP is an 897 amino acid protein identified in NCBI Accession No. Q99708.
  • CtIP has an amino acid sequence of SEQ ID NO: 10.
  • the CtIP, variant or fragment thereof is from a mammal, such as human, mouse, rat, or the like.
  • the nucleic acid or fusion protein encodes a fragment of CtIP.
  • a fragment is a polypeptide having an amino acid sequence that entirely is the same as part but not all of the amino acid sequence of one of a CtIP polypeptide or variant. Fragments may be continuous or discontinuous.
  • the nucleic acid or fusion protein encodes a CtIP fragment comprising an amino acid sequence of SEQ ID NO:3, corresponding to amino acids 1-296 of CtIP.
  • SEQ ID NO:3 is encoded by a nucleotide sequence comprising SEQ ID NOS:4 or 38.
  • the fragment may constitute about 150 contiguous amino acids identified in SEQ ID NOS:3 or 10. In some embodiments, the fragment is about 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, or 290 contiguous amino acids or more identified in SEQ ID NOS:3 or 10.
  • the fragments include, for example, truncation polypeptides having the amino acid sequence of CtIP polypeptides, except for deletion of a continuous series of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl terminus or deletion of two continuous series of residues, one including the amino terminus and one including the carboxyl terminus.
  • the fragment is a homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP.
  • HDR Enhancing homology-directed repair enhancing
  • the HDR Enhancing N-terminal fragment of CtIP comprises an amino acid sequence at least 90% identical to SEQ ID NOS:3 or 10 (amino acids 1-296).
  • the HDR Enhancing N-terminal fragment of CtIP comprises an amino acid sequence of at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NOS:3 or amino acids 1-296 of SEQ ID NO: 10.
  • the homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP comprises one or more mutations or differences, e.g., at least 1, 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to SEQ ID NOS:3 or amino acids 1-296 of SEQ ID NO: 10.
  • the homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP comprises an amino acid sequence that differs from a sequence of SEQ ID NOS:3 or amino acids 1-296 of SEQ ID NO: 10 by as many as 1, but no more than 2, 3, 4, or 5 residues.
  • the nucleic acid sequence of CtIP, a variant or fragment thereof contains a nucleotide sequence that is highly identical, at least 90% identical, with a nucleotide sequence encoding CtIP, a variant or fragment thereof polypeptide.
  • the nucleic acid sequence of CtIP, a variant or fragment thereof comprises a nucleotide sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical with the encoding nucleotide sequence set forth in SEQ ID NOS :4 or 38.
  • the nucleic acid or fusion protein comprises Rad 18 or a variant or fragment thereof or Rad52 or a variant or fragment thereof.
  • Rad 18 is a E3 ubiquitin-protein ligase involved in postreplication repair of UV-damaged DNA. which are incorporated by reference herein. See Nambiar et al., Nat. Commim. 10, 3395 (2019), which is incorporated by reference herein
  • human Rad 18 is a 495 amino acid protein identified in NCBI Accession No. AAF86618.
  • Radi 8 has an amino acid sequence of SEQ ID NO:11.
  • Rad52 or a variant or fragment thereof can be used in place of the Rad 18 component in the fusion protein.
  • the Rad52 or a variant or fragment thereof has an amino acid sequence of SEQ ID NO: 15.
  • SEQ ID NO: 15 is encoded by a nucleotide sequence comprising SEQ ID NOS: 16 or 39.
  • the Radl8 or Rad52, variant or fragment thereof is from a mammal, such as human, mouse, rat, or the like.
  • the nucleic acid or fusion protein encodes a Radi 8 fragment.
  • a fragment is a polypeptide having an amino acid sequence that entirely is the same as part but not all of the amino acid sequence of one of a Rad 18 polypeptide or variant. Fragments may be continuous or discontinuous.
  • the fragment of Radi 8 comprises a deletion of a putative DNA-binding domain, and optionally lacks the N-terminal methionine.
  • the fragment of Rad 18 (also referred to herein as eRad 18) has a deletion in amino acids 242-282 in Radi 8 and is encoded by an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:5 or SEQ ID NO:43.
  • the fragment may constitute about 150 contiguous amino acids identified in SEQ ID NOS:5 or 1 1. In some embodiments, the fragment is about 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, or 450 contiguous amino acids or more identified in SEQ ID NO:5 or SEQ ID NO:43.
  • the fragments include, for example, truncation polypeptides having the amino acid sequence of Rad 18 polypeptides, except for deletion of a continuous series of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl terminus or deletion of two continuous series of residues, one including the amino terminus and one including the carboxyl terminus.
  • the Rad 18 variant or fragment comprises one or more mutations or differences, e.g., at least 1, 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to SEQ ID NO:5 or SEQ ID NO:43.
  • the Rad 18 variant or fragment comprises an amino acid sequence that differs from a sequence of SEQ ID NO:5 or SEQ ID NO:43 by as many as 1, but no more than 2, 3, 4, or 5 residues.
  • the nucleic acid sequence of Radi 8, a variant or fragment thereof contains a nucleotide sequence that is highly identical, at least 90% identical, with a nucleotide sequence encoding Radi 8, a variant or fragment thereof polypeptide.
  • the nucleic acid sequence of Radi 8, a variant or fragment thereof comprises a nucleotide sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical with the encoding nucleotide sequence set forth in SEQ ID NOS:6 or 40.
  • the Rad 18 or a variant or fragment thereof is fused to the amino terminus of the Cas9 or a variant or fragment thereof and the CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9.
  • the Rad52 or a variant or fragment thereof is fused to the amino terminus of the Cas9 or a variant or fragment thereof and the CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9.
  • the fusion protein comprises one or more linker sequences between Radi 8 or a variant or fragment thereof, Cas9 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the linker sequence comprises a nuclear localization sequence.
  • the linker sequence comprises GAAPKKKRKVGIHGVPAA (SEQ ID NO:21) and/or KRPAATKKAGQAKKKKEFGSGGAAS (SEQ ID NO:22).
  • GAAPKKKRKVGIHGVPAA is a linker between the Rad52 or eRadl8 portion and the Cas9 portion of the fusion protein.
  • KRPAATKKAGQAKKKKEFGSGGAAS (SEQ ID NO:22) is a linker between the Cas9 portion and the CtIP portion of the fusion protein.
  • the fusion protein (eRadl8-Cas9-CtIP) has an amino acid sequence that is at least 80% , 85 % , 90% , 91 % , 92% , 93 % , 94% , 95 % , 96% , 97 % , 98%, 99% or 100% identical to an amino acid sequence comprising SEQ ID NO: 17.
  • an N-terminal methionine is added to the amino acid sequence of SEQ ID NO:17 at the 1 position.
  • the nucleic acid encoding a fusion protein has a nucleotide sequence that is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to a nucleotide sequence comprising SEQ ID NOS: 18 or 41.
  • an N-terminal start methionine codon (AUG/ATG)) is added to sequence of SEQ ID NOS: 18 or 41.
  • the fusion protein (RAD52-Cas9-QIP) has an amino acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to an amino acid sequence comprising SEQ ID NO:19.
  • an N-terminal methionine is added to the amino acid sequence of SEQ ID NO: 19 at the 1 position.
  • the nucleic acid encoding a fusion protein has a nucleotide sequence that is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to a nucleotide sequence comprising SEQ ID NOS:20 or 42.
  • an N-terminal start methionine codon (AUG/ATG)) is added to sequence of SEQ ID NQS:20 or 42.
  • the fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains.
  • protein domains that may be included in the fusion protein include, without limitation, epitope tags, reporter gene sequences, and nucleic acid repair proteins described herein.
  • the fusion protein can include any sequence encoding a protein or a fragment of a protein that bind DNA molecules or bind other cellular molecules, including but not limited to maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4A DNA binding domain fusions, and herpes simplex virus (HSV) BP 16 protein fusions.
  • MBP maltose binding protein
  • S-tag S-tag
  • Lex A DNA binding domain (DBD) fusions Lex A DNA binding domain
  • GAL4A DNA binding domain fusions GAL4A DNA binding domain fusions
  • HSV herpes simplex virus
  • the fusion protein comprises one or more affinity or purification tags.
  • the affinity or purification tag comprises glutathione S-transferase.
  • the affinity or purification tag comprises a maltose binding protein (MBP) or a variant or fragment thereof.
  • the glutathione S-transferase comprises SEQ ID NO:23.
  • the glutathione S-transferase is encoded by SEQ ID NO:24.
  • the maltose binding protein or a variant or fragment thereof comprises SEQ ID NO:25.
  • the maltose binding protein or a variant or fragment thereof is encoded by SEQ ID NO:26.
  • the affinity or purification tag comprises glutathione S- transferase and a maltose binding protein or a variant or fragment thereof.
  • the fusion protein comprises an N-terminal GST and MBP tandem affinity tag.
  • the expression vector comprises a linker sequence following the affinity/purification tag and prior to any cleavage site.
  • the linker is an N10 linker sequence (SEQ ID NO:27), and in some embodiments, the N10 linker is encoded by SEQ ID NO:28.
  • the affinity/purification tag and optional linker sequence is followed by a protease recognition site.
  • the recognition site is an HRV 3C protease recognition site.
  • the HRV 3C protease recognition site comprises the amino acid sequence LEVLFQGP (SEQ ID NO: 12), where cleavage occurs between the Q and G residues.
  • SEQ ID NO: 12 is encoded by SEQ ID NO:29.
  • the fusion protein comprises a nuclear localization signal (NLS).
  • NLS nuclear localization signal
  • the NLS is a from SV40 and comprises SEQ ID NO:30, which in some embodiments is encoded by SEQ ID NO:31.
  • the NLS is followed by a fragment of Rad 18 that has a deletion of a putative DNA-binding domain and lacking the N-terminal methionine encoded by SEQ ID NO:5, wild-type Cas9 from Streptococcus pyogenes comprising amino acids 2-1368 of SEQ ID NO: 1, and a homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP encoded by SEQ ID NO:3.
  • HDR Enhancing homology-directed repair enhancing
  • the invention provides a vector encoding nucleic acids and fusion proteins herein.
  • the vector comprises a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the vector comprises a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • expression of the fusion protein in a mammalian cell can be achieved by transient transfection of a vector encoding the fusion protein into the cell.
  • expression of the fusion protein in a mammalian cell can be achieved by integration of the polynucleotides containing the fusion protein into the nuclear genome of the mammalian cell.
  • a variety of vectors and systems for the delivery and integration of polynucleotides encoding exogenous proteins into the nuclear DNA of a mammalian cell have been developed. Examples of expression vectors are disclosed in, e.g., WO 1994/011026 and are incorporated herein by reference.
  • expression vectors for use in the compositions and methods described herein contain a polynucleotide sequence of the fusion protein, as well as, e.g., additional sequence elements used for the expression of these agents and/or the integration of these polynucleotide sequences into the genome of a mammalian cell.
  • Certain vectors that can be used for the expression of the fusion protein include plasmids that contain regulatory sequences, such as promoter and enhancer regions, which direct gene transcription.
  • Other useful vectors for expression of the fusion protein contain polynucleotide sequences that enhance the rate of translation of these genes or improve the stability or nuclear export of the mRNA that results from gene transcription.
  • sequence elements include, e.g., 5’ and 3’ untranslated regions and a polyadenylation signal site in order to direct efficient transcription of the gene carried on the expression vector.
  • the expression vectors suitable for use with the compositions and methods described herein may also contain a polynucleotide encoding a marker for selection of cells that contain such a vector. Examples of a suitable marker include genes that encode resistance to antibiotics, such as ampicillin, chloramphenicol, kanamycin, or nourseothricin.
  • nucleic acids or vectors for therapeutic application in the treatment of conditions described herein or modifying the genome they can be directed to the interior of the cell, and, in particular, to specific cell types.
  • Nucleic acids and vectors can be introduced into a cell by a variety of methods, including transformation, transfection, transduction, direct uptake, projectile bombardment, and by encapsulation of the vector or nucleic acid in a liposome or nanoparticle, such as a lipid nanoparticle.
  • suitable methods of transfecting or transforming cells include calcium phosphate precipitation, electroporation, microinjection, infection, lipofection and direct uptake. Such methods are described in more detail, for example, in Green, et al.
  • the fusion protein can also be introduced into a mammalian cell by targeting vectors.
  • vectors can be targeted to the phospholipids on the extracellular surface of the cell membrane by linking the vector molecule to a VSV-G protein, a viral protein with affinity for all cell membrane phospholipids.
  • VSV-G protein a viral protein with affinity for all cell membrane phospholipids.
  • RNA polymerase Recognition and binding of the polynucleotide encoding the fusion protein by mammalian RNA polymerase is important for gene expression. As such, one may include sequence elements within the polynucleotide that exhibit a high affinity for transcription factors that recruit RNA polymerase and promote the assembly of the transcription complex at the transcription initiation site.
  • sequence elements include, e.g., a mammalian promoter, the sequence of which can be recognized and bound by specific transcription initiation factors and ultimately RNA polymerase.
  • Polynucleotides suitable for use in the compositions and methods described herein also include those that encode the fusion protein downstream of a mammalian. Promoters that are useful for the expression in mammalian cells include ubiquitous promoters such as the CAG promoter, or the cytomegalovirus (CMV) promoter. Cell type and tissue specific promoters can also be utilized.
  • CAG CAG promoter
  • CMV cytomegalovirus
  • promoters derived from viral genomes can also be used for the stable expression of these agents in mammalian cells.
  • functional viral promoters that can be used to promote mammalian expression of these agents include adenovirus late promoter, vaccinia virus 7.5K promoter, SV40 promoter, tk promoter of HSV, mouse mammary tumor virus (MMTV) promoter, LTR promoter of HIV, promoter of Moloney virus, Epstein barr virus (EBV) promoter, and the Rous sarcoma virus (RSV) promoter.
  • the fusion protein is delivered by a viral vector.
  • a “viral vector” is a virus that can be used to deliver genetic material into target cells. This can be done either in vivo or in vitro. In general, viral vectors are either inherently safe or are modified to present a low handling risk and have low toxicity with respect to the targeted cells.
  • a “retrovirus” is a virus of the family Retroviridae that inserts a copy of its RNA genome into the DNA of a host cell, then uses a reverse transcriptase enzyme to produce DNA from its RNA genome. Retroviruses are known in the art to be useful in gene delivery systems.
  • a “lentivirus” is a type of retrovirus; they are known as slow retroviruses.
  • An “adenovirus” is a virus of the family Adenoviridae that lacks an outer lipid bilayer and includes a double stranded DNA genome. Adenoviruses are well established in the art as viral vectors for gene therapy, and delivering genes coding proteins of interest to particular locations, as to selected cell types, is possible.
  • An “adeno-associated virus” is of the genus Dependoparvovirus, which is of the family Parvoviridae. These are nonenveloped viruses having a single-stranded DNA genome. Adeno-associated viruses are well known in the art as attractive candidates for use as viral vectors for gene therapy. Unlike adenoviruses, they have the advantage that they do not cause disease.
  • nucleic acids of the compositions and methods described herein are incorporated into recombinant AAV (rAAV) vectors and/or virions in order to facilitate their introduction into a cell.
  • rAAV vectors useful in the compositions and methods described herein are recombinant nucleic acid constructs that include (1) a heterologous sequence to be expressed (e.g., a polynucleotide encoding the fusion protein) and (2) viral sequences that facilitate stability and expression of the heterologous genes.
  • the viral sequences may include those sequences of AAV that are required in cis for replication and packaging (e.g., functional ITRs) of the DNA into a virion.
  • Such rAAV vectors may also contain marker or reporter genes.
  • useful rAAV vectors have one or more of the AAV wild-type genes deleted in whole or in part but retain functional flanking ITR sequences.
  • the AAV ITRs may be of any serotype suitable for a particular application.
  • the ITRs can be AAV2 ITRs. Methods for using rAAV vectors are described, for example, in Tai et al., J. Biomed. Sci. 1 :279 (2000), and Monahan and Samulski, Gene Delivery' . A (2000), the disclosures of each of which are incorporated herein by reference as they pertain to AAV vectors for gene delivery.
  • the nucleic acids and vectors described herein can be incorporated into a rAAV virion in order to facilitate introduction of the nucleic acid or vector into a cell.
  • the capsid proteins of AAV compose the exterior, non-nucleic acid portion of the virion and are encoded by the AAV cap gene.
  • the cap gene encodes three viral coat proteins, VP1, VP2 and VP3, which are required for virion assembly.
  • the construction of rAAV virions has been described, for instance, in U.S. Pat. Nos. 5,173,414; 5,139,941 ; 5,863,541; 5,869,305; 6,057,152; and 6,376,237; as well as in Rabinowitz et al., J. Virol.
  • rAAV virions useful in conjunction with the compositions and methods described herein include those derived from a variety of AAV serotypes including, without limitation, AAV1, AAV2, AAV2quad(Y-F), AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, rhlO, rh39, rh43, rh74, Anc80, Anc80L65, DJ/8, DJ/9, 7m8, PHP.B, PHP.eb, and PHP.S.
  • pseudotyped rAAV vectors include AAV vectors of a given serotype (e.g., AAV9) pseudotyped with a capsid gene derived from a serotype other than the given serotype (e.g., AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, etc.).
  • AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, etc. Techniques involving the construction and use of pseudotyped rAAV virions are known in the art and are described, for instance, in Duan et al., J. Virol. 75:7662 (2001); Halbert et al., J. Virol. 74:1524 (2000); Zolotukhin et al., Methods, 28:158 (2002); and Auricchio et al., Hum. Molec. Genet. 10:3075 (2001).
  • AAV virions that have mutations within the virion capsid may be used to infect particular cell types more effectively than non-mutated capsid virions.
  • suitable AAV mutants may have ligand insertion mutations for the facilitation of targeting AAV to specific cell types.
  • the construction and characterization of AAV capsid mutants including insertion mutants, alanine screening mutants, and epitope tag mutants is described in Wu et al., J. Virol. 74:8635 (2000).
  • Other rAAV virions that can be used in methods described herein include those capsid hybrids that are generated by molecular breeding of viruses as well as by exon shuffling. See, e.g., Soong et al., Nat. Genet., 25:436 (2000) and Kolman and Stemmer, Nat. Biolechnol. 19:423 (2001).
  • the vector is a yeast expression vector.
  • yeast Saccharomyces cerivisae examples include pYepSecl (Baldari, et al., 1987. EMBO J 6: 229-234), pMFa (Kuijan and Herskowitz, 1982. Cell 30: 933- 943), pJRY88 (Schultz et al., 1987. Gene 54: 113-123), pYES2 (Invitrogen Corporation, San Diego, Calif), and picZ (InVitrogen Corp, San Diego, Calif).
  • the vector drives protein expression in insect cells using baculovirus expression vectors.
  • Baculovirus vectors available for expression of proteins in cultured insect cells include the pAc series (Smith, et al., 1983. Mol. Cell. Biol. 3: 2156-2165) and the pVL series (Lucklow and Summers, 1989. Virology 170: 31-39).
  • the vector is a prokaryotic expression vector comprising a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises i) Cas9 or a variant or fragment thereof; ii) Rad 18 or a variant or fragment thereof; and iii) CtIP or a variant or fragment thereof.
  • the vector is a prokaryotic expression vector comprising a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises i) Cas9 or a variant or fragment thereof; ii) Rad52 or a variant or fragment thereof; and iii) CtIP or a variant or fragment thereof.
  • Appropriate prokaryotic vectors are typically equipped with a selectable marker-encoding nucleic acid sequence, insertion sites, and suitable control elements, such as termination sequences.
  • the vectors comprise regulatory sequences, including, for example, control elements (i.e., promoter and terminator elements or 5' and/or 3' untranslated regions), effective for expression of the coding sequence in host cells (and/or in a vector or host cell environment in which a modified protein coding sequence is not normally expressed), operably linked to the coding sequence.
  • the expression vector is derived from the pGEX-6P-l commercial vector from Addgene.
  • the prokaryotic expression vector comprises a plasmid.
  • plasmid refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal selfreplicating genetic element in some eukaryotes or prokaryotes, or integrates into the host chromosome.
  • the expression vector is a bacterial expression vector plasmid. In some embodiments, the expression vector is capable of expressing the fusion protein in Escherichia coli.
  • expression refers to the process by which a polypeptide is produced based on the nucleic acid sequence of a gene.
  • the process includes both transcription and translation.
  • expression of the fusion protein refers to transcription and translation of the fusion protein to be expressed, the products of which can include precursor RNA, mRNA, polypeptide, post-translation processed polypeptide, and derivatives thereof.
  • the terms “vector” and “cloning vector” refer to nucleic acid constructs designed to transfer nucleic acid sequences into cells.
  • expression vector refers to nucleic acid constructs generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell.
  • the vector comprises a recombinant expression cassette, and can be incorporated into a plasmid, chromosome, virus, or nucleic acid fragment.
  • the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid sequence to be transcribed and a promoter.
  • the expression cassette comprises a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein.
  • a "promoter sequence” refers to a DNA sequence which is recognized by a cell for expression purposes.
  • Exemplary promoters include both constitutive promoters and inducible promoters. Such promoters are well known to those of skill in the art. Those skilled in the art are also aware that a natural promoter can be modified by replacement, substitution, addition or elimination of one or more nucleotides without changing its function. The practice of the present invention encompasses and is not constrained by such alterations to the promoter.
  • it is operably linked to a DNA sequence encoding the fusion polypeptide. Such linkage comprises positioning of the promoter with respect to the translation initiation codon of the DNA sequence encoding the fusion DNA sequence.
  • the promoter is an inducible promoter.
  • the promoter is induced by a change in temperature, e.g., an increase of in temperature from 37 degrees Celsius to 42 degrees Celsius.
  • the promoter is induced by an agent, such as a small molecule such as IPTG.
  • the inducible promoter is a lad or lacZ promoter.
  • the lacl gene may also be present in the system.
  • the lacl gene (usually a constitutively expressed gene) encodes the Lac repressor protein Lacl protein that binds to the rack operator of the lac family promoter. Therefore, in some embodiments, when the lac family promoter is utilized, the lac gene can also be included and expressed in the expression system.
  • the expression vector comprises a lac operator.
  • the lac operator comprises SEQ ID NO:32.
  • the lac operator is derived from the pGEX-6P-l commercial vector (Addgene). An operator sequence located at the 5' end serves as a binding site for a repressor protein that blocks RNA polymerase.
  • the expression vector comprises a tac promoter.
  • the tac promoter comprises SEQ ID NO:33.
  • the tac promoter is derived from the pGEX-6P-l commercial vector (Addgene).
  • a nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence.
  • DNA encoding a secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide;
  • a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or
  • a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation.
  • Operably linked DNA sequences are usually contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is typically accomplished by ligation at convenient restriction sites. If such sites do not exist, in some embodiments, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
  • the expression constructs of the invention encode a recombinant fusion protein fused to a secretory leader capable of transporting the recombinant fusion protein to the cytoplasm of cells.
  • the expression construct encodes a recombinant fusion protein fused to a secretory leader capable of transporting the recombinant fusion protein to the periplasm.
  • the secretory leader is cleaved from the recombinant fusion protein.
  • transcription enhancer sequences include, but are not limited to, transcription enhancer sequences, translation enhancer sequences, other promoters, activators, translation start and stop signals, transcription terminators, cistron regulators, polycistronic regulators, expression as described above.
  • tag sequences such as the nucleotide sequence "tags” and "tag” polypeptide coding sequences that facilitate identification, separation, purification, and/or isolation of the polypeptide.
  • the expression construct in addition to the protein coding sequence, operably binds to one of the following regulatory elements: promoter, ribosome binding site (RBS), transcription terminator, and translation initiation and termination signals.
  • Useful RBSs can also be obtained from any of the species useful as host cells in, for example, the expression systems of US Patent Application Publication Nos. 2008/0269070 and 2010/0137162. Many specific and various consensus RBSs are known. See Frishman et al., Gene 234 (2): 257-65 (8 Jul. 1999); and B. E. Suzek et al., Bioinformatics 17 (12): 1123-30 (December 2001).
  • a secretory signal or leader coding sequence is fused to the N-terminus of the sequence encoding the recombinant fusion protein.
  • the use of a secretory signal sequence can increase the production of recombinant proteins in bacteria.
  • By utilizing a secretory leader it is possible to increase the yield of properly folded proteins by secreting proteins from the intracellular environment.
  • proteins secreted from the cytoplasm may end up in the peri-cell membrane cavity, attached to the outer membrane, or to the extracellular culture medium. These methods also avoid the formation of inclusion bodies.
  • the recombinant fusion protein targets the peripheral or extracellular space of a host cell.
  • the expression vector further comprises a nucleotide sequence encoding a secretory signal polypeptide operably linked to a nucleotide sequence encoding the recombinant fusion protein.
  • the expression vector further comprises a transcription termination signal downstream of a nucleotide sequence encoding the fusion protein.
  • terminatator sequence refers to a DNA sequence which is recognized by the io expression host to terminate transcription. It is operably linked to the 3' end of the fusion DNA encoding the fusion polypeptide to be expressed.
  • the termination region is obtained from the same gene as the promoter sequence, while in other embodiments it is obtained from another gene. The selection of suitable transcription termination signals is well-known to those of skill in the art.
  • the expression vector comprises a rrnB T1 terminator comprising SEQ ID NO:34.
  • the vector comprises a T7Te terminator comprising SEQ ID NO: 35.
  • the vector comprises a rmB T1 terminator sequence followed by a T7Te terminator sequence.
  • the expression vector comprises a selectable marker encoding nucleic acid sequence.
  • selectable markerencoding nucleotide sequence refers to a nucleotide sequence which is capable of expression in prokaryotic cells and where expression of the selectable marker confers to cells containing the expressed gene the ability to grow in the presence of is a corresponding selective condition.
  • selectable marker will depend on the host cell. Appropriate markers for different bacterial hosts are well known in the art. Typical selectable marker genes encode proteins that (a) confer resistance to antibiotics or other toxins (e.g., ampicillin, methotrexate, tetracycline, neomycin mycophenolic acid, puromycin, zeomycin, or hygromycin); or (b) complement an auxotrophic mutation or a naturally occurring nutritional deficiency in the host strain. In some embodiments, the selectable marker gene encodes a gene capable of conferring antibiotic resistance. In some embodiments, the selectable marker gene encodes a gene capable of conferring resistance to ampicillin.
  • antibiotics or other toxins e.g., ampicillin, methotrexate, tetracycline, neomycin mycophenolic acid, puromycin, zeomycin, or hygromycin
  • the selectable marker gene encodes a gene capable of conferring antibiotic resistance.
  • the selectable marker gene encodes
  • the expression vector encodes a fusion protein (GST- MBP-eRAD18-Cas9-CtIP) comprising an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:7.
  • a fusion protein GST- MBP-eRAD18-Cas9-CtIP
  • the fusion protein (GST-MBP-eRAD18-Cas9-CtIP) is encoded by a nucleic acid sequence at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:8.
  • the prokaryotic expression vector comprises a nucleic acid sequence at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:9.
  • the vector map is shown in FIG. 7.
  • the present invention provides host cells which have been transduced, transformed or transfected with a vector as described herein.
  • the host cells are prokaryotic.
  • the culture conditions such as temperature, pH and the like, are those previously used for the parental host cell prior to transduction, transformation or transfection and are apparent to those skilled in the art.
  • the nucleotide sequence encoding a fusion protein is operably linked to a promoter sequence functional in the host cell.
  • a bacterial culture is transformed with an expression vector having a promoter or biologically active promoter fragment or one or more (e.g., a series) enhancers which functions in the host cell, operably linked to a nucleic acid sequence encoding the fusion protein, such that the fusion protein is expressed in the cell.
  • hosts include bacterial cells, such as streptococci, staphylococci, Escherichia coli, Streptomyces and Bacillus subtilis cells.
  • the host cell is Escherichia coli.
  • the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising: a) a sgRNA molecule comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome b) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising: a) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome b) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • a CRISPR/CAS9 system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system) to effect the modification of a nucleic acid sequence.
  • the CRISPR/CAS9 system includes one or more nucleic acids or polypeptides encoding fusion proteins and one or more sgRNAs as described herein.
  • the CRISPR/CAS9 system includes a nucleic acid template encoding a sequence of interest for purposes of editing a nucleic acid sequence.
  • target sequence refers to a sequence to which a guide RNA sequence is designed to have complementarity, where hybridization between a target sequence and a guide RNA sequence promotes the formation of a CRISPR complex. Full complementarity is not necessarily required, provided there is sufficient complementarity to cause hybridization and promote formation of a CRISPR complex.
  • a target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides.
  • a target sequence is located in the nucleus or cytoplasm of a cell.
  • the target sequence may be within an organelle of a eukaryotic cell, for example, mitochondrion or chloroplast.
  • a sequence or template that may be used for recombination into the targeted locus comprising the target sequences is referred to as an “editing template,” “editing polynucleotide,” “editing sequence,” or as a “nucleic acid template encoding a sequence of interest” herein.
  • an exogenous template polynucleotide may be referred to as a “nucleic acid template encoding a sequence of interest.”
  • the recombination is homologous recombination.
  • a coding sequence encoding the fusion protein is codon optimized for expression in particular cells, such as eukaryotic cells.
  • the eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate.
  • codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g. about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence.
  • Codon bias differs in codon usage between organisms
  • mRNA messenger RNA
  • tRNA transfer RNA
  • the predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database”, and these tables can be adapted in a number of ways.
  • codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.), are also available.
  • one or more codons e.g. 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or all codons
  • one or more codons in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
  • the sgRNA sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence- specific binding of a CRISPR complex to the target sequence.
  • the degree of complementarity between a sgRNA sequence and its corresponding target sequence when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more.
  • Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting example of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g. the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
  • Burrows-Wheeler Transform e.g. the Burrows Wheeler Aligner
  • ClustalW Clustal X
  • BLAT Novoalign
  • SOAP available at soap.genomics.org.cn
  • Maq available at maq.sourceforge.net
  • a guide sequence is about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length. The ability of a guide sequence to direct sequence-specific binding of a CRISPR complex to a target sequence may be assessed by any suitable assay.
  • the components of a CRISPR system sufficient to form a CRISPR complex may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the CRISPR sequence, followed by an assessment of preferential cleavage within the target sequence, such as by Surveyor assay as described herein.
  • cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a CRISPR complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions.
  • Other assays are possible, and will occur to those skilled in the art.
  • a sgRNA sequence may be selected to target any target sequence.
  • the target sequence is a sequence within a genome of a cell.
  • Exemplary target sequences include those that are unique in the target genome.
  • a sgRNA sequence is selected to reduce the degree of secondary structure within the guide sequence.
  • Secondary structure may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148). Another example folding algorithm is the online webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see e.g. A. R. Gruber et al., 2008, Cell 106(1): 23-24; and PA Carr and GM Church, 2009, Nature Biotechnology 27(12): 1151-62).
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 (or Rad52) or a variant or fragment thereof, and CtIP or a variant or fragment thereof, with optimized (HMEJ) donor DNA can improve knockin performance (both KI precision and efficiency) up to 40-fold (e.g., in cultured human cells) compared to conventional Cas9 knockin (with homologous recombination donor DNA).
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the cell is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the cell is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of treating a disease of condition in a subject by administering to the subject an effective amount of the cells having a modified genomic sequence.
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify a genomic sequence in cells of the subject.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject.
  • the one or more additional agents include guide RNA(s) and/or template nucleic acid.
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the subject is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the subject is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
  • the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
  • the invention provides a method of modifying a genomic sequence of a cell comprising: introducing a system described herein into a cell.
  • the introducing results in disruption, deletion, or insertion of a target nucleic acid (e.g., gene) in the cell.
  • gene editing results in an increase or decrease in expression of an endogenous or exogenous gene in the cell.
  • the cell is a eukaryotic cell (e.g., a mammalian such as a human) cell).
  • the cell is in vitro, ex vivo, or in vivo.
  • the method treats a disease or condition in a subject.
  • the genomic editing comprises HDR or NHEJ.
  • the invention provides methods comprising delivering one or more polynucleotides, such as or one or more vectors, nucleic acids, or systems, one or more transcripts thereof, and/or one or proteins transcribed therefrom, to a cell or to a subject.
  • the invention further provides cells produced by such methods, and organisms (such as animals, plants, or fungi) comprising or produced from such cells.
  • a CRISPR/Cas9 system is delivered to a cell.
  • Viral and non-viral based gene transfer methods can be used to introduce nucleic acids in mammalian cells or target tissues. Such methods can be used to administer nucleic acids encoding components of a CR1SPR/Cas9 system to cells in culture, or in a host organism.
  • Non-viral vector delivery systems include DNA plasmids, RNA (including circular RNA), naked nucleic acid, and nucleic acid complexed with a delivery vehicle, such as a liposome, or lipid nanoparticle.
  • Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell.
  • Methods of non-viral delivery of nucleic acids include lipofection, nucleofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycations or lipidmucleic acid conjugates, naked DNA, lipid nanoparticles, lipid- like nanoparticles, artificial virions, and agent-enhanced uptake of DNA.
  • Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386, 4,946,787; and 4,897,355) and lipofection reagents are sold commercially (e.g., TransfectamTM and LipofectinTM).
  • Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Feigner, WO 91/17424; WO 91/16024. Delivery can be to cells (e.g. in vitro or ex vivo administration) or target tissues (e.g. in vivo administration).
  • lipid nucleic acid complexes, including targeted liposomes such as immunolipid complexes
  • the preparation of lipid: nucleic acid complexes, including targeted liposomes such as immunolipid complexes, is well known to one of skill in the art (see, e.g., Crystal, Science 270:404-410 (1995); Blaese et al., Cancer Gene Ther. 2:291-297 (1995): Behr et al., Bioconjugate Chem. 5:382-389 (1994); Remy et al., Bioconjugate Chem. 5:647-654 (1994); Gao et al., Gene Therapy 2:710-722 (1995); Ahmad et al., Cancer Res. 52:4817-4820 (1992); U.S. Pat. Nos. 4,186,183, 4,217,344, 4,235,871, 4,261,975, 4,485,054, 4,501,728, 4,774,085, 4,837,028, and 4,946,787).
  • RNA or DNA viral based systems for the delivery of nucleic acids take advantage of highly evolved processes for targeting a virus to specific cells in the body and trafficking the viral payload to the nucleus.
  • Viral vectors can be administered directly to patients (in vivo) or they can be used to treat cells in vitro, and the modified cells may optionally be administered to patients (ex vivo).
  • Conventional viral based systems could include retroviral, lentivirus, adenoviral, adeno-associated and herpes simplex virus vectors for gene transfer. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.
  • Lentiviral vectors are retroviral vectors that are able to transduce or infect non-dividing cells and typically produce high viral titers. Selection of a retroviral gene transfer system would therefore depend on the target tissue. Retroviral vectors are comprised of cis-acting long terminal repeats with packaging capacity for up to 6-10 kb of foreign sequence. The minimum cis-acting LTRs are sufficient for replication and packaging of the vectors, which are then used to integrate the therapeutic gene into the target cell to provide permanent transgene expression.
  • Widely used retroviral vectors include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus (GaLV), Simian Immuno deficiency virus (SIV), human immuno deficiency virus (HIV), and combinations thereof (see, e.g., Buchscher et al., J. Virol. 66:2731-2739 (1992); Johann et al., J. Virol. 66:1635-1640 (1992); Sommnerfelt et al., Virol. 176:58-59 (1990); Wilson et al., J. Virol. 63:2374-2378 (1989); Miller et al., J. Virol.
  • MiLV murine leukemia virus
  • GaLV gibbon ape leukemia virus
  • SIV Simian Immuno deficiency virus
  • HAV human immuno deficiency virus
  • adenoviral based systems may be used.
  • Adenoviral based vectors are capable of very high transduction efficiency in many cell types and do not require cell division. With such vectors, high titer and levels of expression have been obtained. This vector can be produced in large quantities in a relatively simple system.
  • Adeno- associated virus (“AAV”) vectors may also be used to transduce cells with target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and for in vivo and ex vivo gene therapy procedures (see, e.g., West et al., Virology 160:38-47 (1987); U.S. Pat. No. 4,797,368; WO 93/24641 ; Kotin, Human Gene Therapy 5:793- 801 (1994); Muzyczka, J. Clin. Invest. 94: 1351 (1994). Construction of recombinant AAV vectors are described in a number of publications, including U.S. Pat. No.
  • Packaging cells are typically used to form virus particles that are capable of infecting a host cell. Such cells include 293 cells, which package adenovirus, and W 2 cells or PA317 cells, which package retrovirus.
  • Viral vectors used in gene therapy are usually generated by producing a cell line that packages a nucleic acid vector into a viral particle. The vectors typically contain the minimal viral sequences required for packaging and subsequent integration into a host, other viral sequences being replaced by an expression cassette for the polynucleotide(s) to be expressed. The missing viral functions are typically supplied in trans by the packaging cell line.
  • AAV vectors used in gene therapy typically only possess ITR sequences from the AAV genome which are required for packaging and integration into the host genome.
  • Viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences.
  • the cell line may also be infected with adenovirus as a helper.
  • the helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid.
  • the helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV. Additional methods for the delivery of nucleic acids to cells are known to those skilled in the art. See, for example, US20030087817, incorporated herein by reference.
  • a host cell is transiently or non-transiently transfected with one or more vectors described herein.
  • a cell is transfected as it naturally occurs in a subject.
  • a cell that is transfected is taken from a subject.
  • the cell is derived from cells taken from a subject, such as a cell line. A wide variety of cell lines for tissue culture are known in the art.
  • cell lines include, but are not limited to, C8161, CCRF-CEM, MOLT, mIMCD-3, NHDF, HeLa-S3, Huhl, Huh4, Huh7, HUVEC, HASMC, HEKn, HEKa, MiaPaCell, Panel, PC-3, TF1, CTLL-2, C1R, Ratb, CV1, RPTE, A10, T24, J82, A375, ARH-77, Calul, SW480, SW620, SKOV3, SK-UT, CaCo2, P388D1, SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01 , LRMB, Bcl-1, BC-3, IC21, DLD2, Raw264.7, NRK, NRK-52E, MRC5, MEF, Hep G2, HeLa B, HeLa T4, COS, COS-1, COS-6, C0S-M6A, BS-C-1 monkey kidney epithelial
  • a cell transfected with one or more vectors described herein is used to establish a new cell line comprising one or more vector-derived sequences.
  • a cell transiently transfected with the components of a CRISPR system as described herein (such as by transient transfection of one or more vectors, or transfection with RNA), and modified through the activity of a CRISPR complex, is used to establish a new cell line comprising cells containing the modification but lacking any other exogenous sequence.
  • cells transiently or non-transiently transfected with one or more vectors described herein, or cell lines derived from such cells are used in assessing one or more test compounds.
  • one or more vectors described herein are used to produce a non-human transgenic animal or transgenic plant.
  • the transgenic animal is a mammal, such as a mouse, rat, or rabbit.
  • the organism or subject is a plant.
  • the organism or subject or plant is algae.
  • Methods for producing transgenic plants and animals are known in the art, and generally begin with a method of cell transfection, such as described herein.
  • Transgenic animals are also provided, as are transgenic plants, especially crops and algae. The transgenic animal or plant may be useful in applications outside of providing a disease model.
  • transgenic plants especially pulses and tubers, and animals, especially mammals such as livestock (cows, sheep, goats and pigs), but also poultry and edible insects, are preferred.
  • Transgenic algae or other plants such as rape may be particularly useful in the production of vegetable oils or biofuels such as alcohols (especially methanol and ethanol), for instance. These may be engineered to express or overexpress high levels of oil or alcohols for use in the oil or biofuel industries.
  • Certain embodiments provide a method (e.g., gene editing method), comprising: introducing a system described herein into a cell.
  • the introducing results in disruption, deletion, or insertion of a target nucleic acid (e.g., gene) in the cell.
  • the gene editing results in an increase or decrease in expression of an endogenous or exogenous gene in the cell.
  • the cell is a eukaryotic cell (e.g., a mammalian (e.g., human) cell). In some embodiments, the cell is in vitro, ex vivo, or in vivo. In some embodiments, the method treats a disease or condition in a subject.
  • a eukaryotic cell e.g., a mammalian (e.g., human) cell.
  • the cell is in vitro, ex vivo, or in vivo.
  • the method treats a disease or condition in a subject.
  • the invention provides for methods of modifying a target polynucleotide in a eukaryotic cell, which may be in vivo, ex vivo or in vitro.
  • the method comprises sampling a cell or population of cells from a human or non-human animal or plant (including micro-algae), and modifying the cell or cells. Culturing may occur at any stage ex vivo. The cell or cells may even be re-introduced into the non-human animal or plant (including micro-algae).
  • the target polynucleotide in the cells or cells of a subject in the methods described herein can be any polynucleotide endogenous or exogenous to the cell.
  • the target polynucleotide can be a polynucleotide residing in the nucleus of a eukaryotic cell.
  • the target polynucleotide can be a sequence coding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or a junk DNA).
  • a PAM protospacer adjacent motif
  • PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Persons skilled in the art will be able to identify PAM sequences for use with a given CRISPR enzyme.
  • target polynucleotides include a sequence associated with a signaling biochemical pathway, e.g., a signaling biochemical pathway-associated gene or polynucleotide.
  • target polynucleotides include a disease associated gene or polynucleotide.
  • a “disease-associated” gene or polynucleotide refers to any gene or polynucleotide which is yielding transcription or translation products at an abnormal level or in an abnormal form in cells derived from a disease- affected tissues compared with tissues or cells of a non-disease control.
  • a disease-associated gene also refers to a gene possessing mutation(s) or genetic variation that is directly responsible or is in linkage disequilibrium with a gene(s) that is responsible for the etiology of a disease.
  • the transcribed or translated products may be known or unknown, and may be at a normal or abnormal level.
  • nucleic acid or expression vector may be employed for delivering a nucleic acid or expression vector into cells in vitro.
  • Methods of introducing nucleic acids into cells for expression of heterologous nucleic acid sequences are also known to the ordinarily skilled artisan, including, but not limited to electroporation; protoplast fusion with intact cells; transduction; high velocity bombardment with DNA-coated microprojectiles; infection with modified viral (e.g., phage) nucleic acids; chemically- mediated transformation, competence, etc.
  • the host cells can be cultured in suitable nutrient media.
  • the media can be modified as appropriate for any activating promoters, selecting transformants, and/or amplifying expression of the fusion protein by modifying culture conditions, such as temperature, pH and the like.
  • the invention provides a method for producing the fusion protein in a prokaryotic host cell, comprising culturing the prokaryotic host cell comprising the expression vector in a growth media under conditions suitable for the expression of the fusion protein and isolating the fusion protein.
  • the fusion protein is produced and isolated as described in Example 2.
  • isolated and purified refer to a nucleic acid or polypeptide that is removed from at least one component with which it is associated.
  • the isolated protein is substantially free of other cellular components.
  • the term “substantially free” encompasses preparations of the desired fusion polypeptide having less than about 20% (by dry weight) other proteins (i.e., contaminating protein), less than about 10% other proteins, less than about 5% other proteins, or less than about 1% other proteins.
  • the term "substantially pure" when applied to the fusion proteins or fragments thereof of the present invention means that the proteins are essentially free of other substances to an extent practical and appropriate for their intended use.
  • the proteins are sufficiently pure and are sufficiently free from other biological constituents of the host cells so as to be useful in, for example, protein sequencing, and/or producing pharmaceutical preparations.
  • a culture of the prokaryotic host cells that harbors the expression vector is used to inoculate a growth media.
  • the growth media is not limiting, provided it is suitable for promoting growth of the host cells.
  • the growth media comprises Luria broth.
  • the culture of prokaryotic host cells used to inoculate the growth media is an overnight culture. In some embodiments, the host cells are cultured at 37°C.
  • the prokaryotic host cells are incubated in the growth media for a period of time to achieve a certain density prior to inducing expression of the fusion protein.
  • the prokaryotic host cells are incubated in the growth media until the optical density at 600 nanometers reaches a value between about 0.6 to about 0.8, at which point, expression of the fusion protein is induced by addition of an agent or a change in culture conditions.
  • the inducing agent is IPTG.
  • the change in conditions is a change in temperature. In some embodiments, the change in temperature is an increase in temperature, e.g., to about 42 degrees Celsius.
  • the growth media is cooled following incubation of the cells in the growth media and prior to or subsequent to inducing expression of the fusion protein by the addition of an agent, such as IPTG.
  • IPTG is added to a final concentration in the growth media of about 0.25 mM to about 1 .0 mM. In some embodiments, the concentration of IPTG in the growth media is about 0.5 mM.
  • the growth media is incubated at a temperature of between about 14-24 degrees Celsius, wherein the fusion protein is expressed in the prokaryotic host cells in the presence of the agent.
  • the incubation time for cells in the media following induction of expression of the fusion protein is not necessarily limiting provide a sufficient quantity of the fusion protein is produced.
  • the cells are incubated for about 12 to about 24 hours to allow expression of the fusion protein.
  • the cells are incubated in the growth media for about 18 hours at about 16 degrees Celsius to allow for expression of the fusion protein.
  • the prokaryotic cells are lysed following incubation to make a lysate.
  • the cells are pelleted prior to lysis.
  • the cells can be pelleted by centrifugation.
  • the cells are lysed by sonication.
  • the cells are lysed by addition of a solution that promotes lysis.
  • cellular debris is removed from the lysate, e.g., by centrifugation and/or filtration.
  • the fusion protein can be isolated/purified from the lysate.
  • the proteins are precipitated from the lysate prior to purification.
  • the lysate is subjected to chromatography to isolate and purify the fusion protein, e.g., by passing it through a column or other apparatus or composition that is able to capture the fusion protein.
  • the lysate is passed through a chromatography column that comprises an agent that binds to the fusion protein, e.g., glutathione in the case of GST-tagged proteins or Ni 2+ or Co 2+ in the case of 6x-His tagged proteins.
  • the agent can be immobilized on beads or a resin to aid in the purification.
  • a protease is added to cleave the fusion protein.
  • the protease recognizes a HRV 3C protease recognition site (e.g., SEQ ID NO: 12).
  • the protease is a human rhinovirus (HRV) type 14 3C protease.
  • the human rhinovirus (HRV) type 14 3C protease is fused to GST.
  • the protease is added while the fusion protein is bound to an agent on the column, and the fusion protein is subsequently eluted from the column following cleavage by the protease.
  • the fusion protein eluate is filtered.
  • the fusion protein is further concentrated.
  • glycerol is added (e.g., 20%) to the fusion protein and the fusion protein can be stored for later use under appropriate storage conditions.
  • the invention provides pharmaceutical compositions comprising effective amounts of the nucleic acids, polypeptides, vectors encoding the polypeptides, or the CRISPR/Cas9 system herein in combination with a pharmaceutically acceptable excipient.
  • the composition comprises a delivery system, such as liposomes, lipid nanoparticles or lipid-like nanoparticles for delivery of nucleic acids.
  • the lipid or lipid- like nanoparticles are ionizable.
  • a “pharmaceutically acceptable excipient” is a material that acts in concert with an active ingredient of a medication to impart desirable qualities to a drug intended to be introduced into the body of a subject.
  • the desirable qualities could include enhancing long term stability, acting as a diluent for an active ingredient that must be administered in small amounts, enhancement of therapeutic qualities of an active ingredient, facilitating absorption of an active ingredient into the body, adjusting viscosity, enhancing solubility of an active ingredient, or modifying macroscopic properties of a drug such as flowability or adhesion.
  • compositions can comprise but are not limited to diluents, binders, pH stabilizing agents, disintegrants, surfactants, glidants, dyes, flavoring agents, preservatives, sorbents, sweeteners and lubricants. These materials can take many different forms. See, e.g., Nema, et al., Excipients and their use in injectable products, PDA J. Pharm. Sci. & Tech. 1997, 51(4): 166-171.
  • nucleic acids, vectors (e.g., AAV vectors) and CRISPR/Cas9 system described herein may be incorporated into a vehicle for administration into a patient, such as a human patient suffering from any of the conditions described herein.
  • Pharmaceutical compositions containing nucleic acids such as RNA or vectors, such as viral vectors, that contain a polynucleotide encoding the fusion protein can be prepared using methods known in the art.
  • such compositions can be prepared using, e.g., physiologically acceptable carriers, excipients or stabilizers (Remington’s Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980); incorporated herein by reference), and in a desired form, e.g., in the form of lyophilized formulations or aqueous solutions.
  • nucleic acids or vectors may be prepared in water suitably mixed with one or more excipients, carriers, or diluents.
  • Dispersions may also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations may contain a preservative to prevent the growth of microorganisms.
  • the pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions (described in US 5,466,468, the disclosure of which is incorporated herein by reference).
  • the formulation may be sterile and may be fluid to the extent that easy syringability exists. Formulations may be stable under the conditions of manufacture and storage and may be preserved against the contaminating action of microorganisms, such as bacteria and fungi.
  • the carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and/or vegetable oils.
  • polyol e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • suitable mixtures thereof e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • vegetable oils e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like
  • Proper fluidity may be maintained, for example, by the use of a coating, such as lecithin
  • the prevention of the action of microorganisms can be brought about by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like.
  • isotonic agents for example, sugars or sodium chloride.
  • Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
  • a solution containing a pharmaceutical composition described herein may be suitably buffered, if necessary, and the liquid diluent first rendered isotonic with sufficient saline or glucose.
  • aqueous solutions are especially suitable for intravenous, intramuscular, subcutaneous, and intraperitoneal administration.
  • sterile aqueous media that can be employed will be known to those of skill in the art in light of the present disclosure.
  • compositions described herein may be administered to a subject by a variety of routes, such as local administration, ocular, retinal, intravenous, parenteral, intradermal, transdermal, intramuscular, intranasal, subcutaneous, percutaneous, intratracheal, intraperitoneal, intraarterial, intravascular, inhalation, perfusion, lavage, and oral administration.
  • routes such as local administration, ocular, retinal, intravenous, parenteral, intradermal, transdermal, intramuscular, intranasal, subcutaneous, percutaneous, intratracheal, intraperitoneal, intraarterial, intravascular, inhalation, perfusion, lavage, and oral administration.
  • routes such as local administration, ocular, retinal, intravenous, parenteral, intradermal, transdermal, intramuscular, intranasal, subcutaneous, percutaneous, intratracheal, intraperitoneal, intraarterial, intravascular, inhalation, perfusion, lavage, and oral administration.
  • Treatment may include administration of a composition containing the nucleic acids or vectors (e.g., AAV vectors) described herein in various unit doses.
  • Each unit dose will ordinarily contain a predetermined quantity of the therapeutic composition.
  • the quantity to be administered, and the particular route of administration and formulation, are within the skill of those in the clinical arts.
  • a unit dose need not be administered as a single injection but may include continuous infusion over a set period of time. Dosing may be performed using a syringe pump to control infusion rate in order to minimize damage to the tissue administered.
  • the nucleic acids, fusion proteins, or vectors described herein are provided in the form of a kit or system that optionally comprises one or more guide RNAs (e.g., sgRNAs), and/or a template nucleic acid encoding a sequence of interest.
  • guide RNAs e.g., sgRNAs
  • a kit comprises one or more reagents for use in a process utilizing one or more of the elements described herein.
  • Reagents may be provided in any suitable container.
  • a kit may provide one or more reaction or storage buffers.
  • Reagents may be provided in a form that is usable in a particular assay, or in a form that requires addition of one or more other components before use (e.g. in concentrate or lyophilized form).
  • a buffer can be any buffer, including but not limited to a sodium carbonate buffer, a sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer, and combinations thereof.
  • the buffer is alkaline.
  • the buffer has a pH from about 7 to about 10.
  • the kit comprises one or more oligonucleotides corresponding to a guide sequence for insertion into a vector so as to operably link the guide sequence and a regulatory element.
  • the kit comprises a homologous recombination template polynucleotide.
  • a prokaryotic expression vector comprising a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises i) Cas9 or a variant or fragment thereof; ii) Rad 18 or a variant or fragment thereof; and iii) CtIP or a variant or fragment thereof.
  • HDR Enhancing N-terminal fragment of CtIP comprises an amino acid sequence at least 90% identical to SEQ ID NO:3 (amino acids 1-296).
  • a nucleic acid comprising a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtlP or a variant or fragment thereof.
  • nucleic acid of any of paragraphs 56-62, wherein the fusion protein comprises an amino acid sequence at least about 90% identical to SEQ ID NO: 17.
  • nucleic acid of any of paragraphs 79-87, wherein the fusion protein is encoded by a nucleotide sequence comprising SEQ ID NO:20 or SEQ ID NO:42.
  • nucleic acid of any of paragraphs 79-87, wherein the fusion protein is encoded by a nucleotide sequence at least about 60% identical to SEQ ID NO:20 or SEQ ID NO:42.
  • Cas9 targets genomic loci with high specificity. For knockin with doublestrand break repair, however, Cas9 often leads to unintended on-target knockout rather than intended edits. This imprecision is a barrier for direct in vivo editing where clonal selection is not feasible.
  • This example demonstrates a high-throughput workflow to comparatively assess on-target efficiency and precision of editing outcomes. Using this workflow, we screened combinations of donor DNA and Cas9 variants, as well as fusions to DNA repair proteins. This yielded novel high-performance double-strand break repair editing agents and combinatorial optimizations yielding orders-of- magnitude increases in knockin precision, increased knockin performance in vitro and in vivo in the developing mouse brain. Continued comparative assessment of editing efficiency and precision with this framework will further the development of high- performance editing agents for in vivo knockin and future genome therapeutics. Results
  • the genotype of the BFP + population matched that of the WT BFP sequence.
  • the dark population exhibited a complex mixture of sequence results in the vicinity of the BFP gRNA cleavage site, representing unintended on-target edits due to imprecise repair.
  • Sequencing decomposition using the ICE algorithm 15 on amplicons from the dark sorted population revealed a predominance of deleterious indels (79% frameshift vs. 11% in-frame), in line with a loss of fluorescence due to knockout (Brinkman et al., Nucleic Acids Res., (2014), 42:el68).
  • the GFP + population exhibited a single genotype containing the desired H26Y point mutation knockin from the donor template (Fig. 1C-D).
  • the sequencing results matched the fluorescence surrogates, thus validating this platform for high-throughput ratiometric screening of knockin agents.
  • knockin donors were provided on plasmids with the knockin sequence flanked by ⁇ 800bp homology arms, the length of which did not significantly affect results within a range of 500 to 1500 bp (data not shown).
  • the HMEJ donor differed from the HR donor by the insertion of gRNA Binding Sites (GRBS) flanking the homology arms, which are cleaved by Cas9 to create linear dsDNA donors in cells. The orientation of the GRBSs did not have significant effects on editing performance.
  • GRBS gRNA Binding Sites
  • Cas9 HF variants were generally not significantly different from Cas9 WT , although Cas9 HF -CtIP showed a 1.6-fold improvement in knockin efficiency specifically with the HR donor (Fig. 2D). Knockout rates did not significantly differ among the Cas9 variants (Fig. 2C), and thus, the KI/KO ratios (knockin precision) mirror the differences seen in knockin efficiency (Fig. 2D).
  • Double fusions on Cas9 improve editing performance
  • dn53BPl-, TIP60-, and RNF169- fused Cas9 did not significantly differ from the Cas9-only control. Rad52 and eRad 18 fusion, however, showed 18% and 28% reductions in knockout frequency, respectively (Fig. 3B-D).
  • compound fusion of each of the five DNA repair proteins with CtIP led to significant reductions in the knockout rate, with eRadl8, Rad52, and TIP60 showing the most pronounced decreases (45%, 38% and 38%, respectively).
  • eRadl8 led to significant improvements in overall knockin precision.
  • Fig. 5A To test the knockin efficiency of Cas9-RC in vivo, we used in utero plasmid electroporation in the embryonic mouse brain (Fig. 5A) targeting integration of a 2A mCherry cassette at the 3’ end of the endogenous [3- Actin (ActB) locus (Fig. 5B) (Saito etal., Dev. Biol., (240), 237-246, (2001 ); Mikuni et al., Cell, (2016), 165:1803-1817).
  • a combination of four plasmids containing Cas9 or Cas9-RC, HMEJ donor with the 2A mCherry knockin, ActB gRNA, and a GFP transfection marker were electroporated into embryonic day 14.5 (E14.5) wild-type mice targeting progenitors of projection neurons of sensorimotor cortex.
  • E7 postnatal day 7
  • electroporation of Cas9-RC led to an increase in mCherry + knockin neurons compared to Cas9 (Fig. 5C).
  • HMEJ donor and gRNA constructs to fuse the fluorescent protein mScarlet onto the N-terminus of the GPI-linked membrane protein Neuronal growth regulator 1 (Negri), a protein with variable expression in the mouse brain (Miyata et al., Neuroscience, (2003), 117:645-658). Knockin efficiency was significantly lower than actin knockin. Lower efficiencies may result from the fact that, unlike actin, not all knockin cells will express Negri, or because many cells express it at levels below our detection threshold. When comparing Cas9-RC efficiency to Cas9, there was no significant difference in knockin efficiency (Fig. 6C). This may indicate that increases of Cas9-RC performance in vivo may be locusspecific.
  • Cas9-RC with tritrode electroporation and supercoiled constructs to target Purkinje cells in the mouse embryonic cerebellum, as a case to examine a difficult to transduce cell type that we were not able to knock in with Cas9.
  • HMEJ donor and gRNA constructs to knock in the fluorescent protein mGreenLantem downstream of the Pvalb locus, expressed by parvalbumin+ Purkinje cells. While efficiency was low, we consistently detected sparse knockin parvalbumin+ Purkinje cells (Fig. 6D).
  • Negri did not show increases in efficiency of Cas9-RC over Cas9, possibly due to the increased size of Cas9-RC resulting in the targeting of fewer cells.
  • Knockin on a highly expressed cell-type- specific locus allowed us to demonstrate the use of Cas9-RC on Purkinje cells, a highly differentiated cell type.
  • optimal agents can be selected based on the experimental need. For example, with ex vivo editing, one might prioritize efficiency if there are facile methods for post hoc selection of properly edited cells, whereas precision may be prioritized for contexts where edited cells cannot be selected, such as with in vivo editing.
  • Cas9-RC high- performance DSB repair editor
  • Cas9-RC When paired with HMEJ donor templates, Cas9-RC outperformed other knockin agents by over 30-fold in human cells and showed potential for 3 -fold increases in the mouse brain, albeit not at all loci tested.
  • Cas9-RC enables high performance for large genomic edits, such as fluorescent protein knockin.
  • This complements parallel developments in base editors and Prime editing, which offer high performance but are limited to smaller edits.
  • a diverse toolkit of precision editors will be useful to broaden the scope of in vivo editing applications. As presented in our study, having standardized platforms for quantitative comparison of new tools and novel combinations will further support efforts towards precision in vivo editing for both basic research and the development of future human therapeutics.
  • Cas9-RC is a new DSB repair genome editor demonstrating enhanced knockin performance in vitro and in vivo.
  • Mammalian expression plasmids and knockin donor template plasmids were constructed with a combination of standard cloning techniques.
  • oligos Integrated DNA Technologies
  • GGA Golden Gate Assembly
  • Cas9 expression constructs were assembled by a modified mMoClo system (Duportet et al., Nucleic Acids Res., (2014), 42:13440-13451).
  • CD1 mouse pups were euthanized via decapitation at postnatal day 0.
  • the skin was sterilized and removed from the pup’s back using sterile surgical tools. Skins were placed dermis-side down on cold 0.25% Trypsin with EDTA (Invitrogen) and incubated at 4°C overnight.
  • the epidermis was separated from the dermis in a sterile hood. Dermis was minced with a razor blade and triturated in warm 10% FBS lx GlutaMAX DMEM using a glass pipette 10-20 times to separate individual cells. The suspension was then transferred to a 50 mL conical tube and centrifuged at 150 g.
  • the cell pellet was resupsended in 10% FBS lx Glutamax DMEM and filtered through a 100 pm cell strainer (BD Biosciences). Cells were counted using a hemocytometer and cell viability was estimated using Trypan Blue (Sigma). Approximately 4-5xl0 6 cells were used for each electroporation. Cells were centrifuged at 150 g and resuspended in 100 pl AM AXA Nucleofection solution (Lonza) at the proper concentration and combined with 1-3 pg of desired DNA mixture in a cuvette. The cuvette was electroporated with the AMAXA biosystems Nucelofector II (Lonza) using the manufacturers settings for Mouse Embryonic Fibroblasts.
  • the solution was immediately transferred to 12- well glass bottom plates (#1.5H; Cellvis) that were pretreated with poly-L-lysine (Sigma Aldrich) diluted 1 :12 in sterile PBS the night before, containing prewarmed sterile-filtered Dulbecco’s Modified Eagle Medium (DMEM; ThermoFisher) supplemented with 10% Fetal Bovine Serum (FBS; Gibco) and lx GlutaMAX (Gibco) at the desired density and incubated at 37°C/5% CO2. Half volume fresh medium was exchanged the next day.
  • DMEM Modified Eagle Medium
  • FBS Fetal Bovine Serum
  • Gibco Fetal Bovine Serum
  • Gibco lx GlutaMAX
  • Electroporations of plasmid DNA were performed in utero on embryonic day 14.5 (E14.5) to target cortical layer II/III, as previously described (Saito et al., Dev. Biol., (240), 237-246, (2001); Poulopoulos et al., Nature, (2019), 565:356-360.).
  • the triple electrode in utero electroporation approach was utilized (dal Maschio et al., Nat. Commun., (2012), 3:960; Szczurkowska, J. et al., Nat. Protoc., (2016), 11:399-412).
  • DNA solutions were prepared to 4 pg/pL total DNA, with 1 pg/pL of each of the relevant plasmids (donor, guide, Cas9, and fluorescent protein). Symptoms were deeply anesthetized with isoflurane under a vaporizer with thermal support (Patterson Scientific Link7 & Heat Therapy Pump HTP-1500). The abdominal area was prepared for surgery with hair removal, surgical scrub, and 70% ethanol and 10% Betadine solution. A midline incision was made to expose the uterine horns.
  • DNA solution was injected into one lateral brain ventricle for cerebral cortex electroporation at E14.5, or in the 4th ventricle for cerebellar Pukinje cell electroporation at El 1.5.
  • 4 x 50 ms square pulses of 35 V were applied to target the nascent sensorimotor areas of the cortical plate.
  • 6 x 50 ms square pulses at 35V was performed at E14.5 and 25 V at El 1.5.
  • 4-6 pups were electroporated per dame.
  • Uterine horns were placed back inside the abdominal cavity, and monofilament nylon sutures (AngioTech) were used to close muscle and skin incisions.
  • electroporated mouse pups were non-invasively screened for unilateral cortical or cerebellar fluorescence using a fluorescence stereoscope (Leica MZIOf with X-Cite FIRE LED light source) and returned to their dame until postnatal day 7 (P7) or P14.
  • fluorescence stereoscope Leica MZIOf with X-Cite FIRE LED light source
  • Tissue was prepared by intracardial perfusion with PBS and 4% paraformaldehyde. Brains were cut to 80 pm coronal sections on a vibrating microtome (Leica VT1000). Sections were immunolabeled in blocking solution consisting of 5% bovine serum albumin and 0.2% Triton X-100 in PBS for 30 minutes, then incubated overnight at 4°C with primary antibodies diluted in blocking solution. Sections were washed in PBS, incubated for 3-4h at room temperature with secondary antibodies diluted 1 :400-l : 1000 in blocking solution. Following PBS washes, sections were mounted on slides with Fluoromount- G Mounting Medium with DAPI (ThermoFisher Scientific).
  • Fluorescence images were acquired using a Nikon Ti2-E inverted microscope fitted with an automated registered linear motor stage (HLD117, Pior Scientific), a Spectra-X 7 channel LED light engine (Lumencor), and standard filter sets for DAPI, FITC, TRITC, and Cy5. Images were stitched and analyzed with NIS-Elements (Nikon) using an automated script to identify and count electroporated cells in brain sections. Knockin-positive neurons were counted manually using ImageJ (NIH) and independently by at least two blinded investigators. Five 80 pm sections, centered at the middle of the anteroposterior axis of the electroporation field and taken every other section, were analyzed per brain, with counts aggregated across sections from the same brain.
  • NIS-Elements NIS-Elements
  • Example 2 Identifying protein domains that add their enhancements to editing performance even when fused to Cas9 enables the production of a single fusion protein for use in ribonucleoprotein (RNP) editing applications, without the need for expression or delivery of these as additional co-factors.
  • RNP ribonucleoprotein
  • IPTG-specific Cas9 immunoreactive bands corresponding to the predicted electrophoretic mobilities of full-length GST-MBP-Cas9-RC (316 kDa) and Cas9-RC (250 kDa) were abundant in lysates, indicating that Cas9-RC can be bacterially expressed for RNP applications.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • General Engineering & Computer Science (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Dermatology (AREA)
  • Nanotechnology (AREA)
  • Optics & Photonics (AREA)
  • Dispersion Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Cell Biology (AREA)
  • Mycology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present invention relates to nucleic acids comprising a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad18 or Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, vectors encoding the same, pharmaceutical compositions thereof and methods of modifying a genomic sequence in cells by administering the compositions.

Description

COMPOSITIONS AND METHODS FOR ENHANCED GENOME EDITING USING CAS9 FUSION PROTEINS
CROSS REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Application No.: 63/611,473, filed December 18, 2023 and U.S. Application No.: 63/694,112, filed September 12, 2024, the contents of which are hereby incorporated by reference in their entireties.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH AND
DEVELOPMENT
This invention was made with government support under Grant Number MH122398 awarded by the National Institutes of Health. The government has certain rights in the invention.
INCORPORATION-BY-REFERENCE OF MATERIAL SUBMITTED ELECTRONICALLY
Incorporated by reference in its entirety herein is a computer-readable sequence listing submitted concurrently herewith and identified as follows: One 123,489 Byte XML file named “sequence_listing.xml,” created on December 18, 2024.
FIELD OF THE INVENTION
The field of the invention generally relates to biotechnology and medicine, in particular compositions and methods for genome editing.
BACKGROUND
Genome editing is a powerful technology that allows for the specific and often precise addition or removal of genetic material. Genome editing is initiated by making double stranded DNA breaks in the target cell. These double stranded DNA breaks can be created by several methods including meganucleases, Zine-Finger Nucleases, TALE-nucleases, and/or the CRISPR/Cas9 restriction modification system. Each of these systems creates a dsDNA break at a user designated genomic location. After the creation of the dsDNA break, the cellular machinery acts quickly to repair this dsDNA using either by the non-homologous end joining (NHEJ) pathway or by homologous recombination (HDR). While the NHEJ pathway efficiently repairs this break, repair is frequently imperfect resulting in insertions and deletions. If these insertions and deletions created by NHEJ repair occur within open reading frames, the most common result is a frame-shift mutation. This frame shift often results in the inactivation of that particular gene. Repair of the dsDNA break by HDR pathway not only can result in precise repair but also allows for the introduction of experimentally designed genomic elements. The correction of many diseases, successful gene therapy, can be achieved by forcing the cell to correct the dsDNA break using HDR. Unfortunately for gene therapy researchers, clinicians, and patients, most human cells strongly prefer to correct dsDNA breaks the error-prone NHEJ pathway as opposed to the more precise HDR pathway. Using endogenous cellular machinery 95% of dsDNA breaks are repaired using NHEJ, while only 5% of dsDNA breaks are repaired using HDR. This statistic represents the best-case scenario; many cell types lack HDR machinery altogether resulting in no repair using the precise HDR pathway. For precise gene therapy to be successful, a cell’s ability to use the HDR pathway must be improved.
What is needed are improved compositions and methods that improve the efficiency and precision of genome editing.
SUMMARY
It is to be understood that both the foregoing general description of the embodiments and the following detailed description are exemplary, and thus do not restrict the scope of the embodiments.
Cas9 targets genomic loci with high specificity. For knockin with doublestrand break repair, however, Cas9 often leads to unintended on-target knockout rather than intended edits. This imprecision is a barrier for direct in vivo editing where clonal selection is not feasible. The present inventors disclose a high-throughput workflow to quantify editing outcomes for creating and identifying editing agents with increased performance and their optimal combinations for knockin applications. The inventors have established editing efficiency and precision as generalizable assessment metrics for comparisons of knockin performance across existing and novel agents. Using this platform, the inventors aimed to enhance DSB repair-based editing performance by combinatorial screens of Cas9 variants, DNA donors, and new compound fusions to DNA repair protein domains. This workflow yielded Cas9-RC, a high-performance DSB repair-based editing agent with increased editing efficiency and precision. Cas9- RC was tested for its editing performance in vivo in the embryonic mouse brain, where it enhanced fluorescent protein knockin in some cases by in utero electroporation. These improvements showcase the utility of this workflow for continued development and assessment of new precision editing tools for in vivo knockin applications, including expression vectors, host cells and methods of producing the recombinant Cas9 fusion protein.
In one aspect, the invention provides a nucleic acid comprising a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In another aspect, the invention provides a nucleic acid comprising a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In some embodiments, Rad 18 or a variant or fragment thereof is fused to the amino terminus of Cas9 or a variant or fragment thereof and CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9 or a variant or fragment thereof.
In some embodiments, a Radi 8 fragment is fused to the amino terminus of Cas9 or a variant or fragment thereof and a HDR Enhancing N-terminal fragment of CtIP is fused to the carboxy terminus of Cas9 or a variant or fragment thereof.
In some embodiments, the fusion protein comprises an amino acid sequence of SEQ ID NO:7.
In some embodiments, the invention provides a nucleic acid sequence that encodes a polypeptide comprising SEQ ID NO:7. In some embodiments, the nucleic acid sequence of the fusion protein comprises SEQ ID NO: 8.
In some embodiments, the Cas9 is a wild-type Cas9 from Streptococcus pyogenes. In some embodiments, the fusion protein comprises a fragment of Cas9 that lacks an N-terminal methionine. In some embodiments, the Cas9 or fragment or variant thereof comprises a nucleic acid sequence encoding a polypeptide comprising SEQ ID NO: 1. In some embodiments, the Cas9 or fragment or variant thereof nucleic acid sequence comprises SEQ ID NO:2 (DNA) or SEQ ID NO:36 (RNA). In some embodiments, Cas9 or fragment or variant thereof is a high fidelity variant (Cas9-HF). In some embodiments, the fusion protein comprises a fragment of CtIP. In some embodiments, the fusion protein comprises the HDR Enhancing N-terminal fragment of CtIP, comprising SEQ ID NO:3 (amino acids 1-296). In some embodiments, the CtIP fragment nucleic acid sequence comprises SEQ ID NO:4 (DNA) or SEQ ID NO:38 (RNA). In some embodiments, the CtIP, variant or fragment lacks an N-terminal methionine.
In some embodiments, the fusion protein comprises a variant or fragment of Rad 18. In some embodiments, the fusion protein comprises a fragment of Radi 8, which contains a deletion of a putative DNA-binding domain. In some embodiments, this enhanced version of Radi 8 (also referred to herein as eRadl8) has a deletion in amino acids 242-282 in Radl8 and comprises SEQ ID NO:5. In some embodiments, the eRadl8 nucleic acid sequence comprises SEQ ID NO:6 (DNA) or SEQ ID NO:40 (RNA). In some embodiments, the Radi 8, variant or fragment lacks an N-terminal methionine.
In some embodiments, the nucleic acid is selected from DNA or RNA. In some embodiments, the nucleic acid is RNA. In some embodiments, the RNA is circular RNA (circRNA).
In some embodiments, the nucleic acid encoding the fusion protein comprises a vector. In some embodiments, the vector is a viral vector. In some embodiments, the vector is a mammalian expression vector.
In another aspect, the invention provides a host cell transformed or transfected with a vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In another aspect, the invention provides a host cell transformed or transfected with a vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In some embodiments, the invention provides a pharmaceutical composition comprising a nucleic acid encoding the fusion protein. In some embodiments, the composition comprises a delivery system, such as lipid nanoparticles or lipid-like nanoparticles. In some embodiments, the lipid or lipid-like nanoparticles are ionizable.
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CUP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the invention provides a method of treating a disease of condition in a subject by administering to the subject an effective amount of the modified cells.
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify a genomic sequence in the subject. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the present invention provides compositions and methods for expression and purification of Cas9 fusion proteins and variants thereof in prokaryotic host cells.
In some embodiments, the invention provides a prokaryotic expression vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In another aspect, the invention provides a prokaryotic host cell transformed or transfected with a prokaryotic expression vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In some embodiments, the invention provides a prokaryotic expression vector encoding a fusion protein, wherein the fusion protein comprises Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In another aspect, the invention provides a method for producing the fusion protein in a prokaryotic host cell, comprising culturing the prokaryotic host cell in a growth media under conditions suitable for the expression of the fusion protein and isolating the fusion protein.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
The skilled artisan will understand that the drawings, described below, are for illustration purposes only. The drawings are not intended to limit the scope of the present teachings in any way. FIG. 1. BFP-to-GFP editing as a platform to quantify knockin efficiency and precision. (A) Schematic of BFP-to-GFP conversion assay screen. Genome editing agents are tested by transfecting HEK:BFP cells, a HEK293 knockin cell line that genomically expresses BFP driven by an EFla promoter. Successful editing knocks in the H62Y mutation into the BFP locus, thus producing GFP. Precisely edited cells will change from blue to green, while imprecise edits on the BFP locus will largely result in indel knockouts and disrupt fluorescence, turning those cells from blue to dark. Quantification of knockin efficiency is calculated by the proportion of GFP+ cells over total, and knockin precision is calculated by the proportion of GFP+ cells over dark cells. (B) Schematic outlining sorting and sequencing of cells treated with agents to edit the BFP locus. Fluorescence plot shows FACS isolation of distinct cell populations following treatment for H62Y editing. GFP fluorescence is shown on the x-axis (log), and BFP fluorescence on the y-axis (log). Blue, green, and dark collection gates are indicated. Genotyping of the three collected populations was performed by PCR of the BFP locus from genomic DNA, followed by Sanger sequencing of amplified fragment pools. (C) Alignment of Sanger sequencing reads from BFP+, dark, and GFP+ cell populations to the reference BFP locus sequence (top), showing wild-type, knockout (indel), and knockin genotypes, respectively. (D) Editing outcomes (indels, knockin frequency) were quantified by decomposition of Sanger sequencing reads using the ICE algorithm and plotted on relative histograms binned by indel size. GFP+ cells represent true knockins, and over 90% of dark cells contain indels predicted to cause knockout.
FIG. 2. Cas9-CtIP fusion and HMEJ donors additively improve knockin precision. (A) Schema illustrating the combinatorial parameters of editing agents: dsDNA donor template HR or HMEJ variants in combination with Cas9 WT or HiFi variants, alone or fused to CtIP. (B) Flow cytometry plots of HEK:BFP cells 7 days after transient transfection with gRNA, Cas9, and donor plasmids. GFP fluorescence is shown on the x-axis (log), and BFP fluorescence on the y-axis (log). Quantification gates are indicated on the plots for BFP+ (WT), GFP+ (KI), and dark (KO) cells. (D) Quantification of flow data representing normalized knockin efficiency (%GFP+), knockout frequency (% dark), and knockin precision. Cas9 variants with HR (top row, yellow) or HMEJ (bottom row purple) donors were measured. Mean values from individual experiments (n >3) were normalized to those of the Cas9 WT with HR donor condition and presented as the mean ± SEM. (C) Heatmaps displaying statistical significance of pairwise comparisons of editing agent performance, calculated using one-way ANOVA with Tukey’s multiple comparison test and single pooled variance (* P<0.05; ** P<0.01; *** P < 0.001). Cas9-CtIP fusion with HMEJ donor outperforms Cas9 with HR donor in knockin precision by over 30-fold.
FIG. 3. Iterative screening of novel Cas9 fusions and compound fusions with DNA-repair domains for increased editing performance. (A) Schema of the fusion screen. Candidate DNA-repair protein domains are fused N-terminally to either Cas9 alone or Cas9-CtIP and together with HMEJ donor assayed by BFP-to-GFP for knockin efficiency and precision. (B) Bar graphs showing quantification of relative knockin efficiency, knockout frequency, and knockin precision for Cas9 fusion (light blue) or Cas9-CtIP compound fusion (dark blue) with the listed DNA-repair protein domains. Values from individual experiments (n > 3) were normalized to Cas9 only and presented as the mean + SEM. (C) 2D editing performance plot comparing relative knockin efficiency and precision of Cas9 fusions (light blue) or Cas9-CtlP compound fusions (dark blue). Points with yellow dashed lines projecting to the axes correspond to Cas9 alone (normalization reference) and Cas9-CtIP. (D) Heatmaps showing statistical significance calculated using one-way ANOVA with Tukey’s multiple comparison test and single pooled variance (* P<0.05; ** P<0.01; *** P < 0.001). Data of replicate experiments are shown in Fig. 9. (E) Schema illustrating comparative performance outcomes: novel fusion eRadl8-Cas9 has equivalent performance to Cas9-CtIP. Novel compound fusion eRadl8-Cas9-CtIP, henceforth named Cas9-RC, outperforms single fusions.
FIG. 4. Expression of affinity-tagged Cas9-RC ribonucleoprotein. (A) Schema of the Cas9-RC ribonucleoprotein N-terminally fused with cleavable tandem GST and MBP affinity tags. The molecular weights of each domain (kDa) and an HRV 3C protease cleavage site (red triangle) are indicated. (B) Protein electrophoresis gel stained with Coomassie blue showing lysates from bacteria transformed with GST-MBP-Cas9-RC expression construct prior to (left lane) and after inducing expression with IPTG for 3h (right lane). Arrows indicate putative GST-MBP- Cas9RC (316 kDA) and Cas9-RC (250 kDa) bands. (C) Anti-Cas9 immunoblot of lysate lanes as in (b) shows Cas9 immunoreactive bands after IPTG induction, including bands with electrophoretic mobilities corresponding to full-length GST- MBP-Cas9RC and Cas9-RC, as indicated.
FIG. 5. Cas9-RC enhances knockin efficiency in vivo. (A) Schema of the Cas9 and Cas9-RC agents compared for in vivo editing using in utero electroporation in the fetal mouse brain at embryonic day (E) 14.5, and analysis in the cerebral cortex at postnatal day (P) 7. (B) Gene editing donor template DNA targets the endogenous ActB locus to insert mCherry downstream of the P-actin coding sequence separated by a 2A. Knockin efficiency was quantified as the number of mCherry-i- (knockin) neurons over the number of GFP+ (electroporated) neurons. (C) Representative fluorescence images of the cerebral cortex at P7 with electroporated neurons receiving Cas9 or Cas9-RC. Images show plasmid GFP (green) and genomic ActB-2A-mCherry (red) expression. Scale bar 100 pm. (D) Swarm plots showing quantification of in vivo knockin efficiency for Cas9 vs. Cas9-RC and effect size estimation. Points show means from each brain and are plotted on the left axis for both groups indicating knockin efficiency. The effect size on knockin efficiency of Cas9-RC versus Cas9 is plotted as a distribution on a floating axis on the right indicating standard deviations (S). The effect size estimated by unpaired Cohen's d between Cas9 and Cas9-RC is 2.0 S (black dot), indicating a large effect size. The 95% confidence interval is 0.739 to 3.97 (vertical error bar). The P value is 0.0022.
FIG. 6. Knockin applications with Cas9-RC. (A) Primary fibroblasts from wildtype newborn mice were cuvette electroporated (nucleofected, schema on the right) with Cas9-RC plasmid and ActB-2A-mCherry donor DNA. Fibroblasts show two intensities of mCherry fluorescence (red) from the mouse -Actin locus, indicating mCherry knockin on one (arrowheads) or both (arrows) ActB alleles. (B) Cortical projection neurons (arrows) and astrocytes (arrowheads) are knocked in with Cas9-RC plasmid and ActB-2A-mCherry donor DNA in utero as in Fig. 4, but with high-yield tritrode in utero electroporation. Increasing the efficiency of electroporation is a significant practical determinant of knockin due to the large size of Cas9-RC. (C) mScarlet- fusion knockin onto Negri, which represents a locus with developmentally regulated levels of expression that vary between neurons. Cas9-RC and mScarlet-Negrl donor DNA were in utero electroporated in E14.5 mouse brain to extracellularly tag the endogenous Negri GPI-linked membrane protein with mScarlet. Knockin cells displayed variable levels of mScarlet fluorescence (red) with a wide range of high (arrows) and low (arrowheads) expressing cells. Quantification (shown on the right) of mScarlet-positive cells over GFP electroporated cells (green) showed no significant difference when using Cas9-RC over Cas9. (D) mGreenLantem knockin in cerebellar Purkinje cells, which represent a low efficiency cell target. Cas9-RC (red) and Pvalb-2A-mGreenLantern donor DNA were delivered with high- yield tritrode electroporation into the fourth brain ventricle in El 1 .5 embryos (schema on the right) to knock in mGreenLantem (green) onto the Pvalb locus. Endogenous PVALB protein was co-labeled (grey). Electroporated knockin positive (arrow) and knockin negative (arrowhead) cells can be seen in the Purkinje cell layer in P14 cerebellum. Cas9-RC has versatile applications and may offer advantages over the disadvantage of its large size depending on the target gene and cell type. Color labels and scale bars as indicated.
FIG. 7. Map of bacterial expression vector encoding Cas9-RC.
FIG. 8. Gating strategy for flow cytometry analysis of editing in HEK:BFP cells. Live cells were first gated by size and granularity using FSC-A vs SSC-A and then singlets were gated using SSC-A vs SSC-H.
FIG. 9. Biological replicate data and statistical analysis related to Fig. 3. (a- c) Quantification of flow cytometry data for biological replicates from HEK:BFP cells 7 days after transient transfection indicating (a) KI efficiency (% GFP+) (b) KO frequency (% dark) and (c) KI precision (KI/KO ratio) for Cas9 variants with HMEJ donor, (d-f) Heatmap matrices showing statistical significance calculated using a oneway ANOVA with Tukey’s multiple comparison test and pooled variance (also shown in Fig. 3). Differences between conditions were judged to be significant at P < 0.05 (*), P < 0.01 (**), and P < 0.001 (***).
FIG. 10. Expression of affinity-tagged Cas9-RC ribonucleoprotein: A) Protein electrophoresis gel stained with Coomassie blue showing lysates from bacteria transformed with GST-MBP-Cas9-RC expression construct prior to (left lane) and after inducing expression with IPTG for 3h (right lane). Arrows indicate putative GST-MBP-Cas9RC (316 kDA) and Cas9-RC (250 kDa) bands. (B) Anti-Cas9 immunoblot of lysate lanes shows Cas9 immunoreactive bands after IPTG induction, including bands with electrophoretic mobilities corresponding to full-length GST- MBP-Cas9RC and Cas9-RC, as indicated. (C) Schema of the affinity-tagged Cas9- RC ribonucleoprotein, N-terminally fused with cleavable tandem GST and MBP affinity tags. The molecular weights of each domain (kDa) and an HRV 3C protease cleavage site (red triangle) are indicated.
DETAILED DESCRIPTION OF THE INVENTION
The present invention generally relates to compositions and methods for modifying a genomic sequence of a cell, including nucleic acids and polypeptides encoding fusion proteins comprising Cas9 or a variant or fragment thereof, Rad52 or Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject.
Reference will now be made in detail to the presently preferred embodiments of the invention which, together with the drawings and the following examples, serve to explain the principles of the invention. These embodiments describe in sufficient detail to enable those skilled in the art to practice the invention, and it is understood that other embodiments may be utilized, and that structural, biological, and chemical changes may be made without departing from the spirit and scope of the present invention. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art.
The practice of the present invention employs techniques of molecular biology (including recombinant techniques), microbiology, cell biology, and biochemistry which are within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook et al. Molecular Cloning: A Laboratory Manual, 2nd edition (1989); Current Protocols in Molecular Biology (F. M. Ausubel et al. eds. (1987)); the series Methods in Enzymology (Academic Press, Inc.); PCR: A Practical Approach (M. MacPherson et al. IRL Press at Oxford University Press (1991)); PCR 2: A Practical Approach (M. I. MacPherson, B. D. Hames and G. R. Taylor eds. (1995)); Antibodies, A Laboratory Manual (Harlow and Lane eds. (1988)); Using Antibodies, A Laboratory Manual (Harlow and Lane eds. (1999)); and Animal Cell Culture (R. I. Freshney ed. (1987)).
Definitions of common terms in molecular biology may be found, for example, in Benjamin Lewin, Genes VII, published by Oxford University Press, 2000 (ISBN 019879276X) Kendrew et al. (eds.); The Encyclopedia of Molecular Biology, published by Blackwell Publishers, 1994 (ISBN 0632021829); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by Wiley, John & Sons, Inc., 1995 (ISBN 0471186341).
For the purpose of interpreting this specification, the following definitions will apply and whenever appropriate, terms used in the singular will also include the plural and vice versa. In the event that any definition set forth below conflicts with the usage of that word in any other document, including any document incorporated herein by reference, the definition set forth below shall always control for purposes of interpreting this specification and its associated claims unless a contrary meaning is clearly intended (for example in the document where the term is originally used). The use of "or" means "and/or" unless stated otherwise. As used in the specification and claims, the singular form "a," "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a cell" includes a plurality of cells, including mixtures thereof. The use of “comprise,” “comprises,” “comprising,” “include,” “includes,” and “including” are interchangeable and not intended to be limiting. Furthermore, where the description of one or more embodiments uses the term “comprising,” those skilled in the art would understand that, in some specific instances, the embodiment or embodiments can be alternatively described using the language “consisting essentially of” and/or “consisting of.”
As used herein, the term "about" means plus or minus 10% of the numerical value of the number with which it is being used.
The terms "nucleic acid," and "polynucleotide," are used interchangeably and refer to a deoxyribonucleotide or ribonucleotide polymer, in linear or circular conformation, and in either single- or double-stranded form. For the purposes of the present disclosure, these terms are not to be construed as limiting with respect to the length of a polymer. The terms can encompass known analogues of natural nucleotides, as well as nucleotides that are modified in the base, sugar and/or phosphate moieties. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by nonnucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
In aspects of the invention the terms “chimeric RNA”, “chimeric guide RNA”, “guide RNA”, “single guide RNA” and “synthetic guide RNA” are used interchangeably and refer to the polynucleotide sequence comprising the guide sequence, the tracr sequence and the tracr mate sequence. The term “guide sequence” refers to the about 20 bp sequence within the guide RNA that specifies the target site and may be used interchangeably with the terms “guide” or “spacer”. The term “tracr mate sequence” may also be used interchangeably with the term “direct repeat(s)”. Exemplary CRISPR-Cas system are provided in U.S. Pat. No. 8,697,359 and US 20140234972, both of which are incorporated herein by reference in their entirety.
The terms "polypeptide," "peptide" and "protein" are used interchangeably to refer to a polymer of amino acid residues. The term also applies to amino acid polymers in which one or more amino acids are chemical analogues or modified derivatives of corresponding naturally-occurring amino acids. The term "sequence" relates to a nucleotide sequence of any length, which can be DNA or RNA; can be linear, circular or branched and can be either single-stranded or double stranded.
The term "identity" relates to an exact nucleotide- to-nucleotide or amino acid-to-amino acid correspondence of two polynucleotides or polypeptide sequences, respectively. Two or more sequences (polynucleotide or amino acid) can be compared by determining their percent identity. Calculations of homology or sequence identity between two sequences (the terms are used interchangeably herein) are performed as follows. The sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). The optimal alignment is determined as the best score using the GAP program in the GCG software package with a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frame shift gap penalty of 5. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percentage identity between the two sequences is a function of the number of identical positions shared by the sequences.
"Sequence similarity" between polynucleotides can be determined by hybridization of polynucleotides under conditions that allow formation of stable duplexes between homologous regions, followed by digestion with single- stranded- specific nuclease(s), and size determination of the digested fragments.
Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are written left to right in 5' to 3’ orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. It is to be understood that this invention is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context they are used by those of skill in the art.
As used herein, "recombinant" includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid sequence or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all as a result of deliberate human intervention.
As used herein the term “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms.
As used herein the term “variant” should be taken to mean the exhibition of qualities that have a pattern that deviates from what occurs in nature.
The terms “non-naturally occurring,” “modified” or “engineered” are used interchangeably and indicate the involvement of the hand of man. The terms, when referring to nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.
“Complementarity” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson-Crick or other non- traditional types. A percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary). “Perfectly complementary” means that all the contiguous residues of a nucleic acid sequence will hydrogen bond with the same number of contiguous residues in a second nucleic acid sequence. “Substantially complementary” as used herein refers to a degree of complementarity that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%. 97%, 98%, 99%, or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
As used herein, “stringent conditions” for hybridization refer to conditions under which a nucleic acid having complementarity to a target sequence predominantly hybridizes with the target sequence, and substantially does not hybridize to non-target sequences. Stringent conditions are generally sequencedependent, and vary depending on a number of factors. In general, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence. Non-limiting examples of stringent conditions are described in detail in Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology-Hybridization With Nucleic Acid Probes Part 1, Second Chapter “Overview of principles of hybridization and the strategy of nucleic acid probe assay”, Elsevier, N.Y.
“Hybridization” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of PCR, or the cleavage of a polynucleotide by an enzyme. A sequence capable of hybridizing with a given sequence is referred to as the “complement” of the given sequence.
As used herein, “expression” refers to the process by which a polynucleotide is transcribed from a DNA template (such as into and mRNA or other RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell.
The terms “subject,” “individual,” and “patient” are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
The terms “therapeutic agent,” “therapeutic capable agent” or “treatment agent” are used interchangeably and refer to a molecule or compound that confers some beneficial effect upon administration to a subject. The beneficial effect includes enablement of diagnostic determinations; amelioration of a disease, symptom, disorder, or pathological condition; reducing or preventing the onset of a disease, symptom, disorder or condition; and generally counteracting a disease, symptom, disorder or pathological condition.
As used herein, “treatment” or “treating,” or “palliating” or “ameliorating” are used interchangeably. These terms refer to an approach for obtaining beneficial or desired results including but not limited to a therapeutic benefit and/or a prophylactic benefit. By therapeutic benefit is meant any therapeutically relevant improvement in or effect on one or more diseases, conditions, or symptoms under treatment. For prophylactic benefit, the compositions may be administered to a subject at risk of developing a particular disease, condition, or symptom, or to a subject reporting one or more of the physiological symptoms of a disease, even though the disease, condition, or symptom may not have yet been manifested.
The term “effective amount” or “therapeutically effective amount” refers to the amount of an agent that is sufficient to effect beneficial or desired results. The therapeutically effective amount may vary depending upon one or more of: the subject and disease condition being treated, the weight and age of the subject, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary skill in the art. The term also applies to a dose that will provide an image for detection by any one of the imaging methods described herein. The specific dose may vary depending on one or more of: the particular agent chosen, the dosing regimen to be followed, whether it is administered in combination with other compounds, timing of administration, the tissue to be imaged, and the physical delivery system in which it is carried.
In particular, in some embodiments, the present invention relates to compositions and methods for using nucleic acids and/or polypeptides encoding a Cas9 fusion protein to enhance genomic editing. The compositions are useful in genomic or nucleic acid modification in vitro, ex vivo, and in vivo for a variety of research, screening, and therapeutic applications.
Nucleic acids and Polypeptides
In some embodiments, the invention provides a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In some embodiments, the invention provides a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, a nucleic acid encoding Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In some embodiments, the nucleic acid is DNA. In some embodiments, the nucleic acid is RNA. In some embodiments, the nucleic acid is a circular RNA molecule.
In some embodiments, the invention provides a polypeptide encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof. In some embodiments, the invention provides a polypeptide encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof. Cas9
As provided herein, the nucleic acid or polypeptide encodes a fusion protein that comprises Cas9 or a variant or fragment thereof. The Cas9 or variant or fragment thereof is not particularly limiting. Cas molecules of a variety of species can be used in nucleic acids, polypeptides, vectors and methods described herein.
In some embodiments, the Cas9 is from Staphylococcus aureus. In some embodiments, the Cas9 is from S', pyogenes, S. thermophiles, or Neisseria meningitides. Additional Cas9 species include: Acidovorax avenae, Actinobacillus pleuropneumoniae. Actinobacillus succinogenes, Actinobacillus suis, Actinomyces sp., cycliphilus denitrificans, Aminomonas paucivorans, Bacillus cereus, Bacillus smithii, Bacillus thuringiensis, Bacteroides sp., Blastopirellula marina, Bradyrhizobium sp., Brevibacillus laterosporus, Campylobacter coli, Campylobacter jejuni, Campylobacter lari, Candidatus Puniceispirillum, Clostridium cellulolyticum, Clostridium perfringens, Coryn ebacterium accolens, Corynebacterium diphtheria, Corynebacterium matruchotii, Dinoroseobacter shibae, Eubacterium dolichum, gamma proteobacterium, Gluconacetobacter diazotrophicus, Haemophilus parainfluenzae, Haemophilus sputorum, Helicobacter canadensis, Helicobacter cinaedi, Helicobacter mustelae, Ilyobacter polytropus, Kingella kingae, Lactobacillus crispatus, Listeria ivanovii, Listeria monocytogenes, Listeriaceae bacterium, Methylocystis sp., Methylosinus trichosporium, Mobiluncus mulieris, Neisseria bacilliformis, Neisseria cinerea, Neisseria flavescens, Neisseria lactamica, Neisseria sp., Neisseria wadsworthii, Nitrosomonas sp., Parvibaculum lavamentivorans, Pasteurella multocida, Phascolarctobacterium succinatutens, Ralstonia syzygii, Rhodopseudomonas palustris, Rhodovulum sp., Simonsiella muelleri, Sphingomonas sp., Sporolactobacillus vineae, Staphylococcus lugdunensis. Streptococcus sp., Subdoligranulum sp., Tistrella mobilis, Treponema sp., or Verminephmrobacter
Figure imgf000020_0001
A Cas9 molecule, as that term is used herein, refers to a molecule that can interact with a gRNA molecule and, in concert with the gRNA molecule, localize (e.g., target or home) to a site which comprises a target domain and PAM sequence.
The Cas9 molecule is capable of cleaving a target nucleic acid molecule. The ability of a Cas9 molecule to interact with and cleave a target nucleic acid is PAM sequence dependent. A PAM sequence is a sequence in the target nucleic acid. In an embodiment, cleavage of the target nucleic acid occurs upstream from the PAM sequence. Cas9 molecules from different bacterial species can recognize different sequence motifs (e.g., PAM sequences). In some embodiments, a Cas9 molecule of S. pyogenes recognizes the sequence motif NGG and directs cleavage of a target nucleic acid sequence 1 to 10, e.g., 3 to 5, base pairs upstream from that sequence. See, e.g., Mali et al., Science 2013; 339(6121): 823-826. In some embodiments, a Cas9 molecule of S. thermophilus recognizes the sequence motif NGGNG and NNAGAAW (W=A or T) and directs cleavage of a core target nucleic acid sequence 1 to 10, e.g., 3 to 5, base pairs upstream from these sequences. See, e.g., Horvath et al., Science 2010; 327(5962):167-170, and Deveau et al., J Bacterial 2008; 190(4): 1390-1400. In some embodiments, a Cas9 molecule of S. mutans recognizes the sequence motif NGG or NAAR (R=A or G) and directs cleavage of a core target nucleic acid sequence 1 to 10, e.g., 3 to 5 base pairs, upstream from this sequence. See, e.g., Deveau et al., J Bacterial 2Q0 -, 190(4): 1390-1400. In some embodiments, a Cas9 molecule of S. aureus recognizes the sequence motif NNGRRT (R=A or G) and directs cleavage of a target nucleic acid sequence 1 to 10, e.g., 3 to 5, base pairs upstream from that sequence. In some embodiments, a Cas9 molecule of N. meningitidis recognizes the sequence motif NNNNGATT and directs cleavage of a target nucleic acid sequence 1 to 10, e.g., 3 to 5, base pairs upstream from that sequence. See, e.g., Hou et al., PNAS Early Edition 2013, 1-6. The ability of a Cas9 molecule to recognize a PAM sequence can be determined, e.g., using a transformation assay described in Jinek et al., Science 2012, 337:816.
Exemplary naturally occurring Cas9 molecules are described in Chylinski et al., RNA Biology 2013; 10:5, 727-737, which is incorporated herein by reference. Such Cas9 molecules include Cas9 molecules of a cluster 1 bacterial family, cluster 2 bacterial family, cluster 3 bacterial family, cluster 4 bacterial family, cluster 5 bacterial family, cluster 6 bacterial family, a cluster 7 bacterial family, a cluster 8 bacterial family, a cluster 9 bacterial family, a cluster 10 bacterial family, a cluster 11 bacterial family, a cluster 12 bacterial family, a cluster 13 bacterial family, a cluster 14 bacterial family, a cluster 15 bacterial family, a cluster 16 bacterial family, a cluster 17 bacterial family, a cluster 18 bacterial family, a cluster 19 bacterial family, a cluster 20 bacterial family, a cluster 21 bacterial family, a cluster 22 bacterial family, a cluster 23 bacterial family, a cluster 24 bacterial family, a cluster 25 bacterial family, a cluster 26 bacterial family, a cluster 27 bacterial family, a cluster 28 bacterial family, a cluster 29 bacterial family, a cluster 30 bacterial family, a cluster 31 bacterial family, a cluster 32 bacterial family, a cluster 33 bacterial family, a cluster 34 bacterial family, a cluster 35 bacterial family, a cluster 36 bacterial family, a cluster 37 bacterial family, a cluster 38 bacterial family, a cluster 39 bacterial family, a cluster 40 bacterial family, a cluster 41 bacterial family, a cluster 42 bacterial family, a cluster 43 bacterial family, a cluster 44 bacterial family, a cluster 45 bacterial family, a cluster 46 bacterial family, a cluster 47 bacterial family, a cluster 48 bacterial family, a cluster 49 bacterial family, a cluster 50 bacterial family, a cluster 51 bacterial family, a cluster 52 bacterial family, a cluster 53 bacterial family, a cluster 54 bacterial family, a cluster 55 bacterial family, a cluster 56 bacterial family, a cluster 57 bacterial family, a cluster 58 bacterial family, a cluster 59 bacterial family, a cluster 60 bacterial family, a cluster 61 bacterial family, a cluster 62 bacterial family, a cluster 63 bacterial family, a cluster 64 bacterial family, a cluster 65 bacterial family, a cluster 66 bacterial family, a cluster 67 bacterial family, a cluster 68 bacterial family, a cluster 69 bacterial family, a cluster 70 bacterial family, a cluster 71 bacterial family, a cluster 72 bacterial family, a cluster 73 bacterial family, a cluster 74 bacterial family, a cluster 75 bacterial family, a cluster 76 bacterial family, a cluster 77 bacterial family, or a cluster 78 bacterial family.
Exemplary naturally occurring Cas9 molecules include a Cas9 molecule of a cluster 1 bacterial family. Examples include a Cas9 molecule of: S. pyogenes (e.g., strain SF370, MGAS10270, MGAS10750, MGAS2096, MGAS315, MGAS5OO5, MGAS6180, MGAS9429, NZ131 and SSI-1), S. thermophilus (e.g., strain LMD-9), S. pseudoporcinus (e.g., strain SPIN 20026), S. mutans (e.g., strain UA159, NN2025), S. macacae (e.g., strain NCTC11558), S', gallolyticus (e.g., strain UCN34, ATCC BAA-2069), S. equines (e.g., strain ATCC 9812, MGCS 124), S. dysdalactiae (e.g., strain GGS 124), 5. bovis (e.g., strain ATCC 700338), S. anginosus (e.g.; strain F0211), S. agalactiae (e.g., strain NEM316, A909), Listeria monocytogenes e.g., strain F6854), Listeria innocua (L. innocua, e.g., strain Clipl l262) Enterococcus italicus (e.g., strain DSM 15952), or Enterococcus faecium (e.g., strain 1,231,408). Additional exemplary Cas9 molecules are a Cas9 molecule of Neisseria meningitidis (Hou et al. PNAS Early Edition 2013, 1-6) and a 5. aureus Cas9 molecule.
When a Cas9 polynucleotide is used for the production of Cas9 polypeptide fusion protein, the polynucleotide may include the coding sequence for the full-length polypeptide or a fragment thereof, by itself; the coding sequence for the full-length polypeptide or fragment in reading frame with other coding sequences, such as those encoding a leader or secretory sequence, a pre-, or pro or prepro-protein sequence, nuclear localization signal or other fusion peptide portions. The polynucleotide may also contain non-coding 5' and 3' sequences, such as transcribed, non-translated sequences, signals, ribosome binding sites and sequences that stabilize mRNA.
In some embodiments, the nucleic acid sequence of Cas9, or variant or fragment thereof contains a nucleotide sequence that is highly identical, at least 90% identical, with a nucleotide sequence encoding Cas9 polypeptide. In some embodiments, the nucleic acid sequence of Cas9 comprises a nucleotide sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical with the encoding nucleotide sequence set forth in SEQ ID NOS:2, 36, 14 or 37.
In some embodiments, the molecule is a wild-type Cas9. In some embodiments, the Cas9 is a wild-type Cas9 from Streptococcus pyogenes and comprises SEQ ID NO: 1.
In some embodiments, the expression vector encodes a variant of the Cas9 protein referred to herein as a high fidelity Cas9. In some embodiments, the high fidelity Cas9 comprises an amino acid sequence comprising SEQ ID NO: 13.
In some embodiments, the Cas9 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NOS: 1 or 13.
In some embodiments, the nucleotide sequence encoding Cas9 or a biologically active fragment or derivative thereof includes nucleic acid molecules comprising a polynucleotide having a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical to a nucleotide sequence encoding Cas9 having the amino acid sequence in SEQ ID NO: 1 or 13.
In some embodiments, the Cas9 portion of the fusion protein is encoded by a nucleic acid sequence at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NOS:2, 36, 14 or 37, which lacks the codon for the N-terminal methionine.
In some embodiments, SEQ ID NO:13 is encoded by SEQ ID NO: 14 or SEQ ID NO:37.
In some embodiments, the nucleic acid or polypeptide encodes a biologically active fragment of Cas9 protein.
In some embodiments, a Cas9 molecule comprises an amino acid sequence having at least 90%, 95%, 96%, 97%, 98%, or 99% identity with SEQ ID NOS:1 or 13 (either full length or lacking the N-terminal methionine) or a naturally occurring Cas9 molecule sequence, e.g., a Cas9 molecule from a species listed herein or described in Chylinski et al., RNA Biology 2013, 10:5, 727-737; Hou et al. PNAS Early Edition 2013, 1-6.
In some embodiments, the Cas9 polypeptide comprises an amino acid sequence that differs from a sequence of SEQ ID NOS:1 or 13 by as many as 1, but no more than 2, 3, 4, or 5 residues.
In some embodiments, the nucleic acid or fusion protein encodes a Cas9 fragment. A fragment is a polypeptide having an amino acid sequence that entirely is the same as part but not all of the amino acid sequence of one of a Cas9 polypeptide or variant. Fragments may be continuous or discontinuous. In some embodiments, the fragment may constitute from about 1000 contiguous amino acids identified in SEQ ID NOS: 1 or 13 In some embodiments, the fragment is about 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1360, 1361, 1362, 1363, 1364, 1365, or 1367 contiguous amino acids identified in SEQ ID NOSH or 13. In some embodiments, the fragment comprises SEQ ID NO:1 but lacks the N-terminal methionine residue (i.e., comprises amino acids 2-1368 of SEQ ID NO: 1).
In some embodiments the fragments include, for example, truncation polypeptides having the amino acid sequence of Cas9 polypeptides, except for deletion of a continuous series of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl terminus or deletion of two continuous series of residues, one including the amino terminus and one including the carboxyl terminus.
Naturally occurring Cas9 molecules possess a number of properties, including: nickase activity, nuclease activity (e.g., endonuclease and/or exonuclease activity); helicase activity; the ability to associate functionally with a gRNA molecule; and the ability to target (or localize to) a site on a nucleic acid (e.g., PAM recognition and specificity). In some embodiments, a Cas9 molecule can include all or a subset of these properties. In typical embodiments, Cas9 molecules have the ability to interact with a gRNA molecule and, in concert with the gRNA molecule, localize to a site in a nucleic acid. Other activities, e.g., PAM specificity, cleavage activity, or helicase activity can vary more widely in Cas9 molecules.
Cas9 molecules with desired properties can be made in a number of ways, e.g., by alteration of a parental, naturally occurring Cas9 molecule to provide an altered Cas9 molecule having a desired property. One or more mutations or differences relative to a parental Cas9 molecule can be introduced. Such mutations and differences can comprise substitutions e.g., conservative substitutions or substitutions of non-essential amino acids); insertions; or deletions. In some embodiments, a Cas9 molecule can comprise one or more mutations or differences, e.g., at least 1, 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to a reference Cas9 molecule.
Candidate Cas9 molecules can be evaluated by art-known methods or as described herein. For example, exemplary methods for evaluating the endonuclease activity of Cas9 molecule are described, e.g., in Jinek et al., Science 2012; 337(6096):816-821.
CtIP
As provided herein, the nucleic acids or fusion proteins comprise a nucleic acid sequence encoding CtIP or a variant or fragment thereof. CtIP is a DNA repair protein. See e.g., Huertas et al. J. Biol. Chem. 284, 9558-9565 (2009) and Charpentier are incorporated by reference herein.
Figure imgf000025_0001
In some embodiments, human CtIP is an 897 amino acid protein identified in NCBI Accession No. Q99708. In some embodiments, CtIP has an amino acid sequence of SEQ ID NO: 10.
In some embodiments, the CtIP, variant or fragment thereof is from a mammal, such as human, mouse, rat, or the like.
In some embodiments, the nucleic acid or fusion protein encodes a fragment of CtIP. A fragment is a polypeptide having an amino acid sequence that entirely is the same as part but not all of the amino acid sequence of one of a CtIP polypeptide or variant. Fragments may be continuous or discontinuous.
In some embodiments, the nucleic acid or fusion protein encodes a CtIP fragment comprising an amino acid sequence of SEQ ID NO:3, corresponding to amino acids 1-296 of CtIP. In some embodiments, SEQ ID NO:3 is encoded by a nucleotide sequence comprising SEQ ID NOS:4 or 38.
In some embodiments, the fragment may constitute about 150 contiguous amino acids identified in SEQ ID NOS:3 or 10. In some embodiments, the fragment is about 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, or 290 contiguous amino acids or more identified in SEQ ID NOS:3 or 10.
In some embodiments the fragments include, for example, truncation polypeptides having the amino acid sequence of CtIP polypeptides, except for deletion of a continuous series of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl terminus or deletion of two continuous series of residues, one including the amino terminus and one including the carboxyl terminus.
In some embodiments, the fragment is a homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP. In some embodiments, the HDR Enhancing N-terminal fragment of CtIP comprises an amino acid sequence at least 90% identical to SEQ ID NOS:3 or 10 (amino acids 1-296). In some embodiments, the HDR Enhancing N-terminal fragment of CtIP comprises an amino acid sequence of at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID NOS:3 or amino acids 1-296 of SEQ ID NO: 10.
In some embodiments, the homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP comprises one or more mutations or differences, e.g., at least 1, 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to SEQ ID NOS:3 or amino acids 1-296 of SEQ ID NO: 10. In some embodiments, the homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP comprises an amino acid sequence that differs from a sequence of SEQ ID NOS:3 or amino acids 1-296 of SEQ ID NO: 10 by as many as 1, but no more than 2, 3, 4, or 5 residues.
In some embodiments, the nucleic acid sequence of CtIP, a variant or fragment thereof contains a nucleotide sequence that is highly identical, at least 90% identical, with a nucleotide sequence encoding CtIP, a variant or fragment thereof polypeptide. In some embodiments, the nucleic acid sequence of CtIP, a variant or fragment thereof comprises a nucleotide sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical with the encoding nucleotide sequence set forth in SEQ ID NOS :4 or 38.
Radl8/Rad52
As provided herein, the nucleic acid or fusion protein comprises Rad 18 or a variant or fragment thereof or Rad52 or a variant or fragment thereof.
Rad 18 is a E3 ubiquitin-protein ligase involved in postreplication repair of UV-damaged DNA. which are incorporated by reference herein. See Nambiar et al., Nat. Commim. 10, 3395 (2019), which is incorporated by reference herein
In some embodiments, human Rad 18 is a 495 amino acid protein identified in NCBI Accession No. AAF86618. In some embodiments, Radi 8 has an amino acid sequence of SEQ ID NO:11.
In some embodiments, Rad52 or a variant or fragment thereof can be used in place of the Rad 18 component in the fusion protein. In some embodiments, the Rad52 or a variant or fragment thereof has an amino acid sequence of SEQ ID NO: 15.
In some embodiments, SEQ ID NO: 15 is encoded by a nucleotide sequence comprising SEQ ID NOS: 16 or 39.
In some embodiments, the Radl8 or Rad52, variant or fragment thereof is from a mammal, such as human, mouse, rat, or the like.
In some embodiments, the nucleic acid or fusion protein encodes a Radi 8 fragment. A fragment is a polypeptide having an amino acid sequence that entirely is the same as part but not all of the amino acid sequence of one of a Rad 18 polypeptide or variant. Fragments may be continuous or discontinuous. In some embodiments, the fragment of Radi 8 comprises a deletion of a putative DNA-binding domain, and optionally lacks the N-terminal methionine. In some embodiments, the fragment of Rad 18 (also referred to herein as eRad 18) has a deletion in amino acids 242-282 in Radi 8 and is encoded by an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:5 or SEQ ID NO:43.
In some embodiments, the fragment may constitute about 150 contiguous amino acids identified in SEQ ID NOS:5 or 1 1. In some embodiments, the fragment is about 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, or 450 contiguous amino acids or more identified in SEQ ID NO:5 or SEQ ID NO:43.
In some embodiments the fragments include, for example, truncation polypeptides having the amino acid sequence of Rad 18 polypeptides, except for deletion of a continuous series of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl terminus or deletion of two continuous series of residues, one including the amino terminus and one including the carboxyl terminus.
In some embodiments, the Rad 18 variant or fragment comprises one or more mutations or differences, e.g., at least 1, 2, 3, 4, 5, 10, 15, 20, 30, 40 or 50 mutations but less than 200, 100, or 80 mutations relative to SEQ ID NO:5 or SEQ ID NO:43. In some embodiments, the Rad 18 variant or fragment comprises an amino acid sequence that differs from a sequence of SEQ ID NO:5 or SEQ ID NO:43 by as many as 1, but no more than 2, 3, 4, or 5 residues.
In some embodiments, the nucleic acid sequence of Radi 8, a variant or fragment thereof contains a nucleotide sequence that is highly identical, at least 90% identical, with a nucleotide sequence encoding Radi 8, a variant or fragment thereof polypeptide. In some embodiments, the nucleic acid sequence of Radi 8, a variant or fragment thereof comprises a nucleotide sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 99% or 100% identical with the encoding nucleotide sequence set forth in SEQ ID NOS:6 or 40.
In some embodiments, the Rad 18 or a variant or fragment thereof is fused to the amino terminus of the Cas9 or a variant or fragment thereof and the CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9. In some embodiments, the Rad52 or a variant or fragment thereof is fused to the amino terminus of the Cas9 or a variant or fragment thereof and the CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9.
In some embodiments, the fusion protein comprises one or more linker sequences between Radi 8 or a variant or fragment thereof, Cas9 or a variant or fragment thereof, and CtIP or a variant or fragment thereof. In some embodiments, the linker sequence comprises a nuclear localization sequence. In some embodiments, the linker sequence comprises GAAPKKKRKVGIHGVPAA (SEQ ID NO:21) and/or KRPAATKKAGQAKKKKEFGSGGAAS (SEQ ID NO:22). In some embodiments, GAAPKKKRKVGIHGVPAA is a linker between the Rad52 or eRadl8 portion and the Cas9 portion of the fusion protein. In some embodiments, KRPAATKKAGQAKKKKEFGSGGAAS (SEQ ID NO:22) is a linker between the Cas9 portion and the CtIP portion of the fusion protein.
In some embodiments, the fusion protein (eRadl8-Cas9-CtIP) has an amino acid sequence that is at least 80% , 85 % , 90% , 91 % , 92% , 93 % , 94% , 95 % , 96% , 97 % , 98%, 99% or 100% identical to an amino acid sequence comprising SEQ ID NO: 17. In some embodiments, an N-terminal methionine is added to the amino acid sequence of SEQ ID NO:17 at the 1 position.
In some embodiments, the nucleic acid encoding a fusion protein has a nucleotide sequence that is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to a nucleotide sequence comprising SEQ ID NOS: 18 or 41. In some embodiments, an N-terminal start methionine codon (AUG/ATG)) is added to sequence of SEQ ID NOS: 18 or 41. In some embodiments, the fusion protein (RAD52-Cas9-QIP) has an amino acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to an amino acid sequence comprising SEQ ID NO:19. In some embodiments, an N-terminal methionine is added to the amino acid sequence of SEQ ID NO: 19 at the 1 position.
In some embodiments, the nucleic acid encoding a fusion protein (RAD52- Cas9-CtIP) has a nucleotide sequence that is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to a nucleotide sequence comprising SEQ ID NOS:20 or 42. In some embodiments, an N-terminal start methionine codon (AUG/ATG)) is added to sequence of SEQ ID NQS:20 or 42.
In some embodiments, the fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains. Examples of protein domains that may be included in the fusion protein include, without limitation, epitope tags, reporter gene sequences, and nucleic acid repair proteins described herein. The fusion protein can include any sequence encoding a protein or a fragment of a protein that bind DNA molecules or bind other cellular molecules, including but not limited to maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4A DNA binding domain fusions, and herpes simplex virus (HSV) BP 16 protein fusions.
In some embodiments, the fusion protein comprises one or more affinity or purification tags. In some embodiments, the affinity or purification tag comprises glutathione S-transferase. In some embodiments, the affinity or purification tag comprises a maltose binding protein (MBP) or a variant or fragment thereof. In some embodiments, the glutathione S-transferase comprises SEQ ID NO:23.
In some embodiments, the glutathione S-transferase is encoded by SEQ ID NO:24.
In some embodiments, the maltose binding protein or a variant or fragment thereof comprises SEQ ID NO:25.
In some embodiments, the maltose binding protein or a variant or fragment thereof is encoded by SEQ ID NO:26.
In some embodiments, the affinity or purification tag comprises glutathione S- transferase and a maltose binding protein or a variant or fragment thereof.
In some embodiments, the fusion protein comprises an N-terminal GST and MBP tandem affinity tag. In some embodiments, the expression vector comprises a linker sequence following the affinity/purification tag and prior to any cleavage site. In some embodiments, the linker is an N10 linker sequence (SEQ ID NO:27), and in some embodiments, the N10 linker is encoded by SEQ ID NO:28.
In some embodiments, the affinity/purification tag and optional linker sequence is followed by a protease recognition site. In some embodiments, the recognition site is an HRV 3C protease recognition site. In some embodiments, the HRV 3C protease recognition site comprises the amino acid sequence LEVLFQGP (SEQ ID NO: 12), where cleavage occurs between the Q and G residues. In some embodiments, SEQ ID NO: 12 is encoded by SEQ ID NO:29.
In some embodiments, the fusion protein comprises a nuclear localization signal (NLS). In some embodiments, the NLS is a from SV40 and comprises SEQ ID NO:30, which in some embodiments is encoded by SEQ ID NO:31. In some embodiments, the NLS is followed by a fragment of Rad 18 that has a deletion of a putative DNA-binding domain and lacking the N-terminal methionine encoded by SEQ ID NO:5, wild-type Cas9 from Streptococcus pyogenes comprising amino acids 2-1368 of SEQ ID NO: 1, and a homology-directed repair enhancing (HDR Enhancing) N-terminal fragment of CtIP encoded by SEQ ID NO:3.
Vectors
In some embodiments, the invention provides a vector encoding nucleic acids and fusion proteins herein. In some embodiments, the vector comprises a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
In some embodiments, the vector comprises a nucleic acid encoding a fusion protein comprising a Cas9 polypeptide or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
The vector is not limiting, and can include, e.g., a viral vector, mammalian expression vector, plasmid, yeast expression vector, bacterial expression vector, or baculovirus expression vector. In some embodiments, the vector encodes one or more additional nucleic acids, such as, for example, template and/or guide RNAs.
In some embodiments, expression of the fusion protein in a mammalian cell can be achieved by transient transfection of a vector encoding the fusion protein into the cell. In some embodiments, expression of the fusion protein in a mammalian cell can be achieved by integration of the polynucleotides containing the fusion protein into the nuclear genome of the mammalian cell. A variety of vectors and systems for the delivery and integration of polynucleotides encoding exogenous proteins into the nuclear DNA of a mammalian cell have been developed. Examples of expression vectors are disclosed in, e.g., WO 1994/011026 and are incorporated herein by reference. In some embodiments, expression vectors for use in the compositions and methods described herein contain a polynucleotide sequence of the fusion protein, as well as, e.g., additional sequence elements used for the expression of these agents and/or the integration of these polynucleotide sequences into the genome of a mammalian cell. Certain vectors that can be used for the expression of the fusion protein include plasmids that contain regulatory sequences, such as promoter and enhancer regions, which direct gene transcription. Other useful vectors for expression of the fusion protein contain polynucleotide sequences that enhance the rate of translation of these genes or improve the stability or nuclear export of the mRNA that results from gene transcription. These sequence elements include, e.g., 5’ and 3’ untranslated regions and a polyadenylation signal site in order to direct efficient transcription of the gene carried on the expression vector. The expression vectors suitable for use with the compositions and methods described herein may also contain a polynucleotide encoding a marker for selection of cells that contain such a vector. Examples of a suitable marker include genes that encode resistance to antibiotics, such as ampicillin, chloramphenicol, kanamycin, or nourseothricin.
In order to utilize nucleic acids or vectors for therapeutic application in the treatment of conditions described herein or modifying the genome, they can be directed to the interior of the cell, and, in particular, to specific cell types.
Nucleic acids and vectors can be introduced into a cell by a variety of methods, including transformation, transfection, transduction, direct uptake, projectile bombardment, and by encapsulation of the vector or nucleic acid in a liposome or nanoparticle, such as a lipid nanoparticle. Examples of suitable methods of transfecting or transforming cells include calcium phosphate precipitation, electroporation, microinjection, infection, lipofection and direct uptake. Such methods are described in more detail, for example, in Green, et al. , Molecular Cloning: A Laboratory Manual, Fourth Edition (Cold Spring Harbor University Press, New York 2014); and Ausubel, el al., Current Protocols in Molecular Biology (John Wiley & Sons, New York 2015), the disclosures of each of which are incorporated herein by reference.
In some embodiments, the fusion protein can also be introduced into a mammalian cell by targeting vectors. For example, vectors can be targeted to the phospholipids on the extracellular surface of the cell membrane by linking the vector molecule to a VSV-G protein, a viral protein with affinity for all cell membrane phospholipids. Such a construct can be produced using methods well known to those of skill in the field.
Recognition and binding of the polynucleotide encoding the fusion protein by mammalian RNA polymerase is important for gene expression. As such, one may include sequence elements within the polynucleotide that exhibit a high affinity for transcription factors that recruit RNA polymerase and promote the assembly of the transcription complex at the transcription initiation site.
Such sequence elements include, e.g., a mammalian promoter, the sequence of which can be recognized and bound by specific transcription initiation factors and ultimately RNA polymerase.
Polynucleotides suitable for use in the compositions and methods described herein also include those that encode the fusion protein downstream of a mammalian. Promoters that are useful for the expression in mammalian cells include ubiquitous promoters such as the CAG promoter, or the cytomegalovirus (CMV) promoter. Cell type and tissue specific promoters can also be utilized.
Alternatively, promoters derived from viral genomes can also be used for the stable expression of these agents in mammalian cells. Examples of functional viral promoters that can be used to promote mammalian expression of these agents include adenovirus late promoter, vaccinia virus 7.5K promoter, SV40 promoter, tk promoter of HSV, mouse mammary tumor virus (MMTV) promoter, LTR promoter of HIV, promoter of Moloney virus, Epstein barr virus (EBV) promoter, and the Rous sarcoma virus (RSV) promoter.
In some embodiments, the fusion protein is delivered by a viral vector. A “viral vector” is a virus that can be used to deliver genetic material into target cells. This can be done either in vivo or in vitro. In general, viral vectors are either inherently safe or are modified to present a low handling risk and have low toxicity with respect to the targeted cells. A “retrovirus” is a virus of the family Retroviridae that inserts a copy of its RNA genome into the DNA of a host cell, then uses a reverse transcriptase enzyme to produce DNA from its RNA genome. Retroviruses are known in the art to be useful in gene delivery systems. A “lentivirus” is a type of retrovirus; they are known as slow retroviruses. They are associated with severe immunodeficiency and death in humans but can be useful as viral vectors in gene therapy. An “adenovirus” is a virus of the family Adenoviridae that lacks an outer lipid bilayer and includes a double stranded DNA genome. Adenoviruses are well established in the art as viral vectors for gene therapy, and delivering genes coding proteins of interest to particular locations, as to selected cell types, is possible. An “adeno-associated virus” is of the genus Dependoparvovirus, which is of the family Parvoviridae. These are nonenveloped viruses having a single-stranded DNA genome. Adeno-associated viruses are well known in the art as attractive candidates for use as viral vectors for gene therapy. Unlike adenoviruses, they have the advantage that they do not cause disease.
In some embodiments, nucleic acids of the compositions and methods described herein are incorporated into recombinant AAV (rAAV) vectors and/or virions in order to facilitate their introduction into a cell. rAAV vectors useful in the compositions and methods described herein are recombinant nucleic acid constructs that include (1) a heterologous sequence to be expressed (e.g., a polynucleotide encoding the fusion protein) and (2) viral sequences that facilitate stability and expression of the heterologous genes. The viral sequences may include those sequences of AAV that are required in cis for replication and packaging (e.g., functional ITRs) of the DNA into a virion. Such rAAV vectors may also contain marker or reporter genes. In some embodiments, useful rAAV vectors have one or more of the AAV wild-type genes deleted in whole or in part but retain functional flanking ITR sequences. The AAV ITRs may be of any serotype suitable for a particular application. In some embodiments, the ITRs can be AAV2 ITRs. Methods for using rAAV vectors are described, for example, in Tai et al., J. Biomed. Sci. 1 :279 (2000), and Monahan and Samulski, Gene Delivery' . A (2000), the disclosures of each of which are incorporated herein by reference as they pertain to AAV vectors for gene delivery.
The nucleic acids and vectors described herein can be incorporated into a rAAV virion in order to facilitate introduction of the nucleic acid or vector into a cell. The capsid proteins of AAV compose the exterior, non-nucleic acid portion of the virion and are encoded by the AAV cap gene. The cap gene encodes three viral coat proteins, VP1, VP2 and VP3, which are required for virion assembly. The construction of rAAV virions has been described, for instance, in U.S. Pat. Nos. 5,173,414; 5,139,941 ; 5,863,541; 5,869,305; 6,057,152; and 6,376,237; as well as in Rabinowitz et al., J. Virol. 76:791 (2002) and Bowles et al., J. Virol, 'll :423 (2003), the disclosures of each of which are incorporated herein by reference as they pertain to AAV vectors for gene delivery. rAAV virions useful in conjunction with the compositions and methods described herein include those derived from a variety of AAV serotypes including, without limitation, AAV1, AAV2, AAV2quad(Y-F), AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, rhlO, rh39, rh43, rh74, Anc80, Anc80L65, DJ/8, DJ/9, 7m8, PHP.B, PHP.eb, and PHP.S.
Also useful in conjunction with the compositions and methods described herein are pseudotyped rAAV vectors. Pseudotyped vectors include AAV vectors of a given serotype (e.g., AAV9) pseudotyped with a capsid gene derived from a serotype other than the given serotype (e.g., AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, etc.). Techniques involving the construction and use of pseudotyped rAAV virions are known in the art and are described, for instance, in Duan et al., J. Virol. 75:7662 (2001); Halbert et al., J. Virol. 74:1524 (2000); Zolotukhin et al., Methods, 28:158 (2002); and Auricchio et al., Hum. Molec. Genet. 10:3075 (2001).
AAV virions that have mutations within the virion capsid may be used to infect particular cell types more effectively than non-mutated capsid virions. For example, suitable AAV mutants may have ligand insertion mutations for the facilitation of targeting AAV to specific cell types. The construction and characterization of AAV capsid mutants including insertion mutants, alanine screening mutants, and epitope tag mutants is described in Wu et al., J. Virol. 74:8635 (2000). Other rAAV virions that can be used in methods described herein include those capsid hybrids that are generated by molecular breeding of viruses as well as by exon shuffling. See, e.g., Soong et al., Nat. Genet., 25:436 (2000) and Kolman and Stemmer, Nat. Biolechnol. 19:423 (2001).
In some embodiments, the vector is a yeast expression vector. Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl (Baldari, et al., 1987. EMBO J 6: 229-234), pMFa (Kuijan and Herskowitz, 1982. Cell 30: 933- 943), pJRY88 (Schultz et al., 1987. Gene 54: 113-123), pYES2 (Invitrogen Corporation, San Diego, Calif), and picZ (InVitrogen Corp, San Diego, Calif).
In some embodiments, the vector drives protein expression in insect cells using baculovirus expression vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., SF9 cells) include the pAc series (Smith, et al., 1983. Mol. Cell. Biol. 3: 2156-2165) and the pVL series (Lucklow and Summers, 1989. Virology 170: 31-39).
Tn another embodiment, the vector is a prokaryotic expression vector comprising a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises i) Cas9 or a variant or fragment thereof; ii) Rad 18 or a variant or fragment thereof; and iii) CtIP or a variant or fragment thereof.
In some embodiments, the vector is a prokaryotic expression vector comprising a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises i) Cas9 or a variant or fragment thereof; ii) Rad52 or a variant or fragment thereof; and iii) CtIP or a variant or fragment thereof.
Appropriate prokaryotic vectors are typically equipped with a selectable marker-encoding nucleic acid sequence, insertion sites, and suitable control elements, such as termination sequences. In some embodiments, the vectors comprise regulatory sequences, including, for example, control elements (i.e., promoter and terminator elements or 5' and/or 3' untranslated regions), effective for expression of the coding sequence in host cells (and/or in a vector or host cell environment in which a modified protein coding sequence is not normally expressed), operably linked to the coding sequence.
Large numbers of suitable vectors and promoters are known to those of skill in the art, many of which are commercially available and known to those in the art. In some embodiments, the expression vector is derived from the pGEX-6P-l commercial vector from Addgene. In some embodiments, the prokaryotic expression vector comprises a plasmid. As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal selfreplicating genetic element in some eukaryotes or prokaryotes, or integrates into the host chromosome.
In some embodiments, the expression vector is a bacterial expression vector plasmid. In some embodiments, the expression vector is capable of expressing the fusion protein in Escherichia coli.
As used herein, the term "expression" refers to the process by which a polypeptide is produced based on the nucleic acid sequence of a gene. The process includes both transcription and translation. Thus, expression of the fusion protein refers to transcription and translation of the fusion protein to be expressed, the products of which can include precursor RNA, mRNA, polypeptide, post-translation processed polypeptide, and derivatives thereof.
As used herein, the terms "vector" and "cloning vector" refer to nucleic acid constructs designed to transfer nucleic acid sequences into cells. As used herein, the term "expression vector" refers to nucleic acid constructs generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell. Typically the vector comprises a recombinant expression cassette, and can be incorporated into a plasmid, chromosome, virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid sequence to be transcribed and a promoter. In this embodiment, the expression cassette comprises a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein.
As used herein, a "promoter sequence" refers to a DNA sequence which is recognized by a cell for expression purposes. Exemplary promoters include both constitutive promoters and inducible promoters. Such promoters are well known to those of skill in the art. Those skilled in the art are also aware that a natural promoter can be modified by replacement, substitution, addition or elimination of one or more nucleotides without changing its function. The practice of the present invention encompasses and is not constrained by such alterations to the promoter. In some embodiments, it is operably linked to a DNA sequence encoding the fusion polypeptide. Such linkage comprises positioning of the promoter with respect to the translation initiation codon of the DNA sequence encoding the fusion DNA sequence.
In some embodiments, the promoter is an inducible promoter. In some embodiments, the promoter is induced by a change in temperature, e.g., an increase of in temperature from 37 degrees Celsius to 42 degrees Celsius. In some embodiments, the promoter is induced by an agent, such as a small molecule such as IPTG. In some embodiments, the inducible promoter is a lad or lacZ promoter.
In some embodiments where a lac family promoter is utilized, the lacl gene may also be present in the system. The lacl gene (usually a constitutively expressed gene) encodes the Lac repressor protein Lacl protein that binds to the rack operator of the lac family promoter. Therefore, in some embodiments, when the lac family promoter is utilized, the lac gene can also be included and expressed in the expression system.
In some embodiments, the expression vector comprises a lac operator. In some embodiments, the lac operator comprises SEQ ID NO:32. In some embodiments, the lac operator is derived from the pGEX-6P-l commercial vector (Addgene). An operator sequence located at the 5' end serves as a binding site for a repressor protein that blocks RNA polymerase.
In some embodiments, the expression vector comprises a tac promoter. In some embodiments, the tac promoter comprises SEQ ID NO:33. In some embodiments, the tac promoter is derived from the pGEX-6P-l commercial vector (Addgene).
As used herein, a nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA encoding a secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Operably linked DNA sequences are usually contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is typically accomplished by ligation at convenient restriction sites. If such sites do not exist, in some embodiments, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
In some embodiments, other regulatory elements can be present in the expression construct encoding the recombinant fusion protein. In embodiments, the solubilizable recombinant fusion protein is present in either the cytoplasm or peripheral of the cell during production. In some embodiments, the expression constructs of the invention encode a recombinant fusion protein fused to a secretory leader capable of transporting the recombinant fusion protein to the cytoplasm of cells. In some embodiments, the expression construct encodes a recombinant fusion protein fused to a secretory leader capable of transporting the recombinant fusion protein to the periplasm. In some embodiments, the secretory leader is cleaved from the recombinant fusion protein.
Other factors include, but are not limited to, transcription enhancer sequences, translation enhancer sequences, other promoters, activators, translation start and stop signals, transcription terminators, cistron regulators, polycistronic regulators, expression as described above. Further included are tag sequences such as the nucleotide sequence "tags" and "tag" polypeptide coding sequences that facilitate identification, separation, purification, and/or isolation of the polypeptide. In some embodiments, the expression construct, in addition to the protein coding sequence, operably binds to one of the following regulatory elements: promoter, ribosome binding site (RBS), transcription terminator, and translation initiation and termination signals. Useful RBSs can also be obtained from any of the species useful as host cells in, for example, the expression systems of US Patent Application Publication Nos. 2008/0269070 and 2010/0137162. Many specific and various consensus RBSs are known. See Frishman et al., Gene 234 (2): 257-65 (8 Jul. 1999); and B. E. Suzek et al., Bioinformatics 17 (12): 1123-30 (December 2001).
Methods, vectors, and translation and transcriptional elements, as well as other elements useful in the present invention, are well known in the art and are described, for example, in Gilroy's US Pat. No. 5,055,294 and Gilroy et al. 5,128,130, Rammler et al., US Pat. No. 5,281,532, Barnes et al., US Pat. Nos. 4,695,455 and 4,861,595; Gray et al., US Pat. No. 4,755,465; and in Wilcox US Pat. No. 5,169,760, and in many of the other publications incorporated herein by reference.
In some embodiments, a secretory signal or leader coding sequence is fused to the N-terminus of the sequence encoding the recombinant fusion protein. The use of a secretory signal sequence can increase the production of recombinant proteins in bacteria. By utilizing a secretory leader, it is possible to increase the yield of properly folded proteins by secreting proteins from the intracellular environment. In Gramnegative bacteria, proteins secreted from the cytoplasm may end up in the peri-cell membrane cavity, attached to the outer membrane, or to the extracellular culture medium. These methods also avoid the formation of inclusion bodies. Secretion of proteins into the peri-cell membrane cavity promotes proper disulfide bond formation (Bardwell et al., 1994, Phosphate Microorg, Chapter 45, 270-5, and Manoil, 2000, Methods in Enzymol. 326: 35-47). Other benefits of secreting recombinant proteins include more efficient isolation of proteins, such as active proteins, reduced inclusion body formation, reduced rate of toxicity to host cells, and recombination of soluble forms. It is also possible to potentially encourage continuous media for protein production rather than batches, as it may expel the protein of interest into the medium.
In some embodiments, the recombinant fusion protein targets the peripheral or extracellular space of a host cell. In embodiments, the expression vector further comprises a nucleotide sequence encoding a secretory signal polypeptide operably linked to a nucleotide sequence encoding the recombinant fusion protein.
In some embodiments, the expression vector further comprises a transcription termination signal downstream of a nucleotide sequence encoding the fusion protein. As used herein, "terminator sequence" refers to a DNA sequence which is recognized by the io expression host to terminate transcription. It is operably linked to the 3' end of the fusion DNA encoding the fusion polypeptide to be expressed. In some embodiments, the termination region is obtained from the same gene as the promoter sequence, while in other embodiments it is obtained from another gene. The selection of suitable transcription termination signals is well-known to those of skill in the art. In some embodiments, the expression vector comprises a rrnB T1 terminator comprising SEQ ID NO:34. In some embodiments, the vector comprises a T7Te terminator comprising SEQ ID NO: 35. In some embodiments, the vector comprises a rmB T1 terminator sequence followed by a T7Te terminator sequence.
In some embodiments, the expression vector comprises a selectable marker encoding nucleic acid sequence. As used herein, the term "selectable markerencoding nucleotide sequence" refers to a nucleotide sequence which is capable of expression in prokaryotic cells and where expression of the selectable marker confers to cells containing the expressed gene the ability to grow in the presence of is a corresponding selective condition.
The choice of the proper selectable marker will depend on the host cell. Appropriate markers for different bacterial hosts are well known in the art. Typical selectable marker genes encode proteins that (a) confer resistance to antibiotics or other toxins (e.g., ampicillin, methotrexate, tetracycline, neomycin mycophenolic acid, puromycin, zeomycin, or hygromycin); or (b) complement an auxotrophic mutation or a naturally occurring nutritional deficiency in the host strain. In some embodiments, the selectable marker gene encodes a gene capable of conferring antibiotic resistance. In some embodiments, the selectable marker gene encodes a gene capable of conferring resistance to ampicillin.
In some embodiments, the expression vector encodes a fusion protein (GST- MBP-eRAD18-Cas9-CtIP) comprising an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:7.
In some embodiments, the fusion protein (GST-MBP-eRAD18-Cas9-CtIP) is encoded by a nucleic acid sequence at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:8.
In some embodiments, the prokaryotic expression vector comprises a nucleic acid sequence at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO:9. The vector map is shown in FIG. 7.
In some embodiments, the nucleotide sequence encoding the fusion protein is codon-optimized for expression in a prokaryotic host cell.
Host cells In another embodiment, the present invention provides host cells which have been transduced, transformed or transfected with a vector as described herein. In some embodiments, the host cells are prokaryotic. The culture conditions, such as temperature, pH and the like, are those previously used for the parental host cell prior to transduction, transformation or transfection and are apparent to those skilled in the art.
In some embodiments, the nucleotide sequence encoding a fusion protein is operably linked to a promoter sequence functional in the host cell. In one embodiment, a bacterial culture is transformed with an expression vector having a promoter or biologically active promoter fragment or one or more (e.g., a series) enhancers which functions in the host cell, operably linked to a nucleic acid sequence encoding the fusion protein, such that the fusion protein is expressed in the cell.
Representative examples of appropriate hosts include bacterial cells, such as streptococci, staphylococci, Escherichia coli, Streptomyces and Bacillus subtilis cells. In some embodiments, the host cell is Escherichia coli.
CR1SPK/CAS9 System
In some embodiments, the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising: a) a sgRNA molecule comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome b) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising: a) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome b) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector). In another embodiment, the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In another embodiment, the invention provides a CRISPR/Cas9 system for modifying a nucleic acid sequence in cells or in cells of a subject comprising, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In general, a CRISPR/CAS9 system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system) to effect the modification of a nucleic acid sequence. The CRISPR/CAS9 system includes one or more nucleic acids or polypeptides encoding fusion proteins and one or more sgRNAs as described herein. In some embodiments, the CRISPR/CAS9 system includes a nucleic acid template encoding a sequence of interest for purposes of editing a nucleic acid sequence.
In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide RNA sequence is designed to have complementarity, where hybridization between a target sequence and a guide RNA sequence promotes the formation of a CRISPR complex. Full complementarity is not necessarily required, provided there is sufficient complementarity to cause hybridization and promote formation of a CRISPR complex. A target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell. In some embodiments, the target sequence may be within an organelle of a eukaryotic cell, for example, mitochondrion or chloroplast. A sequence or template that may be used for recombination into the targeted locus comprising the target sequences is referred to as an “editing template,” “editing polynucleotide,” “editing sequence,” or as a “nucleic acid template encoding a sequence of interest” herein. In aspects of the invention, an exogenous template polynucleotide may be referred to as a “nucleic acid template encoding a sequence of interest.” In some embodiments, the recombination is homologous recombination.
In some embodiments, a coding sequence encoding the fusion protein is codon optimized for expression in particular cells, such as eukaryotic cells. The eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate. In general, codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g. about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database”, and these tables can be adapted in a number of ways. See Nakamura, Y., et al. “Codon usage tabulated from the international DNA sequence databases: status for the year 2000” Nucl. Acids Res. 28:292 (2000). Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.), are also available. In some embodiments, one or more codons (e.g. 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or all codons) in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
In general, the sgRNA sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence- specific binding of a CRISPR complex to the target sequence. In some embodiments, the degree of complementarity between a sgRNA sequence and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting example of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g. the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net). In some embodiments, a guide sequence is about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length. The ability of a guide sequence to direct sequence-specific binding of a CRISPR complex to a target sequence may be assessed by any suitable assay. For example, the components of a CRISPR system sufficient to form a CRISPR complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the CRISPR sequence, followed by an assessment of preferential cleavage within the target sequence, such as by Surveyor assay as described herein. Similarly, cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a CRISPR complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible, and will occur to those skilled in the art.
A sgRNA sequence may be selected to target any target sequence. In some embodiments, the target sequence is a sequence within a genome of a cell. Exemplary target sequences include those that are unique in the target genome.
In some embodiments, a sgRNA sequence is selected to reduce the degree of secondary structure within the guide sequence. Secondary structure may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148). Another example folding algorithm is the online webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see e.g. A. R. Gruber et al., 2008, Cell 106(1): 23-24; and PA Carr and GM Church, 2009, Nature Biotechnology 27(12): 1151-62). Methods
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify the genomic sequence in the cell. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 (or Rad52) or a variant or fragment thereof, and CtIP or a variant or fragment thereof, with optimized (HMEJ) donor DNA can improve knockin performance (both KI precision and efficiency) up to 40-fold (e.g., in cultured human cells) compared to conventional Cas9 knockin (with homologous recombination donor DNA).
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the cell is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the cell is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
Tn some embodiments, the invention provides a method of treating a disease of condition in a subject by administering to the subject an effective amount of the cells having a modified genomic sequence.
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify a genomic sequence in cells of the subject. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject. In some embodiments, the one or more additional agents include guide RNA(s) and/or template nucleic acid.
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the subject is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the subject is further administered an effective amount of a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome. In some embodiments, the system further comprises a nucleic acid template encoding a sequence of interest (e.g., on a vector).
In some embodiments, the invention provides a method of modifying a genomic sequence of a cell comprising: introducing a system described herein into a cell. In some embodiments, the introducing results in disruption, deletion, or insertion of a target nucleic acid (e.g., gene) in the cell. In some embodiments, gene editing results in an increase or decrease in expression of an endogenous or exogenous gene in the cell. In some embodiments, the cell is a eukaryotic cell (e.g., a mammalian such as a human) cell). In some embodiments, the cell is in vitro, ex vivo, or in vivo. In some embodiments, the method treats a disease or condition in a subject. In some embodiments, the genomic editing comprises HDR or NHEJ.
In some embodiments, the invention provides methods comprising delivering one or more polynucleotides, such as or one or more vectors, nucleic acids, or systems, one or more transcripts thereof, and/or one or proteins transcribed therefrom, to a cell or to a subject. In some aspects, the invention further provides cells produced by such methods, and organisms (such as animals, plants, or fungi) comprising or produced from such cells.
In some embodiments, a CRISPR/Cas9 system is delivered to a cell. Viral and non-viral based gene transfer methods can be used to introduce nucleic acids in mammalian cells or target tissues. Such methods can be used to administer nucleic acids encoding components of a CR1SPR/Cas9 system to cells in culture, or in a host organism. Non-viral vector delivery systems include DNA plasmids, RNA (including circular RNA), naked nucleic acid, and nucleic acid complexed with a delivery vehicle, such as a liposome, or lipid nanoparticle. Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell. For a review of gene therapy procedures, see Anderson, Science 256:808-813 (1992); Nabel & Feigner, TIBTECH 11:211-217 (1993); Mitani & Caskey, TIBTECH 11:162-166 (1993); Dillon. TIBTECH 11:167-175 (1993); Miller, Nature 357:455-460 (1992); Van Brunt, Biotechnology 6(10): 1149-1154 (1988); Vigne, Restorative Neurology and Neuroscience 8:35-36 (1995); Kremer & Perricaudet, British Medical Bulletin 51(1):31-44 (1995); Haddada et al., in Current Topics in Microbiology and Immunology Doerfler and Bohm (eds) (1995); and Yu et al., Gene Therapy 1:13-26 (1994).
Methods of non-viral delivery of nucleic acids include lipofection, nucleofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycations or lipidmucleic acid conjugates, naked DNA, lipid nanoparticles, lipid- like nanoparticles, artificial virions, and agent-enhanced uptake of DNA. Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386, 4,946,787; and 4,897,355) and lipofection reagents are sold commercially (e.g., Transfectam™ and Lipofectin™). Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Feigner, WO 91/17424; WO 91/16024. Delivery can be to cells (e.g. in vitro or ex vivo administration) or target tissues (e.g. in vivo administration).
The preparation of lipid: nucleic acid complexes, including targeted liposomes such as immunolipid complexes, is well known to one of skill in the art (see, e.g., Crystal, Science 270:404-410 (1995); Blaese et al., Cancer Gene Ther. 2:291-297 (1995): Behr et al., Bioconjugate Chem. 5:382-389 (1994); Remy et al., Bioconjugate Chem. 5:647-654 (1994); Gao et al., Gene Therapy 2:710-722 (1995); Ahmad et al., Cancer Res. 52:4817-4820 (1992); U.S. Pat. Nos. 4,186,183, 4,217,344, 4,235,871, 4,261,975, 4,485,054, 4,501,728, 4,774,085, 4,837,028, and 4,946,787).
The use of RNA or DNA viral based systems for the delivery of nucleic acids take advantage of highly evolved processes for targeting a virus to specific cells in the body and trafficking the viral payload to the nucleus. Viral vectors can be administered directly to patients (in vivo) or they can be used to treat cells in vitro, and the modified cells may optionally be administered to patients (ex vivo). Conventional viral based systems could include retroviral, lentivirus, adenoviral, adeno-associated and herpes simplex virus vectors for gene transfer. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.
The tropism of a retrovirus can be altered by incorporating foreign envelope proteins, expanding the potential target population of target cells. Lentiviral vectors are retroviral vectors that are able to transduce or infect non-dividing cells and typically produce high viral titers. Selection of a retroviral gene transfer system would therefore depend on the target tissue. Retroviral vectors are comprised of cis-acting long terminal repeats with packaging capacity for up to 6-10 kb of foreign sequence. The minimum cis-acting LTRs are sufficient for replication and packaging of the vectors, which are then used to integrate the therapeutic gene into the target cell to provide permanent transgene expression. Widely used retroviral vectors include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus (GaLV), Simian Immuno deficiency virus (SIV), human immuno deficiency virus (HIV), and combinations thereof (see, e.g., Buchscher et al., J. Virol. 66:2731-2739 (1992); Johann et al., J. Virol. 66:1635-1640 (1992); Sommnerfelt et al., Virol. 176:58-59 (1990); Wilson et al., J. Virol. 63:2374-2378 (1989); Miller et al., J. Virol. 65:2220- 2224 (1991); PCT/US94/05700). In applications where transient expression is preferred, adenoviral based systems may be used. Adenoviral based vectors are capable of very high transduction efficiency in many cell types and do not require cell division. With such vectors, high titer and levels of expression have been obtained. This vector can be produced in large quantities in a relatively simple system. Adeno- associated virus (“AAV”) vectors may also be used to transduce cells with target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and for in vivo and ex vivo gene therapy procedures (see, e.g., West et al., Virology 160:38-47 (1987); U.S. Pat. No. 4,797,368; WO 93/24641 ; Kotin, Human Gene Therapy 5:793- 801 (1994); Muzyczka, J. Clin. Invest. 94: 1351 (1994). Construction of recombinant AAV vectors are described in a number of publications, including U.S. Pat. No. 5,173,414; Tratschin et al., Mol. Cell. Biol. 5:3251-3260 (1985); Tratschin, et al., Mol. Cell. Biol. 4:2072-2081 (1984); Hermonat & Muzyczka, PNAS 81:6466-6470 (1984); and Samulski et al., J. Virol. 63:03822-3828 (1989).
Packaging cells are typically used to form virus particles that are capable of infecting a host cell. Such cells include 293 cells, which package adenovirus, and W2 cells or PA317 cells, which package retrovirus. Viral vectors used in gene therapy are usually generated by producing a cell line that packages a nucleic acid vector into a viral particle. The vectors typically contain the minimal viral sequences required for packaging and subsequent integration into a host, other viral sequences being replaced by an expression cassette for the polynucleotide(s) to be expressed. The missing viral functions are typically supplied in trans by the packaging cell line. For example, AAV vectors used in gene therapy typically only possess ITR sequences from the AAV genome which are required for packaging and integration into the host genome. Viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences. The cell line may also be infected with adenovirus as a helper. The helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid. The helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV. Additional methods for the delivery of nucleic acids to cells are known to those skilled in the art. See, for example, US20030087817, incorporated herein by reference.
In some embodiments, a host cell is transiently or non-transiently transfected with one or more vectors described herein. In some embodiments, a cell is transfected as it naturally occurs in a subject. In some embodiments, a cell that is transfected is taken from a subject. In some embodiments, the cell is derived from cells taken from a subject, such as a cell line. A wide variety of cell lines for tissue culture are known in the art. Examples of cell lines include, but are not limited to, C8161, CCRF-CEM, MOLT, mIMCD-3, NHDF, HeLa-S3, Huhl, Huh4, Huh7, HUVEC, HASMC, HEKn, HEKa, MiaPaCell, Panel, PC-3, TF1, CTLL-2, C1R, Ratb, CV1, RPTE, A10, T24, J82, A375, ARH-77, Calul, SW480, SW620, SKOV3, SK-UT, CaCo2, P388D1, SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01 , LRMB, Bcl-1, BC-3, IC21, DLD2, Raw264.7, NRK, NRK-52E, MRC5, MEF, Hep G2, HeLa B, HeLa T4, COS, COS-1, COS-6, C0S-M6A, BS-C-1 monkey kidney epithelial, BALB/3T3 mouse embryo fibroblast, 3T3 Swiss, 3T3-L1, 132-d5 human fetal fibroblasts; 10.1 mouse fibroblasts, 293-T, 3T3, 721, 9L, A2780, A2780ADR, A2780cis, A172, A20, A253, A431, A-549, ALC, B16, B35, BCP-1 cells, BEAS-2B, bEnd.3, BHK-21, BR 293, BxPC3, C3H-10T1/2, C6/36, Cal-27, CHO, CHO-7, CHO-IR, CHO-K1, CHO-K2, CHO-T, CHO Dhfr-/-, COR-L23, COR-L23/CPR, COR-L23/5010, COR-L23/R23, COS-7, COV-434, CML Tl, CMT, CT26, D17, DH82, DU145, DuCaP, EL4, EM2, EM3, EMT6/AR1, EMT6/AR10.0, FM3, H1299, H69, HB54, HB55, HCA2, HEK- 293, HeLa, Hepalclc7, HL-60, HMEC, HT-29, Jurkat, JY cells, K562 cells, Ku812, KCL22, KG1, KY01, LNCap, Ma-Mel 1-48, MC-38, MCF-7, MCF-10A, MDA-MB- 231, MDA-MB-468, MDA-MB-435, MDCK II, MDCK II, MOR/0.2R, MONOMAC 6, MTD-1A, MyEnd, NCLH69/CPR, NCI-H69/LX10, NCI-H69/LX20, NCI- H69/LX4, NIH-3T3, NALM-1, NW-145, OPCN/OPCT cell lines, Peer, PNT- 1A/PNT 2, RenCa, RIN-5F, RMA/RMAS, Saos-2 cells, Sf-9, SkBr3, T2, T-47D, T84, THP1 cell line, U373, U87, U937, VCaP, Vero cells, WM39, WT-49, X63, YAC-1, YAR, and transgenic varieties thereof Cell lines are available from a variety of sources known to those with skill in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassas, Va.)). In some embodiments, a cell transfected with one or more vectors described herein is used to establish a new cell line comprising one or more vector-derived sequences. In some embodiments, a cell transiently transfected with the components of a CRISPR system as described herein (such as by transient transfection of one or more vectors, or transfection with RNA), and modified through the activity of a CRISPR complex, is used to establish a new cell line comprising cells containing the modification but lacking any other exogenous sequence. In some embodiments, cells transiently or non-transiently transfected with one or more vectors described herein, or cell lines derived from such cells are used in assessing one or more test compounds.
In some embodiments, one or more vectors described herein are used to produce a non-human transgenic animal or transgenic plant. In some embodiments, the transgenic animal is a mammal, such as a mouse, rat, or rabbit. In certain embodiments, the organism or subject is a plant. In certain embodiments, the organism or subject or plant is algae. Methods for producing transgenic plants and animals are known in the art, and generally begin with a method of cell transfection, such as described herein. Transgenic animals are also provided, as are transgenic plants, especially crops and algae. The transgenic animal or plant may be useful in applications outside of providing a disease model. These may include food or feed production through expression of, for instance, higher protein, carbohydrate, nutrient or vitamins levels than would normally be seen in the wildtype. In this regard, transgenic plants, especially pulses and tubers, and animals, especially mammals such as livestock (cows, sheep, goats and pigs), but also poultry and edible insects, are preferred.
Transgenic algae or other plants such as rape may be particularly useful in the production of vegetable oils or biofuels such as alcohols (especially methanol and ethanol), for instance. These may be engineered to express or overexpress high levels of oil or alcohols for use in the oil or biofuel industries. Certain embodiments provide a method (e.g., gene editing method), comprising: introducing a system described herein into a cell. In some embodiments, the introducing results in disruption, deletion, or insertion of a target nucleic acid (e.g., gene) in the cell. In some embodiments, the gene editing results in an increase or decrease in expression of an endogenous or exogenous gene in the cell. In some embodiments, the cell is a eukaryotic cell (e.g., a mammalian (e.g., human) cell). In some embodiments, the cell is in vitro, ex vivo, or in vivo. In some embodiments, the method treats a disease or condition in a subject.
For example, in one aspect, the invention provides for methods of modifying a target polynucleotide in a eukaryotic cell, which may be in vivo, ex vivo or in vitro. In some embodiments, the method comprises sampling a cell or population of cells from a human or non-human animal or plant (including micro-algae), and modifying the cell or cells. Culturing may occur at any stage ex vivo. The cell or cells may even be re-introduced into the non-human animal or plant (including micro-algae).
The target polynucleotide in the cells or cells of a subject in the methods described herein can be any polynucleotide endogenous or exogenous to the cell. For example, the target polynucleotide can be a polynucleotide residing in the nucleus of a eukaryotic cell. The target polynucleotide can be a sequence coding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or a junk DNA). Without wishing to be bound by theory, it is believed that the target sequence should be associated with a PAM (protospacer adjacent motif); that is, a short sequence recognized by the CRISPR complex. The precise sequence and length requirements for the PAM differ depending on the CRISPR enzyme used, but PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Persons skilled in the art will be able to identify PAM sequences for use with a given CRISPR enzyme.
Examples of target polynucleotides include a sequence associated with a signaling biochemical pathway, e.g., a signaling biochemical pathway-associated gene or polynucleotide. Examples of target polynucleotides include a disease associated gene or polynucleotide. A “disease-associated” gene or polynucleotide refers to any gene or polynucleotide which is yielding transcription or translation products at an abnormal level or in an abnormal form in cells derived from a disease- affected tissues compared with tissues or cells of a non-disease control. It may be a gene that becomes expressed at an abnormally high level; it may be a gene that becomes expressed at an abnormally low level, where the altered expression correlates with the occurrence and/or progression of the disease. A disease-associated gene also refers to a gene possessing mutation(s) or genetic variation that is directly responsible or is in linkage disequilibrium with a gene(s) that is responsible for the etiology of a disease. The transcribed or translated products may be known or unknown, and may be at a normal or abnormal level.
Various methods may be employed for delivering a nucleic acid or expression vector into cells in vitro. Methods of introducing nucleic acids into cells for expression of heterologous nucleic acid sequences are also known to the ordinarily skilled artisan, including, but not limited to electroporation; protoplast fusion with intact cells; transduction; high velocity bombardment with DNA-coated microprojectiles; infection with modified viral (e.g., phage) nucleic acids; chemically- mediated transformation, competence, etc.
Following introduction of the expression vector or nucleic acid into host cells, the host cells can be cultured in suitable nutrient media. In some embodiments, the media can be modified as appropriate for any activating promoters, selecting transformants, and/or amplifying expression of the fusion protein by modifying culture conditions, such as temperature, pH and the like.
Accordingly, in another embodiment, the invention provides a method for producing the fusion protein in a prokaryotic host cell, comprising culturing the prokaryotic host cell comprising the expression vector in a growth media under conditions suitable for the expression of the fusion protein and isolating the fusion protein. In some embodiments, the fusion protein is produced and isolated as described in Example 2.
As used herein, the terms "isolated" and "purified" refer to a nucleic acid or polypeptide that is removed from at least one component with which it is associated.
In some embodiments, the isolated protein is substantially free of other cellular components. As used herein, the term "substantially free" encompasses preparations of the desired fusion polypeptide having less than about 20% (by dry weight) other proteins (i.e., contaminating protein), less than about 10% other proteins, less than about 5% other proteins, or less than about 1% other proteins.
As used herein, the term "substantially pure" when applied to the fusion proteins or fragments thereof of the present invention means that the proteins are essentially free of other substances to an extent practical and appropriate for their intended use. In particular, the proteins are sufficiently pure and are sufficiently free from other biological constituents of the host cells so as to be useful in, for example, protein sequencing, and/or producing pharmaceutical preparations.
Methods for growing and culturing prokaryotic host cells are known in the art. See, e.g., Sambrook el al. Molecular Cloning: A Laboratory Manual, 2nd edition (1989). In some embodiments, a culture of the prokaryotic host cells that harbors the expression vector is used to inoculate a growth media. The growth media is not limiting, provided it is suitable for promoting growth of the host cells. In some embodiments, the growth media comprises Luria broth. In some embodiments, the culture of prokaryotic host cells used to inoculate the growth media is an overnight culture. In some embodiments, the host cells are cultured at 37°C.
In some embodiments, the prokaryotic host cells are incubated in the growth media for a period of time to achieve a certain density prior to inducing expression of the fusion protein.
In some embodiments, the prokaryotic host cells are incubated in the growth media until the optical density at 600 nanometers reaches a value between about 0.6 to about 0.8, at which point, expression of the fusion protein is induced by addition of an agent or a change in culture conditions. In some embodiments, the inducing agent is IPTG. In some embodiments, the change in conditions is a change in temperature. In some embodiments, the change in temperature is an increase in temperature, e.g., to about 42 degrees Celsius.
In some embodiments, the growth media is cooled following incubation of the cells in the growth media and prior to or subsequent to inducing expression of the fusion protein by the addition of an agent, such as IPTG. In some embodiments, IPTG is added to a final concentration in the growth media of about 0.25 mM to about 1 .0 mM. In some embodiments, the concentration of IPTG in the growth media is about 0.5 mM. In some embodiments, the growth media is incubated at a temperature of between about 14-24 degrees Celsius, wherein the fusion protein is expressed in the prokaryotic host cells in the presence of the agent.
The incubation time for cells in the media following induction of expression of the fusion protein is not necessarily limiting provide a sufficient quantity of the fusion protein is produced. In some embodiments, the cells are incubated for about 12 to about 24 hours to allow expression of the fusion protein. In some embodiments, the cells are incubated in the growth media for about 18 hours at about 16 degrees Celsius to allow for expression of the fusion protein.
In some embodiments, the prokaryotic cells are lysed following incubation to make a lysate. In some embodiments, the cells are pelleted prior to lysis. The cells can be pelleted by centrifugation. In some embodiments, the cells are lysed by sonication. In some embodiments, the cells are lysed by addition of a solution that promotes lysis. In some embodiments, cellular debris is removed from the lysate, e.g., by centrifugation and/or filtration.
The fusion protein can be isolated/purified from the lysate. In some embodiments, the proteins are precipitated from the lysate prior to purification. In some embodiments, the lysate is subjected to chromatography to isolate and purify the fusion protein, e.g., by passing it through a column or other apparatus or composition that is able to capture the fusion protein. In some embodiments, the lysate is passed through a chromatography column that comprises an agent that binds to the fusion protein, e.g., glutathione in the case of GST-tagged proteins or Ni2+ or Co2+ in the case of 6x-His tagged proteins. The agent can be immobilized on beads or a resin to aid in the purification.
In some embodiments, a protease is added to cleave the fusion protein. In some embodiments, the protease recognizes a HRV 3C protease recognition site (e.g., SEQ ID NO: 12). In some embodiments, the protease is a human rhinovirus (HRV) type 14 3C protease. In some embodiments, the human rhinovirus (HRV) type 14 3C protease is fused to GST. In some embodiments, the protease is added while the fusion protein is bound to an agent on the column, and the fusion protein is subsequently eluted from the column following cleavage by the protease. In some embodiments, the fusion protein eluate is filtered. In some embodiments, the fusion protein is further concentrated. In some embodiments, glycerol is added (e.g., 20%) to the fusion protein and the fusion protein can be stored for later use under appropriate storage conditions.
Pharmaceutical compositions
In some embodiments, the invention provides pharmaceutical compositions comprising effective amounts of the nucleic acids, polypeptides, vectors encoding the polypeptides, or the CRISPR/Cas9 system herein in combination with a pharmaceutically acceptable excipient. In some embodiments, the composition comprises a delivery system, such as liposomes, lipid nanoparticles or lipid-like nanoparticles for delivery of nucleic acids. In some embodiments, the lipid or lipid- like nanoparticles are ionizable.
A “pharmaceutically acceptable excipient” is a material that acts in concert with an active ingredient of a medication to impart desirable qualities to a drug intended to be introduced into the body of a subject. The desirable qualities could include enhancing long term stability, acting as a diluent for an active ingredient that must be administered in small amounts, enhancement of therapeutic qualities of an active ingredient, facilitating absorption of an active ingredient into the body, adjusting viscosity, enhancing solubility of an active ingredient, or modifying macroscopic properties of a drug such as flowability or adhesion. Pharmaceutically acceptable excipients can comprise but are not limited to diluents, binders, pH stabilizing agents, disintegrants, surfactants, glidants, dyes, flavoring agents, preservatives, sorbents, sweeteners and lubricants. These materials can take many different forms. See, e.g., Nema, et al., Excipients and their use in injectable products, PDA J. Pharm. Sci. & Tech. 1997, 51(4): 166-171.
The nucleic acids, vectors (e.g., AAV vectors) and CRISPR/Cas9 system described herein may be incorporated into a vehicle for administration into a patient, such as a human patient suffering from any of the conditions described herein. Pharmaceutical compositions containing nucleic acids such as RNA or vectors, such as viral vectors, that contain a polynucleotide encoding the fusion protein can be prepared using methods known in the art. For example, such compositions can be prepared using, e.g., physiologically acceptable carriers, excipients or stabilizers (Remington’s Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980); incorporated herein by reference), and in a desired form, e.g., in the form of lyophilized formulations or aqueous solutions.
Mixtures of the nucleic acids or vectors (e.g., AAV vectors) described herein may be prepared in water suitably mixed with one or more excipients, carriers, or diluents. Dispersions may also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations may contain a preservative to prevent the growth of microorganisms. The pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions (described in US 5,466,468, the disclosure of which is incorporated herein by reference). In any case the formulation may be sterile and may be fluid to the extent that easy syringability exists. Formulations may be stable under the conditions of manufacture and storage and may be preserved against the contaminating action of microorganisms, such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (e.g., glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and/or vegetable oils. Proper fluidity may be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. The prevention of the action of microorganisms can be brought about by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars or sodium chloride. Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
For example, a solution containing a pharmaceutical composition described herein may be suitably buffered, if necessary, and the liquid diluent first rendered isotonic with sufficient saline or glucose. These particular aqueous solutions are especially suitable for intravenous, intramuscular, subcutaneous, and intraperitoneal administration. In this connection, sterile aqueous media that can be employed will be known to those of skill in the art in light of the present disclosure. The compositions described herein may be administered to a subject by a variety of routes, such as local administration, ocular, retinal, intravenous, parenteral, intradermal, transdermal, intramuscular, intranasal, subcutaneous, percutaneous, intratracheal, intraperitoneal, intraarterial, intravascular, inhalation, perfusion, lavage, and oral administration. The most suitable route for administration in any given case will depend on the particular composition administered, the patient, pharmaceutical formulation methods, administration methods (e.g., administration time and administration route), the patient's age, body weight, sex, severity of the disease being treated, the patient’s diet, and the patient’s excretion rate. Compositions may be administered once, or more than once (e.g., once annually, twice annually, three times annually, bimonthly, monthly, or bi-weekly).
Treatment may include administration of a composition containing the nucleic acids or vectors (e.g., AAV vectors) described herein in various unit doses. Each unit dose will ordinarily contain a predetermined quantity of the therapeutic composition. The quantity to be administered, and the particular route of administration and formulation, are within the skill of those in the clinical arts. A unit dose need not be administered as a single injection but may include continuous infusion over a set period of time. Dosing may be performed using a syringe pump to control infusion rate in order to minimize damage to the tissue administered.
Kits
In some embodiments, the nucleic acids, fusion proteins, or vectors described herein are provided in the form of a kit or system that optionally comprises one or more guide RNAs (e.g., sgRNAs), and/or a template nucleic acid encoding a sequence of interest.
In some embodiments, a kit comprises one or more reagents for use in a process utilizing one or more of the elements described herein. Reagents may be provided in any suitable container. For example, a kit may provide one or more reaction or storage buffers. Reagents may be provided in a form that is usable in a particular assay, or in a form that requires addition of one or more other components before use (e.g. in concentrate or lyophilized form). A buffer can be any buffer, including but not limited to a sodium carbonate buffer, a sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer, and combinations thereof. In some embodiments, the buffer is alkaline. In some embodiments, the buffer has a pH from about 7 to about 10. In some embodiments, the kit comprises one or more oligonucleotides corresponding to a guide sequence for insertion into a vector so as to operably link the guide sequence and a regulatory element. In some embodiments, the kit comprises a homologous recombination template polynucleotide.
Sample embodiments
This section describes exemplary compositions and methods of the invention, presented without limitation, as a series of paragraphs, some or all of which may be alphanumeric ally designated for clarity and efficiency. Each of these paragraphs can be combined with one or more other paragraphs, and/or with disclosure from elsewhere in this application, including the materials incorporated by reference, in any suitable manner. Some of the paragraphs below expressly refer to and further limit other paragraphs, providing without limitation examples of some of the suitable combinations.
1. A prokaryotic expression vector comprising a promoter sequence operably linked to a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises i) Cas9 or a variant or fragment thereof; ii) Rad 18 or a variant or fragment thereof; and iii) CtIP or a variant or fragment thereof.
2. The expression vector of paragraph 1 , wherein the vector is a plasmid.
3. The expression vector of paragraph 1 or 2, wherein the expression vector is a bacterial expression vector.
4. The expression vector of any of paragraphs 1-3, wherein the expression vector is capable of expressing the fusion protein in Escherichia coli.
5. The expression vector of any of paragraphs 1-4, further comprising a ribosome binding site (RBS) positioned upstream of the nucleotide sequence encoding the fusion protein.
6. The expression vector of any of paragraphs 1-5, further comprising a selectable marker gene. 7. The expression vector of paragraph 6, wherein the marker gene encodes a gene capable of conferring antibiotic resistance.
8. The expression vector of paragraph 7, wherein the marker gene encodes a gene capable of conferring resistance to ampicillin.
9. The expression vector of any of paragraphs 1-8, wherein the promoter is an inducible promoter.
10. The expression vector of paragraph9, wherein the inducible promoter is a lad promoter inducible by TPTG.
11. The expression vector of any of paragraphs 1-10, further comprising a transcription termination signal downstream of a nucleotide sequence encoding the fusion protein.
12. The expression vector of any of paragraphs 1-11, wherein the Cas9 is a wildtype Cas9 from Streptococcus pyogenes.
13. The expression vector of any of paragraphs 1-12, wherein the Cas9 comprises an amino acid sequence at least 90% identical to SEQ ID NO: 1.
14. The expression vector of any of paragraphs 1-13, wherein the Cas9 comprises amino acids 2-1368 of SEQ IQ NO:1.
15. The expression vector of any of paragraphs 1-14, wherein Cas9 is encoded by a nucleic acid sequence at least 80% identical to SEQ ID NO:2.
16. The expression vector of any of paragraphs 1-15, wherein the fusion protein comprises a fragment of CtIP.
17. The expression vector of any of paragraphs 1-16, wherein the fusion protein comprises an HDR Enhancing N-terminal fragment of CtIP.
18. The expression vector of any of paragraphs 1-17, wherein the HDR Enhancing N-terminal fragment of CtIP comprises an amino acid sequence at least 90% identical to SEQ ID NO:3 (amino acids 1-296).
19. The expression vector of any of paragraphs 1-18, wherein the CtIP fragment is encoded by a nucleic acid sequence at least 80% identical to SEQ ID NO:4.
20. The expression vector of any of paragraphs 1-19, wherein the fusion protein comprises a fragment of Rad 18.
21. The expression vector of any of paragraphs 1-20, wherein the fragment of Rad 18 comprises a deletion of a putative DNA-binding domain. 22. The expression vector of any of paragraphs 1-21, wherein the fragment of Rad 18 (also referred to herein as eRad 18) has a deletion in amino acids 242-282 in Rad 18 and is encoded by an amino acid sequence at least 90% identical to SEQ ID NO:5 or SEQ ID NO:43.
23. The expression vector of any of paragraphs 1-22, wherein fragment of Rad 18 is encoded by a nucleic acid sequence at least 80% identical to SEQ ID NO:6.
24. The expression vector of any of paragraphs 1-23, wherein the Radi 8 or a variant or fragment thereof is fused to the amino terminus of the Cas9 or a variant or fragment thereof and the CtIP or a variant or fragment thereof is fused to the carboxy terminus of Cas9.
25. The expression vector of any of paragraphs 1-24, wherein the fusion protein comprises an amino acid sequence at least 95% identical to SEQ ID NO:7.
26. The expression vector of any of paragraphs 1-25, wherein the fusion protein is encoded by a nucleic acid sequence at least 80% identical to SEQ ID NO:8.
27. The expression vector of any of paragraphs 1-26, wherein the fusion protein comprises one or more affinity or purification tags.
28. The expression vector of any of paragraphs 1-27, wherein the affinity or purification tag comprises glutathione S-transferase.
29. The expression vector of any of paragraphs 1-28, wherein the affinity or purification tag comprises a maltose binding protein or a variant or fragment thereof.
30. The expression vector of any of paragraphs 1-29, wherein the affinity or purification tag comprises glutathione S-transferase and a maltose binding protein or a variant or fragment thereof.
31. The expression vector of any of paragraphs 1-30, wherein the expression vector comprises a nucleic acid sequence at least 80% identical to SEQ ID NO:9.
32. The expression vector of any of paragraphs 1-31, wherein the nucleotide sequence encoding the fusion protein is codon-optimized for expression in a prokaryotic host cell.
33. A prokaryotic host cell transformed or transfected with the expression vector of any of paragraphs 1-32.
34. The host cell of paragraph 33, wherein the host cell is Escherichia coli. 35. A method for producing the fusion protein in a prokaryotic host cell of paragraph 33 or paragraph 34, comprising culturing the prokaryotic host cell in a growth media under conditions suitable for the expression of the fusion protein and isolating the fusion protein.
36. The method of paragraph 35, wherein a culture of the prokaryotic host cells is used to inoculate the growth media.
37. The method of paragraph 35 or 36, wherein the growth media comprises Luria broth.
38. The method of paragraph 36 or 37, wherein the culture of prokaryotic host cells is an overnight culture.
39. The method of any of paragraphs 35-38, wherein the prokaryotic host cells are incubated in the growth media for a period of time prior to inducing expression of the fusion protein.
40. The method of any of paragraphs 35-39, wherein the prokaryotic host cells are incubated in the growth media until the optical density at 600 nanometers reaches a value between about 0.6 to about 0.8.
41. The method of any of paragraphs 35-40, wherein the growth media is cooled following incubation in the growth media.
42. The method of any of paragraphs 35-41, wherein expression of the fusion protein is induced by addition of an agent to the growth media or incubation of the prokaryotic host cells at a different temperature.
43. The method of paragraph 42, wherein the agent is IPTG.
44. The method of paragraph 43, wherein IPTG is added to a final concentration in the growth media of about 0.5 mM.
45. The method of any of paragraphs 39-44, wherein the growth media is incubated at a temperature of between about 14-24 degrees Celsius, wherein the fusion protein is expressed in the prokaryotic host cells in the presence of the agent.
46. The method of any of paragraphs 39-45, wherein the growth media is incubated for about 12 to about 24 hours to allow expression of the fusion protein.
47. The method of any of paragraphs 39-46, wherein the growth media is incubated for about 18 hours at about 16 degrees Celsius. 48. The method of any of paragraphs 35-47, wherein the prokaryotic cells are lysed to make a lysate.
49. The method of paragraph 48, wherein the cells are pelleted prior to lysis.
50. The method of paragraph 49, wherein the cells are pelleted by centrifugation.
51. The method of paragraph 48, wherein the cells are lysed by sonication.
52. The method of any of paragraphs 48-51, wherein cellular debris is removed from the lysate by centrifugation and/or filtration.
53. The method of any of paragraphs 48-52, further comprising isolating the fusion protein from the lysate.
54. The method of paragraph 53, wherein the fusion protein is purified by chromatography.
55. The method of paragraph 54, wherein the fusion protein is purified by affinity chromatography.
56. A nucleic acid comprising a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtlP or a variant or fragment thereof.
57. The nucleic acid of paragraph 56, wherein the Cas9 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO: 13.
58. The nucleic acid of any of paragraphs 56 or 57, wherein the Cas9 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO: 14 or SEQ ID NO:37.
59. The nucleic acid of any of paragraphs 56-58, wherein the Radi 8 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:5.
60. The nucleic acid of any of paragraphs 56-59, wherein the Radi 8 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO:6 or SEQ ID NO:40.
61. The nucleic acid of any of paragraphs 56-60, wherein the CtlP polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:3.
62. The nucleic acid of any of paragraphs 56-61, wherein the CtlP polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO:4 or SEQ ID NO:38. 63. The nucleic acid of any of paragraphs 56-62, wherein the fusion protein comprises an amino acid sequence of SEQ ID NO: 17.
64. The nucleic acid of any of paragraphs 56-62, wherein the fusion protein comprises an amino acid sequence at least about 90% identical to SEQ ID NO: 17.
65. The nucleic acid of any of paragraphs 56-64, wherein the fusion protein is encoded by a nucleotide sequence comprising SEQ ID NO: 18 or SEQ ID NO:41.
66. The nucleic acid of any of paragraphs 56-64, wherein the fusion protein is encoded by a nucleotide sequence at least about 60% identical to SEQ ID NO: 18 or SEQ ID NO:41.
67. A vector encoding the nucleic acid of any of paragraphs 56-66.
68. The vector of paragraph 67, wherein the vector is a viral vector.
69. The vector of paragraph 67 or 68, wherein the vector encodes a sgRNA.
70. The vector of any of paragraphs 67-69, wherein the vector encodes a nucleic acid template encoding a sequence of interest.
71. A pharmaceutical composition comprising the nucleic acid of any of paragraphs 56-66, or the vector of any of paragraphs 67-70.
72. The pharmaceutical composition of paragraph 71 , wherein the composition further comprises a sgRNA.
73. The pharmaceutical composition of paragraph 70 or 71, wherein the composition further comprises a template nucleic acid encoding a sequence of interest.
74. The pharmaceutical composition of any of paragraphs 71-73, wherein the composition comprises lipid nanoparticles.
75. A method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of the pharmaceutical composition of any of paragraphs 71-74.
76. A provides a method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of the nucleic acid of any of paragraphs 56-66 or the vector of paragraphs 67-70, and optionally one or more additional agents to modify the genomic sequence in the cell. 77. A method of treating a disease of condition in a subject, comprising administering to the subject an effective amount of a pharmaceutical composition of any of paragraphs 71-74.
78. A method of treating a disease of condition in a subject, comprising administering to the subject an effective amount of the nucleic acid of any of paragraphs 56-66 or the vector of paragraphs 67-70, and optionally one or more additional agents to modify a genomic sequence in a cell of the subject.
79. A nucleic acid comprising a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
80. The nucleic acid of paragraph 79, wherein the Cas9 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO: 13.
81. The nucleic acid of any of paragraphs 79 or 80, wherein the Cas9 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO: 14 or SEQ ID NO:37.
82. The nucleic acid of any of paragraphs 79-81, wherein the Rad52 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:15.
83. The nucleic acid of any of paragraphs 79-82, wherein the Rad52 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO: 16 or SEQ ID NO:39.
84. The nucleic acid of any of paragraphs 79-83, wherein the CtIP polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:3.
85. The nucleic acid of any of paragraphs 79-84, wherein the CtIP polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO:4 or SEQ ID NO:38.
86. The nucleic acid of any of paragraphs 79-85, wherein the fusion protein comprises an amino acid sequence of SEQ ID NO: 19.
87. The nucleic acid of any of paragraphs 79-85, wherein the fusion protein comprises an amino acid sequence at least about 90% identical to SEQ ID NO: 19.
88. The nucleic acid of any of paragraphs 79-87, wherein the fusion protein is encoded by a nucleotide sequence comprising SEQ ID NO:20 or SEQ ID NO:42. 89. The nucleic acid of any of paragraphs 79-87, wherein the fusion protein is encoded by a nucleotide sequence at least about 60% identical to SEQ ID NO:20 or SEQ ID NO:42.
90. A vector encoding the nucleic acid of any of paragraphs 79-89.
91. The vector of paragraph 90, wherein the vector is a viral vector.
92. The vector of paragraph 90 or 91 , wherein the vector encodes a sgRNA.
93. The vector of any of paragraphs 90-92, wherein the vector encodes a nucleic acid template encoding a sequence of interest.
94. A pharmaceutical composition comprising the nucleic acid of any of paragraphs 79-89, or the vector of any of paragraphs 90-93.
95. The pharmaceutical composition of paragraph 94, wherein the composition further comprises a sgRNA.
96. The pharmaceutical composition of paragraph 94 or 95 wherein the composition further comprises a template nucleic acid encoding a sequence of interest.
97. The pharmaceutical composition of any of paragraphs 94-96, wherein the composition comprises lipid nanoparticles.
98. A method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of the pharmaceutical composition of any of paragraphs 94-97.
99. A method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of the nucleic acid of any of paragraphs 79-89 or the vector of paragraphs 90-93, and optionally one or more additional agents to modify the genomic sequence in the cell.
100. A method of treating a disease of condition in a subject, comprising administering to the subject an effective amount of a pharmaceutical composition of any of paragraphs 94-97.
101. A method of treating a disease of condition in a subject, comprising administering to the subject an effective amount of the nucleic acid of any of paragraphs 79-89 or the vector of paragraphs 90-93, and optionally one or more additional agents to modify a genomic sequence in a cell of the subject. All publications herein are incorporated by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference.
EXAMPLES
Example 1. Combinatorial fusions of DNA repair proteins enhance precision and efficiency of Crispr/Cas9-mediated knockin
Cas9 targets genomic loci with high specificity. For knockin with doublestrand break repair, however, Cas9 often leads to unintended on-target knockout rather than intended edits. This imprecision is a barrier for direct in vivo editing where clonal selection is not feasible. This example demonstrates a high-throughput workflow to comparatively assess on-target efficiency and precision of editing outcomes. Using this workflow, we screened combinations of donor DNA and Cas9 variants, as well as fusions to DNA repair proteins. This yielded novel high-performance double-strand break repair editing agents and combinatorial optimizations yielding orders-of- magnitude increases in knockin precision, increased knockin performance in vitro and in vivo in the developing mouse brain. Continued comparative assessment of editing efficiency and precision with this framework will further the development of high- performance editing agents for in vivo knockin and future genome therapeutics. Results
Quantifying editing efficiency and precision through BFP-to-GFP conversion
To quantify and compare S. pyogenes Cas9 and donor variant combinations on knockin performance, we used a BFP-to-GFP conversion assay previously developed using an engineered HEK293 cell line (HEK:BFP) genomically expressing BFP (Richardson et al., Nat. Genet., (2018), 50:1132-1139). Here, we developed donor templates in a variety of formats to all target the BFP sequence and convert it to GFP via introduction of the point mutation H26Y. Fluorescence was used as a surrogate for editing outcomes following transfection of these cells with Cas9 or Cas9 variants, gRNA targeting the BFP locus, and H26Y donor DNA formats. Precise H26Y knockin would cause fluorescence to shift from BFP to GFP, while on-target indels by error-prone repair would result in unintended knockout and loss of fluorescence (Fig. 1A). By quantifying the proportions of BFP+, GFP+, and dark (BFP /GFP ) cells, we can ratiometrically assess knockin efficiency (% of GFP+ to BFP+ cells) and knockin precision (ratio of % GFP+ to % dark cells) across editing agents.
To validate our workflow and confirm that fluorescence readouts correspond to predicted genotypes in HEK:BFP cells, we used FACS to sort and collect the three phenotypic cell populations (BFP+, GFP+, and dark) that emerge following treatment with BFP/H26Y editing agents. Genomic DNA was extracted from sorted cell populations and used as template in PCR with primers flanking the region targeted by the BFP gRNA within the BFP locus, thereby amplifying all alleles regardless of on- target editing outcome. Pooled PCR amplicons from each sorted cell population were analyzed by Sanger sequencing (Fig. IB).
As expected, the genotype of the BFP+ population matched that of the WT BFP sequence. The dark population exhibited a complex mixture of sequence results in the vicinity of the BFP gRNA cleavage site, representing unintended on-target edits due to imprecise repair. Sequencing decomposition using the ICE algorithm15 on amplicons from the dark sorted population revealed a predominance of deleterious indels (79% frameshift vs. 11% in-frame), in line with a loss of fluorescence due to knockout (Brinkman et al., Nucleic Acids Res., (2014), 42:el68). Finally, the GFP+ population exhibited a single genotype containing the desired H26Y point mutation knockin from the donor template (Fig. 1C-D). The sequencing results matched the fluorescence surrogates, thus validating this platform for high-throughput ratiometric screening of knockin agents.
Combinatorial screening of editing agents for enhanced performance
We sought to identify elements of DSB repair knockin agents that in combination offer improved efficiency and precision of editing. We investigated a matrix of combinations of three factors, which have individually been shown to enhance knockin performance: i) wildtype S. pyogenes Cas9 (‘WT’ Cas9) versus a High-Fidelity (‘HF’) Cas9 sequence variant (Cas9-HF); (Kleinstiver et cd., Nature, (2016), 529:490-495; Idoko-Akoh el al., Sci. Rep., (2018), 8:15126) ii) Cas9 variants alone versus fusions with the "HDR Enhancing" (HE) N-terminal fragment (1-296) of the DNA repair protein CtIP (Huertas et al., J. Biol. Chem., (2009), 284:9558-9565; Charpentier et al., Nat. Commun, (2018), 9: 1133); iii) circular DNA donor (‘HR’) predicted to favor homologous recombination vs. in situ linearized DNA donor (‘HMEJ’) predicted to favor homology-mediated end joining (Fig. 2A) (Shibata et al. , EMBO J., (2011), 30:1079-1092; Yao, X. et al., CellRes., (2017), 27:801-814). Both knockin donors were provided on plasmids with the knockin sequence flanked by ~800bp homology arms, the length of which did not significantly affect results within a range of 500 to 1500 bp (data not shown). The HMEJ donor differed from the HR donor by the insertion of gRNA Binding Sites (GRBS) flanking the homology arms, which are cleaved by Cas9 to create linear dsDNA donors in cells. The orientation of the GRBSs did not have significant effects on editing performance.
Transfection of HEK:BFP cells with Cas9 variant, donor template, and the BFP gRNA led to the emergence of both knockin (GFP+) and knockout (dark) populations for all combinations of donors and Cas9 variants, while transfection with off-target control gRNA resulted in no GFP+ cells (Fig. 2B). Among Cas9 variants tested, Cas9WT-CtIP performed with a roughly 2-fold improvement in knockin efficiency compared to Cas9WT, regardless of dsDNA donor variant, consistent with previous reports (Charpentier et al., Nat. Commun, (2018), 9:1133; Nakade et al., Nat. Commun., (2018), 9:3270; Tran, N.-T. et al. Front. Genet., (2019), 10:365). Interestingly, CtIP fusion did not significantly improve efficiency and precision for single strand oligodeoxynucleotide DNA (ssODN) donors. Cas9HF variants were generally not significantly different from Cas9WT, although Cas9HF-CtIP showed a 1.6-fold improvement in knockin efficiency specifically with the HR donor (Fig. 2D). Knockout rates did not significantly differ among the Cas9 variants (Fig. 2C), and thus, the KI/KO ratios (knockin precision) mirror the differences seen in knockin efficiency (Fig. 2D).
In contrast to the Cas9 variants, donor architecture impacted knockin efficiency as well as knockout rates. Across all four Cas9 variants, the HMEJ donor showed a 9- to 13-fold increase in knockin efficiency compared to the corresponding HR combination. Of the combinations of Cas9 and donors tested, Cas9WT-CtIP in conjunction with the HMEJ donor resulted in the highest rate of knockin, 24-fold higher than Cas9WT with the HR donor. Regardless of Cas9 variant, the HMEJ donor showed a 22-30% reduction in gene disruption, which, together with the improved knockin efficiency, enhanced the precision by about 15-fold relative to the HR donor template permutations. Interestingly, these data suggest the independent and additive contributions of both CtIP fusion (~2-fold) and the in situ cleaved HMEJ donors (~11- fold) to knockin efficiency.
Taken together, these results demonstrate that combinatorial optimization of both Cas9 and donor DNA can significantly shift the balance away from error-prone repair, with the combination of Cas9WT-CtIP (henceforth “Cas9-CtIP”) and HMEJ donors giving the highest efficiency and precision in HEK cells.
Double fusions on Cas9 improve editing performance
Several groups have independently demonstrated that modulation of DNA repair pathways is an effective way to improve knockin efficiency and precision (Charpentier et al., Nat. Commun, (2018), 9:1133; Tran, N.-T. el al. Front. Genet., (2019), 10:365; Jayavaradhan et al., Nat. Commun., (2019), 10:2866). To build on the results of the 3-factor screening highlighting WT Cas9 , CtIP fusion, and HMEJ donor as the best performing combination, we iterated the BFP-to-GFP screening platform with constant HMEJ donor and evaluated the impact of five candidate DNA repair protein domains dn53BPl, TIP60, RNF169, Rad52, eRadl8 on editing efficacy and precision when fused N-terminally to Cas9 or Cas9-CtLP (Fig. 3 A) (Jayavaradhan et al., Nat. Commun., (2019), 10:2866; Tang et al. Nat. Struct. Mol. Biol., (2013), 20:317-325; An et al. Proc. Natl. Acad. Sci. USA, (2018), 115:E8286-E8295; Tang et al. Nat. Struct. Mol. Biol., (2013), 20:317-325; Nambiar et al., Nat. Commun., (2019), 10:3395; Tran, N.-T. et al. Front. Genet., (2019), 10:365; Shao, S. et al., Int. J. Biochem. Cell Biol., (2017), 92: 43-52; Paulsen, et al. Nat. Biomed. Eng., (2017), 1 :878-888.).
In the absence of CtIP, only eRadl8 fusion to Cas9 significantly improved knockin efficacy, increasing it by 1.8-fold above Cas9 alone, similar to Cas9-CtIP. With the compound fusions, while addition of dn53BPl, TIP60, or RNF169 to Cas9- CtlP appeared to abrogate the effect of CtIP on knockin efficacy, fusion of Rad52 or eRadl8 did not have a detrimental impact on efficiency (Fig. 3B-D).
Regarding unintended on-target knockout, dn53BPl-, TIP60-, and RNF169- fused Cas9 did not significantly differ from the Cas9-only control. Rad52 and eRad 18 fusion, however, showed 18% and 28% reductions in knockout frequency, respectively (Fig. 3B-D). Interestingly, compound fusion of each of the five DNA repair proteins with CtIP led to significant reductions in the knockout rate, with eRadl8, Rad52, and TIP60 showing the most pronounced decreases (45%, 38% and 38%, respectively). Despite these reductions in knockout rates, only Rad52 and eRadl8 led to significant improvements in overall knockin precision. Without CtIP, eRadl8 demonstrated a 2.5-fold increase in the knockin-to-knockout ratio, while combination of either Rad52 or eRadl8 with CtIP led to a 3.1 -fold increase in precision relative to Cas9 alone (Fig. 3B-D). These results show that specific combinations of DNA repair domains can function together, fused to flank Cas9, to improve both the efficiency and precision of editing.
We elected to move forward with the smallest of these compound fusions, Cas9WT flanked by eRadl8 and CtIP, which we call “Cas9-RC”. In addition to C- terminal fusion of truncated human CtIP, Cas9-RC harbors an N-terminal fusion of an “enhanced” variant of human Radi 8 protein, eRadl8, which contains the N- terminal 242 amino acid residues of a putative DNA-binding domain shown to enhance homology-dependent repair (HDR) when independently co-expressed with Cas9 (Nambiar et al., Nat. Commun., (2019), 10:3395). The total length of Cas9-RC is 2,172 residues.
Identifying protein domains that add their enhancements to editing performance even when fused to Cas9 enables the production of a single fusion protein for use in ribonucleoprotein (RNP) editing applications, without the need for expression or delivery of these as additional co-factors. To assess the feasibility of producing Cas9-RC for RNP applications, we fused N-terminal GST and MBP tandem affinity tags to Cas9-RC separated by an HRV 3C protease cleavage site to facilitate protein purification (Fig. 4A), subcloned into an IPTG-driven bacterial expression vector. Induction of transformed bacteria with IPTG yielded new bands in bacterial lysates, including at high molecular weights (Fig. 4B). Induced bands displayed Cas9 immunoreactivity at molecular weights ranging from over 300 kDa to 50 kDa (Fig. 4C), indicating that exogenous Cas9-RC expression in bacteria results in various translated or cleavage products. Importantly, IPTG-specific Cas9 immunoreactive bands corresponding to the predicted electrophoretic mobilities of full-length GST-MBP-Cas9-RC (316 kDa) and Cas9-RC (250 kDa) were abundant in lysates, indicating that Cas9-RC can be bacterially expressed for RNP applications. Double fusion Cas9-RC increases knockin in vivo The combinatorial screening and iterative optimizations of Cas9 agents yielded Cas9-RC that showed increases in knockin performance in cultured cell lines. Aiming to develop precision knockin agents for direct editing in vivo, we sought to assess the knockin efficacy of Cas9-RC with an HMEJ donor in a mouse model.
To test the knockin efficiency of Cas9-RC in vivo, we used in utero plasmid electroporation in the embryonic mouse brain (Fig. 5A) targeting integration of a 2A mCherry cassette at the 3’ end of the endogenous [3- Actin (ActB) locus (Fig. 5B) (Saito etal., Dev. Biol., (240), 237-246, (2001 ); Mikuni et al., Cell, (2016), 165:1803-1817). A combination of four plasmids containing Cas9 or Cas9-RC, HMEJ donor with the 2A mCherry knockin, ActB gRNA, and a GFP transfection marker were electroporated into embryonic day 14.5 (E14.5) wild-type mice targeting progenitors of projection neurons of sensorimotor cortex. When assessed knockin efficiency in neurons at postnatal day 7 (P7), electroporation of Cas9-RC led to an increase in mCherry+ knockin neurons compared to Cas9 (Fig. 5C).
In vivo knockin efficiency was calculated by comparing the number of mCherry+ knockin neurons to the number of GFP+ electroporated neurons (Fig. 5B). Electroporation of Cas9 resulted in an in vivo knockin efficiency of 2.4% (± 0.44%), while Cas9-RC yielded a 3.7-fold increase in performance averaging 9% (± 1.7%) in vivo knockin efficiency (Fig. 5C-D). These results demonstrate that Cas9-RC outperforms existing DSB repair-based editing agents with over three-fold increases in knockin performance both in vitro and in vivo.
Fluorescent protein knockin applications with Cas9-RC
We went on to use Cas9-RC and HMEJ donors in knockin applications of three different fluorescent proteins onto three different loci in three different cell types. We extracted P0 primary mouse fibroblasts from P0 mice and ex vivo electroporated Cas9- RC, ActB 2A mCherry donor, and ActB gRNA constructs from Fig. 4 using cuvette electroporation (nucleofection). Cultured cells expressed mCherry fluorescence with two distinct intensities, characteristic of cells expressing mCherry from one or both ActB alleles (Fig. 6A). Biallelic knockin cells offer experimental advantages because of increased fluorescent signal, and because biallelic expression indicates that both alleles remain functional after editing. Due to the high incidence of indels with conventional HDR-based editing, cells that display monoallelic knockin are likely to have an indel on the other targeted allele, potentially causing unintended haploinsuficiency phenotypes. The high knockin-to-knockout ratio of Cas9-RC increases the chances of obtaining biallelic knockin cells with expression from both alleles.
One of the limitations of Cas9-RC efficiency in vivo is the large size of the expression construct, which drastically reduces electroporation efficiency. We repeated Cas9-RC in utero electroporations with the Cas9-RC ActB 2A mCherry knockin constructs in the brain using a triple electrode (tritrode) and highly supercoiled plasmid DNA, both of which increased the efficiency of electroporation (dal Maschio et al., Nat. Commun., (2012), 3:960; Szczurkowska, J. et al., Nat. Protoc., (2016), 11:399-412). We observed considerable increases in the number of knockin cells in vivo using the tritrode and supercoiled method, including both neurons and astrocytes (Fig. 6B). Our experience with Cas9-RC indicates that steps to increase electroporation of large plasmids is a practical way to significantly increase knockin efficiency.
We constructed HMEJ donor and gRNA constructs to fuse the fluorescent protein mScarlet onto the N-terminus of the GPI-linked membrane protein Neuronal growth regulator 1 (Negri), a protein with variable expression in the mouse brain (Miyata et al., Neuroscience, (2003), 117:645-658). Knockin efficiency was significantly lower than actin knockin. Lower efficiencies may result from the fact that, unlike actin, not all knockin cells will express Negri, or because many cells express it at levels below our detection threshold. When comparing Cas9-RC efficiency to Cas9, there was no significant difference in knockin efficiency (Fig. 6C). This may indicate that increases of Cas9-RC performance in vivo may be locusspecific. Alternatively, it may indicate that the lower electroporation efficiency of Cas9-RC in vivo compared to the smaller Cas9 may negate the positive effects of Cas9-RC in loci with low knockin yield. This result indicates that Cas9-RC use over Cas9 should be determined for each locus individually.
Finally, we used Cas9-RC with tritrode electroporation and supercoiled constructs to target Purkinje cells in the mouse embryonic cerebellum, as a case to examine a difficult to transduce cell type that we were not able to knock in with Cas9. We constructed HMEJ donor and gRNA constructs to knock in the fluorescent protein mGreenLantem downstream of the Pvalb locus, expressed by parvalbumin+ Purkinje cells. While efficiency was low, we consistently detected sparse knockin parvalbumin+ Purkinje cells (Fig. 6D). These results suggest that Cas9-RC may have advantages for knockin of cells with low knockin rates.
In this study we sought to develop high-performance knockin tools by exploring combinations of DNA donor templates, variants of Cas9, and fusion of DNA repair protein domains. We identified novel Cas9 fusions and donor combinations that resulted in knockin with significantly improved metrics of editing efficiency and precision, in vitro and in vivo.
Our work establishes a standardized pipeline to optimize tools for precision genome editing. The BFP-to-GFP screening platform in HEK cells provides a high- throughput quantitative readout of the efficiency of correctly edited cells, while also reporting on the frequency of incorrectly edited cells (Richardson et al., Nat. Genet., (2018), 50:1132-1139). By simultaneously evaluating knockin and knockout rates, we identified combinations that optimized both efficiency (overall knockin rate) and precision (knockin rate vs. knockout rate). Knockin of a fluorescence cassette into the highly expressed P-Actin locus similarly enabled direct quantification of efficiency for large inserts at an endogenous gene locus in vivo. Quantification at the variably expressed locus Negri did not show increases in efficiency of Cas9-RC over Cas9, possibly due to the increased size of Cas9-RC resulting in the targeting of fewer cells. Knockin on a highly expressed cell-type- specific locus (parvalbumin) allowed us to demonstrate the use of Cas9-RC on Purkinje cells, a highly differentiated cell type.
We propose editing efficiency and precision as generalizable performance metrics for comparing genome editing agents. The ratiometric nature of these values makes them versatile enough to be applied in a variety of biological systems regardless of whether the experimental outputs are sequences (i.e. Fig. 1C), surrogate cellular markers (i.e. Fig 2), or even function. These metrics incorporate all elements of the editing agent, including formulation (e.g. plasmid, RNP), editing modality (e.g. DSB repair, base editing, Prime editing), and delivery system (e.g. lipids, viral, nanoparticles), all of which may affect both outcomes. Efficiency and precision can be used as readouts when assessing individual components for holistic performance optimization, as we have done here. They can additionally be useful as common metrics to compare performance across distinct editing modalities, e.g., DSB repair vs Prime editing. By simultaneously assessing efficiency and precision across a variety of knockin tools, optimal agents can be selected based on the experimental need. For example, with ex vivo editing, one might prioritize efficiency if there are facile methods for post hoc selection of properly edited cells, whereas precision may be prioritized for contexts where edited cells cannot be selected, such as with in vivo editing.
Using this dual metric performance assessment, our study developed the high- performance DSB repair editor Cas9-RC. When paired with HMEJ donor templates, Cas9-RC outperformed other knockin agents by over 30-fold in human cells and showed potential for 3 -fold increases in the mouse brain, albeit not at all loci tested. Importantly, Cas9-RC enables high performance for large genomic edits, such as fluorescent protein knockin. This complements parallel developments in base editors and Prime editing, which offer high performance but are limited to smaller edits. Ultimately, a diverse toolkit of precision editors will be useful to broaden the scope of in vivo editing applications. As presented in our study, having standardized platforms for quantitative comparison of new tools and novel combinations will further support efforts towards precision in vivo editing for both basic research and the development of future human therapeutics.
Fusion of Cas9 to DNA repair protein domains can produce synergistic enhancements of knockin performance. Iterative high-throughput screening based on fluorescent protein conversion is an effective platform to assess knockin efficiency and precision for developing new editing agents. Cas9-RC is a new DSB repair genome editor demonstrating enhanced knockin performance in vitro and in vivo. Methods
Plasmid design and construction
Mammalian expression plasmids and knockin donor template plasmids were constructed with a combination of standard cloning techniques. For gRNA expression constructs, oligos (Integrated DNA Technologies) were annealed and cloned into a custom hU6 backbone using Golden Gate Assembly (GGA) (Engler et al., PLoS One 3, (2008), e3647). Cas9 expression constructs were assembled by a modified mMoClo system (Duportet et al., Nucleic Acids Res., (2014), 42:13440-13451). Briefly, individual parts were cloned, Bsal adapters added, and internal sites removed via PCR using KAPA HiFi HotStart DNA Polymerase with 2X Master Mix (Roche), or synthesized (Integrated DNA Technologies). Parts were subsequently assembled into expression constructs using NEB Golden Gate Assembly Kit (Bsal-HF v2) according to manufacturer’s recommendations. Homology arms for donor templates were PCR amplified from CD1 mouse genomic DNA with adapters for GGA. gRNA binding site (GRBS) parts were generated by oligo annealing. GRBS and homology arms were assembled with knockin sequences via GGA.
Bacterial protein expression
The IPTG-driven bacterial expression plasmid pGEX_MBP_eRadl8- spCas9(wt)-cTip was transformed into Rosetta (DE3) competent cells (Sigma- Aldrich) and cultured in LB supplemented with ampicillin (50 pg/mL) and chloramphenicol (30 pg/mL) until ODeoo reached 0.6. Protein expression was induced by adding IPTG to a final concentration of ImM and culturing at 37°C for 3 hours. Samples were collected before and after induction, pelleted by centrifugation, and boiled in lx Laemmli buffer with DTT (50 mM) for 10 minutes.
Cell-line culture and transfection
HEK:BFP cells were developed in the Com Lab and were the kind gift of Chris Richardson (Richardson et al., Nat. Genet., (2018), 50:1132-1139). Cells were maintained at 37°C and 5% CO2 in DMEM media plus GlutaMax (ThermoFisher Scientific) supplemented with 10% (v/v) fetal bovine serum (FBS). Typically, 20,000- 22,500 cells/cm2 were seeded onto 24-well plates the day before transfection. Cells were transiently transfected at 70-80% confluence using Polyethylenimine, Linear, MW 25000 (‘PEI’, Polysciences) resuspended to Img/mL in H2O at a 3: 1 (v/w) PELDNA ratio with 250 ng DNA per plasmid (750 ng total DNA) diluted in Opti- MEM (ThermoFisher Scientific) and added dropwise to cells.
SDS-PAGE and immunoblot
Samples were loaded in 4-15% Mini-PROTEAN TGX precast protein gels (Bio-Rad Laboratories). Electrophoresis was performed in a Mini-PROTEAN Tetra Vertical Electrophoresis Cell (Bio-Rad Laboratories). Proteins were transferred to Immuno-Blot PVDF Membranes (Bio-Rad Laboratories) using a Trans-Blot Turbo system (Bio-Rad Laboratories) with manufacturer-defined settings. The blot was blocked in 5% skim milk, then incubated with a primary antibody solution of TBST (0.1% Tween- 20 in tris-buffered saline), 5% skim milk, and 1 : 1000 Cas9 (.S'. pyrogenes) (D8Y4K) Rabbit mAb #65832 (Cell Signaling), and a secondary antibody solution containing 1:10000 IRDye 680RD Donkey anti-Rabbit IgG (LI-COR Biosciences) secondary antibody in phosphate buffered saline (PBS). Blots were imaged using a LI-COR Odyssey CLx imager.
Flow cytometry
Cells were trypsinized, pelleted, and resuspended in Dulbecco’s PBS containing 0.1% FBS. At least 20,000 live cells (typically 80,000+) were analyzed using an LSRII cell analyzer with HTS (BD Biosciences). BFP and mTagBFP were measured with a 407 nm laser and a 450/50 emission filter. GFP and mNeonGreen were measured with a 488 nm laser, a 505 LP mirror, and a 530/30 emission filter. mCherry was measured with a 561 nm laser, a 600 LP mirror, and a 615/25 emission filter. Data were analyzed with FlowJo vlO.6.2 (Flowjo LLC). Live cells were gated by size and granularity using FSC-A vs SSC-A. Singlets were gated using SSC-A vs SSC-H (see FIG. 8). At least 3 biological replicates were run with internal technical duplicates or triplicates.
Sequence analysis of knockin products
Transiently transfected HEK:BFP cells were sorted using a FACS Aria II sorter (BD Biosciences). Genomic DNA was extracted from sorted cells using the Genomic DNA Clean & Concentrator kit (Zymo Research). PCR fragments were amplified using KAPA HiFi HotStart DNA Polymerase with 2X Master Mix (Roche), gel extracted with Zymoclean Gel DNA Recovery kit (Zymo Research), and submitted for Sanger sequencing (Genewiz). Alignment of sequencing results was performed using Benchling (https://benchling.com). Analysis of editing outcomes by decomposition of Sanger sequencing data was performed using the ICE Analysis tool v2 (Synthego) as previously described (Hsiau et al., BioRxiv, (2018), doi: 10.1101/251082).
Animals
All animal experimental protocols were approved by the University of Maryland Baltimore Institutional Animal Care and Use Committee and complied with all relevant ethical regulations regarding animal research. Experiments were performed on outbred strain CD1 mouse pups (Charles River Laboratories). Analyses are thought to include animals of both sexes at approximately equal proportions, as no sex determination was attempted. No statistical methods were used to predetermine sample size.
Primary cell culture and cuvette electroporation
CD1 mouse pups were euthanized via decapitation at postnatal day 0. The skin was sterilized and removed from the pup’s back using sterile surgical tools. Skins were placed dermis-side down on cold 0.25% Trypsin with EDTA (Invitrogen) and incubated at 4°C overnight. The epidermis was separated from the dermis in a sterile hood. Dermis was minced with a razor blade and triturated in warm 10% FBS lx GlutaMAX DMEM using a glass pipette 10-20 times to separate individual cells. The suspension was then transferred to a 50 mL conical tube and centrifuged at 150 g. The cell pellet was resupsended in 10% FBS lx Glutamax DMEM and filtered through a 100 pm cell strainer (BD Biosciences). Cells were counted using a hemocytometer and cell viability was estimated using Trypan Blue (Sigma). Approximately 4-5xl06 cells were used for each electroporation. Cells were centrifuged at 150 g and resuspended in 100 pl AM AXA Nucleofection solution (Lonza) at the proper concentration and combined with 1-3 pg of desired DNA mixture in a cuvette. The cuvette was electroporated with the AMAXA biosystems Nucelofector II (Lonza) using the manufacturers settings for Mouse Embryonic Fibroblasts. After electroporation, the solution was immediately transferred to 12- well glass bottom plates (#1.5H; Cellvis) that were pretreated with poly-L-lysine (Sigma Aldrich) diluted 1 :12 in sterile PBS the night before, containing prewarmed sterile-filtered Dulbecco’s Modified Eagle Medium (DMEM; ThermoFisher) supplemented with 10% Fetal Bovine Serum (FBS; Gibco) and lx GlutaMAX (Gibco) at the desired density and incubated at 37°C/5% CO2. Half volume fresh medium was exchanged the next day.
In utero electroporation
Electroporations of plasmid DNA were performed in utero on embryonic day 14.5 (E14.5) to target cortical layer II/III, as previously described (Saito et al., Dev. Biol., (240), 237-246, (2001); Poulopoulos et al., Nature, (2019), 565:356-360.). To increase transfection efficiency in the cortex and to target the cerebellar Purkinje cell progenitor zone at El 1.5, the triple electrode in utero electroporation approach was utilized (dal Maschio et al., Nat. Commun., (2012), 3:960; Szczurkowska, J. et al., Nat. Protoc., (2016), 11:399-412). Briefly, DNA solutions were prepared to 4 pg/pL total DNA, with 1 pg/pL of each of the relevant plasmids (donor, guide, Cas9, and fluorescent protein). Dames were deeply anesthetized with isoflurane under a vaporizer with thermal support (Patterson Scientific Link7 & Heat Therapy Pump HTP-1500). The abdominal area was prepared for surgery with hair removal, surgical scrub, and 70% ethanol and 10% Betadine solution. A midline incision was made to expose the uterine horns. Using pulled (Narishige PC- 100) and beveled (Narishige EG-45) glass micropipettes connected to a pneumatic aspirator, DNA solution was injected into one lateral brain ventricle for cerebral cortex electroporation at E14.5, or in the 4th ventricle for cerebellar Pukinje cell electroporation at El 1.5. 4 x 50 ms square pulses of 35 V (NEPA21 electro-kinetic platinum tweezertrodes connected to a BTX ECM-830 electroporator) were applied to target the nascent sensorimotor areas of the cortical plate. When using the triple electrode, 6 x 50 ms square pulses at 35V was performed at E14.5 and 25 V at El 1.5. Typically, 4-6 pups were electroporated per dame. Uterine horns were placed back inside the abdominal cavity, and monofilament nylon sutures (AngioTech) were used to close muscle and skin incisions. After term birth, electroporated mouse pups were non-invasively screened for unilateral cortical or cerebellar fluorescence using a fluorescence stereoscope (Leica MZIOf with X-Cite FIRE LED light source) and returned to their dame until postnatal day 7 (P7) or P14. When possible, to minimize inter-dame variation, control and experimental electroporations were performed in littermate pups from the same dame.
Histology and immunolabeling
Tissue was prepared by intracardial perfusion with PBS and 4% paraformaldehyde. Brains were cut to 80 pm coronal sections on a vibrating microtome (Leica VT1000). Sections were immunolabeled in blocking solution consisting of 5% bovine serum albumin and 0.2% Triton X-100 in PBS for 30 minutes, then incubated overnight at 4°C with primary antibodies diluted in blocking solution. Sections were washed in PBS, incubated for 3-4h at room temperature with secondary antibodies diluted 1 :400-l : 1000 in blocking solution. Following PBS washes, sections were mounted on slides with Fluoromount- G Mounting Medium with DAPI (ThermoFisher Scientific).
Microscopy and image analysis
Fluorescence images were acquired using a Nikon Ti2-E inverted microscope fitted with an automated registered linear motor stage (HLD117, Pior Scientific), a Spectra-X 7 channel LED light engine (Lumencor), and standard filter sets for DAPI, FITC, TRITC, and Cy5. Images were stitched and analyzed with NIS-Elements (Nikon) using an automated script to identify and count electroporated cells in brain sections. Knockin-positive neurons were counted manually using ImageJ (NIH) and independently by at least two blinded investigators. Five 80 pm sections, centered at the middle of the anteroposterior axis of the electroporation field and taken every other section, were analyzed per brain, with counts aggregated across sections from the same brain.
Statistical analysis
All statistical values are presented as mean ± SEM. For experiments containing more than two conditionals or groups, statistical significance was calculated using a one-way ANOVA with Tukey’s multiple comparison test, with a single pooled variance. For experiments containing two conditionals or groups, statistical significance was calculated using a two-tailed Student’s T test. Effect size was calculated using Cohen's d and expressed in pooled standard deviations (Ho et al., Nat. Methods, (2019), 16:565-566). Differences between conditions were judged to be significant at P < 0.05 (*), P < 0.01 (**), and P < 0.001 (***).
Example 2. Identifying protein domains that add their enhancements to editing performance even when fused to Cas9 enables the production of a single fusion protein for use in ribonucleoprotein (RNP) editing applications, without the need for expression or delivery of these as additional co-factors.
To assess the feasibility of producing Cas9-RC for RNP applications, we fused N-terminal GST and MBP tandem affinity tags to Cas9-RC separated by an HRV 3C protease cleavage site to facilitate protein purification (Fig. 10A), subcloned into an IPTG-driven bacterial expression vector. Induction of transformed bacteria with IPTG yielded new bands in bacterial lysates, including at high molecular weights (Fig. 10B). Induced bands displayed Cas9 immunoreactivity at molecular weights ranging from over 300 kDa to 50 kDa (Fig. 10C), indicating that exogenous Cas9-RC expression in bacteria results in various translated or cleavage products. Importantly, IPTG-specific Cas9 immunoreactive bands corresponding to the predicted electrophoretic mobilities of full-length GST-MBP-Cas9-RC (316 kDa) and Cas9-RC (250 kDa) were abundant in lysates, indicating that Cas9-RC can be bacterially expressed for RNP applications.
Methods
Cell Growth and Protein Expression
Day 0: Set up an overnight culture of the desired bacteria
-10-50 mL
Day 1 : Innoculate 400 mL LB + antibiotic using the overnight culture
Measure OD600 and monitor
When OD600 reaches 0.6-0.8
Collect sample for western blot
Move to ice to cool media
When cooled, induce expression with IPTG (final concentration 0.5 mM)
Move to cold room on stirring hot plate with temperature probe
Allow protein expression for -18 hours at 16C
Day 2: Collect sample for western blot, then pellet cells by centrifugation
Can either advance to purification or freeze pellets at -20 or -80C
Cell Lysis and Protein Purification
NOTE: Always keep these samples and buffers on ice!
Resuspend Cell pellets in 10 mL total lx Resuspension Buffer
Lyse cells by sonication in cold room collect western sample (205 uL, add 75 uL 4x Laemmli buffer and 20 uL 1.5 mM DTT)
Spin Lysate 10 min in centrifuge at 4C
Pass the Supernatant through a 45um filter to remove debris collect western samples of the Supernatant and Pellet
Bring total volume of Supernatant up to 50mL using Binding Buffer
Load 500uL - ImL of GST Sepharose suspension on a gravity chromatography column Equilibrate the column with 20 mL Binding Buffer
Add the supernatant to the equilibrated column
Note: an IV drip can be used to control flow rate onto column - slower flow is better because of the relatively slow binding kinetics of GST
Collect and Keep the Flowthrough
Take sample of the flowthrough for western
Wash the column with 30-50 mL Binding buffer
Add 40uL PreScission Protease to 960 uL Binding buffer - this is the cleavage buffer
Add cap to column to prevent flow, add the cleavage buffer to this
Incubate in cold room with this cleavage buffer minimum 4 hours or overnight
Collect Eluate + 1 more mL binding buffer
Make western sample of this
Run SDS PAGE with the collected samples
Concentrate and Store Protein
Equilibrate a centrifugal filter (lOOkDa cutoff) by loading binding buffer centrifuge to pass through
Load Eluate onto centrifugal filter
Wash eluate by passing full volume through column 2-3x
Exchange buffers by adding 2-3 volumes Cas9 Storage buffer
Concentrate to desired volume
Add (autoclaved) glycerol to final concentration of 20%
Aliquot and store protein
Regeneration and Storage of GST Sepharose 4B Beads
The beads are currently bound to the GST-MBP tag and PreScission protease
Elute the bound proteins by passing 5-10 bed volumes of Elution Buffer (50mM Tris
HCL, lOmM Glutathione)
Wash The bed with 2-3 bed volumes of alternating High pH Buffer and Low pH
Buffer - repeat for 3 cycles
High pH Buffer - 0.1 M Tris HC1, 0.5 M NaCl, pH 8.5
Low pH Buffer - 0. 1 M Sodium Acetate, 0.5 M NaCl, pH 4.5 Optional If binding capacity is decreasing, remove precipitated or denatured proteins by washing with 2 bed volumes 6 M guanidine hydrochloride followed immediately by 5 bed volumes lx PBS
Wash 2x with 10 bed volumes PBS
Wash 2x with 10 bed volumes 20% ethanol
Suspend beads in ethanol, Store at 4C
Throughout this disclosure, various publications, patents and published patent specifications are referenced by an identifying citation. The disclosures of these publications, patents and published patent specifications are hereby incorporated by reference into the present disclosure to more fully describe the state of the art to which this invention pertains.
While the present teachings are described in conjunction with various embodiments, it is not intended that the present teachings be limited to such embodiments. On the contrary, the present teachings encompass various alternatives, modifications, and equivalents, as will be appreciated by those of skill in the art.

Claims

WHAT IS CLAIMED IS:
1. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify a genomic sequence in cells of the subject.
2. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the subject.
3. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
4. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
5. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
6. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
7. The method of any of claims 1-6, wherein the method comprises administering to the subject a nucleic acid template encoding a sequence of interest.
8. A method of treating a disease of condition in a subject by administering to the subject an effective amount of cells having a modified genomic sequence, wherein the cells have been modified by administering an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radl8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify a genomic sequence in the cells.
9. A method of treating a disease of condition in a subject by administering to the subject an effective amount of cells having a modified genomic sequence, wherein the cells have been modified by administering an effective amount of a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof, and one or more additional agents to modify genomic sequence in the cells.
10. A method of treating a disease of condition in a subject by administering to the subject an effective amount of cells having a modified genomic sequence, wherein the cells have been modified by administering an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad 18 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
11. A method of treating a disease of condition in a subject by administering to the subject an effective amount of cells having a modified genomic sequence, wherein the cells have been modified by administering an effective amount of a CRISPR/Cas9 system, wherein the system comprises a) a nucleic acid or vector encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof; and b) a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
12. A method of treating a disease of condition in a subject by administering to the subject an effective amount of cells having a modified genomic sequence, wherein the cells have been modified by administering an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
13. A method of treating a disease of condition in a subject by administering to the subject an effective amount of cells having a modified genomic sequence, wherein the cells have been modified by administering an effective amount of a CRISPR/Cas9 system, wherein the system comprises a vector comprising a) a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof and b) a sequence encoding a sgRNA comprising a targeting domain which is complementary with a target domain sequence of a gene or region interest in a genome.
14. A nucleic acid comprising a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Radi 8 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
15. The nucleic acid of claim 14, wherein the nucleic acid is DNA or RNA.
16. The nucleic acid of claim 14 or 15, wherein the Cas9 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ TD NO:13.
17. The nucleic acid of any of claims 14-16, wherein the Cas9 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO:14 or SEQ ID NO:37.
18. The nucleic acid of any of claims 14-17, wherein the Radl8 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:5.
19. The nucleic acid of any of claims 14-18, wherein the Radi 8 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO:6 or SEQ ID NO:40.
20. The nucleic acid of any of claims 14-19, wherein the CtIP polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:3.
21. The nucleic acid of any of claims 14-20, wherein the CtIP polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO:4 or SEQ ID NO:38.
22. The nucleic acid of any of claims 14-21, wherein the fusion protein comprises an amino acid sequence of SEQ ID NO: 17.
23. The nucleic acid of any of claims 14-21, wherein the fusion protein comprises an amino acid sequence at least about 90% identical to SEQ ID NO: 17.
24. The nucleic acid of any of claims 14-23, wherein the fusion protein is encoded by a nucleotide sequence comprising SEQ ID NO: 18 or SEQ ID NO:41.
25. The nucleic acid of any of claims 14-24, wherein the fusion protein is encoded by a nucleotide sequence at least about 60% identical to SEQ ID NO: 18 or SEQ ID NO:41.
26. A vector encoding the nucleic acid of any of claims 14-25.
27. The vector of claim 26, wherein the vector is a viral vector.
28. The vector of claim 26 or 27, wherein the vector encodes a sgRNA.
29. The vector of any of claims 26-28, wherein the vector encodes a nucleic acid template encoding a sequence of interest.
30. A pharmaceutical composition comprising the nucleic acid of any of claims 14-25, or the vector of any of claims 26-29.
31. The pharmaceutical composition of claim 30, wherein the composition further comprises a sgRNA.
32. The pharmaceutical composition of claim 30 or 31, wherein the composition further comprises a template nucleic acid encoding a sequence of interest.
33. The pharmaceutical composition of any of claims 30-32, wherein the composition comprises lipid nanoparticles.
34. A nucleic acid comprising a sequence encoding a fusion protein comprising Cas9 or a variant or fragment thereof, Rad52 or a variant or fragment thereof, and CtIP or a variant or fragment thereof.
35. The nucleic acid of claim 35, wherein the Cas9 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:13.
36. The nucleic acid of any of claims 34 or 35, wherein the Cas9 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO: 14 or SEQ ID NO:37.
37. The nucleic acid of any of claims 34-36, wherein the Rad52 polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO: 15.
38. The nucleic acid of any of claims 34-37, wherein the Rad52 polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NO: 16 or SEQ ID NO:39.
39. The nucleic acid of any of claims 34-38, wherein the CtIP polypeptide or a variant or fragment thereof comprises an amino acid sequence of SEQ ID NO:3.
40. The nucleic acid of any of claims 34-39, wherein the CtIP polypeptide or a variant or fragment thereof is encoded by a sequence comprising SEQ ID NON or SEQ ID NO:38.
41. The nucleic acid of any of claims 34-40, wherein the fusion protein comprises an amino acid sequence of SEQ ID NO: 19.
42. The nucleic acid of any of claims 34-40, wherein the fusion protein comprises an amino acid sequence at least about 90% identical to SEQ ID NO:19.
43. The nucleic acid of any of claims 34-42, wherein the fusion protein is encoded by a nucleotide sequence comprising SEQ ID NO:20 or SEQ ID NO:42.
44. The nucleic acid of any of claims 34-42, wherein the fusion protein is encoded by a nucleotide sequence at least about 60% identical to SEQ ID NO:20 or SEQ ID NO:42.
45. A vector encoding the nucleic acid of any of claims 34-44.
46. The vector of claim 45, wherein the vector is a viral vector.
47. The vector of claim 45 or 46, wherein the vector encodes a sgRNA.
48. The vector of any of claims 45-47, wherein the vector encodes a nucleic acid template encoding a sequence of interest.
49. A pharmaceutical composition comprising the nucleic acid of any of claims 34-44, or the vector of any of claims 45-48.
50. The pharmaceutical composition of claim 49, wherein the composition further comprises a sgRNA.
51. The pharmaceutical composition of claim 49 or 50 wherein the composition further comprises a template nucleic acid encoding a sequence of interest.
52. The pharmaceutical composition of any of claims 49-51, wherein the composition comprises lipid nanoparticles.
53. A method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of the pharmaceutical composition of any of claims 30-33 or 49-52.
54. A method of modifying a genomic sequence of a cell, comprising administering to the cell an effective amount of the nucleic acid of any of claims 14-25 or 34-44 or the vector of any of claims 26-29 or 45-48, and optionally one or more additional agents to modify the genomic sequence in the cell .
55. A method of treating a disease or condition in a subject, comprising administering to the subject an effective amount of a pharmaceutical composition of any of claims 30-33 or 49-52.
56. A method of treating a disease of condition in a subject, comprising administering to the subject an effective amount of the nucleic acid of any of claimsl4-25 or 24-34 or the vector of any of claims 26-29 or 35-38, and optionally one or more additional agents to modify a genomic sequence in a cell of the subject.
PCT/US2024/060720 2023-12-18 2024-12-18 Compositions and methods for enhanced genome editing using cas9 fusion proteins Pending WO2025137069A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202363611473P 2023-12-18 2023-12-18
US63/611,473 2023-12-18
US202463694112P 2024-09-12 2024-09-12
US63/694,112 2024-09-12

Publications (1)

Publication Number Publication Date
WO2025137069A1 true WO2025137069A1 (en) 2025-06-26

Family

ID=96137946

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/060720 Pending WO2025137069A1 (en) 2023-12-18 2024-12-18 Compositions and methods for enhanced genome editing using cas9 fusion proteins

Country Status (1)

Country Link
WO (1) WO2025137069A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170073695A1 (en) * 2014-12-31 2017-03-16 Synthetic Genomics, Inc. Compositions and methods for high efficiency in vivo genome editing
US20210403922A1 (en) * 2019-01-07 2021-12-30 Crisp-Hr Therapeutics, Inc. Non-toxic cas9 enzyme and application thereof
US20230203178A1 (en) * 2020-07-06 2023-06-29 Sichuan Kelun-Biotech Biopharmaceutical Co., Ltd. Chimeric antigen receptor car or car construct targeting bcma and cd19 and application thereof
US20230357798A1 (en) * 2020-10-12 2023-11-09 The Board Of Trustees Of The Leland Stanford Junior University Gene correction for x-cgd in hematopoietic stem and progenitor cells

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170073695A1 (en) * 2014-12-31 2017-03-16 Synthetic Genomics, Inc. Compositions and methods for high efficiency in vivo genome editing
US20210403922A1 (en) * 2019-01-07 2021-12-30 Crisp-Hr Therapeutics, Inc. Non-toxic cas9 enzyme and application thereof
US20230203178A1 (en) * 2020-07-06 2023-06-29 Sichuan Kelun-Biotech Biopharmaceutical Co., Ltd. Chimeric antigen receptor car or car construct targeting bcma and cd19 and application thereof
US20230357798A1 (en) * 2020-10-12 2023-11-09 The Board Of Trustees Of The Leland Stanford Junior University Gene correction for x-cgd in hematopoietic stem and progenitor cells

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RICHARDSON RYAN R., STEYERT MARILYN, KHIM SAOVLEAK N., CRUTCHER GARRETT W., BRANDENBURG CHERYL, ROBERTSON COLIN D., ROMANOWSKI AND: "Enhancing Precision and Efficiency of Cas9-Mediated Knockin Through Combinatorial Fusions of DNA Repair Proteins", THE CRISPR JOURNAL, vol. 6, no. 5, 1 October 2023 (2023-10-01), US, pages 447 - 461, XP093331874, ISSN: 2573-1599, DOI: 10.1089/crispr.2023.0036 *

Similar Documents

Publication Publication Date Title
US11649444B1 (en) CRISPR-CAS12i systems
CN113151215B (en) Engineered Cas12i nuclease, effector protein thereof and application thereof
US10975392B2 (en) Site-directed CRISPR/recombinase compositions and methods of integrating transgenes
AU2020336969B2 (en) Compositions and methods for non-toxic conditioning
WO2021222318A1 (en) Targeted base editing of the ush2a gene
US20190323038A1 (en) Bidirectional targeting for genome editing
AU2019284926A1 (en) Engineered cascade components and cascade complexes
US20240376456A1 (en) Rna-guided nucleases and active fragments and variants thereof and methods of use
WO2023078314A1 (en) Novel crispr-cas12i systems and uses thereof
AU2021285983A1 (en) Novel OMNI-59, 61, 67, 76, 79, 80, 81, and 82 CRISPR nucleases
WO2023230613A1 (en) Improved mitochondrial base editors and methods for editing mitochondrial dna
WO2022170216A2 (en) Omni 90-99, 101, 104-110, 114, 116, 118-123, 125, 126, 128, 129, and 131-138 crispr nucleases
WO2025137069A1 (en) Compositions and methods for enhanced genome editing using cas9 fusion proteins
US20250101394A1 (en) Novel crispr-cas12i systems and uses thereof
WO2024033901A1 (en) Rna-guided nucleases and active fragments and variants thereof and methods of use
CA3238939A1 (en) Mutant myocilin disease model and uses thereof
US20230081547A1 (en) Non-human animals comprising a humanized klkb1 locus and methods of use
TW202536173A (en) Rna-guided nucleases and active fragments and variants thereof and methods of use
WO2025022367A2 (en) Rna-guided nucleases and active fragments and variants thereof and methods of use
WO2023235725A2 (en) Crispr-based therapeutics for c9orf72 repeat expansion disease
JP2025514304A (en) Identifying tissue-specific extragenic safe harbors for gene therapy
WO2025119363A1 (en) Cas protein, crispr-cas system containing cas protein, and use of cas protein
WO2024127370A1 (en) Guide rnas that target trac gene and methods of use
HK1243455B (en) Site-directed crispr/recombinase compositions and methods of integrating transgenes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24908808

Country of ref document: EP

Kind code of ref document: A1