EP4532704A2 - Novel nucleic acid-editing proteins - Google Patents
Novel nucleic acid-editing proteinsInfo
- Publication number
- EP4532704A2 EP4532704A2 EP23729980.5A EP23729980A EP4532704A2 EP 4532704 A2 EP4532704 A2 EP 4532704A2 EP 23729980 A EP23729980 A EP 23729980A EP 4532704 A2 EP4532704 A2 EP 4532704A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- domain
- protein
- sequence
- fusion protein
- editing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P11/00—Drugs for disorders of the respiratory system
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04005—Cytidine deaminase (3.5.4.5)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
Definitions
- the present invention relates to novel nucleic acid-editing proteins and base-editing systems comprising such.
- Targeted introduction of a specific modification into genomic DNA is a promising approach for the study of gene function and has the potential to provide new therapies for human genetic diseases.
- An ideal nucleic acid editing technology would provide high efficiency of introducing the desired modification, have a minimal off-target activity, and have the ability to be guided to edit precisely any site within the genome.
- ZFNs engineered zinc finger nucleases
- TALENs transcription activator like effector nucleases
- RGN RNA-guided DNA endonuclease
- NHEJ and HDR typically result in modest gene editing efficiencies as well as unwanted gene alterations that can compete with the desired alteration. Since many genetic diseases in principle can be treated by effecting a specific nucleotide change at a specific location in the genome (for example, a C to T change in a specific codon of a gene associated with a disease), the development of a programmable way to achieve such precision gene editing would represent both a powerful new research tool, as well as a potential new approach to gene editing-based human therapeutics.
- CRISPR clustered regularly interspaced short palindromic repeat
- gRNA RNA molecule
- the target DNA sequence must be both complementary to the sgRNA, and also contain a “protospacer-adjacent motif’ (PAM) sequence at the end of the complementary region in order for the system to function.
- PAM protospacer-adjacent motif
- S. pyogenes Cas9 has been mostly widely used as a tool for genome engineering.
- This Cas9 protein is a large, multi- domain protein containing two distinct nuclease domains. Point mutations can be introduced into Cas proteins to abolish nuclease activity, resulting in a dead Cas (dCas) or nickase (nCas) that still retains its ability to bind DNA guided by a gRNA.
- dCas9 can target that protein to the DNA sequence of interest simply by co-expression with an appropriate gRNA.
- HDR efficiencies vary according to the location of the target gene within the genome, the state of the cell cycle, and the type of cell/tissue.
- US 10,167,457 discloses some examples base editors and provides fusion peptides for targeted base editing.
- Some aspects of this disclosure provide methods, systems, reagents, and kits that are useful for the targeted editing of nucleic acids.
- fusion proteins of a RNA-guided endonucleases domain and cytidine deaminase domains are provided.
- methods for targeted nucleic acid editing are provided.
- reagents and kits for the generation of targeted nucleic acid editing proteins e.g., fusion proteins of Cas and deaminase domains, are provided.
- fusion proteins comprising (i) a nuclease-inactive RNA- guided endonuclease domain; and (ii) a cytidine deaminase domain.
- the nucleic- acid-editing domain is fused to the N-terminus of the RNA-guided endonuclease domain.
- the nucleic-acid-editing domain is fused to the C-terminus of the RNA-guided endonuclease domain.
- the RNA-guided endonuclease domain and the nucleic- acid-editing domain are fused via a linker.
- the methods comprise contacting a DNA molecule with (a) a fusion protein or protein complex comprising a nuclease-inactive RNA-guided endonuclease domain and a cytidine deaminase domain; and (b) a guide RNA targeting said fusion protein to a target nucleotide sequence; wherein the DNA molecule is contacted with the fusion protein or protein complex and the guide RNA in an amount effective and under conditions suitable for the deamination of a nucleotide base.
- the target DNA sequence comprises a sequence associated with a disease or disorder, and wherein the deamination of the nucleotide base results in a sequence that is not associated with a disease or disorder.
- the DNA sequence comprises a T>C point mutation.
- the deamination corrects a point mutation in the sequence associated with the disease or disorder.
- the sequence associated with the disease or disorder encodes a protein, and wherein the deamination introduces a stop codon or disrupt splicing of the sequence associated with the disease or disorder.
- kits comprising a nucleic acid construct that comprises a sequence encoding a nuclease-inactive RNA-guided endonuclease sequence, a sequence encoding a nucleic acid-editing enzyme or enzyme domain, such as cytidine deaminase, in-frame with the RNA- guided endonuclease -encoding sequence, and, optionally, a sequence encoding a linker positioned between the Cas encoding sequence and the cloning site.
- the kit comprises suitable reagents, buffers, and/or instructions for use.
- kits comprising a fusion protein comprising a nuclease- inactive RNA-guided endonuclease domain and a cytidine deaminase domain, and, optionally, a linker positioned between the RNA-guided endonuclease domain and the deaminase domain.
- the kit comprises suitable reagents, buffers, and/or instructions for using the fusion protein, e.g., for in vitro or in vivo DNA or RNA editing.
- the kit comprises instructions regarding the design and use of suitable gRNAs for targeted editing of a nucleic acid sequence.
- Zinc finger nuclease or "ZFN” as used herein refers to a chimeric protein molecule comprising at least one zinc finger DNA binding domain effectively linked to at least one nuclease or part of a nuclease capable of cleaving DNA when fully transcribed and assembled.
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
- prokaryotic organisms such as bacteria and archaea. These sequences are derived from DNA fragments of bacteriophages that had previously infected the prokaryote. They are used to detect and destroy DNA from similar bacteriophages during subsequent infections.
- CRISPR system refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas") proteins, including sequences encoding a Cas protein, a tracr (trans -activating CRISPR) sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-mate sequence (containing a "direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred herein to as a "spacer” in the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus.
- a tracr trans -activating CRISPR
- tracr-mate sequence containing a "direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system
- guide sequence also referred herein to as a "spacer” in the context of an endogen
- Type II CRISPR system refers to effector system that carries out targeted DNA double- strand break in four sequential steps, using a single effector enzyme, Cas9, to cleave dsDNA.
- the Type II effector system may function in alternative contexts such as eukaryotic cells.
- the Type II effector system consists of a long pre-crRNA, which is transcribed from the spacer-containing CRISPR locus, the Cas9 protein, and a tracrRNA, which is involved in pre-crRNA processing.
- nucleic acid guided DNA binding protein refers to any protein that complexes with one or more nucleic acids that guide the binding of that protein to a specific region of a DNA.
- RNA-guided nucleases are an example of nucleic acid guided DNA binding proteins.
- RNA-guided endonuclease or “RGN” is used interchangeably herein and refer to a nuclease that forms a complex with (e.g., binds or associates with) one or more RNA that is not a target for cleavage.
- gRNA also used interchangeably herein as a chimeric single guide RNA (“sgRNA”), refers to nucleic acid which is a fusion of two noncoding RNAs: a crRNA and a tracrRNA. “gRNA” is used interchangeably to refer to guide RNAs that exist as either single molecules or as a complex of two or more molecules. Typically, gRNAs that exist as single RNA species comprise two domains:(l) a domain that shares homology to a target nucleic acid (e.g., and directs binding of a Cas9 complex to the target); and (2) a domain that binds a Cas9 protein.
- Cas9 refers to type of an RGN that cleaves nucleic acid and is encoded by the CRISPR loci and is a part of the Type II CRISPR system.
- the Cas9 protein commonly used is from bacterial species Streptococcus pyogenes.
- the Cas9 protein may be mutated so that the nuclease activity is partly or completely inactivated.
- dCas9 refers to an inactivated Cas9 protein. Examples include dCas9 from Streptococcus pyogenes with no nuclease activity. As used herein, “dCas9” refer to a Cas9 protein that has the amino acid substitutions and has its nuclease activity inactivated. For S. pyogenes Cas9 these mutations are D10A and H840A.
- nCas9 refers to Cas9 nickase domain or protein.
- Cas9 nickase refers to a modified version of the Cas9, containing a single inactive catalytic domain, either RuvC- or HNH-. With only one active nuclease domain, the Cas9 nickase cuts only one strand of the target DNA, creating a single-strand break or “nick”.
- a Cas9 nickase is still able to bind DNA based on gRNA specificity, but nickases will only cut one of the DNA strands.
- nCas9 is derived from S.
- pyogenes and the RuvC domain can be inactivated by an amino acid substitution at position D10 (e.g., D10A) and the HNH domain can be inactivated by an amino acid substitution at position H840 (e.g., H840A), or at positions corresponding to those amino acids in other proteins.
- nicking refers to a reaction that breaks the phosphodiester bond between two nucleotides in one strand of a double-stranded DNA molecule to produce a 3' hydroxyl group and a 5' phosphate group.
- linker refers to a chemical group or a molecule linking two molecules or moieties, e.g., a binding domain and a cleavage domain of a nuclease.
- a linker joins a gRNA binding domain of an RNA-programmable nuclease and the catalytic domain of a recombinase.
- a linker joins a dCas9 and a recombinase.
- the linker is positioned between, or flanked by, two groups, molecules, or other moieties and connected to each one via a covalent bond, thus connecting the two.
- the linker is an amino acid or a plurality of amino acids (e.g., a peptide or protein).
- the linker is an organic molecule, group, polymer, or chemical moiety.
- the linker is 5-100 amino acids in length, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-150, or 150-200 amino acids in length. Longer or shorter linkers are also contemplated.
- nucleic acid As used herein, the terms "nucleic acid,” “nucleic acid sequence,” “nucleotide sequence,” “oligonucleotide,” and “polynucleotide” are interchangeable and refer to a polymeric form of nucleotides.
- the nucleotides may be deoxyribonucleotides (DNA), ribonucleotides (RNA), analogs thereof, or combinations thereof, and may be of any length.
- Polynucleotides may perform any function and may have any secondary and tertiary structures.
- the terms encompass known analogs of natural nucleotides and nucleotides that are modified in the base, sugar and/or phosphate moieties.
- a polynucleotide may comprise one modified nucleotide or multiple modified nucleotides. Examples of modified nucleotides include fluorinated nucleotides, methylated nucleotides, and nucleotide analogs. Nucleotide structure may be modified before or after a polymer is assembled. Following polymerization, polynucleotides may be additionally modified via, for example, conjugation with a labeling component or target binding component. A nucleotide sequence may incorporate non-nucleotide components.
- nucleic acids comprising modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, and have similar binding properties as a reference polynucleotide (e.g., DNA or RNA).
- reference polynucleotide e.g., DNA or RNA
- analogs include, but are not limited to, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs), Locked Nucleic Acid (LNATM) (Exiqon, Inc., Woburn, MA) nucleosides, glycol nucleic acid, bridged nucleic acids, and morpholino structures.
- Polynucleotide sequences are displayed herein in the conventional 5' to 3' orientation unless otherwise indicated.
- polypeptide As used herein, the terms “peptide,” “polypeptide,” and “protein” are interchangeable and refer to polymers of amino acids.
- a polypeptide may be of any length. It may be branched or linear, it may be interrupted by non-amino acids, and it may comprise modified amino acids.
- the terms may be used to refer to an amino acid polymer that has been modified through, for example, acetylation, disulfide bond formation, glycosylation, lipidation, phosphorylation, cross- linking, and/or conjugation (e.g., with a labeling component or ligand).
- Polypeptide sequences are displayed herein in the conventional N-terminal to C-terminal orientation.
- Polypeptides and polynucleotides can be made using routine techniques in the field of molecular biology (see, e.g., standard texts set forth above). Further, essentially any polypeptide or polynucleotide can be custom ordered from commercial sources.
- target region refers to the region of the target gene to which the CRISPR-based system targets.
- target site refers to a sequence within a nucleic acid molecule that is deaminated by a deaminase or a fusion protein comprising a deaminase, (e.g., a RGN-cytidine deaminase fusion protein provided herein).
- mutation refers to a substitution of a residue within a sequence, e.g., a nucleic acid or amino acid sequence, with another residue, or a deletion or insertion of one or more residues within a sequence. Mutations are typically described herein by identifying the original residue followed by the position of the residue within the sequence and by the identity of the newly substituted residue. Various methods for making the amino acid substitutions (mutations) provided herein are well known in the art, and are provided by, for example, Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)).
- complement or “complementary” as used herein means a nucleic acid can mean Watson-Crick or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
- complementarity refers to a property shared between two nucleic acid sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position will be complementary.
- promoter means a synthetic or naturally-derived molecule which is capable of conferring, activating or enhancing expression of a nucleic acid in a cell.
- a promoter may comprise one or more specific transcriptional regulatory sequences to further enhance expression and/or to alter the spatial expression and/or temporal expression of same.
- a promoter may also comprise distal enhancer or repressor elements, which may be located as much as several thousand base pairs from the start site of transcription.
- a promoter may be derived from sources including viral, bacterial, fungal, plants, insects, and animals.
- Enhancer refers to non-coding DNA sequences containing multiple activator and repressor binding sites. Enhancers range from 200 bp to 1 kb in length and may be either proximal, 5' upstream to the promoter or within the first intron of the regulated gene, or distal, in introns of neighboring genes or intergenic regions far away from the locus. Through DNA looping, active enhancers contact the promoter dependently of the core DNA binding motif promoter specificity. 4 to 5 enhancers may interact with a promoter.
- operably linked means that expression of a gene is under the control of a promoter with which it is spatially connected.
- a promoter may be positioned 5' (upstream) or 3' (downstream) of a gene under its control.
- the distance between the promoter and a gene may be approximately the same as the distance between that promoter and the gene it controls in the gene from which the promoter is derived. As is known in the art, variation in this distance may be accommodated without loss of promoter function.
- vector means a nucleic acid sequence containing an origin of replication.
- a vector may be a viral vector, bacteriophage, bacterial artificial chromosome or yeast artificial chromosome.
- a vector may be a DNA or RNA vector.
- a vector may be a self- replicating extrachromosomal vector, or a DNA plasmid.
- an effective amount refers to an amount of a biologically active agent that is sufficient to elicit a desired biological response.
- an effective amount of a nuclease may refer to the amount of the nuclease that is sufficient to induce cleavage of a target site specifically bound and cleaved by the nuclease.
- an effective amount of a recombinase may refer to the amount of the recombinase that is sufficient to induce recombination at a target site specifically bound and recombined by the recombinase.
- an agent e.g., a nuclease, a recombinase, a hybrid protein, a fusion protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide
- an agent e.g., a nuclease, a recombinase, a hybrid protein, a fusion protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide
- an agent e.g., a nuclease, a recombinase, a hybrid protein, a fusion protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide
- AAV adeno-associated virus
- subject and patient as used herein interchangeably refers to any vertebrate, including, but not limited to, a mammal ⁇ e.g., cow, pig, camel, llama, horse, goat, rabbit, sheep, hamsters, guinea pig, cat, dog, rat, and mouse, a non-human primate (for example, a monkey, such as a cynomolgus or rhesus monkey, chimpanzee, etc.) and a human).
- a mammal ⁇ e.g., cow, pig, camel, llama, horse, goat, rabbit, sheep, hamsters, guinea pig, cat, dog, rat, and mouse
- a non-human primate for example, a monkey, such as a cynomolgus or rhesus monkey, chimpanzee, etc.
- the subject may be a human or a non-human.
- the subject or patient may be undergoing
- treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein.
- treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein.
- treatment may be administered after one or more symptoms have developed and/or after a disease has been diagnosed.
- treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease.
- treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors). Treatment may also be continued after symptoms have resolved, for example, to prevent or delay their recurrence
- the present disclosure provides fusion proteins comprising (i) site-specific nuclease domain; and (ii) a cytidine deaminase domain.
- site-specific nuclease domains are known to the skilled person and include zinc finger nucleases (ZFNs), TAL effector nucleases (TALENs), and CRISPR-based systems. Any nucleic acid guided DNA binding domain can be used as long as the domain is being guided to a specific point of interest within the target nucleic acid sequence.
- ZFNs zinc finger nucleases
- TALENs TAL effector nucleases
- CRISPR-based systems Any nucleic acid guided DNA binding domain can be used as long as the domain is being guided to a specific point of interest within the target nucleic acid sequence.
- a CRISPR nuclease-domain would be suitable for such purpose.
- the present disclosure more specifically provides fusion proteins comprising (i) a CRISPR nuclease-domain and (ii) a cytidine deaminase domain.
- the cytidine deaminase comprises one of the sequences described above.
- Suitable CRISPR nuclease domains are described herein with Cas9 being commonly used.
- inactive CRISPR nuclease domain is dCas9 domain.
- Alternative suitable site-specific nuclease domains will be apparent to the skilled artisan based on this disclosure.
- the CRISPR nuclease domain is a CRISPR nickase domain or inactive CRISPR nuclease domain.
- the disclosure provides CRISPR nuclease enzyme/domain fusion proteins with various configurations.
- the cytidine deaminase enzyme or domain is fused to the N- terminus of the CRISPR nuclease domain.
- the cytidine deaminase enzyme or domain is fused to the C-terminus of the CRISPR nuclease domain.
- the general architecture of Cas fusion proteins provided herein comprises the structure:
- the general architecture of Cas fusion proteins comprises the structure:
- the NLS is located C -terminal of the cytidine deaminase and/or the CRISPR nuclease domain. Multiple NLS might be present. In some embodiments, the NLS is located between the cytidine deaminase and the Cas domain. In some embodiments multiple NLS are present. Preferably NLS is present on both C- and N-terminal of the cytidine deaminase and/or CRISPR nuclease domains. [059] In some embodiments, the CRISPR nuclease domain and the cytidine deaminase domain are fused via a linker.
- the linker comprises the sequence SGGSSGGSSGSETPGTSESATPESSGGSSGGS (SEQ ID NO: 31).
- the linker comprises a (GGGGS) n (SEQ ID NO: 32), a (G) n , an (EAAAK) n (SEQ ID NO: 33), or an (XP) n motif, or a combination of any of these, wherein n is independently an integer between 1 and 30.
- n is independently 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30, or, if more than one linker or more than one linker motif is present, any combination thereof.
- suitable linker motifs and linker configurations include those described in Chen et al., Adv Drug Deliv Rev. 2013; 65(10): 1357-69.
- fusion proteins as provided herein comprise the full-length amino acid of a nucleic acid-editing enzyme, e.g., one of the sequences provided above. In other embodiments, however, fusion proteins as provided herein do not comprise a full-length sequence of a nucleic acid-editing enzyme, but only a fragment thereof.
- a fusion protein provided herein comprises a Cas9 domain and a fragment of a nucleic acid-editing enzyme, e.g., wherein the fragment comprises a nucleic acid-editing domain.
- Exemplary amino acid sequences of nucleic acid-editing domains are shown in the sequences above as italicized letters, and additional suitable sequences of such domains will be apparent to those of skill in the art.
- additional features may be present. Such features could be one or more linker sequences between the NLS and the rest of the fusion protein and/or between the cytidine deaminase domain and the CRISPR nuclease domain. Other features such as, for example, nuclear localization sequences, cytoplasmic localization sequences, export sequences, such as nuclear export sequences, or other localization sequences, could be present. In some embodiments sequence tags could be present. Such tags are useful for solubilization, purification, or detection of the fusion proteins.
- Suitable localization signal sequences and protein tag sequences are provided herein, and include, but are not limited to, biotin carboxylase carrier protein (BCCP) tags, myc-tags, calmodulin-tags, FLAG-tags, hemagglutinin (HA)-tags, polyhistidine tags, also referred to as histidine tags or His-tags, maltose binding protein (MBP)-tags, nus-tags, glutathione-S-transferase (GST)-tags, green fluorescent protein (GFP)-tags, thioredoxin-tags, S-tags, Softags (e.g., Softag 1, Softag 3), strep-tags, biotin ligase tags, FlAsH tags, V5 tags, and SBP-tags. Additional suitable sequences will be apparent to those of skill in the art.
- BCCP biotin carboxylase carrier protein
- MBP maltose binding protein
- GST glutathione-S-transferase
- nucleic-acid editing enzyme sequences e.g., deaminase enzyme and domain sequences, that can be used according to aspects of this invention, e.g., that can be fused to a nuclease- inactive CRISPR associated domain
- additional enzyme sequences include cytidine deaminase domain sequences that are at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% similar to the sequences provided herein.
- Additional suitable CRISPR nuclease domains, variants, and sequences will also be apparent to those of skill in the art.
- fusion proteins as provided herein comprise the full-length amino acid of a cytidine deaminase, e.g., one of the sequences provided above. In other embodiments, however, fusion proteins as provided herein do not comprise a full-length sequence of a cytidine deaminase, but only a fragment thereof.
- a fusion protein provided herein comprises a CRISPR nuclease domain (such as, for example, Cas9 domain) and a fragment of a cytidine deaminase domain, e.g., wherein the fragment comprises a cytidine deaminase domain. Additional suitable sequences of such domains will be apparent to those of skill in the art.
- Cytidine deaminase domain is capable of catalyzing the hydrolytic deamination of cytidine or deoxycytidine to uridine or deoxyuridine, respectively.
- the cytidine deaminase domain catalyzes the hydrolytic deamination of cytidine to uracil.
- the cytidine deaminase or cytidine deaminase domain is a naturally occurring cytidine deaminase.
- the present disclosure provides novel cytidine deaminase domains that could be used in a fusion protein or a protein complex comprising the sequence of any one of SEQ ID NO: 1-13.
- the examples of the present disclosure demonstrate cytidine base editing of several cytidine deaminase sequences.
- CBE07, CBE08, CBE10, CBE11, CBE12, and CBE13 demonstrate active base editing ability, with CBE07, CBE11, CBE12, and CBE13 being most effective.
- the cytidine deaminase or cytidine deaminase domain is a variant of a naturally occurring deaminase from an organism, that does not occur in nature.
- the cytidine deaminase or cytidine deaminase domain is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the deaminase domain of any one of SEQ ID NO: 7, 8, 10, 11, 12, or 13.
- the cytidine deaminase domain is at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the deaminase domain of any one of SEQ ID NOs 7, 8, 10, 11, 12, or 13.
- the cytidine deaminase domain comprises the amino acid sequence of any one of SEQ ID NOs: 7, 8, 10, 11, 12, or 13.
- Cytidine deaminases provided herein can be used for targeted editing of nucleic acid sequences. Such Cytidine deaminases are useful for targeted editing of DNA in vitro, e.g., for the generation of mutant cells or animals; for the introduction of targeted mutations, e.g., for the correction of genetic defects in cells ex vivo, e.g., in cells obtained from a subject that are subsequently re-introduced into the same or another subject; and for the introduction of targeted mutations, e.g., the correction of genetic defects or the introduction of deactivating mutations in disease-associated genes in a subject.
- the cytidine deaminase domain has catalytic activity mutations that reduce, but do not eliminate, the catalytic activity of a cytidine deaminase domain within a base editing fusion protein. Such mutations could make it less likely that the cytidine deaminase domain will catalyze the deamination of a residue adjacent to a target residue, thereby narrowing the deamination window.
- the ability to narrow the deamination window may help to prevent unwanted deamination of residues adjacent of specific target residues, which may help to decrease or prevent off-target effects.
- any of the fusion proteins provided herein comprise a cytidine deaminase domain that has reduced catalytic deaminase activity. In some embodiments, any of the fusion proteins provided herein comprise a cytidine deaminase domain that has a reduced catalytic deaminase activity as compared to an appropriate control.
- the appropriate control may be the deaminase activity of the cytidine deaminase prior to introducing one or more mutations into the cytidine deaminase. In other embodiments, the appropriate control may be a wild-type deaminase.
- the appropriate control is a wild-type apolipoprotein B mRNA-editing complex (APOBEC) family deaminase.
- the appropriate control is an APOBEC 1 deaminase, an APOBEC2 deaminase, an APOBEC3A deaminase, an APOBEC3B deaminase, an APOBEC3C deaminase, an APOBEC3D deaminase, an APOBEC3F deaminase, an APOBEC3G deaminase, or an APOBEC3H deaminase.
- APOBEC a wild-type apolipoprotein B mRNA-editing complex
- the appropriate control is an activation induced deaminase (AID).
- the deaminase domain may be a deaminase domain that has at least 1%, at least 5%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% less catalytic deaminase activity as compared to an appropriate control.
- Some aspects of this disclosure provide fusion proteins and protein complexes that comprise an RNA-guided endonucleases domain that binds to a guide RNA (gRNA or sgRNA), which, in turn, binds a target nucleic acid sequence via strand hybridization; and a cytidine deaminase domain that can deaminate a cytidine.
- gRNA or sgRNA guide RNA
- cytidine deaminase domain that can deaminate a cytidine.
- the RNA-guided endonuclease domain of the fusion proteins described herein partially lacks nuclease activity or does not have any nuclease activity.
- Such domain might be a fragment of nuclease-inactive Cas9 protein or a dCas9 protein or domain.
- nCas9 nickase Cas9 is used to cleave the RNA bound strand (templating) of DNA when the Cas9 is bound.
- RNA-guided endonucleases fusion protein or protein complex comprise at least one nuclear localization signal, which permits entry of the endonuclease into the nuclei of eukaryotic cells.
- RNA-guided endonucleases also comprise at least one nuclease domain and at least one domain that interacts with a guide RNA.
- An RNA-guided endonuclease is directed to a specific nucleic acid sequence (or target site) by a guide RNA.
- the guide RNA interacts with the RNA-guided endonuclease as well as the target site such that it directs RNA-guided to the target site nucleic acid sequence to which the guide RNA is complimentary to.
- RNA-guided endonuclease can be derived from a clustered regularly interspersed short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system,
- CRISPR/Cas proteins comprise at least one RNA recognition and/or RNA binding domain.
- RNA recognition and/or RNA binding domains interact with guide RNAs.
- CRISPR/Cas proteins can also comprise nuclease domains (i.e., DNase or RNase domains), DNA binding domains, helicase domains, RNAse domains, protein-protein interaction domains, dimerization domains, as well as other domains.
- the CRISPR/Cas system can be a type I, a type II, type III, type IV, type V, or type VI system.
- suitable CRISPR/Cas proteins include Cas1, Cas2, Cas3, Cas4, Cas4, Cas5, Cas7, Cas7, Cas8, Cas9, Cas1O, Cas12(CpfI), CasI3(C2c2), Csm, and Cmr.
- the RNA-guided endonuclease is derived from a type II CRISPR/Cas system. In specific embodiments, the RNA-guided endonuclease is derived from a Cas9 or Cas 12 protein.
- the CRISPR/Cas protein can be a wild type CRISPR/Cas protein, a modified CRISPR/Cas protein, or a fragment of a wild type or modified CRISPR/Cas protein.
- the CRISPR/Cas protein can be modified to increase nucleic acid binding affinity and/or specificity, alter an enzymatic activity, and/or change another property of the protein.
- nuclease i.e., DNase, RNase
- the CRISPR/Cas protein can be truncated to remove domains that are not essential for the function of the fusion protein or protein complex.
- the CRISPR/Cas protein can also be truncated or modified to optimize the activity of the effector domain of the fusion protein.
- the CRISPR/Cas protein can be derived from a wild type Cas9 protein or fragment thereof.
- the CRISPR/Cas protein can be derived from modified Cas9 protein.
- the amino acid sequence of the Cas9 protein can be modified to alter one or more properties (e.g., nuclease activity, affinity, stability, etc.) of the protein.
- domains of the Cas9 protein not involved in RNA-guided cleavage can be eliminated from the protein such that the modified Cas9 protein is smaller than the wild type Cas9 protein.
- Cas9 protein commonly comprises at least two nuclease domains.
- a Cas9 protein can comprise a RuvC-like nuclease domain and a HNH-like nuclease domain. The RuvC and HNH domains work together to cut single strands to make a double-stranded break in DNA.
- the Cas9 protein can be modified to contain only one functional nuclease domain (either a RuvC-like or a HNH-like nuclease domain).
- the Cas9-derived protein can be modified such that one of the nuclease domains is deleted or mutated such that it is no longer functional (i. e. , the nuclease activity is absent).
- the Cas9-derived protein is able to introduce a nick into a double-stranded nucleic acid (such protein is termed a “nickase”), but not cleave the double-stranded DNA.
- an aspartate to alanine (D10A) conversion in a RuvC-like domain converts the Cas9-derived protein into a nickase.
- H840A or H839A mutations in a HNH domain convert the Cas9-derived protein into a nickase.
- Each nuclease domain can be modified using well-known methods, such as site-directed mutagenesis, PCR-mediated mutagenesis, and total gene synthesis, as well as other methods known in the art.
- Non-limiting, exemplary nuclease inactive Cas9 domains are well known to the skilled person.
- One exemplary suitable nuclease-inactive S. pyogenes Cas9 domain is the D10A/H840A Cas9 domain mutant.
- the RGN domain is a nNme2Cas9 (having D16A mutation) having sequence of SEQ ID NO: 34.
- nuclease-inactive CRISPR associated domains will be apparent to those of skill in the art based on this disclosure.
- Such additional exemplary suitable nuclease-inactive spCas9 domains include, but are not limited to, D10A, D10A/D839A/H840A, and D10A/D839A/H840A/N863A mutant domains (e.g., Prashant et al. Nature Biotechnology. 2013; 31(9): 833-838).
- Cas9 fusion proteins as provided herein comprise the full-length amino acid of a Cas9 protein. In other embodiments, however, fusion proteins as provided herein do not comprise a full-length Cas9 sequence, but only a fragment thereof.
- a Cas9 fusion protein provided herein comprises a Cas9 fragment, wherein the fragment binds crRNA and tracrRNA or sgRNA, but does not comprise a functional nuclease domain, e.g., in that it comprises only a truncated version of a nuclease domain or no nuclease domain at all.
- Exemplary amino acid sequences of suitable modified Cas9 domains are described for example in Oakes et al., Cell. 2019 Jan 10; 176(1 - 2):254-267. Additional suitable sequences of Cas9 domains and fragments will be apparent to those of skill in the art.
- Cas9 refers to Cas9 from: Corynebacterium ulcerans (NCBI Refs: NC_015683.1, NC_017317.1); Corynebacterium diphtheria (NCBI Refs: NC_016782.1, NC_016786.1); Spiroplasma syrphidicola (NCBI Ref: NC_021284.1); Prevotella intermedia (NCBI Ref: NC_017861.1); Spiroplasma taiwanense (NCBI Ref: NC_021846.1); Streptococcus iniae (NCBI Ref: NCJJ21314.1); Belliella baltica (NCBI Ref: NC_018010.1); Psychroflexus torquisl (NCBI Ref: NC__018721.1); Streptococcus thermophilus (NCBI Ref: YP_820832.1); Listeria innocua (NCBI Ref: NP______
- RNA-guided endonuclease domains that have different PAM specificities.
- RNA-guided endonuclease proteins such as commonly used Cas9 from S. pyogenes (spCas9), require a canonical NGG PAM sequence to bind a particular nucleic acid region. Having a nuclease domain that requires a specific PAM sequence may limit the ability to edit desired bases within a genome.
- the fusion proteins provided herein may need to be placed at a precise location.
- any of the fusion proteins or protein complexes provided herein may contain a RGN domain that is capable of binding a nucleotide sequence that does not contain a canonical (e.g., NGG) PAM sequence.
- RGN domains that bind to non-canonical PAM sequences have been described in the art and would be apparent to the skilled artisan.
- Cas9 domains that bind non-canonical PAM sequences have been described in Kleinstiver, B. P., et al., Nature 523, 481-485 (2015).
- RNA-guided endonuclease from Neisseria meningitidis recognizes a simple dinucleotide PAM (nnnnCC) that provides for high target site density (Edraki et al. Mol Cell. 2019 Feb 21 ;73(4):714-726) and would be a preferred variant of a Cas9 domain for use in the fusion proteins and protein complexes described herein.
- UFP uracil protecting peptide
- examples of such peptides include an uracil glycosylase inhibitor (UGI) (US 10, 167,457) and p56. Both UGI and p56 have been shown to inhibit Uracil DNA-glycosylase’s (UDG) activity Fusion proteins comprising a cytidine deaminase domain, a dCas (e.g dCas9) domain and an uracil glycosylase inhibitor (UGI) have been demonstrated to improved efficiency for deaminating target nucleotides (Komor, et al. Nature 533, 420-424 (2016)) Without wishing to be bound by any particular theory, cellular DNA-repair response to the presence of U:G heteroduplex DNA may be responsible for a decrease in nucleobase editing efficiency in cells.
- UPP novel UPPs that are useful in the context of base editing.
- such UPP comprises the sequence of SEQ ID NO: 43 or 45.
- any of the fusion proteins provided herein that comprise a RNA-guided endonuclease domain e.g., a nuclease active Cas9 domain, a nuclease inactive dCas9 domain, or a Cas9 nickase
- a nuclease active Cas9 domain e.g., a nuclease active Cas9 domain, a nuclease inactive dCas9 domain, or a Cas9 nickase
- a linker e.g., a nuclease active Cas9 domain, a nuclease inactive dCas9 domain, or a Cas9 nickase
- Some aspects of this disclosure provide cytidine deaminase-dCas9 fusion proteins, cytidine deaminase-nuclease active Cas9 fusion proteins and cytidine deaminase-Cas9 nickase (nCas9) fusion proteins comprising a UPP.
- the present disclosure provides a fusion protein or protein complex that comprises (i) an RNA-guided endonuclease domain (such as, for example, nuclease active Cas9 domain, a nuclease inactive dCas9 domain, or a Cas9 nickase), (ii) a cytidine deaminase domain, and (iii) an uracil protecting peptide (UPP) comprising the sequence of SEQ ID NO: 43 or 45.
- an RNA-guided endonuclease domain such as, for example, nuclease active Cas9 domain, a nuclease inactive dCas9 domain, or a Cas9 nickase
- UFP uracil protecting peptide
- cellular DNA-repair response to the presence of U:G heteroduplex DNA may be responsible for the decrease in nucleobase editing efficiency in cells.
- UDG uracil DNA glycosylase
- this disclosure contemplates a fusion protein comprising dCas9 -nucleic acid editing domain further fused to a UPP.
- This disclosure also contemplates a fusion protein comprising a Cas9 nickase-nucleic acid editing domain further fused to a UPP.
- the use of a UPP may increase the editing efficiency of cytidine deaminase domain that catalyzes a C to U change.
- the fusion protein comprises the structure:
- the fusion protein comprises the structure:
- nCas9 is a Cas9 nickase, and is an optional linker sequence.
- Some aspects of this disclosure provide complexes comprising any of the fusion proteins or protein complexes provided herein, and a guide RNA bound to a Cas domain (e.g., a dCas9, a nuclease active Cas9, or a Cas9 nickase) of fusion protein.
- a Cas domain e.g., a dCas9, a nuclease active Cas9, or a Cas9 nickase
- the guide RNA is from 15-300 nucleotides long and comprises a sequence of at least 10 contiguous nucleotides that is complementary to a target sequence.
- the guide RNA comprises a sequence of 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 contiguous nucleotides that is complementary to a target sequence.
- the target sequence is a DNA sequence.
- the target sequence is a sequence in the genome of a mammal, plant, or bacteria.
- the target sequence is a sequence in the genome of a human.
- the 3' end of the target sequence is immediately adjacent to a PAM sequence (e.g. canonical PAM sequence NGG of SpCas9).
- the guide RNA is complementary to a sequence associated with a disease or disorder.
- Fusion proteins comprising cytidine deaminase domain can be used for the targeted editing of nucleic acid sequences. Such fusion proteins are useful for targeted editing of DNA in vitro, e.g., for the generation of mutant cells or animals; for the introduction of targeted mutations, e.g., for the correction of genetic defects in cells ex vivo, e.g., in cells obtained from a subject that are subsequently re-introduced into the same or another subject; and for the introduction of targeted mutations, e.g., the correction of genetic defects or the introduction of deactivating mutations in disease-associated genes in a subject.
- methods comprising contacting a DNA molecule with a cytidine deaminase domain of a fusion protein provided herein with at least one gRNA as provided herein.
- the 3' end of the target sequence is not immediately adjacent to a PAM sequence .
- the 3' end of the target sequence is immediately adjacent to adjacent to a canonical PAM sequence (NGG), e.g. an AGC, GAG, TTT, GTG, or CAA sequence of SpCas9.
- NVG canonical PAM sequence
- the target DNA sequence comprises a sequence associated with a disease or disorder.
- the target DNA sequence comprises a point mutation associated with a disease or disorder.
- the activity of the cytidine deaminase domain, the cytidine deaminase fusion protein, or the complex results in a correction of the point mutation.
- the target DNA sequence comprises a T ⁇ C point mutation associated with a disease or disorder, and wherein the deamination of the mutant C base results in a sequence that is not associated with a disease or disorder.
- the target DNA sequence encodes a protein and wherein the point mutation is in a codon and results in a change in the amino acid encoded by the mutant codon as compared to the wild-type codon.
- the deamination of the mutant C results in a change of the amino acid encoded by the mutant codon.
- the deamination of the mutant C results in the codon encoding the wild-type amino acid.
- the contacting is in vivo in a subject. In some embodiments, the subject has or has been diagnosed with a disease or disorder.
- the fusion protein is used to introduce a point mutation into a nucleic acid by deaminating a target C residue.
- the deamination of the target nucleobase results in the correction of a genetic defect, e.g., in the correction of a point mutation that leads to a loss of function in a gene product.
- the genetic defect is associated with a disease or disorder.
- the methods provided herein are used to introduce a deactivating point mutation into a gene or allele that encodes a gene product that is associated with a disease or disorder.
- methods are provided herein that employ a DNA editing fusion protein provided herein to introduce a deactivating point mutation into an oncogene (e.g., in the treatment of a proliferative disease).
- a deactivating mutation may, in some embodiments, generate a premature stop codon in a coding sequence, which results in the expression of a truncated gene product, e.g., a truncated protein lacking the function of the full-length protein.
- the purpose of the methods provide herein is to restore the function of a dysfunctional gene via genome editing.
- the Cas- cytidine deaminase fusion proteins provided herein can be validated for gene editing-based human therapeutics in vitro, e.g., by correcting a disease-associated mutation in human cell culture. It will be understood by the skilled artisan that the fusion proteins provided herein, e.g., the fusion proteins comprising a Cas9 domain and a nucleic acid deaminase domain can be used to correct any single point T ⁇ C or A ⁇ G mutation. In the first case, deamination of the mutant C back to U corrects the mutation, and in the latter case, deamination of the C that is base-paired with the mutant G, followed by a round of replication, corrects the mutation.
- Some aspects of this disclosure provide a base-editing system comprising the fusion proteins and protein complexes of cytidine deaminase as disclosed herein. More specifically the disclosure provides a base-editing system comprising: (1) a fusion proteins or protein complexes comprising a RGN and cytidine deaminase domains as provided herein, and (2) a guide RNA that binds to the RNA-guided nuclease of the fusion protein or the protein complex.
- the fusion protein further comprises a UPP as disclosed herein.
- the present disclosure further provides a nucleic acid that encodes the components of the base- editing system as disclosed herein and genetic constructs comprising such nucleic acids.
- the genetic constructs such as a plasmid or expression vector, may comprise a nucleic acid that encodes the fusion protein or the protein complex of RNA-guided nuclease with cytidine deaminase and/or at least one gRNA targeting the nucleic acid of interest.
- the nucleic acid may be present in the cell as a functioning extrachromosomal molecule.
- the genetic construct may be a linear minichromosome including centromere, telomeres or plasmids or cosmids.
- the nucleic acid may also be part of a genome of a recombinant viral vector, including recombinant lentivirus, recombinant adenovirus, and recombinant adenovirus associated virus.
- the nucleic acid may be part of the genetic material in attenuated live microorganisms or recombinant microbial vectors.
- the nucleic acid may comprise regulatory elements for gene expression of the coding sequences of the nucleic acid.
- the regulatory elements may be a promoter, an enhancer, an initiation codon, a stop codon, or a polyadenylation signal.
- the nucleic acid sequences may be a form of a vector.
- the vector may be capable of expressing the base-editing system as provided herein, in the cell of a mammal.
- the vector may be recombinant.
- the vector may comprise heterologous nucleic acid encoding the fusion protein or protein complex provided herein.
- the vector may be a plasmid.
- the vector may be useful for transfecting cells with nucleic acid encoding the fusion protein or the protein complex.
- Coding sequences of the fusion proteins and protein complexes provided herein may be optimized for stability and high levels of expression.
- the coding sequences can also be codon optimized for the expression in the target cells.
- coding sequences are codon optimized for expression in particular cells, such as eukaryotic cells.
- the eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate.
- codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g. about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence.
- Various species exhibit particular bias for certain codons of a particular amino acid.
- the vector may also comprise a promoter that is operably linked to the base-editing system coding sequence.
- the promoter operably linked to the base-editing system coding sequence may be a promoter from simian virus 40 (SV40), a mouse mammary tumor virus (MMTV) promoter, a human immunodeficiency vims (HIV) promoter such as the bovine immunodeficiency virus (BIV) long terminal repeat (LTR) promoter, a cytomegalovirus (CMV) promoter such as the CMV immediate early promoter, Epstein Barr virus (EBV) promoter, or a Rous sarcoma virus (RSV) promoter.
- the promoter may also be a promoter from a human gene.
- the promoter may also be a tissue specific promoter. Examples of such promoters are described in US2004/017572.
- the vector may also comprise an enhancer upstream of the components of the base-editing system. Examples of enhancers are described in US5,593,972, US5,962,428, and WO94/016737.
- the vector may also comprise a mammalian origin of replication in order to maintain the vector extrachromosomally and produce multiple copies of the vector in a cell.
- the vector may also comprise a regulatory sequence, which may be well suited for gene expression in a mammalian or human cell into which the vector is administered.
- the vector may also comprise a reporter gene, such as green fluorescent protein ("GFP") and/or a selectable marker.
- GFP green fluorescent protein
- the disclosure provides methods comprising delivering one or more polynucleotides, such as or one or more vectors as described herein, one or more transcripts thereof, and/or one or proteins transcribed therefrom, to a host cell.
- the invention further provides cells produced by such methods, and organisms (such as animals, plants, or fungi) comprising or produced from such cells.
- the method is a method for editing a base of a nucleic acid.
- the method comprises the step of contacting a target region of a double-stranded nucleic acid (such as a DNA) with the fusion protein or protein complex provided herein and a guide RNA complementary the target region, wherein the target region comprises a targeted nucleobase pair to be edited.
- the method for editing a base of a double-stranded nucleic acid comprises the steps of: a) contacting a target region of a double-stranded nucleic acid with a complex comprising a fusion protein or protein complex provided herein and a guide RNA, wherein the target region comprises a targeted nucleobase pair to be edited; and b) converting a first nucleobase of said target nucleobase pair in a single strand of the target region to a second nucleobase, wherein a third nucleobase complementary to the first nucleobase base is replaced by a fourth nucleobase complementary to the second nucleobase; and the method results in less than 20% indel formation in the nucleic acid.
- the first nucleobase is a cytidine.
- the second nucleobase is a deaminated cytidine, or a uracil.
- the third nucleobase is a guanine.
- the fourth nucleobase is an adenine.
- the first nucleobase is a cytidine
- the second nucleobase is a deaminated cytidine, or a uracil
- the third nucleobase is a guanine
- the fourth nucleobase is an adenine.
- the method results in less than 19%, 18%, 16%, 14%, 12%, 10%, 8%, 6%, 4%, 2%, 1%, 0.5%, 0.2%, or less than 0.1% indel formation.
- the method further comprises replacing the second nucleobase with a fifth nucleobase that is complementary to the fourth nucleobase, thereby generating an intended edited base pair (e.g., C:G ⁇ T: A).
- the fifth nucleobase is a thymine.
- At least 1% of the intended base pairs are edited. In some embodiments, at least 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, or 50% of the intended base pairs are edited.
- the ratio of intended products to unintended products in the target nucleotide is at least 2: 1, 5:1, 10: 1, 20:1, 30: 1, 40: 1, 50:1, 60: 1, 70:1, 80:1, 90: 1, 100:1, or 200: 1, or more. In some embodiments, the ratio of intended point mutation to indel formation is greater than 1:1, 10:1, 50:1, 100: 1, 500: 1, or 1000:1, or more.
- the cut single strand (nicked strand) is hybridized to the guide nucleic acid. In some embodiments, the cut single strand is opposite to the strand comprising the first nucleobase.
- the base editor comprises a Cas domain, e.g. Cas9 domain.
- the first base is cytidine.
- the second base is not a G, C, A, or T.
- the second base is uracil.
- the fusion protein or protein complex provided herein inhibits base excision repair of the edited strand.
- the fusion protein or protein complex provided herein protects or binds the non-edited strand.
- the fusion protein or protein complex provided herein protects or binds the edited strand.
- the fusion protein or protein complex provided herein protects or binds the non- edited and edited strands.
- the intended edited base pair is upstream of a PAM site. In some embodiments, the intended edited base pair is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides upstream of the PAM site. In some embodiments, the intended edited base pair is downstream of a PAM site. In some embodiments, the intended edited base pair is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides downstream stream of the PAM site. In some embodiments, the method does not require a canonical spCas9 (e.g., NGG) PAM site.
- a canonical spCas9 e.g., NGG
- the target sequence comprises a target window, wherein the target window comprises the target nucleobase pair.
- the target window comprises 1-10 nucleotides.
- the target window is 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, 1-2, or 1 nucleotide in length.
- the target window is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides in length.
- the intended edited base pair is within the target window.
- the target window comprises the intended edited base pair.
- the method is performed using any of the base editors provided herein.
- a target window is a deamination window
- the disclosure provides methods for editing a nucleotide.
- the disclosure provides a method for editing a nucleobase pair of a double-stranded DNA sequence.
- the method comprises a) contacting a target region of the double-stranded DNA sequence with a complex comprising a fusion protein or protein complex provided herein and a guide nucleic acid (e.g., gRNA), where the target region comprises a target nucleobase pair; and b) converting a first nucleobase of said target nucleobase pair in a single strand of the target region to a second nucleobase, wherein a third nucleobase complementary to the first nucleobase base is replaced by a fourth nucleobase complementary to the second nucleobase, and the second nucleobase is replaced with a fifth nucleobase that is complementary to the fourth nucleobase, thereby generating an intended edited base pair, wherein the efficiency of generating the intended edited base pair
- the method causes less than 19%, 18%, 16%, 14%, 12%, 10%, 8%, 6%, 4%, 2%, 1%, 0.5%, 0.2%, or less than 0.1% indel formation.
- the ratio of intended product to unintended products at the target nucleotide is at least 2:1, 5:1, 10:1, 20:1, 30:1, 40:1, 50:1, 60:1, 70:1, 80: 1, 90: 1, 100:1, or 200: 1, or more.
- the ratio of intended point mutation to indel formation is greater than 1: 1, 10:1, 50:1, 100:1, 500:1, or 1000:1, or more.
- the cut single strand is hybridized to the guide nucleic acid. In some embodiments, the cut single strand is opposite to the strand comprising the first nucleobase. In some embodiments, the first base is cytidine. In some embodiments, the second nucleobase is not G, C, A, or T. In some embodiments, the second base is uracil. In some embodiments, the base editor inhibits base excision repair of the edited strand. In some embodiments, the base editor protects or binds the non-edited strand.
- the nucleobase editor comprises a UPP. In some embodiments, the nucleobase edit comprises nickase activity. In some embodiments, the intended edited base pair is upstream of a PAM site. In some embodiments, the intended edited base pair 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides upstream of the PAM site. In some embodiments, the intended edited base pair is downstream of a PAM site. In some embodiments, the intended edited base pair is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides downstream stream of the PAM site.
- the method does not require a PAM site.
- the nucleobase editor comprises a linker.
- the linker is 1-25 amino acids in length. In some embodiments, the linker is 5-20 amino acids in length. In some embodiments, the linker is 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids in length.
- the target region comprises a target window, wherein the target window comprises the target nucleobase pair. In some embodiments, the target window comprises 1-10 nucleotides. In some embodiments, the target window is 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, 1-2, or 1 nucleotide in length.
- the target window is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides in length.
- the intended edited base pair occurs within the target window.
- the target window comprises the intended edited base pair.
- the nucleobase editor is any one of the base editors provided herein.
- the instant disclosure provides methods for the treatment of diseases or disorders, e.g., diseases or disorders that are associated or caused by a point mutation that can be corrected by cytidine deaminase gene editing.
- diseases or disorders e.g., diseases or disorders that are associated or caused by a point mutation that can be corrected by cytidine deaminase gene editing.
- Some such diseases are described herein, and additional suitable diseases that can be treated with the strategies and fusion proteins provided herein will be apparent to those of skill in the art based on the instant disclosure.
- Exemplary suitable diseases and disorders are listed below. It will be understood that the numbering of the specific positions or residues in the respective sequences depends on the particular protein and numbering scheme used. Numbering might be different, e.g., in precursors of a mature protein and the mature protein itself, and differences in sequences from species to species may affect numbering.
- Suitable diseases and disorders include, without limitation, cystic fibrosis (see, e.g., Schwank et al., Functional repair of CFTR by CRISPR/Cas9 in intestinal stem cell organoids of cystic fibrosis patients. Cell stem cell. 2013; 13: 653-658; and Wu et. al., Correction of a genetic disease in mouse via use of CRISPR-Cas9. Cell stem cell. 2013; 13: 659-662, Pharmaceutical compositions
- the composition of the present invention may be in a pharmaceutical composition.
- the pharmaceutical composition may comprise about 1 ng to about 10 mg of DNA encoding the CRISPR/Cas9- based system or CRISPR/Cas9-based system protein component, i.e., the fusion protein.
- the pharmaceutical composition may comprise about 1 ng to about 10 mg of the DNA of the modified lentiviral vector.
- the pharmaceutical composition may comprise about 1 ng to about 10 mg of the DNA of the modified AAV vector and a nucleotide sequence encoding the site-specific nuclease.
- the pharmaceutical compositions according to the present invention can be formulated according to the mode of administration to be used.
- compositions are injectable pharmaceutical compositions, they are sterile, pyrogen free and particulate free.
- An isotonic formulation is preferably used.
- additives for isotonicity may include sodium chloride, dextrose, mannitol, sorbitol and lactose.
- isotonic solutions such as phosphate buffered saline are preferred.
- Stabilizers include gelatin and albumin.
- a vasoconstriction agent is added to the formulation.
- the composition may further comprise a pharmaceutically acceptable excipient.
- the pharmaceutically acceptable excipient may be functional molecules as vehicles, adjuvants, carriers, or diluents.
- the pharmaceutically acceptable excipient may be a transfection facilitating agent, which may include surface active agents, such as immune-stimulating complexes (ISCOMS), Freunds incomplete adjuvant, LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs, vesicles such as squalene and squalene, hyaluronic acid, lipids, liposomes, calcium ions, viral proteins, polyanions, polycations, or nanoparticles, or other known transfection facilitating agents.
- ISCOMS immune-stimulating complexes
- LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs, vesicles such as squalene and squalene, hyaluronic acid,
- the transfection facilitating agent can be a polyanion, polycation, including poly-L- glutamate (LGS), or lipid.
- the transfection facilitating agent is poly-L-glutamate, and more preferably, the poly-L- glutamate is present in the composition for genome editing in skeletal muscle or cardiac muscle at a concentration less than 6 mg/ml.
- the transfection facilitating agent may also include surface active agents such as immune-stimulating complexes (ISCOMS), Freunds incomplete adjuvant, LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs and vesicles such as squalene and squalene, and hyaluronic acid may also be used administered in conjunction with the genetic construct.
- ISCOMS immune-stimulating complexes
- LPS analog including monophosphoryl lipid A
- muramyl peptides muramyl peptides
- quinone analogs and vesicles such as squalen
- the DNA vector encoding the composition may also include a transfection facilitating agent such as lipids, liposomes, including lecithin liposomes or other liposomes known in the art, as a DNA- liposome mixture (see for example W09324640), calcium ions, viral proteins, polyanions, polycations, or nanoparticles, or other known transfection facilitating agents.
- the transfection facilitating agent is a polyanion, polycation, including poly-L-glutamate (LGS), or lipid.
- CRISPResso2 CRISPResso2
- CRISPResso2 CRISPResso2
- element K Rees H, Canver MC, Gehrke JM, Farouni R, Hsu JY, Cole MA, Liu DR, Joung JK, Bauer DE, Pinello L.
- CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat Biotechnol. 2019 Mar; 37(3):224-226. doi: 10.1038/s41587-019-0032-3. PubMed PMID: 30809026).
- Table 5 summaries base editing results for both target sequences and Tables 6, 7, 8 and 9 show the editing rate of cytidine bases for CBE07, CBE08, CBE10, CBE11, CBE12, and CBE13 deaminase and the rate for targeted cytosine deamination for the wildtype Nme2Cas9 targeted to the same region.
- Active cytosine base editing was defined as greater than the median INDEL formation of the novel CBE under investigation and greater than 5x C>T SNP base editing at target cytosines compared to the average C>T rate of the regular Nme2Cas9 within the target region.
- Increased INDEL formation indicates an active Cytosine Base Editor because the deaminase-RGN fusion protein consists of a RGN that has an inactive RuvC domain. With only a single active nuclease domain, the RGN will only function as a nickase and will not generate a detectable INDEL formation by itself. When fused with an active deaminase that acts on the opposite strand a cytosine will be turned into a uracil. The uracil is rapidly removed from the DNA leaving an abasic site, and eventually a gap, on the strand opposite the strand nicked by the RGN.
- Each CBE construct has a different editing window, or region of the target sequence that the cytosine deaminase acts upon. This is driven by two functions: 1) the steric properties of the direct fusion and how accessible the exposed single stranded DNA is, and 2) what are the cytosine recognition preferences of the deaminase itself. Successful cytosine editing will only occur when there is a cytosine with a preferred sequence motif located in the preferred editing window.
- TC is a common dinucleotide preference for existing cytosine base editors and we see successful editing on EGsG033 (ctggtggccarCtCcactgctagta) (SEQ ID NO: 39) with the C12 and C14 positions with APOBEC3A, but hardly any editing at position C8 (GC) or C 15 (CC) as they are not the preferred motif. However, with die novel cytosine deaminases identified the preferred editing occurs at C8 (GC).
- Example 4 Demonstration of base editing activity of fusion proteins comprising a UPP on endogenous targets in mammalian cells
- the coding sequence of the identified UPP is codon-optimized for expression in mammalian cells and introduced into the expression cassette, which produces a fusion protein that includes a 3xFLAG tag at its N-terminal end, the 3xFLAG tag is operably linked to a NLS at its C-terminal end, and the NLS is operably linked to the codon optimized deaminase sequences at its C-terminal end.
- the putative deaminases are operably linked to a flexible amino acid linker at their C-terminal end, and the amino acid linker is operably linked to a known active RNA guided nuclease at its C-terminal end, that has been mutated to have an inactive RuvC domain (nNme2Cas9_D16A) (That is, it has been mutated into RGN that acts as a nickase).
- the RNA-guided DNA binding polypeptide is operably linked to a flexible amino acid linker at their C-terminal end, and the amino acid linker is operably linked to the putative uracil protecting peptide.
- genomic DNA is harvested from the transfected cells, and the DNA is sequenced and analyzed for the presence of targeted cytosine base editing mutations using CRISPResso2 (Clement K, et al Nat Biotechnol. 2019; 37(3):224-226).
- Tables 11 and 12 show the editing rate of cytidine bases for UPP12 (SEQ ID NO: 43) and UPP14 (SEQ ID NO: 45) and the rate for targeted cytosine deamination for the control deaminase-RGN targeted to the same region.
- Active cytosine base editing was defined as a reduction in INDEL formation of the novel UPP under investigation, increase of C>D SNP base editing along the targeted window, and >85% C>T SNP base editing at highly mutated cytosines compared to the deaminase-RGN without a UPP within the target region.
- Decreased INDEL formation indicates an active UPP because the deaminase-RGN-UPP fusion protein consists of a RGN that has an inactive RuvC domain. With only a single active nuclease domain, the RGN will only function as a nickase and will not generate a detectable INDEL formation by itself. When fused with an active deaminase that acts on the opposite strand a cytosine will be turned into a uracil. The uracil is rapidly removed from the DNA leaving an abasic site, and eventually a gap, on the strand opposite the strand nicked by the RGN.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Medicinal Chemistry (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Pulmonology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Enzymes And Modification Thereof (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Peptides Or Proteins (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263346065P | 2022-05-26 | 2022-05-26 | |
| PCT/EP2023/063933 WO2023227669A2 (en) | 2022-05-26 | 2023-05-24 | Novel nucleic acid-editing proteins |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP4532704A2 true EP4532704A2 (en) | 2025-04-09 |
Family
ID=86760329
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP23729980.5A Withdrawn EP4532704A2 (en) | 2022-05-26 | 2023-05-24 | Novel nucleic acid-editing proteins |
Country Status (3)
| Country | Link |
|---|---|
| EP (1) | EP4532704A2 (en) |
| JP (1) | JP2025517515A (en) |
| WO (1) | WO2023227669A2 (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2020051562A2 (en) | 2018-09-07 | 2020-03-12 | Beam Therapeutics Inc. | Compositions and methods for improving base editing |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP0646178A1 (en) | 1992-06-04 | 1995-04-05 | The Regents Of The University Of California | expression cassette with regularoty regions functional in the mammmlian host |
| WO1994016737A1 (en) | 1993-01-26 | 1994-08-04 | Weiner David B | Compositions and methods for delivery of genetic material |
| US5593972A (en) | 1993-01-26 | 1997-01-14 | The Wistar Institute | Genetic immunization |
| US5962428A (en) | 1995-03-30 | 1999-10-05 | Apollon, Inc. | Compositions and methods for delivery of genetic material |
| US6768550B2 (en) | 2002-07-26 | 2004-07-27 | Proterion Corporation | Beam shifting surface plasmon resonance system and method |
| IL258821B (en) | 2015-10-23 | 2022-07-01 | Harvard College | Nucleobase editors and uses thereof |
| CA3116555A1 (en) * | 2018-10-15 | 2020-04-23 | University Of Massachusetts | Programmable dna base editing by nme2cas9-deaminase fusion proteins |
| GB202010348D0 (en) * | 2020-07-06 | 2020-08-19 | Univ Wageningen | Base editing tools |
| EP4182454A1 (en) * | 2020-07-15 | 2023-05-24 | Lifeedit Therapeutics, Inc. | Uracil stabilizing proteins and active fragments and variants thereof and methods of use |
-
2023
- 2023-05-24 EP EP23729980.5A patent/EP4532704A2/en not_active Withdrawn
- 2023-05-24 WO PCT/EP2023/063933 patent/WO2023227669A2/en not_active Ceased
- 2023-05-24 JP JP2024569521A patent/JP2025517515A/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| WO2023227669A2 (en) | 2023-11-30 |
| WO2023227669A3 (en) | 2024-02-22 |
| JP2025517515A (en) | 2025-06-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2021231074C1 (en) | Class II, type V CRISPR systems | |
| US12215365B2 (en) | Cas variants for gene editing | |
| KR102852347B1 (en) | Method for substituting pathogenic amino acids using a programmable base editor system | |
| US20240173430A1 (en) | Base editing for treating hutchinson-gilford progeria syndrome | |
| US10519454B2 (en) | Genome editing using Campylobacter jejuni CRISPR/CAS system-derived RGEN | |
| JP2022500017A (en) | Compositions and Methods for Delivering Nucleobase Editing Systems | |
| EP4028026A1 (en) | Novel nucleobase editors and methods of using same | |
| KR20250107288A (en) | Uses of adenosine base editors | |
| CN120400115A (en) | Nucleobase editor with reduced off-target deamination reaction and method for modifying nucleobase target sequence using the same | |
| CA3100014A1 (en) | Methods of suppressing pathogenic mutations using programmable base editor systems | |
| CN117925585A (en) | Adenosine deaminase, base editor fusion protein, base editor system and use thereof | |
| JP2013520190A (en) | Use of endonuclease for transgene insertion into the Safe Harbor locus | |
| KR20160050069A (en) | Cas9 variants and uses thereof | |
| WO2003089618A2 (en) | Transposon system and methods of use | |
| US12129478B1 (en) | Engineered adenosine deaminases and base editors thereof | |
| EP4532704A2 (en) | Novel nucleic acid-editing proteins | |
| US12312619B2 (en) | Deaminases and variants thereof for use in base editing | |
| KR20190122596A (en) | Gene Construct for Base Editing, Vector Comprising the Same and Method for Base Editing Using the Same | |
| CN118147120A (en) | Cytidine deaminase and base editor | |
| KR20240141024A (en) | Gene editing system using E1347A DddAtox mutant-TnpB fusion protein | |
| CN119162157B (en) | Deaminases and their variants for base editing | |
| WO2025003358A2 (en) | Novel nucleic acid targeting systems comprising rna-guided nucleases | |
| WO2024038168A1 (en) | Novel rna-guided nucleases and nucleic acid targeting systems comprising such | |
| CN120665840A (en) | Cas9 protein mutant and application thereof | |
| WO2024235991A1 (en) | Rna-guided nucleases and nucleic acid targeting systems comprising such rna-guided nucleases |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
| 17P | Request for examination filed |
Effective date: 20250102 |
|
| AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
| DAV | Request for validation of the european patent (deleted) | ||
| DAX | Request for extension of the european patent (deleted) | ||
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
| 18D | Application deemed to be withdrawn |
Effective date: 20250715 |