[go: up one dir, main page]

WO2025049303A1 - Compositions et procédés de régulation épigénétique - Google Patents

Compositions et procédés de régulation épigénétique Download PDF

Info

Publication number
WO2025049303A1
WO2025049303A1 PCT/US2024/043629 US2024043629W WO2025049303A1 WO 2025049303 A1 WO2025049303 A1 WO 2025049303A1 US 2024043629 W US2024043629 W US 2024043629W WO 2025049303 A1 WO2025049303 A1 WO 2025049303A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
acid molecule
domain
epigenetic
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/043629
Other languages
English (en)
Inventor
Naveen Cherukupalli REDDY
Vic MYER
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nchroma Bio Inc
Original Assignee
Nchroma Bio Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nchroma Bio Inc filed Critical Nchroma Bio Inc
Publication of WO2025049303A1 publication Critical patent/WO2025049303A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4702Regulators; Modulating activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1003Transferases (2.) transferring one-carbon groups (2.1)
    • C12N9/1007Methyltransferases (general) (2.1.1.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21061Kexin (3.4.21.61), i.e. proprotein convertase subtilisin/kexin type 9
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/90Fusion polypeptide containing a motif for post-translational modification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/001Vector systems having a special element relevant for transcription controllable enhancer/promoter combination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/48Vector systems having a special element relevant for transcription regulating transport or export of RNA, e.g. RRE, PRE, WPRE, CTE
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/50Vector systems having a special element relevant for transcription regulating RNA stability, not being an intron, e.g. poly A signal
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y201/00Transferases transferring one-carbon groups (2.1)
    • C12Y201/01Methyltransferases (2.1.1)
    • C12Y201/01037DNA (cytosine-5-)-methyltransferase (2.1.1.37)

Definitions

  • Genome editing has been considered a promising therapeutic approach for the treatment of genetic disease for over a decade.
  • manipulation on the DNA level remains risky, given the potential for undesired double stranded breaks, heterogenous repair including large and small insertions and deletions at the intended site, and toxicity.
  • Some aspects of the present disclosure provide systems, compositions, strategies, and methods for epigenetic regulation.
  • nucleic acid molecule comprising a regulatory element and an epigenetic reducing construct, wherein the regulatory element comprises a promotor and a target sequence, wherein the epigenetic reducing construct encodes a fusion protein, and wherein the fusion protein comprises a DNA-binding domain that is capable of binding to the target sequence, upon which the fusion protein reduces an expression level of the epigenetic reducing construct itself.
  • the target sequence is located upstream of the promotor. In some embodiments, the target sequence is located downstream of the promotor. In some embodiments, the target sequence is located inside the promotor. In some embodiments, the nucleic acid molecule further comprises a CpG island (CGI). In some embodiments, the CGI is located upstream or downstream of the promotor. In some embodiments, the target sequence is located inside the CGI. In some embodiments, the CGI is 100-200bp. In some embodiments, the CGI is less than lOObp. In some embodiments, the CGI is more than 200bp. In some embodiments, the nucleic acid molecule comprises more than one target sequence. In some embodiments, the target sequence is homologous to the promotor. In some embodiments, the target sequence is heterologous to the promotor.
  • CGI CpG island
  • the promotor is selected from the group consisting of a cytomegalovirus (CMV) promoter, a synapsin gene promoter, a Clathrin Light Chain A (CLTA) gene promoter, or a Proprotein convertase subtilisin/kexin type 9 (PCSK9) gene promoter.
  • CMV cytomegalovirus
  • CLTA Clathrin Light Chain A
  • PCSK9 Proprotein convertase subtilisin/kexin type 9
  • the nucleic acid molecule further comprises an enhancer.
  • the enhancer comprises a cytomegalovirus (CMV) enhancer.
  • the DNA-binding domain comprises a zinc finger motif or a zinc finger array. In some embodiments, the DNA-binding domain comprises a nucleic acid-guided DNA-binding domain. In some embodiments, the DNA-binding domain comprises an RNA-guided DNA-binding domain. In some embodiments, the DNA-binding domain comprises a DNA-binding domain of a CRISPR-Cas protein. In some embodiments, the CRISPR-Cas protein comprises a nuclease inactive Cas9 (dCas9), a nuclease inactive Cast 2a (dCasl2a), or a nuclease inactive CasX (dCasX).
  • dCas9 nuclease inactive Cas9
  • dCasl2a nuclease inactive Cast 2a
  • dCasX nuclease inactive CasX
  • the fusion protein further comprises a DNA methyltransferase (DNMT) domain.
  • the DNMT domain comprises a DNMT3 A domain.
  • the DNMT3 A domain is a human DNMT3 A domain or a mouse DNMT3 A domain.
  • the DNMT domain comprises a DNMT3L domain.
  • the DNMT3L domain is a human DNMT3L domain or a mouse DNMT3L domain.
  • the fusion protein further comprises a transcriptional repressor domain.
  • the transcriptional repressor domain is a KRAB domain.
  • the fusion protein further comprises a nuclear localization sequence.
  • plasmid or a vector comprising the nucleic acid molecule of any one of the embodiments herein.
  • AAV Adeno-associated virus
  • a cell comprising the nucleic acid molecule of any one of the proceeding embodiments, or a plasmid or a vector of the embodiment herein.
  • the cell is a non-dividing cell.
  • the cell is a neuron.
  • nucleic acid molecule comprising a regulatory element, wherein the regulatory element comprises a promotor, a target sequence and a methylation site.
  • the methylation site is located upstream of the promotor. In some embodiments, the methylation site is located downstream of the promotor. In some embodiments, the methylation site is located inside the promotor. In some embodiments, the target sequence is located upstream of the promotor. In some embodiments, the target sequence is located downstream of the promotor. In some embodiments, the target sequence is located inside the promotor. In some embodiments, the target sequence is located inside the methylation site. In some embodiments, the methylation site comprises a CpG island (CGI). In some embodiments, the CGI is 100-200bp. In some embodiments, the CGI is less than lOObp. In some embodiments, the CGI is more than 200bp.
  • CGI CpG island
  • the nucleic acid molecule comprises more than one target sequence.
  • the target sequence, the methylation site, or both are heterologous to the promotor.
  • the promotor is selected from the group consisting of a cytomegalovirus (CMV) promoter, a synapsin gene promoter, a Clathrin Light Chain A (CLTA) gene promoter, or a Proprotein convertase subtilisin/kexin type 9 (PCSK9) gene promoter.
  • CMV cytomegalovirus
  • CLTA Clathrin Light Chain A
  • PCSK9 Proprotein convertase subtilisin/kexin type 9
  • the nucleic acid molecule further comprises an enhancer.
  • the enhancer comprises a cytomegalovirus (CMV) enhancer.
  • plasmid or a vector comprising the nucleic acid molecule of any one of the embodiments herein.
  • AAV Adeno-associated virus
  • a cell comprising the nucleic acid molecule of any one of the embodiments herein, or the plasmid or vector of the embodiment herein.
  • the cell is a non-dividing cell. In some embodiments, the cell is a neuron.
  • a method of modifying an epigenetic state of a target gene in a mammalian cell comprising contacting the cell with the nucleic acid molecule of any one of the embodiments herein, the plasmid or vector of the embodiment herein or the AAV vector of the embodiment herein.
  • a method of reducing expression of a target gene in a mammalian cell comprising contacting the cell with the nucleic acid molecule of any one of the embodiments herein, the plasmid or vector of the embodiment herein or the AAV vector of the embodiment herein.
  • the target gene is PCSK9, CLTA, HTT or B2M.
  • the mammalian cell is located in the central nervous system. In some embodiments, the mammalian cell is a neuron. [0024] In one aspect, disclosed herein is a method of treatment, the method comprising administering to a subject in need thereof a nucleic acid molecule of an embodiment disclosed herein, a plasmid or vector of an embodiment disclosed herein, or an AAV vector of an embodiment disclosed herein. In some embodiments, the method comprises contacting the nucleic acid molecule, the plasmid, the vector, or the AAV vector to a neuron in the subject.
  • FIG. 1A shows schematic illustration of a SLiC containing CRISPRoff.
  • FIG. IB shows schematic illustrations of a SLiC containing ZFoff.
  • FIG. 2A shows epigenetic silencing of a SLiC with a downstream CTLA CpG island (CGI) by ZFoff using GFP as marker.
  • FIG. 2B shows epigenetic silencing of a SLiC with an upstream CTLA CGI by ZFoff using GFP as a marker.
  • FIG. 3A shows epigenetic silencing by CRISPRoff of variant SLiCs with a downstream CTLA CGI using GFP as a marker.
  • FIG. 3B shows epigenetic silencing by CRISPRoff of variant SLiCs with an upstream CTLA CGI using GFP as a marker.
  • FIG. 3C shows epigenetic silencing by CRISPRoff of variant SLiCs with a downstream CTLA CGI and various CTLA target (TAR) positions, using GFP as a marker.
  • FIG. 3D shows epigenetic silencing by CRISPRoff of variant SLiCs with an upstream CTLA CGI and various CTRLA TAR positions, using GFP as a marker.
  • FIG. 4 shows epigenetic silencing of variant SLiCs with variant CRISPR targeting domains (e.g., HTT(799), B2M, CIITA, CLTA) by CRISPRoff.
  • FIG. 5A shows epigenetic silencing by ZFoff of variant SLiCs with an upstream CLTA CGI.
  • FIG. 5B shows epigenetic silencing by ZFoff of variant SLiCs with an upstream CD151 CGI.
  • FIG. 6A shows methylation profiles after treatment with SLiC with a control construct, SLiC CRISPRi, or CRISPRoff possessing an upstream CGI.
  • FIG. 6B shows methylation profiles after treatment with a control construct, SLiC ZFi, or ZFoff possessing an upstream CGI.
  • FIG. 6C shows methylation profiles after treatment with a control construct, SLiC CRISPRi or CRISPRoff possessing a downstream CGI.
  • FIG. 6D shows methylation profiles after treatment with a control construct, SLiC ZFi, or ZFoff possessing a downstream CGI.
  • FIG. 7A shows simultaneous epigenetic silencing of SLiC and endogenous GFP by ZFoff using GFP as a marker for endogenous protein expression and mCherry as a marker for SLiC protein expression (FIG. 7B).
  • FIG. 8 shows the percentage of CTLA protein levels measured at various time points up to 28 days after treatment with SLiC ZFoff.
  • FIG. 9A shows methylation profiles after treatment with a control construct, SLiC ZFi, or ZFoff possessing a downstream CGI.
  • FIG. 9B shows percentage of methylation after treatment with a control construct, SLiC ZFi, or ZFoff possessing a downstream CGI.
  • FIG. 10 shows epigenetic silencing of CD151 by ZFoff optimized for AAV (ZFoff-ADD).
  • FIG. 11 shows GFP protein expression by SLiCs with variant synthetic CGI domains.
  • FIG. 12 shows GFP protein expression and epigenetic silencing by SLiCs with variant synthetic CGI domains.
  • FIG. 13 shows epigenetic silencing of SLiC with various TAR positions in the promoter (e.g., CLTA promoter, PCSK9 promoter).
  • the promoter e.g., CLTA promoter, PCSK9 promoter.
  • the present disclosure provides nucleic acid molecule, and strategies and methods of using such nucleic acid molecule for regulating expression of the epigenetic editors and endogenous genes.
  • the present disclosure also provides a method of generating the nucleic acid molecule, and a cell comprising the nucleic acid molecule disclosed herein.
  • the nucleic acid molecule is a plasmid, a vector, or a viral vector (e.g., Adeno-associated virus (AAV) vector).
  • the nucleic acid molecule comprises an epigenetic editor.
  • the epigenetic editor of the nucleic acid molecule is self-limiting, wherein the epigenetic editor can reduce expression level of the nucleic acid molecule.
  • the nucleic acid molecule further comprises a regulatory element.
  • the regulatory element comprises a promoter and a target sequence.
  • the nucleic acid further comprises a methylation site (e.g., CpG island (CGI)).
  • the epigenetic editors described herein may be expressed in a host cell transiently, or may be integrated in a genome of the host cell; such cells and their progeny are also contemplated by the present disclosure. Both transiently expressed and integrated epigenetic editors or components thereof can effect stable epigenetic modifications. For example, after introducing to a host cell an epigenetic editor described herein, the target gene in the host cell may be stably or permanently repressed or silenced.
  • An epigenetic editor described herein may comprise one or more DNA-binding domains that direct the effector domain(s) of the epigenetic editor to target sequences within a cell genome and/or a nucleic acid molecule (e.g., a plasmid) provided herein.
  • a DNA- binding domain as described herein may be, e.g., a polynucleotide guided DNA-binding domain, a zinc finger protein (ZFP) domain, a transcription activator like effector (TALE) domain, a meganuclease DNA-binding domain, and the like. Examples of DNA-binding domains can be found in U.S. Patent 11,162,114, which is incorporated by refence herein in its entirety.
  • a DNA-binding domain described herein is encoded by its native coding sequence. In other embodiments, the DNA-binding domain is encoded by a nucleotide sequence that has been codon-optimized for optimal expression in human cells.
  • a DNA-binding domain herein may be a protein domain directed by a guide nucleic acid sequence (e.g., a guide RNA sequence) to a target site in a cell genome and/or a nucleic acid molecule (e.g., a plasmid) provided herein.
  • the protein domain may be derived from a CRISPR-associated nuclease, such as a Class I or II CRISPR-associated nuclease.
  • the protein domain may be derived from a Cas nuclease such as a Type II, Type IIA, Type IIB, Type IIC, Type V, or Type VI Cas nuclease.
  • the protein domain may be derived from a Class II Cas nuclease selected from Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, CaslO, Casl4a, Casl4b, Casl4c, CasX, CasY, CasPhi, C2c4, C2c8, C2c9, C2cl0, Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, CsxlS, Csf
  • “Derived from” is used to mean that the protein domain comprises the full polypeptide sequence of the parent protein, or comprises a variant thereof (e.g., with amino acid residue deletions, insertions, and/or substitutions).
  • the variant retains the desired function of the parent protein (e.g., the ability to form a complex with the guide nucleic acid sequence and the target DNA).
  • the CRISPR-associated protein domain may be a Cas9 domain described herein.
  • Cas9 may, for example, refer to a polypeptide with at least about 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity and/or sequence similarity to a wildtype Cas9 polypeptide described herein.
  • said wildtype polypeptide is Cas9 from Streptococcus pyogenes (NCBI Ref. No. NC_002737.2 (SEQ ID NO: 1)) and/or UniProt Ref. No. Q99ZW2 (SEQ ID NO: 2).
  • said wildtype polypeptide is Cas9 from Staphylococcus aureus (SEQ ID NO: 3).
  • the CRISPR-associated protein domain is a Cpfl domain or protein, or a polypeptide with at least about 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity and/or sequence similarity to a wildtype Cpfl polypeptide described herein (e.g., Cpfl from Franscisella novicida (UniProt Ref. No. U2UMQ6 or SEQ ID NO: 4).
  • the CRISPR-associated protein domain may be a modified form of the wildtype protein comprising one or more amino acid residue changes such as a deletion, an insertion, or a substitution; a fusion or chimera; or any combination thereof.
  • Cas9 sequences and structures of variant Cas9 orthologs have been described for various organisms.
  • Exemplary organisms from which a Cas9 domain herein can be derived include, but are not limited to, Streptococcus pyogenes, Streptococcus thermophilus, Streptococcus sp., Staphylococcus aureus, Listeria innocua, Lactobacillus gasseri, Francisella novicida, Wolinella succinogenes, Sutterella wadsworthensis, Gamma proteobacterium, Neisseria meningitidis, Campylobacter jejuni, Pasteurella multocida, Fibrobacter succinogene, Rhodospirillum rubrum, Nocardiopsis rougevillei, Streptomyces pristinaespiralis, Streptomyces viridochromogenes, Streptomyces viridochromogenes, Streptosporangium
  • Cas9 sequences also include those from the organisms and loci disclosed in Chylinski et al., RNA Biol. (2013) 10(5):726-37.
  • the Cas9 domain is from Streptococcus pyogenes. In some embodiments, the Cas9 domain is from Staphylococcus aureus.
  • Cas domains are also contemplated for use in the epigenetic editors herein. These include, for example, those from CasX (Casl2E) (e.g., SEQ ID NO: 5), CasY (Cas 12d) (e.g., SEQ ID NO: 6), Cascp (CasPhi) (e.g., SEQ ID NO: 7), Casl2fl (Casl4a) (e.g., SEQ ID NO: 8), Casl2f2 (Casl4b) (e.g., SEQ ID NO: 9), Casl2fi (Casl4c) (e.g., SEQ ID NO: 10), and C2c8 (e.g., SEQ ID NO: 11).
  • CasX Casl2E
  • CasY Cas 12d
  • Cascp CasPhi
  • Casl2fl Casl4a
  • Casl4a Casl4a
  • the nuclease-derived protein domain may have reduced or no nuclease activity through mutations such that the protein domain does not cleave DNA or has reduced DNA-cleaving activity while retaining the ability to complex with the guide nucleic acid sequence (e.g., guide RNA) and the target DNA.
  • the nuclease activity may be reduced by at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% compared to the wildtype domain.
  • a CRISPR-associated protein domain described herein is catalytically inactive (“dead”).
  • examples of such domains include, for example, dCas9 (“dead” Cas9), dCpfl, ddCpfl, dCasPhi, ddCasl2a, dLbCpfl, and dFnCpfl.
  • a dCas9 protein domain may comprise one, two, or more mutations as compared to wildtype Cas9 that abrogate its nuclease activity.
  • the DNA cleavage domain of Cas9 is known to include two subdomains: the HNH nuclease subdomain and the RuvCl subdomain.
  • the HNH subdomain cleaves the strand complementary to the gRNA, whereas the RuvCl subdomain cleaves the non-complementary strand. Mutations within these subdomains can silence the nuclease activity of Cas9.
  • the mutations D10A (in RuvCl) and H840A (in HNH) completely inactivate the nuclease activity of SpCas9.
  • SaCas9 similarly, may be inactivated by the mutations D10A and N580A.
  • the dCas9 comprises at least one mutation in the HNH subdomain and/or the RuvCl subdomain that reduces or abrogates nuclease activity.
  • the dCas9 only comprises a RuvCl subdomain, or only comprises an HNH subdomain. It is to be understood that any mutation that inactivates the RuvCl and/or the HNH domain may be included in a dCas9 herein, e.g., insertion, deletion, or single or multiple amino acid substitution in the RuvCl domain and/or the HNH domain.
  • a dCas9 protein herein comprises a mutation at position(s) corresponding to position DIO (e.g., D10A), H840 (e.g., H840A), or both, of a wildtype SpCas9 sequence as numbered in the sequence provided at UniProt Accession No. Q99ZW2 (SEQ ID NO: 2).
  • the dCas9 comprises the amino acid sequence of dSpCas9 (D10A and H840A) (SEQ ID NO: 12).
  • a dCas9 protein as described herein comprises a mutation at position(s) corresponding to position D10 (e.g., D10A), N580 (e.g., N580A), or both, of a wildtype SaCas9 sequence (e.g., SEQ ID NO: 9).
  • the dCas9 comprises the amino acid sequence of dSaCas9 (D10A and N580A) (SEQ ID NO.: 13).
  • Additional suitable mutations that inactivate Cas9 will be apparent to those of skill in the art based on this disclosure and knowledge in the field and are within the scope of this disclosure.
  • Such mutations may include, but are not limited to, D839A, N863A, and/or K603R in SpCas9.
  • the present disclosure contemplates any mutations that reduce or abrogate the nuclease activity of any Cas9 described herein (e.g., mutations corresponding to any of the Cas9 mutations described herein).
  • a dCpfl protein domain may comprise one, two, or more mutations as compared to wildtype Cpfl that reduce or abrogate its nuclease activity.
  • the Cpfl protein has a RuvC- like endonuclease domain that is similar to the RuvC domain of Cas9, but does not have an HNH endonuclease domain, and the N-terminal of Cpfl does not have the alpha-helical recognition lobe of Cas9.
  • the dCpfl comprises one or more mutations corresponding to position D917A, E1006A, or D1255A as numbered in the sequence of the Francisella novicida Cpfl protein (FnCpfl; SEQ ID NO: 4).
  • the dCpfl protein comprises mutations corresponding to D917A, E1006A, D1255A, D917A/E1006A, D917A/D1255A, E1006A/D1255A, or D917A/ E1006A/D1255A, or corresponding mutation(s) in any of the Cpfl amino acid sequences described herein.
  • the dCpfl comprises a D917A mutation.
  • the dCpfl comprises the amino acid sequence of dFnCpfl (SEQ ID NO: 14).
  • nuclease inactive CRISPR-associated protein domains contemplated herein include those from, for example, dNmeCas9 (e.g., SEQ ID NO: 15), dCjCas9 (e.g., SEQ ID NO: 16), dStlCas9 (e.g., SEQ ID NO: 17), dSt3Cas9 (e.g., SEQ ID NO: 18), dLbCpfl (e.g., SEQ ID NO: 19), dAsCpfl (e.g., SEQ ID NO: 20), denAsCpfl (e.g., SEQ ID NO: 21), dHFAsCpfl (e.g., SEQ ID NO: 22), dRVRAsCpfl (e.g., SEQ ID NO: 23), dRRAsCpfl (e.g., SEQ ID NO: 24), dCasX (e.g., SEQ ID NO: 25), and
  • a Cas9 domain described herein may be a high fidelity Cas9 domain, e.g., comprising one or more mutations that decrease electrostatic interactions between the Cas9 domain and the sugar-phosphate backbone of DNA to confer increased target binding specificity.
  • the high fidelity Cas9 domain may be nuclease inactive as described herein.
  • Cas9 domains that bind non-canonical PAM sequences have been described in Kleinstiver et al., Nature (2015) 523(7561):481 -5 and Kleinstiver et al., Nat BiotechnoL (2015) 33: 1293-8.
  • Such Cas9 domains may include, for example, those from “VRER” SpCas9, “EQR” SpCas9, “VQR” SpCas9, “SpG Cas9,” “SpRYCas9,” and “KKH” SaCas9.
  • Nuclease inactive versions of these Cas9 domains are also contemplated, such as nuclease inactive VRER SpCas9 (e.g., SEQ ID NO: 27), nuclease inactive EQR SpCas9 (e.g., SEQ ID NO: 28), nuclease inactive VQR SpCas9 (e.g., SEQ ID NO: 29), nuclease inactive SpG Cas9 (e.g., SEQ ID NO: 30), nuclease inactive SpRY Cas9 (e.g., SEQ ID NO: 31), and nuclease inactive KKH SaCas9 (e.g., SEQ ID NO: 32).
  • Another example is the Cas9 of Francisella novicida engineered to recognize 5’-YG-3’ (where “Y” is a pyrimidine).
  • CRISPR-associated proteins [0060] Additional suitable CRISPR-associated proteins, orthologs, and variants, including nuclease inactive variants and sequences, will be apparent to those of skill in the art based on this disclosure.
  • the DNA-binding domain of an epigenetic editor described herein comprises a zinc finger protein (ZFP) domain (or “ZF domain” as used herein).
  • ZFPs are proteins having at least one zinc finger, and bind to DNA in a sequence-specific manner.
  • a “zinc finger” (ZF) or “zinc finger motif’ (ZF motif) refers to a polypeptide domain comprising a beta-beta-alpha (PPa)-protein fold stabilized by a zinc ion.
  • a ZF binds from two to four base pairs of nucleotides, typically three or four base pairs (contiguous or noncontiguous). Each ZF typically comprises approximately 30 amino acids.
  • ZFP domains may contain multiple ZFs that make tandem contacts with their target nucleic acid sequence.
  • a tandem array of ZFs may be engineered to generate artificial ZFPs that bind desired nucleic acid targets.
  • ZFPs may be rationally designed by using databases comprising triplet (or quadruplet) nucleotide sequences and individual ZF amino acid sequences, in which each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of ZFs that bind the particular triplet or quadruplet sequence. See, e.g., U.S. Patents 6,453,242, 6,534,261, and 8,772,453.
  • ZFPs are widespread in eukaryotic cells, and may belong to, e.g., C2H2 class, CCHC class, PHD class, or RING class.
  • An exemplary motif characterizing one class of these proteins is -Cys-(X)2-4-Cys-(X)i2-His-(X)3-5-His- (SEQ ID NO: 1091), where X is any independently chosen amino acid.
  • a ZFP domain herein may comprise a ZF array comprising sequential C2H2-ZFs each contacting three or more sequential nucleotides. Additional architectures, e.g. as described in Paschon et al., Nat. Commun. 10, 1133 (2019), are also possible.
  • a ZFP domain of an epigenetic editor described herein may include 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more ZFs.
  • the ZFP domain may include an array of two-finger or three-finger units, e.g., 3, 4, 5, 6, 7, 8, 9 or 10 or more units, wherein each unit binds a subsite in the target sequence.
  • a ZFP domain comprising at least three ZFs recognizes a target DNA sequence of 9 or 10 nucleotides.
  • a ZFP domain comprising at least four ZFs recognizes a target DNA sequence of 12 to 14 nucleotides.
  • a ZFP domain comprising at least six ZFs recognizes a target DNA sequence of 18 to 21 nucleotides.
  • ZFs in a ZFP domain described herein are connected via peptide linkers.
  • the peptide linkers may be, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more amino acids in length.
  • a linker comprises 5 or more amino acids.
  • a linker comprises 7-17 amino acids.
  • the linker may be flexible or rigid.
  • a zinc finger array may have the sequence:
  • “ ⁇ ” in italics may be TR, LR or LK, and “[linker]” represents a linker sequence.
  • the linker sequence is TGSQKP (SEQ ID NO: 1085); this linker may be used when sub-sites targeted by the ZFs are adjacent.
  • the linker sequence is TGGGGSQKP (SEQ ID NO: 1086); this linker may be used when there is a base between the sub-sites targeted by the zinc fingers.
  • the two indicated linkers may be the same or different.
  • ZFP domains herein may contain arrays of two or more adjacent ZFs that are directly adjacent to one another (e.g., separated by a short (canonical) linker sequence), or are separated by longer, flexible or structured polypeptide sequences.
  • directly adjacent fingers bind to contiguous nucleic acid sequences, i.e., to adjacent trinucleotides/triplets.
  • adjacent fingers cross-bind between each other’s respective target triplets, which may help to strengthen or enhance the recognition of the target sequence, and leads to the binding of overlapping sequences.
  • distant ZFs within the ZFP domain may recognize (or bind to) non-contiguous nucleotide sequences.
  • the DNA-binding domain of an epigenetic editor described herein comprises a transcription activator-like effector (TALE) domain.
  • TALE transcription activator-like effector
  • the DNA-binding domain of a TALE comprises a highly conserved sequence of about 33-34 amino acids, with a repeat variable di-residue (RVD) at positions 12 and 13 that is central to the recognition of specific nucleotides.
  • RVD repeat variable di-residue
  • TALEs can be engineered to bind practically any desired DNA sequence. Methods for programming TALEs are known in the art. For example, such methods are described in Carroll et al., Genet Soc Amer. (2011) 188(4):773-82; Miller et al., Nat Biotechnol.
  • the DNA-binding domain comprises an argonaute protein domain, e.g., from Natronobacterium gregoryi (NgAgo).
  • NgAgo is a ssDNA-guided endonuclease that is guided to its target site by 5' phosphorylated ssDNA (gDNA), where it produces double-strand breaks.
  • gDNA 5' phosphorylated ssDNA
  • the NgAgo-gDNA system does not require a protospacer-adjacent motif (PAM).
  • PAM protospacer-adjacent motif
  • NgAgo The characterization and use of NgAgo have been described, e.g., in Gao et al., Nat Biotechnol. (2016) 34(7):768-73; Swarts et al., Nature (2014) 507(7491):258-61; and Swarts et al., Nucl Acids Res. (2015) 43(10):5120-9.
  • the DNA-binding domain comprises an inactivated nuclease, for example, an inactivated meganuclease.
  • DNA-binding domains include tetracycline-controlled repressor (tetR) DNA-binding domains, leucine zippers, helix-loop-helix (HLH) domains, helix-turn-helix domains, P-sheet motifs, steroid receptor motifs, bZIP domains homeodomains, and AT-hooks.
  • tetR tetracycline-controlled repressor
  • Epigenetic editors described herein that comprise a polynucleotide guided DNA- binding domain may also include a guide polynucleotide that is capable of forming a complex with the DNA-binding domain.
  • the guide polynucleotide may comprise RNA, DNA, or a mixture of both.
  • the guide polynucleotide may be a guide RNA (gRNA).
  • gRNA guide RNA
  • a “guide RNA” or “gRNA” refers to a nucleic acid that is able to hybridize to a target sequence and direct binding of the CRISPR-Cas complex to the target sequence.
  • a guide polynucleotide sequence may comprises two parts: 1) a nucleotide sequence comprising a “targeting sequence” that is complementary to a target nucleic acid sequence (“target sequence”), e.g., to a nucleic acid sequence comprised in a genomic target site; and 2) a nucleotide sequence that binds a polynucleotide guided DNA- binding domain (e.g., a CRISPR-Cas protein domain).
  • the nucleotide sequence in 1) may comprise a targeting sequence that is 100% complementary to a genomic nucleic acid sequence, e.g., a nucleic acid sequence comprised in a genomic target site, and thus may hybridize to the target nucleic acid sequence.
  • the nucleotide sequence in 1) may be referred to as, e.g., a crispr RNA, or crRNA.
  • the nucleotide sequence in 2) may be referred to as a scaffold sequence of a guide nucleic acid, e.g., a tracrRNA, or an activating region of a guide nucleic acid, and may comprise a stem-loop structure.
  • Parts 1) and 2) as described above may be fused to form one single guide (e.g., a single guide RNA, or sgRNA), or may be on two separate nucleic acid molecules.
  • a guide polynucleotide comprises parts 1) and 2) connected by a linker.
  • a guide polynucleotide comprises parts 1) and 2) connected by a non-nucleic acid linker, for example, a peptide linker or a chemical linker.
  • Part 2 the scaffold sequence of a guide polynucleotide as described herein may be, for example, as described in Jinek et al., Science (2012) 337:816-21; U.S. Patent Publication 2016/0208288; or U.S. Patent Publication 2016/0200779. Variants of part 2) are also contemplated by the present disclosure.
  • the tetraloop and stem loop of a gRNA scaffold (tracrRNA) sequence may be modified to include RNA aptamers, which can be bound by specific protein domains.
  • such modified gRNAs can be used to facilitate the recruitment of repressive or activating domains fused to the proteininteracting RNA aptamers.
  • a gRNA as provided herein typically comprises a targeting domain and a binding domain.
  • the targeting domain (also termed “targeting sequence”) may comprise a nucleic acid sequence that binds to a target site, e.g., to a genomic nucleic acid molecule within a cell.
  • the target site may be a double-stranded DNA sequence comprising a PAM sequence as well as the target sequence, which is located on the same strand as, and directly adjacent to, the PAM sequence.
  • the targeting domain of the gRNA may comprise an RNA sequence that corresponds to the target sequence, i.e., it resembles the sequence of the target domain, sometimes with one or more mismatches, but typically comprising an RNA sequence instead of a DNA sequence.
  • the targeting domain of the gRNA thus may base pair (in full or partial complementarity) with the sequence of the double-stranded target site that is complementary to the target sequence, and thus with the strand complementary to the strand that comprises the PAM sequence. It will be understood that the targeting domain of the gRNA typically does not include a sequence that resembles the PAM sequence. It will further be understood that the location of the PAM may be 5’ or 3’ of the target sequence, depending on the nuclease employed. For example, the PAM is typically 3’ of the target sequence for Cas9 nucleases, and 5’ of the target sequence for Casl2a nucleases.
  • the targeting domain sequence comprises between 17 and 30 nucleotides and corresponds fully to the target sequence (i.e., without any mismatch nucleotides). In some embodiments, however, the targeting domain sequence may comprise one or more, but typically not more than 4, mismatches, e.g., 1, 2, 3, or 4 mismatches. As the targeting domain is part of gRNA, which is an RNA molecule, it will typically comprise ribonucleotides, while the DNA targeting domain will comprise deoxyribonucleotides.
  • FIG. 1 An exemplary illustration of a Cas9 target site, comprising a 22 nucleotide target domain, and an NGG PAM sequence, as well as of a gRNA comprising a targeting domain that fully corresponds to the target sequence (and thus base pairs with full complementarity with the DNA strand complementary to the strand comprising the target sequence and PAM) is provided below:
  • FIG. 7 An exemplary illustration of a Casl2a target site, comprising a 22 nucleotide target domain, and a TTN PAM sequence, as well as of a gRNA comprising a targeting domain that fully corresponds to the target sequence (and thus base pairs with full complementarity with the DNA strand complementary to the strand comprising the target sequence and PAM) is provided below:
  • binding domain [ binding domain ] [ targeting domain ( RNA) ]
  • the length and complementarity of the targeting domain with the target sequence contributes to specificity of the interaction of the gRNA/Cas9 molecule complex with a target nucleic acid.
  • the targeting domain of a gRNA provided herein is 5 to 50 nucleotides in length. In some embodiments, the targeting domain is 15 to 25 nucleotides in length. In some embodiments, the targeting domain is 18 to 22 nucleotides in length. In some embodiments, the targeting domain is 19-21 nucleotides in length. In some embodiments, the targeting domain is 15 nucleotides in length.
  • the targeting domain is 16 nucleotides in length. In some embodiments, the targeting domain is 17 nucleotides in length. In some embodiments, the targeting domain is 18 nucleotides in length. In some embodiments, the targeting domain is 19 nucleotides in length. In some embodiments, the targeting domain is 20 nucleotides in length. In some embodiments, the targeting domain is 21 nucleotides in length. In some embodiments, the targeting domain is 22 nucleotides in length. In some embodiments, the targeting domain is 23 nucleotides in length. In some embodiments, the targeting domain is 24 nucleotides in length. In some embodiments, the targeting domain is 25 nucleotides in length.
  • the targeting domain fully corresponds, without mismatch, to a target sequence provided herein, or a part thereof.
  • the targeting domain of a gRNA provided herein comprises 1 mismatch relative to a target sequence provided herein. In some embodiments, the targeting domain comprises 2 mismatches relative to the target sequence. In some embodiments, the target domain comprises 3 mismatches relative to the target sequence.
  • Methods for designing, selecting, and validating gRNAs are described herein and known in the art.
  • Software tools can be used to optimize the gRNAs corresponding to a target DNA sequence, e.g., to minimize total off-target activity across the genome.
  • DNA sequence searching algorithms can be used to identify a target sequence in crRNAs of a gRNA for use with Cas9.
  • Exemplary gRNA design tools include the ones described in Bae et al., Bioinformatics (2014) 30:1473-5.
  • Guide polynucleotides e.g., gRNAs
  • the length of the spacer or targeting sequence depends on the CRISPR-associated protein component of the epigenetic editor system used.
  • Cas proteins from different bacterial species have varying optimal targeting sequence lengths.
  • the spacer sequence may comprise, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or more than 50 nucleotides in length.
  • the spacer comprises 10-24, 11-20, 11-16, 18-24, 19-21, or 20 nucleotides in length.
  • a guide polynucleotide e.g., gRNA
  • gRNA is from 15-100 (e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50) nucleotides in length and comprises a spacer sequence of at least 10 (e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50) contiguous nucleotides complementary to the target sequence.
  • a guide polynucleotide described herein may be truncated, e.g., by 1, 2,
  • the 3’ end of the target sequence is immediately adjacent to a PAM sequence (e.g., a canonical PAM sequence such as NGG for SpCas9).
  • the degree of complementarity between the targeting sequence of the guide polynucleotide (e.g., the spacer sequence of a gRNA) and the target sequence may be at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%.
  • the targeting and the target sequence may be 100% complementary.
  • the targeting sequence and the target sequence may contain, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mismatches.
  • a guide polynucleotide may be modified with, for example, chemical alterations and synthetic modifications.
  • a modified gRNA for instance, can include an alteration or replacement of one or both of the non-linking phosphate oxygens and/or of one or more of the linking phosphate oxygens in the phosphodiester backbone linkage, an alteration of the ribose sugar (e.g., of the 2’ hydroxyl on the ribose sugar), an alteration of the phosphate moiety, modification or replacement of a naturally occurring nucleobase, modification or replacement of the ribose-phosphate backbone, modification of the 3’ end and/or 5’ end of the oligonucleotide, replacement of a terminal phosphate group or conjugation of a moiety, cap, or linker, or any combination thereof.
  • one or more ribose groups of the gRNA may be modified.
  • chemical modifications to the ribose group include, but are not limited to, 2’-O- methyl (2’-0Me), 2’-fluoro (2’-F), 2’-deoxy, 2’-O-(2-methoxyethyl) (2’-M0E), 2’-NH2, 2’- O-allyl, 2’-O-ethylamine, 2’-O-cyanoethyl, 2’-O-acetalester, or a bicyclic nucleotide such as locked nucleic acid (LNA), 2’ -(5 -constrained ethyl (S-cEt)), constrained MOE, or 2’-0,4’-C- aminomethylene bridged nucleic acid (2’,4’-BNANC).
  • 2’-O-methyl modification and/or 2’- fluoro modification may increase binding affinity and/or nuclease stability of
  • one or more phosphate groups of the gRNA may be chemically modified.
  • chemical modifications to a phosphate group include, but are not limited to, a phosphorothioate (PS), phosphonoacetate (PACE), thiophosphonoacetate (thioPACE), amide, triazole, phosphonate, and phosphotriester modification.
  • a guide polynucleotide described herein may comprise one, two, three, or more PS linkages at or near the 5’ end and/or the 3’ end; the PS linkages may be contiguous or noncontiguous.
  • the gRNA herein comprises a mixture of ribonucleotides and deoxyribonucleotides and/or one or more PS linkages.
  • one or more nucleobases of the gRNA may be chemically modified.
  • chemically modified nucleobases include, but are not limited to, 2- thiouridine, 4-thiouridine, N6-methyladenosine, pseudouridine, 2,6-diaminopurine, inosine, thymidine, 5 -methyl cytosine, 5-substituted pyrimidine, isoguanine, isocytosine, and nucleobases with halogenated aromatic groups.
  • Chemical modifications can be made in the spacer region, the tracr RNA region, the stem loop, or any combination thereof.
  • Target domains identified above that are adjacent to a PAM sequence can be targeted by a CRISPR-based epigenetic repressor, e.g., an epigenetic repressor comprising a dCas9 DNA-binding domain.
  • a CRISPR-based epigenetic repressor e.g., an epigenetic repressor comprising a dCas9 DNA-binding domain.
  • target sites 1-143 are suitable for dCas9-based epigenetic repressor targeting.
  • a suitable gRNA for targeting any of the target domain sequences would, in some embodiments, comprise a target domain sequence that is the RNA-equivalent sequence of the provided DNA sequence of the targeting domain sequence (i.e., an RNA nucleotide of that sequence instead of the provided DNA nucleotide, with uracil instead of thymine), and a suitable tracr RNA sequence.
  • Epigenetic editors described herein include one or more effector protein domains (also “epigenetic effector domains,” or “effector domains,” as used herein) that effect epigenetic modification of a target gene.
  • An epigenetic editor with one or more effector domains may modulate expression of a target gene without altering its nucleobase sequence.
  • an effector domain of an epigenetic editor described herein may make histone tail modifications, e.g., by adding or removing active marks on histone tails.
  • an effector domain of an epigenetic editor described herein may comprise or recruit a transcription-related protein, e.g., a transcription repressor.
  • the transcription-related protein may be endogenous or exogenous.
  • an effector domain of an epigenetic editor described herein may, for example, comprise a protein that directly or indirectly blocks access of a transcription factor to the gene of interest harboring the target sequence.
  • An effector domain may be a full-length protein or a fragment thereof that retains the epigenetic effector function (a “functional domain”).
  • Functional domains that are capable of modulating (e.g., repressing) gene expression can be derived from a larger protein.
  • functional domains that can reduce target gene expression may be identified based on sequences of repressor proteins.
  • Amino acid sequences of gene expression-modulating proteins may be obtained from available genome browsers, such as the UCSD genome browser or Ensembl genome browser.
  • Protein annotation databases such as UniProt or Pfam can be used to identify functional domains within the full protein sequence. As a starting point, the largest sequence, encompassing all regions identified by different databases, may be tested for gene expression modulation activity. Various truncations then may be tested to identify the minimal functional unit.
  • variants of effector domains described herein are also contemplated by the present disclosure.
  • a variant may, for example, refer to a polypeptide with at least about 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity and/or sequence similarity to a wildtype effector domain described herein.
  • the variant retains at least about 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the epigenetic effector function of the wildtype effector domain.
  • the effector domains may induce a combination of epigenetic modifications, e.g., transcription repression and DNA methylation, DNA methylation and histone deacetylation, DNA methylation and histone demethylation, DNA methylation and histone methylation, DNA methylation and histone phosphorylation, DNA methylation and histone ubiquitylation, DNA methylation, and histone SUMOylation.
  • epigenetic modifications e.g., transcription repression and DNA methylation, DNA methylation and histone deacetylation, DNA methylation and histone demethylation, DNA methylation and histone methylation, DNA methylation and histone phosphorylation, DNA methylation and histone ubiquitylation, DNA methylation, and histone SUMOylation.
  • an effector domain described herein (e.g., DNMT3 A and/or DNMT3L) is encoded by a nucleotide sequence as found in the native genome (e.g., human or murine) for that effector domain.
  • an effector domain described herein is encoded by a nucleotide sequence that has been codon-optimized for optimal expression in human cells.
  • Effector domains described herein may include, for example, transcriptional repressors, DNA methyltransferases, and/or histone modifiers, as further detailed below.
  • an epigenetic effector domain described herein mediates repression of a target gene’s expression (e.g., transcription).
  • the effector domain may comprise, e.g., a Kriippel-associated box (KRAB) repression domain, a Repressor Element Silencing Transcription Factor (REST) repression domain, a KRAB-associated protein 1 (KAP1) domain, a MAD domain, a FKHR (forkhead in rhabdosarcoma gene) repressor domain, an EGR-1 (early growth response gene product- 1) repressor domain, an ets2 repressor factor repressor domain (ERD), a MAD smSIN3 interaction domain (SID), a WRPW motif of the hairy -related basic helix-loop-helix (bHLH) repressor proteins, an HP1 alpha chromo-shadow repression domain, an HP1 beta re
  • the effector domain may recruit one or more protein domains that repress expression of the target gene, e.g., through a scaffold protein.
  • the effector domain may recruit or interact with a scaffold protein domain that recruits a PRMT protein, a HD AC protein, a SETDB1 protein, or a NuRD protein domain.
  • the effector domain comprises a repression domain (e.g., KRAB) derived from KOX1/ZNF10, KOX8/ZNF708, ZNF43, ZNF184, ZNF91, HPF4, HTF10, or HTF34.
  • a repression domain e.g., KRAB
  • the effector domain comprises a repression domain (e.g., KRAB) derived from ZIM3, ZNF436, ZNF257, ZNF675, ZNF490, ZNF320, ZNF331, ZNF816, ZNF680, ZNF41, ZNF189, ZNF528, ZNF543, ZNF554, ZNF140, ZNF610, ZNF264, ZNF350, ZNF8, ZNF582, ZNF30, ZNF324, ZNF98, ZNF669, ZNF677, ZNF596, ZNF214, ZNF37, ZNF34, ZNF250, ZNF547, ZNF273, ZNF354, ZFP82, ZNF224, ZNF33, ZNF45, ZNF175, ZNF595, ZNF184, ZNF419, ZFP28-1, ZFP28-2, ZNF18, ZNF213, ZNF394, ZFP1, ZFP14, ZNF416, ZNF557, ZNF566, ZNF729, ZIM2, ZNF254, ZNF76
  • the repression domain may be a KRAB domain derived from KOX1, ZIM3, ZFP28, or ZN627.
  • the repression domain is a ZIM3 KRAB domain.
  • the effector domain is derived from a human protein, e.g., a human ZIM3, a human K0X1, a human ZFP28, or a human ZN627.
  • effector domains that may reduce or silence target gene expression are provided in Table 2 below (SEQ: SEQ ID NO, see Table 11 for sequences of exemplary effector domains). Further examples of repressors and transcriptional repressor domains can be found, e.g., in PCT Patent Publication WO 2021/226077 and Tycko et al., Cell (2020) 183(7):2020-35, each of which is incorporated herein by reference in its entirety.
  • a functional analog of any one of the above-listed proteins i.e., a molecule having the same or substantially the same biological function (e.g., retaining 70% or more, 80% or more, 90% or more, 95% or more, or 98% or more) of the protein’s transcription factor function) is encompassed by the present disclosure.
  • the functional analog may be an isoform or a variant of the above-listed protein, e.g., containing a portion of the above protein with or without additional amino acid residues and/or containing mutations relative to the above protein.
  • the functional analog has a sequence identity that is at least 75, 80, 85, 90, 95, 98, or 99% to one of the sequences listed in Table 2.
  • an epigenetic editor described herein comprises a domain derived from KOX1, ZIM3, ZFP28, and/or ZN627, optionally wherein the parental protein is a human protein.
  • the epigenetic editor may comprise a KRAB domain derived from KOX1 (ZNF10), e.g., a human KOX1.
  • the epigenetic editor may comprise a KRAB domain derived from ZIM3 (ZNF657 or ZNF264), e.g., a human ZIM3.
  • the epigenetic editor may comprise a KRAB domain derived from ZFP28, e.g., a human ZFP28.
  • the epigenetic editor may comprise a KRAB domain derived from ZN627, e.g., a human ZN627.
  • an epigenetic editor described herein may comprise a CDYL2, e.g., a human CDYL2, and/or a TOX domain (e.g., a human TOX domain) in combination with a KOX1 KRAB domain (e.g., a human KOX1 KRAB domain).
  • This term also encompasses non-canonical family members that do not catalyze methylation themselves but that recruit (including activate) catalytically active DNMTs; a non-limiting example of such a DNMT is DNMT3L. See, e.g., Lyko, Nat Review (2016) 19:81-92.
  • a DNMT domain may refer to a polypeptide domain derived from a catalytically active DNMT (e.g., DNMT1, DNMT3A, and DNMT3B) or from a catalytically inactive DNMT (e.g., DNMT3L).
  • a DNMT may repress expression of the target gene through the recruitment of repressive regulatory proteins.
  • the methylation is at a CG (or CpG) dinucleotide sequence.
  • the methylation is at a CHG or CHH sequence, where H is any one of A, T, or C.
  • DNMTs in the epigenetic editors may include, e.g., DNMT1, DNMT3A, DNMT3B, and/or DNMT3C.
  • the DNMT is a mammalian (e.g., human or murine) DNMT.
  • the DNMT is DNMT3 A (e.g., human DNMT3 A).
  • an effector domain described herein may be a DNMT-like domain.
  • a “DNMT-like domain” is a regulatory factor of DNA methyltransferase that may activate or recruit other DNMT domains, but does not itself possess methylation activity.
  • the DNMT-like domain is a mammalian (e.g., human or mouse) DNMT-like domain.
  • the DNMT-like domain is DNMT3L, which may be, for example, human DNMT3L or mouse DNMT3L.
  • an epigenetic editor described herein comprises a DNMT3L domain comprising SEQ ID NO: 1032, or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to SEQ ID NO: 1032.
  • an epigenetic editor herein comprises a DNMT3L domain comprising SEQ ID NO: 1033, or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to SEQ ID NO: 1033.
  • an epigenetic editor described herein comprises a DNMT3L domain comprising SEQ ID NO: 1034, or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to SEQ ID NO: 1034.
  • an epigenetic editor described herein comprises a DNMT3L domain comprising SEQ ID NO: 1035, or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to SEQ ID NO: 1035.
  • the DNMT3L domain may have, e.g., a mutation corresponding to that at position D226 (such as D226V), Q268 (such as Q268K), or both (numbering according to SEQ ID NO: 1032).
  • an epigenetic editor herein may comprise comprising both DNMT and DNMT-like effector domains.
  • the epigenetic editor may comprise a DNMT3 A-3L domain, wherein DNMT3 A and DNMT3L may be covalently linked.
  • an epigenetic editor described herein may comprise an effector domain that comprises only a DNMT3 A domain (e.g., human DNMT3 A), or only a DNMT-like domain (e.g., DNMT3L, which may be human or mouse DNMT3L).
  • the functional analog has a sequence identity that is at least 75, 80, 85, 90, 95, 98, or 99% to one of the sequences listed in Table 3.
  • the effector domain herein comprises only the functional domain (or functional analog thereof), e.g., the catalytical domain or recruiting domain, of the above-listed proteins.
  • An epigenetic editor herein may effect methylation at, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 or more CpG dinucleotide sequences in the target gene or chromosome.
  • the CpG dinucleotide sequences may be located within or near the target gene in CpG islands, or may be located in a region that is not a CpG island.
  • a CpG island generally refers to a nucleic acid sequence or chromosome region that comprises a high frequency of CpG dinucleotides.
  • a CpG island may comprise at least 50% GC content.
  • the CpG island may have a high observed-to-expected CpG ratio, for example, an observed-to-expected CpG ratio of at least 60%.
  • an observed-to-expected CpG ratio is determined by Number of CpG * (sequence length) / (Number of C * Number of G).
  • the CpG island has an observed-to-expected CpG ratio of at least 60%, 70%, 80%, 90% or more.
  • a CpG island may be a sequence or region of, e.g., at least 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, or 800 nucleotides. In some embodiments, only 1, or less than 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, or 50 CpG dinucleotides are methylated by the epigenetic editor.
  • an epigenetic editor herein effects methylation at a hypomethylated nucleic acid sequence, i.e., a sequence that may lack methyl groups on the 5- methyl cytosine nucleotides (e.g., in CpG) as compared to a standard control. Hypomethylation may occur, for example, in aging cells or in cancer (e.g., early stages of neoplasia) relative to a younger cell or non-cancer cell, respectively.
  • methylation may be introduced by the epigenetic editor at a site other than a CpG dinucleotide.
  • the target gene sequence may be methylated at the C nucleotide of CpA, CpT, or CpC sequences.
  • an epigenetic editor comprises a DNMT3 A domain and effects methylation at CpG, CpA, CpT, CpC sequences, or any combination thereof.
  • an epigenetic editor comprises a DNMT3 A domain that lacks a regulatory subdomain and only maintains a catalytic domain.
  • an effector domain of an epigenetic editor herein mediates histone modification.
  • Histone modifications play a structural and biochemical role in gene transcription, such as by formation or disruption of the nucleosome structure that binds to the histone and prevents gene transcription.
  • Histone modifications may include, for example, acetylation, deacetylation, methylation, phosphorylation, ubiquitination, SUMOylation and the like, e.g., at their N-terminal ends (“histone tails”). These modifications maintain or specifically convert chromatin structure, thereby controlling responses such as gene expression, DNA replication, DNA repair, and the like, which occur on chromosomal DNA.
  • the unstructured N-termini of histones may be modified by acetylation, deacetylation, methylation, ubiquitylation, phosphorylation, SUMOylation, ribosylation, citrullination O-GlcNAcylation, crotonylation, or any combination thereof.
  • histone acetyltransferases utilize acetyl-CoA as a cofactor and catalyze the transfer of an acetyl group to the epsilon amino group of the lysine side chains.
  • lysine This neutralizes the lysine’s positive charge and weakens the interactions between histones and DNA, thus opening the chromosomes for transcription factors to bind and initiate transcription.
  • Acetylation of K14 and K9 lysines of histone H3 by histone acetyltransferase enzymes may be linked to transcriptional competence in humans. Lysine acetylation may directly or indirectly create binding sites for chromatin-modifying enzymes that regulate transcriptional activation.
  • histone methylation of lysine 9 of histone H3 may be associated with heterochromatin, or transcriptionally silent chromatin.
  • an effector domain of an epigenetic editor described herein comprises a histone methyltransferase domain.
  • the effector domain may comprise, for example, a DOT1L domain, a SET domain, a SUV39H1 domain, a G9a/EHMT2 protein domain, an EZH1 domain, an EZH2 domain, a SETDB1 domain, or any combination thereof.
  • the effector domain comprises a histone-lysine-N- methyltransferase SETDB1 domain.
  • KAP1 interacts with or recruits a histone deacetylase protein, a histone-lysine methyltransferase protein, a chromatin remodeling protein, and/or a heterochromatin protein.
  • a KAP1 protein domain may interact with or recruit a heterochromatin protein 1 (HP1) protein, a SETDB1 protein, an HD AC protein, and/or a NuRD protein complex component.
  • a KAP1 protein domain interacts with or recruits a ZFP90 protein (e.g., isoform 2 of ZFP90), and/or a FOXP3 protein.
  • An exemplary KAP1 amino acid sequence is shown in SEQ ID NO: 1062.
  • the effector domain comprises a protein domain that interacts with or is recruited by one or more DNA epigenetic marks.
  • the effector domain may comprise a methyl CpG binding protein 2 (MECP2) protein that interacts with methylated DNA nucleotides in the target gene (which may or may not be at a CpG island of the target gene).
  • MECP2 protein domain in an epigenetic editor described herein may induce condensed chromatin structure, thereby reducing or silencing expression of the target gene.
  • an MECP2 protein domain in an epigenetic editor described herein may interact with a histone deacetylase (e.g. HD AC), thereby repressing or silencing expression of the target gene.
  • a histone deacetylase e.g. HD AC
  • an MECP2 protein domain in an epigenetic editor described herein may block access of a transcription factor or transcriptional activator to the target sequence, thereby repressing or silencing expression of the target gene.
  • An exemplary MECP2 amino acid sequence is shown in SEQ ID NO: 1063.
  • effector domains for the epigenetic editors described herein are, e.g., a chromoshadow domain, a ubiquitin-2 like Rad60 SUMO-like (Rad60- SLD/SUMO) domain, a chromatin organization modifier domain (Chromo) domain, a Yaf2/RYBP C-terminal binding motif domain (YAF2 RYBP), a CBX family C-terminal motif domain (CBX7 C), a zinc finger C3HC4 type (RING finger) domain (ZF-C3HC4 2), a cytochrome b5 domain (Cyt-b5), a helix-loop-helix domain (HLH), a helix-hairpin-helix motif domain (e.g., HHH 3), a high mobility group box domain (HMG-box), a basic leucine zipper domain (e.g., bZIP l or bZIP_2), a Myb DNA-binding domain
  • Vg Tdu a LIM domain, an RNA recognition motif domain (RRM l), a paired amphipathic helix domain (PAH), a proteasomal ATPase OB C-terminal domain (Prot ATP ID OB), a nervy homology 2 domain (NHR2), a hinge domain of cleavage stimulation factor subunit 2 (CSTF2_hinge), a PPAR gamma N-terminal region domain (PPARgamma N), a CDC48 N- terminal domain (CDC48 2), a WD40 repeat domain (WD40), a Fipl motif domain (Fip 1 ), a PDZ domain (PDZ 6), a Von Willebrand factor type C domain (VWC), a NAB conserved region 1 domain (NCD1), an SI RNA-binding domain (SI), an HNF3 C-terminal domain (HNF C), a Vietnamese domain (Tudor_2), a histone-like transcription factor (CBF/
  • the effector domain comprises a protein domain selected from a group consisting of SUMO3 domain, Chromo domain from M phase phosphoprotein 8 (MPP8), chromoshadow domain from Chromobox 1 (CBX1), and SAM l/SPM domain from Scm Polycomb Group Protein Homolog 1 (SCMH1).
  • MPP8 Chromo domain from M phase phosphoprotein 8
  • CBX1 Chromobox 1
  • SCMH1 Scm Polycomb Group Protein Homolog 1
  • the effector domain comprises an HNF3 C-terminal domain (HNF C).
  • HNF C HNF3 C-terminal domain
  • the HNF C domain may be from FOXA1 or FOXA2.
  • the HNF C domain comprises an EH1 (engrailed homology 1) motif.
  • the effector domain may comprise an interferon regulatory factor 2-binding protein zinc finger domain (IRF-2BP1 2), a Cyt-b5 domain from DNA repair factor HERC2 E3 ligase, a variant SH3 domain (SH3 9) from Bridging Integrator 1 (BINI), an HMG-box domain from transcription factor TOX or ZF-C3HC4 2 RING finger domain from the polycomb component PCGF2, a Chromodomain-helicase-DNA binding protein 3 (CHD3) domain, or a ZNF783 domain.
  • IRF-2BP1 2 interferon regulatory factor 2-binding protein zinc finger domain
  • BINI Bridging Integrator 1
  • HMG-box domain from transcription factor TOX or ZF-C3HC4 2
  • CHD3 domain Chromodomain-helicase-DNA binding protein 3
  • epigenetic editors also referred to herein as epigenetic editing systems, that direct epigenetic modification(s) to a target sequence in a gene of interest, e.g., using one or more DNA-binding domains as described herein and one or more effector domains (e.g., epigenetic repression domains) as described herein, in any combination.
  • the DNA-binding domain (in concert with a guide polynucleotide such as one described herein, where the DNA-binding domain is a polynucleotide guided DNA-binding domain) directs the effector domain to epigenetically modify the target sequence, resulting in gene repression or silencing that may be durable and inheritable across cell generations.
  • the epigenetic editors described herein can repress or silence genes reversibly or irreversibly in cells.
  • an epigenetic editor described herein comprises one or more fusion proteins, each comprising (1) DNA-binding domain(s) and (2) effector domain(s).
  • the effector domains may be on one or more fusion proteins comprised by the epigenetic editor.
  • a single fusion protein may comprise all of the effector domains with a DNA-binding domain.
  • the effector domains or subsets thereof may be on separate fusion proteins, each with a DNA-binding domain (which may be the same or different).
  • a fusion protein described herein may further comprise one or more linkers (e.g., peptide linkers), detectable tags, nuclear localization signals (NLSs), or any combination thereof.
  • fusion protein refers to a chimeric protein in which two or more coding sequences (e.g., for DNA-binding domain(s) and/or effector domain(s)) are covalently or non-covalently joined, directly or indirectly.
  • an epigenetic editor described herein comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, or more effector (e.g., repression) domains, which may be identical or different.
  • effector domains function synergistically.
  • Combinations of effector domains may comprise DNA methylation domains, histone deacetylation domains, histone methylation domains, and/or scaffold domains that recruit any of the above.
  • an epigenetic editor described herein may comprise one or more transcriptional repressor domains (e.g., a KRAB domain such as KOX1, ZIM3, ZFP28, or ZN627 KRAB) in combination with one or more DNA methylation domains (e.g., a DNMT domain) and/or recruiter domain (e.g., a DNMT3L domain).
  • a KRAB domain such as KOX1, ZIM3, ZFP28, or ZN627 KRAB
  • DNA methylation domains e.g., a DNMT domain
  • recruiter domain e.g., a DNMT3L domain
  • the epigenetic editor further comprises an additional effector domain (e g., a KAP1, MECP2, HPlb, CBX8, CDYL2, TOX, TOX3, TOX4, EED, RBBP4, RCOR1, or SCML2 domain).
  • the additional effector domain is a CDYL2, TOX, TOX3, TOX4, or HPla domain.
  • an epigenetic editor described herein may comprise a CDYL2 and/or a TOX domain in combination with a KRAB domain (e.g., a KOX1 KRAB domain).
  • a fusion protein as described herein may comprise one or more linkers that connect components of the epigenetic editor.
  • a linker may be a peptide or non-peptide linker.
  • one or more linkers utilized in an epigenetic editor provided herein is a peptide linker, i.e., a linker comprising a peptide moiety.
  • a peptide linker can be any length applicable to the epigenetic editor fusion proteins described herein.
  • the linker can comprise a peptide between 1 and 200 (e.g., between 1 and 80) amino acids.
  • the linker comprises from 1 to 5, 1 to 10, 1 to 20, 1 to 30, 1 to 40, 1 to 50, 1 to 60, 1 to 80, 1 to 100, 1 to 150, 1 to 200, 5 to 10, 5 to 20, 5 to 30, 5 to 40, 5 to 60, 5 to 80, 5 to 100, 5 to 150, 5 to 200, 10 to 20, 10 to 30, 10 to 40, 10 to 50, 10 to 60, 10 to 80, 10 to 100, 10 to 150, 10 to 200, 20 to 30, 20 to 40, 20 to 50, 20 to 60, 20 to 80, 20 to 100, 20 to 150, 20 to 200, 30 to 40, 30 to 50, 30 to 60, 30 to 80, 30 to 100, 30 to 150, 30 to 200, 40 to 50, 40 to 60, 40 to 80, 40 to 100, 40 to 150, 40 to 200, 50 to 60 50 to 80, 50 to 100, 50 to 150, 50 to 200, 60 to 80, 60 to 100, 60 to 150, 60 to 200, 80 to 100, 80 to 150, 80 to 200, 100 to 150, 100 to 200,
  • the peptide linker is 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 25, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acids in length.
  • the peptide linker may be 4, 5, 16, 20, 24, 27, 32, 40, 64, 92, or 104 amino acids in length.
  • the peptide linker may be a flexible or rigid linker.
  • the peptide linker comprises the amino acid sequence of any one of SEQ ID NOs: 1064-1068 or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • XTEN refers to a recombinant peptide or polypeptide lacking hydrophobic amino acid residues.
  • XTEN linkers typically are unstructured and comprise a limited set of natural amino acids. Fusion of XTEN to proteins alters its hydrodynamic properties and reduces the rate of clearance and degradation of the fusion protein. These XTEN fusion proteins are produced using recombinant technology, without the need for chemical modifications, and degraded by natural pathways.
  • the XTEN linker may be, for example, 5, 10, 16, 20, 26, or 80 amino acids in length.
  • the XTEN linker is 16 amino acids in length. In some embodiments, the XTEN linker is 80 amino acids in length. In certain embodiments, the XTEN linker may be XTEN10, XTEN16, XTEN20, or XTEN80. In certain embodiments, the XTEN linker may comprise the amino acid sequence of any one of SEQ ID NOs: 1069-1073 and 1092 or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the XTEN linker may be XTEN10, XTEN16, XTEN20, or XTEN80.
  • one or more linkers utilized in an epigenetic editor provided herein is a non-peptide linker.
  • the linker may be a carbon bond, a disulfide bond, or carbon-heteroatom bond.
  • the linker is a carbonnitrogen bond of an amide linkage.
  • the linker is a cyclic or acyclic, substituted or unsubstituted, or branched or unbranched aliphatic or heteroaliphatic linker.
  • one or more linkers utilized in an epigenetic editor provided herein is polymeric (e.g., polyethylene, polyethylene glycol, polyamide, polyester, etc.).
  • the linker may comprise, for example, a monomer, dimer, or polymer of aminoalkanoic acid; an aminoalkanoic acid (e.g., glycine, ethanoic acid, alanine, betaalanine, 3 -aminopropanoic acid, 4-aminobutanoic acid, 5-pentanoic acid, etc.); a monomer, dimer, or polymer of aminohexanoic acid (Ahx); or a polyethylene glycol moiety (PEG); or an aryl or heteroaryl moiety.
  • an aminoalkanoic acid e.g., glycine, ethanoic acid, alanine, betaalanine, 3 -aminopropanoic acid, 4-aminobutanoic acid, 5-pentanoic acid, etc.
  • Ahx aminohexanoic acid
  • PEG polyethylene glycol moiety
  • aryl or heteroaryl moiety aryl or heteroaryl
  • the linker may be based on a carbocyclic moiety (e.g., cyclopentane or cyclohexane) or a phenyl ring.
  • the linker may include functionalized moieties to facilitate attachment of a nucleophile (e.g., thiol, amino) from the peptide to the linker.
  • Any electrophile may be used as part of the linker. Exemplary electrophiles include, but are not limited to, activated esters, activated amides, alkyl halides, aryl halides, acyl halides, and isothiocyanates.
  • linker lengths and flexibilities can be employed between any two components of an epigenetic editor (e.g., between an effector domain (e.g., a repressor domain) and a DNA-binding domain (e.g., a Cas9 domain), between a first effector domain and a second effector domain, etc.).
  • the linkers may range from very flexible linkers, such as glycine/serine-rich linkers, to more rigid linkers, in order to achieve the optimal length for effector domain activity for the specific application.
  • the more flexible linkers are glycine/serine-rich linkers (GS-rich linkers), where more than 45% (e.g., more than 48, 50, 55, 60, 70, 80, or 90%) of the residues are glycine or serine residues.
  • GS-rich linkers are (GGGGS)n (SEQ ID NO: 485), (G)n, and W linker (SEQ ID NO: 486).
  • the more rigid linkers are in the form of the form (EAAAK)n (SEQ ID NO: 487), (SGGS)n (SEQ ID NO: 488), and (XP)n (SEQ ID NO: 489).
  • n may be any integer between 1 and 30. In some embodiments, n is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15. In some embodiments, the linker comprises a (GGS)n motif, wherein n is 1, 3, or 7 (SEQ ID NO: 490). In some embodiments, the linker comprises a (GGGGS)n motif, wherein n is 4 (SEQ ID NO: 491).
  • a fusion protein described herein may comprise one or more nuclear localization signals, and in certain embodiments, may comprise two or more nuclear localization signals.
  • the fusion protein may comprise 1, 2, 3, 4, or 5 nuclear localization signals.
  • a “nuclear localization signal” is an amino acid sequence that directs proteins to the nucleus.
  • the NLS may be an SV40 NLS.
  • the fusion protein may comprise an NLS at its N-terminus, C-terminus, or both, and/or an NLS may be embedded in the middle of the fusion protein (e.g., at the N- or C- terminus of a DNA- binding domain or an effector domain).
  • an NLS comprises the amino acid sequence of any one of SEQ ID NOs: 1074-1079, or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the selected sequence. Additional NLSs are known in the art.
  • Epigenetic editors provided herein may comprise one or more additional sequences (“tags”) for tracking, detection, and localization of the editors.
  • the epigenetic editor comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more detectable tags. Each of the detectable tags may be the same or different.
  • an epigenetic editor fusion protein may comprise cytoplasmic localization sequences, export sequences, such as nuclear export sequences, or other localization sequences, as well as sequence tags that are useful for solubilization, purification, or detection of the fusion proteins.
  • Suitable protein tags include, but are not limited to, biotin carboxylase carrier protein (BCCP) tags, myc-tags, calmodulin-tags, FLAG- tags, hemagglutinin (HA)-tags, poly-histidine tags (also referred to as histidine tags or His- tags), maltose binding protein (MBP)-tags, nus-tags, glutathione-S-transferase (GST)-tags, green fluorescent protein (GFP)-tags, thioredoxin-tags, S-tags, Softags (e.g., Softag 1 or Softag 3), strep-tags, biotin ligase tags, FlAsH tags, V5 tags, and SBP-tags. Additional suitable sequences will be apparent to those of skill in the art. D. Fusion Protein Configurations
  • a fusion protein of an epigenetic editor described herein may have its components structured in different configurations.
  • the DNA-binding domain may be at the C-terminus, the N-terminus, or in between two or more epigenetic effector domains or additional domains.
  • the DNA-binding domain is at the C-terminus of the epigenetic editor.
  • the DNA-binding domain is at the N-terminus of the epigenetic editor.
  • the DNA-binding domain is linked to one or more nuclear localization signals.
  • the DNA-binding domain is flanked by an epigenetic effector domain and/or an additional domain on both sides.
  • the epigenetic editor comprises the configuration of:
  • an epigenetic editor comprises a DNA-binding domain (DBD), a DNA methyltransferase (DNMT) domain, and a transcriptional repressor (“repressor”) domain that represses or silences expression of a target gene.
  • DBD DNA-binding domain
  • DNMT DNA methyltransferase
  • repressor transcriptional repressor domain that represses or silences expression of a target gene.
  • the DBD, DNMT, and transcriptional repressor domains may be any as described herein, in any combination.
  • the epigenetic editor comprises a fusion protein with the configuration of:
  • a connecting structure “]-[“in any one of the epigenetic editor structures is a linker, e.g., a peptide linker; a detectable tag; a peptide bond; a nuclear localization signal; and/or a promoter or regulatory sequence.
  • the multiple connecting structures “]-[“ may be the same or may each be a different linker, tag, NLS, or peptide bond.
  • the DNA methyltransferase domain comprises DNMT3 A, DNMT3L, or both.
  • the DBD is a catalytically inactive polynucleotide guided DNA-binding domain (e.g., a dCas9) or a ZFP domain.
  • the repressor domain is a KRAB domain.
  • the epigenetic editor comprises a configuration selected from
  • the DBD, KRAB, DNMT3 A, and DNMT3L domains may be any as described herein, in any combination.
  • the DBD is a CRISPR-associated protein domain (e.g., dCas9) or a ZFP domain;
  • the KRAB domain is derived from KOX1, ZIM3, ZFP28, or ZN627;
  • the DNMT3A domain is a human DNMT3 A domain;
  • the DNMT3L domain is a human or mouse DNMT3L domain; any combination of these components is also contemplated by the present disclosure.
  • the epigenetic editor comprises a configuration selected from
  • [DNMT3 A-DNMT3L] indicates that the DNMT3 A and DNMT3L domains are directly fused via a peptide bond
  • the connecting structure ]-[ is any one of the linkers as described herein, a detectable tag, an affinity domain, a peptide bond, a nuclear localization signal, a promoter, and/or a regulatory sequence.
  • the DBD, SETDB1, DNMT3 A, and DNMT3L domains may be any as described herein, in any combination.
  • the DBD is a CRISPR-associated protein domain (e.g., dCas9) or a ZFP domain;
  • the SETDB1 domain is derived from human SETDB1, ZIM3, ZFP28, or ZN627;
  • the DNMT3A domain is a human DNMT3A domain;
  • the DNMT3L domain is a human or mouse DNMT3L domain; any combination of these components is also contemplated by the present disclosure.
  • constructs contemplated herein include: DNMT3A-DNMT3L-XTEN80-NLS-dCas9-NLS-XTEN16-KOXl KRAB (Configuration 1), and DNMT3A-DNMT3L-XTEN80-NLS-ZFP domain-NLS-XTEN16-KOXl KRAB (Configuration 2).
  • a fusion construct described herein may have Configuration 2 and comprise SEQ ID NO: 1081, or a sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • SEQ ID NO: 1081 the XTEN linkers are underlined, the NLS sequences are bolded and underlined, the DNMT3A sequence is italicized, the DNMT3L sequence is underlined and italicized, the ZFP domain is bolded, and the KOX1 KRAB domain is underlined and bolded.
  • Variable amino acids represented by Xs are the amino acids of the DNA-recognition helix of the zinc finger and XX in italics may be either TR, LR or LK.
  • the target sequence of the nucleic acid molecule is derived from a target gene, which the epigenetic editor encoded by the nucleic acid molecule is designed to specifically bind to and modify epigenetically.
  • the target sequence of the nucleic acid molecule has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to a sequence of a target gene, which the epigenetic editor encoded by the nucleic acid molecule is designed to specifically recognize and bind to.
  • the a methylation site (e.g., CGI) is more than 20 bp, more than 30 bp, more than 40 bp, more than 50 bp, more than 60 bp, more than 70 bp, more than 80 bp, more than 90 bp, more than 100 bp, more than 150 bp, more than 200 bp, or more than 250 bp.
  • the a methylation site (e.g., CGI) is 20-50 bp, 50-100 bp, 100-150 bp, 150-200 bp, or 200-250 bp.
  • the a methylation site (e.g., CGI) is 100-200 bp.
  • An epigenetic editor described herein may perform sequence-specific epigenetic modification(s) (e.g., alteration of chemical modification(s)) of a target gene that harbors the target sequence. Such epigenetic modulation may be safer and more easily reversible than modulation due to gene editing, e.g., with generation of DNA double-strand breaks. In some embodiments, the epigenetic modulation may reduce or silence the target gene. In some embodiments, the modification is at a specific site of the target sequence. In some embodiments, the modification is at a specific allele of the target gene.
  • the epigenetic modification may result in modulated (e.g., reduced) expression of one copy of a target gene harboring a specific allele, and not the other copy of the target gene.
  • the specific allele is associated with a disease, condition, or disorder.
  • the epigenetic modification reduces or abolishes transcription of the target gene harboring the target sequence. In some embodiments, the epigenetic modification reduces or abolishes transcription of a copy of the target gene harboring a specific allele recognized by the epigenetic editor. In some embodiments, the epigenetic editor reduces the level of or eliminates expression of a protein encoded by the target gene. In some embodiments, the epigenetic editor reduces the level of or eliminates expression of a protein encoded by a copy of the target gene harboring a specific allele recognized by the epigenetic editor.
  • a target gene may be epigenetically modified in vitro, ex vivo, or in vivo.
  • the effector domain of an epigenetic editor described herein may alter (e.g., deposit or remove) a chemical modification at a nucleotide of the target gene or at a histone associated with the target gene.
  • the chemical modification may be altered at a single nucleotide or a single histone, or may be altered at 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 3000 or more nucleotides.
  • an effector domain of an epigenetic editor described herein may alter a CpG dinucleotide within the target gene.
  • all CpG dinucleotides within 2000, 1500, 1000, 500, or 200 bps flanking a target sequence are altered according to a modification type described herein, as compared to the original state of the gene or the gene in a comparable cell not contacted with the epigenetic editor.
  • At least 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the CpG dinucleotides are altered as compared to the original state of the gene or the gene in a comparable cell not contacted with the epigenetic editor.
  • one single CpG dinucleotide is altered, as compared to the original state of the gene or the gene in a comparable cell not contacted with the epigenetic editor.
  • An effector domain of an epigenetic editor described herein may alter a histone modification state of a histone associated with or bound to the target gene. For example, an effector domain may deposit a modification on one or more lysine residues of histone tails of histones associated with the target gene. In some embodiments, the effector domain may result in deacetylation of one or more histone tails of histones associated with the target gene, thereby reducing or silencing expression of the target gene. In some embodiments, the histone modification state is a methylation state. For example, the effector domain may result in a H3K9, H3K27 or H4K20 methylation (e.g.
  • H3K9me2, H3K9me3, H3K27me2, H3K27me3, and H4K20me3 methylation at one or more histone tails associated with the target gene, thereby reducing or silencing expression of the target gene.
  • all histone tails of histones bound to DNA nucleotides within 2000, 1500, 1000, 500, or 200 bps flanking the target sequence are altered according to a modification type as described herein, as compared to the original state of the chromosome or the chromosome in a comparable cell not contacted with the epigenetic editor.
  • At least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120 or more histone tails of the bound histones are altered as compared to the original state of the chromosome or the chromosome in a comparable cell not contacted with the epigenetic editor.
  • At least 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% of histone tails of the bound histones are altered as compared to the original state of the chromosome or the chromosome in a comparable cell not contacted with the epigenetic editor.
  • one single histone tail of the bound histones may be altered as compared to the original state of the chromosome or the chromosome in a comparable cell not contacted with the epigenetic editor.
  • one single bound histone octamer may be altered as compared to the original state of the chromosome or the chromosome in a comparable cell not contacted with the epigenetic editor.
  • the chemical modification deposited at target gene DNA nucleotides or histone residues may be at or in close proximity to a target sequence in the target gene.
  • an effector domain of an epigenetic editor described herein alters a chemical modification state of a nucleotide or histone tail bound to a nucleotide 100-200, 200-300, 300-400, 400-55, 500-600, 600-700, or 700-800 nucleotides 5’ or 3’ to the target sequence in the target gene.
  • an effector domain alters a chemical modification state of a nucleotide or histone tail bound to a nucleotide within 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, or 2000 nucleotides flanking the target sequence.
  • “flanking” refers to nucleotide positions 5’ to the 5’ end of and 3’ to the 3’ end of a particular sequence, e.g. a target sequence.
  • an effector domain mediates or induces a chemical modification change of a nucleotide or a histone tail bound to a nucleotide distant from a target sequence. Such modification may be initiated near the target sequence, and may subsequently spread to one or more nucleotides in the target gene distant from the target sequence.
  • an effector domain may initiate alteration of a chemical modification state of one or more nucleotides or one or more histone residues bound to one or more nucleotides within 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500 nucleotides flanking the target sequence, and the chemical modification state alteration may spread to one or more nucleotides at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2500, 3000, or more nucleotides from the target sequence in the target gene, either upstream or downstream of the target sequence.
  • the chemical modification may be initiated at less than 2, 3, 5, 10, 20, 30, 40, 50, or 100 nucleotides in the target gene and spread to at least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, or more nucleotides in the target gene. In some embodiments, the chemical modification spreads to nucleotides in the entire target gene. Additional proteins or transcription factors, for example, transcription repressors, methyltransferases, or transcription regulation scaffold proteins, may be involved in the spreading of the chemical modification. Alternatively, the epigenetic editor alone may be involved.
  • an epigenetic editor described herein reduces expression of a target gene by at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or more, as measured by transcription of the target gene in a cell, a tissue, or a subject as compared to a control cell, control tissue, or a control subject (e.g., in the absence of the epigenetic editor).
  • the nucleic acid molecule(s) may be in nucleic acid expression vector(s), which may include expression control sequences such as promoters, enhancers, transcription signal sequences, transcription termination sequences, introns, polyadenylation signals, Kozak consensus sequences, internal ribosome entry sites (IRES), etc.
  • expression control sequences are well known in the art.
  • a vector may also comprise a sequence encoding a signal peptide (e.g., for nuclear localization, nucleolar localization, or mitochondrial localization), associated with (e.g., inserted into or fused to) a sequence coding for a protein.
  • an AAV variant has at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% amino acid sequence identity to a wildtype AAV.
  • the AAV variant may be engineered such that its capsid proteins have reduced immunogenicity or enhanced transduction ability in humans.
  • one or more regions of at least two different AAV serotype viruses are shuffled and reassembled to generate a chimeric variant.
  • a chimeric AAV may comprise inverted terminal repeats (ITRs) that are of a heterologous serotype compared to the serotype of the capsid.
  • a chimeric variant of an AAV includes amino acid sequences from 2, 3, 4, 5, or more different AAV serotypes.
  • Non-viral systems are also contemplated for delivery as described herein.
  • Non- viral systems include, but are not limited to, nucleic acid transfection methods including electroporation, sonoporation, calcium phosphate transfection, microinjection, DNA biolistics, lipid-mediated transfection, transfection through heat shock, compacted DNA- mediated transfection, lipofection, cationic agent-mediated transfection, and transfection with liposomes, immunoliposomes, or cationic facial amphiphiles (CFAs).
  • nucleic acid transfection methods including electroporation, sonoporation, calcium phosphate transfection, microinjection, DNA biolistics, lipid-mediated transfection, transfection through heat shock, compacted DNA- mediated transfection, lipofection, cationic agent-mediated transfection, and transfection with liposomes, immunoliposomes, or cationic facial amphiphiles (CFAs).
  • one or more mRNAs encoding epigenetic editor fusion proteins as described herein may be co-electroporated with one or more guide polynucleotides (e.g., gRNAs) as described herein.
  • guide polynucleotides e.g., gRNAs
  • One important category of non-viral nucleic acid vectors is nanoparticles, which can be organic (e.g., lipid) or inorganic (e.g., gold).
  • organic (e.g. lipid and/or polymer) nanoparticles can be suitable for use as delivery vehicles in certain embodiments of this disclosure.
  • an LNP as described herein may be made from cationic, anionic, or neutral lipids.
  • an LNP may comprise neutral lipids, such as the fusogenic phospholipid l,2-Dioleoyl-sn-glycero-3 -phosphoethanolamine (DOPE) or the membrane component cholesterol, as helper lipids to enhance transfection activity and nanoparticle stability.
  • DOPE fusogenic phospholipid l,2-Dioleoyl-sn-glycero-3 -phosphoethanolamine
  • an LNP may comprise hydrophobic lipids, hydrophilic lipids, or both hydrophobic and hydrophilic lipids. Any lipid or combination of lipids that are known in the art can be used to produce an LNP. The lipids may be combined in any molar ratios to produce the LNP.
  • the LNP is a liver-targeting (e.g., preferentially or specifically targeting the liver) LNP.
  • the cells may be eukaryotic or prokaryotic.
  • the cells are mammalian (e.g., human) cells.
  • Human cells may include, for example, neurons, glial cells, hepatocytes, biliary epithelial cells (cholangiocytes), stellate cells, Kupffer cells, and liver sinusoidal endothelial cells.
  • the cell can be a non-dividing cell.
  • the cell can be a neuron.
  • a nucleic acid molecule described herein, or component s) thereof are delivered to a host cell for transient expression, e.g., via a transient expression vector.
  • Transient expression of the nucleic acid molecule or its component(s) may result in prolonged or permanent epigenetic modification of the target gene or the nucleic acid molecule expressing itself.
  • the epigenetic modification may be stable for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. 11, or 12 weeks or more; or 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months or more, after introduction of the epigenetic editor into the host cell.
  • the epigenetic modification may be maintained after one or more mitotic and/or meiotic events of the host cell. In particular embodiments, the epigenetic modification is maintained across generations in offspring generated or derived from the host cell.
  • the present disclosure also provides methods for treating or preventing a condition in a subject, comprising administering to the subject a nucleic acid molecule comprising an epigenetic editor or pharmaceutical composition as described herein.
  • the method for treating or preventing a condition in a subject comprises contacting a cell with the nucleic acid molecule disclosed herein.
  • the epigenetic editor of the nucleic acid molecule disclosed herein may effectuate an epigenetic modification of a target polynucleotide sequence in a target gene associated with a disease, condition, or disorder in the subject, thereby modulating expression of the target gene to treat or prevent the disease, condition, or disorder.
  • the epigenetic editor reduces the expression of the target gene to an extent sufficient to achieve a desired effect, e.g., a therapeutically relevant effect such as the prevention or treatment of the disease, condition, or disorder.
  • the disease, condition, or disorder comprises nervous system diseases or conditions. In some embodiments, the disease, condition, or disorder is of the central nervous system. In some embodiments, the method of treating or preventing a disease, condition, or disorder (e.g., nervous system diseases, central nervous system diseases) result in silencing gene expression in the neurons. In some embodiments, the method of treating or preventing a disease, condition, or disorder (e.g., nervous system diseases) result in reduction of gene expression in the neurons of the subject.
  • a disease, condition, or disorder e.g., nervous system diseases
  • Treat,” “treating” and “treatment” refer to a method of alleviating or abrogating a biological disorder and/or at least one of its attendant symptoms.
  • to “alleviate” a disease, disorder or condition means reducing the severity and/or occurrence frequency of the symptoms of the disease, disorder, or condition.
  • references herein to “treatment” include references to curative, palliative and prophylactic treatment.
  • alleviating a symptom may involve reduction of the symptom by at least 3%, 5%, 10%, 20%, 40%, 50%, 60%, 80%, 90%, 95%, 98%, 99%, 99.5%, 99.9%, or 100% as measured by any standard technique.
  • the subject may be a mammal, e.g., a human.
  • the subject is selected from a non-human primate such as chimpanzee, cynomolgus monkey, or macaque, and other apes and monkey species.
  • a nucleic acid molecule comprising an epigenetic editor and a regulatory element of the present disclosure may be administered in a therapeutically effective amount to a patient with a condition described herein.
  • “Therapeutically effective amount,” as used herein, refers to an amount of the therapeutic agent being administered that will relieve to some extent one or more of the symptoms of the disorder being treated, and/or result in clinical endpoint(s) desired by healthcare professionals.
  • An effective amount for therapy may be measured by its ability to stabilize disease progression and/or ameliorate symptoms in a patient, and preferably to reverse disease progression.
  • an epigenetic editor of the nucleic acid molecule of the present disclosure may be evaluated by in vitro assays, e.g., as described herein, as well as in suitable animal models that are predictive of the efficacy in humans.
  • Suitable dosage regimens will be selected in order to provide an optimum therapeutic response in each particular situation, for example, administered as a single bolus or as a continuous infusion, and with possible adjustment of the dosage as indicated by the exigencies of each case.
  • nucleic acid molecule comprising regulatory elements, epigenetic editors or components thereof (or nucleic acid molecules encoding the epigenetic editors or components thereof) of the present disclosure may be administered by any method accepted in the art (e.g., parenterally, intravenously, intradermally, or intramuscularly). IX. Definitions
  • nucleic acid refers to any oligonucleotide or polynucleotide containing nucleotides (e.g., deoxyribonucleotides or ribonucleotides) in either single- or double-strand form, and includes DNA and RNA.
  • nucleotides contain a sugar deoxyribose (DNA) or ribose (RNA), a base, and a phosphate group, and are linked together through the phosphate groups.
  • Bases include purines and pyrimidines, which include natural compounds such as adenine, thymine, guanine, cytosine, uracil, inosine, and natural analogs; as well as synthetic derivatives of purines and pyrimidines, which include, but are not limited to, modified versions which place new reactive groups such as amines, alcohols, thiols, carboxylates, alkylhalides, etc.
  • Nucleic acids may contain known nucleotide analogs and/or modified backbone residues or linkages, which may be synthetic, naturally occurring, and non-naturally occurring. Such nucleotide analogs, modified residues, and modified linkages are well known in the art, and may provide a nucleic acid molecule with enhanced cellular uptake, reduced immunogenicity, and/or increased stability in the presence of nucleases.
  • an “isolated” or “purified” nucleic acid molecule is a nucleic acid molecule that exists apart from its native environment.
  • an “isolated” or “purified” nucleic acid molecule (1) has been separated away from the nucleic acids of the genomic DNA or cellular RNA of its source of origin; and/or (2) does not occur in nature.
  • an “isolated” or “purified” nucleic acid molecule is a recombinant nucleic acid molecule.
  • variants, derivatives, homologs, and fragments thereof may have the specific sequence of residues (whether amino acid or nucleic acid residues) modified in such a manner that the polypeptide or polynucleotide in question substantially retains at least one of its endogenous functions.
  • a variant sequence can be obtained by addition, deletion, substitution, modification, replacement and/or variation of at least one residue present in the naturally-occurring sequence (in some embodiments, no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20 residues).
  • the present disclosure also contemplates any of the protein’s naturally occurring forms, or variants or homologs that retain at least one of its endogenous functions (e.g., at least 50%, 60%, 70%, 80%, 90%, 85%, 96%, 97%, 98%, or 99% of its function as compared to the specific protein described).
  • a homologue of any polypeptide or nucleic acid sequence contemplated herein includes sequences having a certain homology with the wildtype amino acid and nucleic sequence.
  • a homologous sequence may include a sequence, e.g. an amino acid sequence which may be at least 50%, 55%, 65%, 75%, 85%, 90%, 91%, 92% ⁇ 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the subject sequence.
  • the term “percent identical” in the context of amino acid or nucleotide sequences refers to the percent of residues in two sequences that are the same when aligned for maximum correspondence.
  • the length of a reference sequence aligned for comparison purposes is at least 30%, (e.g., at least 40, 50, 60, 70, 80, or 90%, or 100%) of the reference sequence.
  • Sequence identity may be measured using sequence analysis software (for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs). Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications.
  • sequence analysis software for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs.
  • sequence analysis software for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs.
  • Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications.
  • the percent identity of two nucleotide or polypeptide sequences is determined by, e.g., BLAST® using default parameters (available at the U.S. National Library of Medicine’s National Center for Biotechnology Information website).
  • the length of a reference sequence aligned for comparison purposes is at least 30%, (e.g., at least 40, 50, 60, 70, 80, or 90%) of the reference sequence.
  • an epigenetic editor as described herein may modulate the activity of a promoter sequence by binding to a motif within the promoter, thereby inducing, enhancing, or suppressing transcription of a gene operatively linked to the promoter sequence.
  • an epigenetic editor as described herein may block RNA polymerase from transcribing a gene, or may inhibit translation of an mRNA transcript.
  • inhibitor when used in reference to an epigenetic editor or a component thereof as described herein, refers to decreasing or preventing the activity (e.g., transcription) of a nucleic acid sequence (e.g., a target gene) or protein relative to the activity of the nucleic acid sequence or protein in the absence of the epigenetic editor or component thereof.
  • the term may include partially or totally blocking activity, or preventing or delaying activity.
  • the inhibited activity may be, e.g., 10%, 20%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% less than that of a control, or may be, e.g., at least 1.5-fold, 2-fold, 3-fold, 4- fold, 5-fold, or 10-fold less than that of a control.
  • Ranges provided herein are understood to be shorthand for all of the values within the range.
  • a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
  • a nested sub-range of an exemplary range of 1 to 50 may comprise 1 to 10, 1 to 20, 1 to 30, and 1 to 40 in one direction, or 50 to
  • variant SLiCs (FIGs. 1A-1B; Tables 4-10) were developed using variant promoter configurations and variant effector targeting sites to have epigenetic silencing activity on endogenous targets and epigenetic silencing activity on themselves.
  • self-limiting plasmids were encoded to express fluorescent proteins (GFP, d2EGFP, or mCherry) which could be measured via Incucyte or flow cytometry. After transfections were completed, cells were placed in incucyte and images were taken every 24h to track GFP and/or mCherry expression levels. At day 3 and/or day 6, cells were stained with live/dead (Zombie Violet) and analyzed for GFP and/or mCherry expression levels. Results were plotted using PRISM software.
  • GFP fluorescent proteins
  • d2EGFP d2EGFP
  • mCherry fluorescent proteins
  • Tables 4-10 shows location of the components of various SLiCs.
  • Table 11 list the corresponding sequences of each SLiCs described in Tables 4-10.
  • Plasmid 1 CMV-CLTA CGI-d2EGFP (SEQ ID NO: 1093)
  • Plasmid 2 CLTA CGI-CMV-d2EGFP (SEQ ID NO: 1094)
  • Plasmid 3 All-in-one SLiC ZFoff CLTA (CMV-CGI) (SEQ ID NO: 1095)
  • SLICs with variant promoter configurations were constructed and showed epigenetic silencing activity of ZFoff and CRISPRoff on the SLiCs.
  • HEK293T GripTite cells were co-transfected using the transfection protocol in Example 1 with 50 ng SLiC comprising a CMV enhancer domain, a CMV promoter domain, a CLTA CGI, and a d2EGFP along with either 50 ng CRISPRoff (e.g., dead Cas9 fusion protein) and 25 ng gRNA, or 50 ng ZFoff.
  • the CTLA CGI contains a target site for the ZFoff or CRISPRoff/gRNA complex.
  • D2EGFP expression was used as a marker for plasmid epigenetic silencing activity.
  • GFP expression was determined by FACS.
  • the results show a decrease in the percent of GFP positive cells (FIGs. 2A-2B, 3A-3B, FIG. 5A) or GFP MFI ( Figure 4) in samples treated with CRISPRoff or ZFoff versus control indicating plasmid silencing.
  • the results showed that regardless of the target positions (TAR) within the CTLA CGI, there was a decrease in the present of GFP positive cells (FIG. 3C-3D).
  • a SLiC comprising a CD151 CGI containing a target site for the ZFoff complex, there was also decrease in the present GFP positive cells (FIG. 5B).
  • variant SLiCs (e.g., Table 6, Table 7) were tested and showed epigenetic silencing of the endogenous target and itself.
  • HEK293T GripTite (CLTA-GFP) cells were transfected using the transfection protocol in Example 1 with 50 ng ZFoff SLiC comprising a CMV enhancer, CMV promoter, a CGI, a DNMT-ZF-KRAB fusion, a T2A domain, and a mCherry or 50 ng ZFi SLiC.
  • the CGI contains a target site for the ZFoff.
  • GFP and mCherry expression were read by FACS on day 6 as markers for endogenous epigenetic silencing and plasmid epigenetic silencing respectively.
  • ZFoff treatment reduced the percent of GFP positive cells (FIG. 7 A) and mCherry positive cells (FIG. 7B) compared to the control treatment demonstrating their simultaneous endogenous target silencing and plasmid silencing activity.
  • Endogenous silencing was durable with stable silencing up to 28 days (FIG. 8).
  • the methylation profile of CGI domain was analyzed at day 6 confirming methylation of the SLiC (FIG. 9A). Percentage of methylation also confirmed methylation of the SLiC, while the TAR cytosines were unmethylated.
  • synthetic CGIs (e.g., Table 9, Table 10) are constructed and show epigenetic silencing activity of SLiCs with synthetic CGIs.
  • Synthetic CGIs are designed by identifying endogenous PCSK9 and CLTA CGI sequences. The order of the CG and GC dinucleotides is then shuffled while maintaining the total number of CG and GC dinucleotides. The resulting sequences are then filtered to exclude sequences with homology to the human genome by Blast and filtered to include sequences with similar transcription factor motifs as the original CGI by Homer. Finally, transcription factor motifs from the original CGI were added back.
  • HEK293T GripTite cells were transfected using the protocol from Example 1 with 50 ng SLiC containing the synthetic CGI.
  • the SLiC comprises a CMV enhancer, a CMV promoter, a synthetic CGI, and d2EGFP where GFP expression is used as a marker for plasmid silencing.
  • GFP is read by FACS on Day 6 demonstrating the SLiCs were functional with synthetic CGIs (FIG. 11) and that these SLiCs can be silenced upon treatment with ZFoff (FIG. 12).
  • a promoter e.g., CLTA promoter, PCSK9 promoter
  • SEQ ID NOs (SEQ) of nucleotide (nt) and amino acid (aa) sequences described in the present disclosure are listed in Table 11 below.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Virology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

Selon certains aspects, la présente divulgation concerne des compositions, des procédés, des stratégies et des modalités de traitement associées à la modification épigénétique à l'aide d'une approche auto-limitative.
PCT/US2024/043629 2023-08-25 2024-08-23 Compositions et procédés de régulation épigénétique Pending WO2025049303A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363578902P 2023-08-25 2023-08-25
US63/578,902 2023-08-25

Publications (1)

Publication Number Publication Date
WO2025049303A1 true WO2025049303A1 (fr) 2025-03-06

Family

ID=94820215

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/043629 Pending WO2025049303A1 (fr) 2023-08-25 2024-08-23 Compositions et procédés de régulation épigénétique

Country Status (1)

Country Link
WO (1) WO2025049303A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150267205A1 (en) * 2014-03-18 2015-09-24 Sangamo Biosciences, Inc. Methods and compositions for regulation of zinc finger protein expression
US20220010375A1 (en) * 2018-12-14 2022-01-13 Aarhus Universitet Control plasmids and uses thereof
WO2022140577A2 (fr) * 2020-12-22 2022-06-30 Chroma Medicine, Inc. Compositions et méthodes pour l'édition épigénétique

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150267205A1 (en) * 2014-03-18 2015-09-24 Sangamo Biosciences, Inc. Methods and compositions for regulation of zinc finger protein expression
US20220010375A1 (en) * 2018-12-14 2022-01-13 Aarhus Universitet Control plasmids and uses thereof
WO2022140577A2 (fr) * 2020-12-22 2022-06-30 Chroma Medicine, Inc. Compositions et méthodes pour l'édition épigénétique

Similar Documents

Publication Publication Date Title
AU2023265968A1 (en) Compositions and methods for epigenetic regulation of pcsk9 expression
US20240132855A1 (en) Compositions and methods for epigenetic regulation of hbv gene expression
AU2024230979A1 (en) Compositions and methods for epigenetic regulation of pcsk9 expression
WO2024238700A1 (fr) Compositions et méthodes pour la régulation épigénétique de l'expression du gène vhb
AU2024270764A1 (en) Compositions and methods for epigenetic regulation of hbv gene expression
AU2023289695A1 (en) Compositions and methods for epigenetic regulation of b2m expression
WO2025049303A1 (fr) Compositions et procédés de régulation épigénétique
WO2024145615A2 (fr) Compositions et méthodes de régulation épigénétique de l'expression de angptl3
US20250387518A1 (en) Compositions and methods for epigenetic regulation of b2m expression
US20250236847A1 (en) Compositions and methods for epigenetic regulation of hbv gene expression
WO2024229020A2 (fr) Compositions et procédés pour la régulation épigénétique de l'expression de pcsk9
WO2025106739A1 (fr) Compositions et procédés de régulation épigénétique de l'expression de htt
WO2025101979A1 (fr) Compositions et méthodes de régulation épigénétique de gènes pour le traitement de la stéatohépatite non alcoolique ou de la stéatohépatite associée à un dysfonctionnement métabolique
WO2025106523A1 (fr) Compositions et procédés de régulation épigénétique de l'expression de f8
WO2024081879A1 (fr) Compositions et méthodes pour régulation épigénétique de l'expression de cd247
WO2025049789A1 (fr) Compositions et méthodes de régulation épigénétique de l'expression de adora2a
EP4544057A1 (fr) Compositions et procédés de régulation épigénétique de l'expression trac
WO2025264819A1 (fr) Compositions et procédés pour la régulation épigénétique de l'expression de znf410
WO2025019807A2 (fr) Compositions et procédés aux fins de la régulation épigénétique de l'expression de rfxap
WO2025038840A1 (fr) Compositions et procédés d'édition épigénétique
WO2025049792A1 (fr) Compositions et méthodes destinées à la régulation épigénétique de l'expression de tgfbr2
WO2024238679A9 (fr) Compositions et méthodes de régulation épigénétique de l'expression de cd3gamma, de cd3delta et de rfx5
WO2023250512A1 (fr) Compositions et procédés de régulation épigénétique de l'expression de ciita
US20250387514A1 (en) Compositions and methods for epigenetic regulation of trac expression
WO2025038821A1 (fr) Procédés et compositions de modification épigénétique multiplexe

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24860788

Country of ref document: EP

Kind code of ref document: A1