[go: up one dir, main page]

WO2025172421A1 - Compositions et procédés de traitement de la maladie de huntington - Google Patents

Compositions et procédés de traitement de la maladie de huntington

Info

Publication number
WO2025172421A1
WO2025172421A1 PCT/EP2025/053831 EP2025053831W WO2025172421A1 WO 2025172421 A1 WO2025172421 A1 WO 2025172421A1 EP 2025053831 W EP2025053831 W EP 2025053831W WO 2025172421 A1 WO2025172421 A1 WO 2025172421A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
sequence
nos
polynucleotide
expression system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2025/053831
Other languages
English (en)
Inventor
Oliver FREEMAN
Pinar AKCAKAYA
Sasa SVIKOVIG
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AstraZeneca AB
Original Assignee
AstraZeneca AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AstraZeneca AB filed Critical AstraZeneca AB
Publication of WO2025172421A1 publication Critical patent/WO2025172421A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/30Special therapeutic applications
    • C12N2320/34Allele or polymorphism specific uses

Definitions

  • the present disclosure provides polynucleotides, expression vectors, compositions, and methods for treatment of Huntington’s Disease (HD).
  • the polynucleotides, expression vectors, and compositions provided herein are components of a CRISPR system.
  • the disclosure provides a polynucleotide comprising a guide sequence and a scaffold sequence, wherein the guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • the present disclosure provides a polynucleotide including a guide sequence and a scaffold sequence, wherein the guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence includes any one of SEQ ID NOs:21-95.
  • the guide sequence does not form a secondary structure.
  • the secondary structure is a stem loop.
  • the guide sequence includes any one of SEQ ID NOs:1-12.
  • the present disclosure provides a polynucleotide including a guide sequence and a scaffold sequence, wherein the guide sequence includes any one of SEQ ID NOs:2-12, and wherein the guide sequence including SEQ ID NO:2 does not include a 5' guanine.
  • the guide sequence includes SEQ ID NO:2 and does not include a 5' guanine.
  • the scaffold sequence is capable of binding to a Cas protein.
  • the Cas protein is Cas9 from Staphylococcus aureus (SaCas9).
  • the scaffold sequence does not include an early termination signal sequence.
  • the early termination signal sequence includes 4 to 6 consecutive thymine bases.
  • the scaffold sequence includes a stabilized secondary structure. In some embodiments, the stabilized secondary structure includes a locked loop. [008] In some embodiments, the scaffold sequence includes any one of SEQ ID NOs:20-95. In some embodiments, the scaffold sequence includes any one of SEQ ID NOs:20-22, 25, 28- 31, and 44-47. [009] In some embodiments, the present disclosure provides a polynucleotide including any one of SEQ ID NOs:97-1007.
  • the present disclosure provides a polynucleotide, including any one of SEQ ID NOs:97, 98, 101, 104-107, 120-123, 172-174, 177, 180-183, and 196-199. In some embodiments, the present disclosure provides a polynucleotide, including any one of SEQ ID NOs: 97, 98, 101, 104-107, 120-123, 172-174, 183, 196, 199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • the present disclosure provides a polynucleotide including a guide sequence and a scaffold sequence, wherein: (i) the guide sequence includes any one of SEQ ID NOs:2-12, and the scaffold sequence includes any one of SEQ ID NOs:20-95, wherein the guide sequence including SEQ ID NO:2 does not include a 5' guanine; or (ii) the guide sequence includes any one of SEQ ID NOs:1-12, and the scaffold sequence includes any one of SEQ ID NOs:21-95. [011] In some embodiments, the present disclosure provides an expression system including a nucleic acid sequence encoding the polynucleotide described herein.
  • the nucleic acid sequence is a first nucleic acid sequence
  • the polynucleotide is a first polynucleotide
  • the guide sequence is a first guide sequence
  • the scaffold sequence is a first scaffold sequence
  • the expression system further includes a second nucleic acid sequence encoding a second polynucleotide, wherein the second polynucleotide includes a second guide sequence and a second scaffold sequence, wherein the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862.
  • the second guide sequence does not form a secondary structure.
  • the secondary structure is a stem loop.
  • the second guide sequence includes any one of SEQ ID NOs:13-19. In some embodiments, the second guide sequence includes SEQ ID NO:13. [014] In some embodiments, the second scaffold sequence is capable of binding to a Cas protein. In some embodiments, the Cas protein is SaCas9. In some embodiments, the second scaffold sequence does not include an early termination signal sequence. In some embodiments, the early termination signal sequence includes 4 to 6 consecutive thymine bases. In some embodiments, the second scaffold sequence includes a stabilized secondary structure. In some embodiments, the stabilized secondary structure includes a locked loop. [015] In some embodiments, the second scaffold sequence includes any one of SEQ ID NOs:20-95.
  • the second scaffold sequence includes any one of SEQ ID NOs:20-22, 25, 28-31, and 44-47. [016] In some embodiments, the second polynucleotide includes any one of SEQ ID NOs:1008-1539. In some embodiments, the second polynucleotide includes any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, and 1032-1035. In some embodiments, the second polynucleotide includes any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, 1032, 1035, 1084, 1160, 1236, 1312, 1388, and 1464.
  • the present disclosure provides an expression system including: a first nucleic acid sequence encoding a first polynucleotide including any one of SEQ ID NOs:96-1007; and a second nucleic acid sequence encoding a second polynucleotide including any one of SEQ ID NOs:1008-1539.
  • the first polynucleotide includes any one of SEQ ID NOs:96- 98, 101, 104-107, 120-123, 172-174, 177, 180-183, and 196-199.
  • the first polynucleotide includes any one of SEQ ID NOs:96-98, 101, 104-107, 120-123, 172- 174, 183, 196, 199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • the second polynucleotide includes any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, and 1032-1035.
  • the second polynucleotide includes any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, 1032, 1035, 1084, 1160, 1236, 1312, 1388, and 1464.
  • the expression system includes a vector.
  • the vector is a viral vector.
  • the viral vector is a lentiviral vector, an adenoviral vector, or an adeno-associated viral vector.
  • the first nucleic acid sequence and the second nucleic acid sequence are on a single vector.
  • each of the first nucleic acid sequence and the second nucleic acid sequence is on a separate vector.
  • the expression system including the first nucleic acid sequence further includes a third nucleic acid sequence encoding a Cas protein capable of forming a complex with the first polynucleotide.
  • the third nucleic acid sequence is on a separate vector from the first nucleic acid sequence. In some embodiments, the first nucleic acid sequence and the third nucleic acid sequence are on a single vector. In some embodiments, the Cas protein is SaCas9. [021] In some embodiments, the expression system including the first nucleic acid sequence and the second nucleic acid sequence further includes a third nucleic acid sequence encoding a Cas protein capable of forming a complex with the first polynucleotide and/or the second polynucleotide. In some embodiments, the first, second, and third nucleic acid sequences are on a single vector.
  • the first and second nucleic acid sequences are on a first vector, and the third nucleic acid sequence is on a second vector; or wherein the first and third nucleic acid sequences are on a first vector, and the second nucleic acid sequence is on a second vector; or wherein the second and third nucleic acid sequences are on a first vector, and the first nucleic acid sequence is on a second vector.
  • each of the first, second, and third nucleic acid sequences is on a separate vector.
  • the Cas protein is SaCas9.
  • the present disclosure provides a composition including a Cas protein, and one or both of: a) a first polynucleotide including a first guide sequence, wherein: (i) the first guide sequence targets human genome coordinates chr4:3078423- 3078443, and the first polynucleotide further includes a first scaffold sequence including any one of SEQ ID NOs:21-95; or (ii) the first guide sequence includes any one of SEQ ID NOs:2-12, wherein the guide sequence including SEQ ID NO:2 does not include a 5' guanine; and b) a second polynucleotide including a second guide sequence, wherein: (i) the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, and the second polynucleotide further includes a second scaffold sequence including any one of SEQ ID NOs:21-95; or (ii) the second guide sequence includes any one of SEQ ID NO
  • the Cas protein is Cas9 from Staphylococcus aureus (SaCas9).
  • the first guide sequence and/or the second guide sequence does not form a secondary structure.
  • the secondary structure is a hairpin loop.
  • the composition includes both the first and second polynucleotides.
  • the first polynucleotide includes the first guide sequence including any one of SEQ ID NOs:2-12, wherein the guide sequence including SEQ ID NO:2 does not include a 5' guanine, and wherein the first polynucleotide further includes a first scaffold sequence.
  • the second polynucleotide includes the second guide sequence including any one of SEQ ID NOs:13-19, and wherein the second polynucleotide further includes a second scaffold sequence.
  • the first and second scaffold sequences are each capable of binding to the Cas protein.
  • the first scaffold sequence and the second scaffold sequence are identical.
  • the first scaffold sequence and the second scaffold sequence are different.
  • the first scaffold sequence and/or the second scaffold sequence does not include an early termination signal sequence.
  • the early termination signal sequence includes 4 to 6 consecutive thymine bases.
  • the first scaffold sequence and/or the second scaffold sequence includes a stabilized secondary structure.
  • the stabilized secondary structure includes a locked hairpin loop.
  • the first scaffold sequence and/or the second scaffold sequence includes any one of SEQ ID NOs:20-96.
  • the first scaffold sequence and/or the second scaffold sequence includes any one of SEQ ID NOs:20-22, 25, 28-31, and 44-47.
  • the first polynucleotide includes any one of SEQ ID NOs:96- 1007.
  • the first polynucleotide includes any one of SEQ ID NOs:96-98, 101, 104-107, 120-123, 172-174, 177, 180-183, 196-199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • the second polynucleotide includes any one of SEQ ID NOs:1008-1539.
  • the second polynucleotide includes any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, 1032-1035, 1084, 1160, 1236, 1312, 1388, and 1464.
  • the first polynucleotide and the second polynucleotide are on a vector.
  • the vector is a viral vector.
  • the viral vector is a lentiviral vector, an adenoviral vector, or an adeno-associated viral vector.
  • the present disclosure provides a delivery particle including the polynucleotide, the expression system, or the composition described herein, or combination thereof.
  • the delivery particle includes a lipid-based particle or a virus-like particle.
  • the delivery particle includes a liposome, a micelle, a vesicle, an exosome, or a lipid nanoparticle.
  • the present disclosure provides a cell including the polynucleotide, the expression system, the composition, or the delivery particle described herein, or combination thereof.
  • the present disclosure provides a method of reducing or removing CAG repeats in a mutant allele of a huntingtin (HTT) gene of a cell, comprising introducing to the cell: the polynucleotide, the expression system, the composition, or the delivery particle described herein, or combination thereof.
  • the mutant allele includes greater than 35 CAG repeats prior to the introducing, and fewer than 35 CAG repeats following the introducing.
  • the present disclosure provides a method of treating Huntington’s Disease (HD) in a subject in need thereof, comprising administering to the subject: the polynucleotide, the expression system, the composition, or the delivery particle described herein, or combination thereof.
  • the subject has greater than 35 CAG repeats in a mutant allele of a huntingtin (HTT) gene prior to the administering.
  • the subject has fewer than 35 CAG repeats in the mutant allele following the administering.
  • following the administering all CAG repeats of the mutant allele are removed.
  • exon 1 of the mutant allele is removed.
  • the method reduces levels of RNA encoded by and/or levels of protein expressed from the mutant allele by at least 30%, and wherein the method does not reduce levels of RNA encoded by and/or levels of protein expressed from a wild-type allele of the HTT gene by more than 50%. In some embodiments, the method reduces the levels of RNA encoded by and/or levels of protein expressed from by at least 50%, and wherein the method does not reduce the levels of RNA encoded by and/or levels of protein expressed from a wild-type allele of the HTT gene by more than 50%.
  • FIG. 1 shows an exemplary selective editing strategy of the huntingtin (HTT) gene mutant allele as described in embodiments herein.
  • a first guide RNA gRNA
  • SNP a first guide RNA
  • a second gRNA targets a sequence adjacent to a PAM in both the mutant and wild-type alleles.
  • the mutant allele is cleaved at both the first gRNA and second gRNA target sequences, thereby deleting exon 1 containing an expanded CAG repeats region.
  • the wild-type allele is only cleaved at the second gRNA target sequence, leaving exon 1 intact upon repair of the cleaved second gRNA target sequence.
  • FIG. 2A shows an exemplary assay schematic for testing 7 different upstream gRNAs paired with a single downstream gRNA targeting the rs3856973 SNP in intron 1 of the HTT gene, as described in embodiments herein.
  • FIG. 2B shows representative results of the assay of FIG. 2A.
  • the top panel shows a gel of the uncleaved and cleaved amplicon DNA.
  • the bottom panel shows quantification of the editing band intensity for a relative comparison of each of the gRNA pairs tested.
  • FIG. 3A shows an exemplary assay schematic for testing additional upstream gRNAs paired with the same downstream gRNA as in FIG.
  • FIG. 3B shows representative results of the assay of FIG. 3A.
  • the top panel shows a gel of the amplicon DNA spanning across gRNA cleavage sites, marked with arrows in FIG. 3A.
  • the size of edited amplicon upon dual gRNA excision is shorter than the size of unedited amplicon.
  • the bottom panel shows quantification of the editing band intensity for a relative comparison of each of the gRNA pairs tested. The experiment was performed in patient or healthy donor fibroblast cells. Synthetic gRNAs and SaCas9 mRNA were delivered by Neon electroporation system. [042] FIGS.
  • FIGS. 1, 2A, and 3A show representative quantitative assessment of the editing strategy with the lead gRNA pairs as shown in FIGS. 1, 2A, and 3A in patient-derived fibroblasts, with combinations of gRNA 1 targeting the SNP in intron 1 with gRNA 11, gRNA 25, gRNA 27, or gRNA 30, each of which targets a sequence upstream of HTT exon 1.
  • FIG. 4A shows the percent mutant allele excision as measured by ddPCR assay that detects the corrected sequence upon exon 1 excision.
  • FIG. 4B shows the percent knockdown of mRNA produced from the mutant vs. wild-type allele, normalized to a control group transfected with SaCas9 protein only. [043]
  • FIG. 4A shows the percent mutant allele excision as measured by ddPCR assay that detects the corrected sequence upon exon 1 excision.
  • FIG. 4B shows the percent knockdown of mRNA produced from the mutant vs. wild-type allele, normalized to a control
  • FIG. 5A shows an exemplary schematic of a lentiviral vector design for expressing the first and second gRNAs and SaCas9, as described in embodiments herein.
  • FIGS. 5B and 5C show representative results of editing efficiency in HD patient- derived IPS neuron cells with the viral vector of FIG. 5A.
  • FIG. 5B shows percent indels detected at each gRNA target site upon editing, measured by amplicon-sequencing..
  • FIG. 5C shows percent mutant allele excision as measured by ddPCR assay that detects the corrected sequence upon exon 1 excision..
  • MOI multiplicity of infection.
  • FIG. 6A shows an exemplary predicted secondary structure of gRNA 1 as a synthetic gRNA, which does not contain the 5’ G.
  • FIG. 6B shows an exemplary predicted secondary structure of gRNA 1 containing the 5’ G for expression from an adeno-associated viral (AAV) vector or lentiviral vector, which forms a stem loop in the spacer sequence.
  • FIG. 6C shows an exemplary predicted secondary structure of gRNA 1 with a mutation of the adenine directly following the 5’ G (A1) to uracil.
  • FIG. 6D shows an exemplary predicted secondary structure of gRNA 1 with addition of a uracil directly following the 5’ G.
  • FIG. 6E shows representative results of an in vitro cleavage assay that tests gRNA 1 and gRNA 11 with or without a 5’ G.
  • FIG. 7A shows an exemplary sequence of gRNA 1 with part of the U6 promoter region, including three loop structures in the scaffold region, as described in embodiments herein.
  • FIG. 7B shows a non-limiting list of modifications that were made to gRNA 1 to improve expression and efficiency, as described in embodiments herein.
  • FIG. 8A shows representative mutant allele excision as measured by ddPCR assay that detects the corrected sequence upon exon 1 excision, in patient IPS-derived neurons using lentiviruses expressing SaCas9, gRNA11 and gRNA1 unmodified or with various modifications described herein.
  • FIG. 8A shows representative mutant allele excision as measured by ddPCR assay that detects the corrected sequence upon exon 1 excision, in patient IPS-derived neurons using lentiviruses expressing SaCas9, gRNA11 and gRNA1 unmodified or with various modifications described herein.
  • FIG. 8B shows representative percent indels detected at the gRNA1 target site upon editing, with gRNA11 and the same modified and unmodified gRNA1 variants shown in FIG. 8A, measured by amplicon-sequencing.
  • FIG. 8C shows representative percent indels detected at the gRNA11 target site upon editing, with gRNA11 and the same modified and unmodified gRNA1 variants shown in FIG. 8A, measured by amplicon-sequencing.
  • FIG. 8C shows representative percent indels detected at the gRNA11 target site upon editing, with gRNA11 and the same modified and unmodified gRNA1 variants shown in FIG. 8A, measured by amplicon-sequencing.
  • FIG. 10A shows representative percent indels detected at the gRNA1 target site upon editing, with gRNA11 and the same modified and unmodified gRNA1 variants shown in FIG. 8A, measured by amplicon-sequencing.
  • FIG. 10B shows representative percent indels detected at the gRNA11 target site upon editing, with gRNA11 and the same modified and unmodified gRNA1 variants shown in FIG.
  • FIG. 11 shows an exemplary schematic with combinations of the different modifications described in embodiments herein for gRNA 1 and gRNA 11.
  • FIG. 12 A shows the fold change in HTT mRNA levels upon dual gRNA excision with the engineered guides and an optimized SaCas9 using lentiviral vector delivery in HD patient-derived IPS-Neurons. Measurements were obtained using allele specific SNP (rs362331) recognizing RT-qPCR assays that can distinguish wt and mutant HTT RNA.
  • FIG. 12 A shows the fold change in HTT mRNA levels upon dual gRNA excision with the engineered guides and an optimized SaCas9 using lentiviral vector delivery in HD patient-derived IPS-Neurons. Measurements were obtained using allele specific SNP (rs362331) recognizing RT-qPCR assays that can distinguish wt and mutant HTT RNA.
  • FIG. 12 B shows the fold change of total and mutant HTT protein reduction achieved in HD patient-derived IPS-Neurons upon dual gRNA excision with the engineered guides and the optimized SaCas9 with lentivirus delivery.
  • Gys1 refers to a guide RNA targeting the mouse glycogen synthase gene Gys1 as non-targeting control and HTT to HTT targeting optimized dual gRNAs. Measurements were obtained using SMCxPRO immunoassay system. Data is normalized to untransduced cells. [058] FIG.
  • FIG. 13A shows the mutant HTT allele excision % in HD patient-derived iNeurons after treatment with escalating doses of AAVs (1E3, 1E4 and 1E5 AAV particles/cell) containing the optimized guides and the original or a codon-NLS optimized SaCas9..
  • FIG. 13B Shows the efficiency of editing (indels) at single sites (g1 top, g11 bottom) associated with the dual excision shown in FIG. 13A. Measurements were obtained by NGS using primers flanking each single cut site.
  • FIG 14A shows that using the optimized SaCas9 variant resulted in an approximately 50 % increased mutant HTT allele excision compared to the original SaCas9, reaching to 9% measured by ddPCR designed to detect the excision product.
  • FIG. 15B shows the corresponding indel % at single gRNA target sites, reaching to 15% measured by NGS using primers flanking each single cut site.
  • FIG. 16 shows exemplary sequences described herein. DETAILED DESCRIPTION [063] The present disclosure relates to treatment of Huntington’s Disease (HD) using the Clustered-Regularly Interspaced Short Palindromic Repeats (CRISPR) system.
  • CRISPR Clustered-Regularly Interspaced Short Palindromic Repeats
  • HD is a monogenic, autosomal dominant neurodegenerative disease, generally understood to be caused by a pathologic expansion of CAG trinucleotide repeats (also called CAG repeats) within exon 1 of the huntingtin (HTT) gene.
  • the healthy number of CAG repeats is 26 or less.
  • the term “expanded CAG repeats” means more than 26 CAG repeats in exon 1 of the HTT gene and is generally indicative of HD.
  • CAG repeats between 27 and 35 will not develop symptoms, but the next generation is at a small risk to develop expansion, which may or may not be into the disease-causing range.
  • CAG repeats between 36 and 39 are incompletely penetrant; individuals may develop symptoms but typically with a late age of onset.
  • CAG repeats When CAG repeats are equal to or greater than 40, the disease is fully penetrant and symptoms of the disease will occur. If an individual has 60 or more CAG repeats, juvenile onset HD will occur. Those individuals with the earliest onset tend to have the largest expansion in the number of repeats, while a lower expansion of the repeat number correlates with onset late in life. Rate of disease progression is inversely related to repeat size. [064] Most patients are heterozygous for the disease-causing expanded CAG repeats, i.e., having a wild-type allele and a mutant allele. SNPs present only in one allele can be used to selectively reduce expression of the mutant HTT allele while leaving expression of the wild- type HTT allele substantially unaltered.
  • the present disclosure provides a SNP that can be targeted to treat HD.
  • the CRISPR system has revolutionized the field of genome engineering and gene editing.
  • foreign DNA e.g., from an invading virus or plasmid
  • CRISPR-RNAs crRNAs
  • the crRNA includes protospacer sequences complementary to the foreign DNA and hybridizes with trans-activating CRISPR-RNA (tracrRNA).
  • the tracrRNA forms secondary structures, e.g., stem loops, and is capable of binding to an RNA-guided nuclease (e.g., Cas9).
  • the crRNA/tracrRNA/nuclease complex is capable of targeting foreign DNA bearing the protospacer sequences, thereby conferring immunity against the invading virus or plasmid. [066] Since its original discovery, extensive research has focused on CRISPR’s ability to perform site-specific cleavage of target polynucleotides.
  • CRISPR systems used for gene editing typically include two components: (i) a single guide RNA (gRNA or sgRNA), which includes a “crRNA” portion, also referred to herein as “guide sequence” or “spacer,” that recognizes a target sequence; and a “tracrRNA” portion, also referred to herein as “scaffold sequence,” that binds to an RNA-guided nuclease, e.g., Cas9; and (ii) the RNA-guided nuclease, which associates with the sgRNA.
  • gRNA or sgRNA a single guide RNA
  • crRNA crRNA” portion
  • tracrRNA tracrRNA portion
  • the target sequence In order for cleavage to occur, the target sequence generally requires a PAM that is adjacent or in proximity to the target sequence.
  • the sequence and location of the PAM varies based on the type of RNA-guided nuclease.
  • the wild-type Cas9 protein from Streptococcus pyogenes (SpCas9) recognizes the PAM “NGG,” where N is any nucleotide
  • the wild-type Cas9 protein from Staphylococcus aureus (SaCas9) recognizes the PAM “NNGRRT,” where N is any nucleotide and R is a purine (e.g., A or G)
  • the wild-type Cas12a (formerly known as Cpf1) proteins from Acidaminococcus sp.
  • BV3L6 (AsCas12a) and Lachnospiraceae bacterium ND2006 (LbCas12a) recognize the PAM “TTTV,” where V is G, C, or A.
  • RNA-guided nucleases such as Cas9 and Cas12a may be engineered to have altered PAM specificity.
  • CRISPR systems and their uses are further described in, e.g., Jinek et al., Science 337(6096):816-821, 2012; Cong et al., Science 339(6121):819-823, 2013; Mali et al., Science 339(6121):823-826, 2013; and Sander et al., Nat Biotechnol 32:347-355, 2014.
  • the present disclosure provides CRISPR systems and components thereof, which are useful for the treatment of HD. Definitions [068] Unless otherwise defined herein, scientific and technical terms used in the present disclosure shall have the meanings that are commonly understood by one of ordinary skill in the art.
  • the term “about” is meant to encompass approximately or less than 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19% or 20% variability, depending on the situation.
  • compositions, polynucleotides, vectors, cells, methods, and/or kits of the present disclosure can be used to achieve methods and proteins of the present disclosure.
  • compositions, polynucleotides, vectors, cells, and/or kits of the present disclosure can be used to achieve methods and proteins of the present disclosure.
  • nucleic acid means a polymeric compound including covalently linked nucleotides.
  • nucleic acid includes ribonucleic acid (RNA) or deoxyribonucleic acid (DNA) both of which may be single- or double-stranded.
  • the polynucleotide may comprise naturally-occurring nucleobases (e.g., guanine, adenine, cytosine, thymine, and uracil), modified nucleobases (e.g., hypoxanthine, xanthine, 7- methylguanine, dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine), and/or artificial nucleobases (e.g., isoguanine or isocytosine). Nucleic acids are transcribed from a 5’ end to a 3’ end.
  • the disclosure provides a polynucleotide comprising RNA and DNA nucleotides.
  • RNA and DNA nucleotides are known in the art and include, e.g., ligation or oligonucleotide synthesis methods.
  • the disclosure provides a polynucleotide capable of forming a complex with a Cas protein as described herein.
  • the disclosure provides a polynucleotide encoding any one of the proteins disclosed herein, e.g., a Cas protein.
  • a “gene” refers to an assembly of nucleotides that encode a polypeptide and includes cDNA and genomic DNA nucleic acid molecules.
  • “gene” also refers to a non-coding nucleic acid fragment that can act as a regulatory sequence preceding (i.e., 5’ or “upstream”) and following (i.e., 3’ or “downstream”) the coding sequence.
  • Genes include exons, i.e., nucleotides that are included in mature mRNA following transcription, and introns, i.e., nucleotides that do not remain in the mature mRNA and do not code for amino acids in the protein encoded by the gene. Exons include coding and non-coding sequences.
  • the disclosure relates to compositions and methods for editing a gene involved in a disease described herein, e.g., Huntington’s Disease (HD).
  • HD Huntington’s Disease
  • locus or its plural form “loci” refers to a specific, fixed position on a chromosome where a particular gene or genetic marker is located. Genes may possess multiple variants known as “alleles,” and an allele may also be referred to as residing at a particular locus. Genes that have the same allele at a given locus are “homozygous,” while genes that have different alleles at a given locus are “heterozygous.” Human genome locus positions are identified by a first number corresponding to the chromosome number; a letter corresponding to the p-arm or q-arm of the chromosome; and subsequent numbers indicating the chromosome position.
  • the locus of the huntingtin (HTT) gene is 4p16.3, i.e., chromosome 4, p-arm, position 16.3.
  • Specific regions of genes or genetic markers can be identified by their genome coordinates, denoted as “chr” followed by the chromosome number and a base pair range, counting from the p-arm telomere.
  • the HTT gene is located at base pair 3,074,510 to base pair 3,243,960 of chromosome 4, wherein the base pair numbering is based on the reference sequence GRCh38.p14, denoted as chr4:3074510- 3243960 (GRCh38.p14).
  • a nucleic acid molecule is “hybridizable” or “hybridized” to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are known and exemplified in Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1989, particularly Chapter 11 and Table 11.1 therein.
  • the conditions of temperature and ionic strength determine the stringency of the hybridization.
  • the stringency of the hybridization conditions can be selected to provide selective formation or maintenance of a desired hybridization product of two complementary polynucleotides, in the presence of other potentially cross-reacting or interfering polynucleotides.
  • Stringent conditions are sequence- dependent; typically, longer complementary sequences specifically hybridize at higher temperatures than shorter complementary sequences.
  • stringent hybridization conditions are between about 5 °C to about 10 °C lower than the thermal melting point Tm (i.e., the temperature at which 50% of the sequences hybridize to a substantially complementary sequence) for a specific polynucleotide at a defined ionic strength, concentration of chemical denaturants, pH, and concentration of the hybridization partners.
  • nucleotide sequences having a higher percentage of G and C bases hybridize under more stringent conditions than nucleotide sequences having a lower percentage of G and C bases.
  • stringency can be increased by increasing temperature, increasing pH, decreasing ionic strength, and/or increasing the concentration of chemical nucleic acid denaturants (such as formamide, dimethylformamide, dimethylsulfoxide, ethylene glycol, propylene glycol and ethylene carbonate).
  • Stringent hybridization conditions typically include salt concentrations or ionic strength of less than about 1 M, 500 mM, 200 mM, 100 mM or 50 mM; hybridization temperatures above about 20 °C, 30 °C, 40 °C, 60 °C or 80 °C; and chemical denaturant concentrations above about 10%, 20%, 30% 40% or 50%. Because many factors can affect the stringency of hybridization, the combination of parameters may be more significant than the absolute value of any parameter alone. [078]
  • the term “complementary” is used to describe the relationship between nucleotide bases that are capable of hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine.
  • nucleic acids When two nucleic acids are “complementary,” it is meant that a first nucleic acid or one or more regions thereof is capable of hydrogen bonding with a second nucleic acid or one or more regions thereof.
  • Complementary nucleic acids may pair through canonical Watson-Crick base pairing, or through non-canonical base pairing, e.g., Hoogsteen base pairing.
  • Complementary nucleic acids need not have complementarity at each nucleotide and may include one or more nucleotide mismatches, i.e., points at which hydrogen bonding does not occur.
  • complementary oligonucleotides can have at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of nucleotides hydrogen bond.
  • “fully complementary” or “100% complementary” in reference to oligonucleotides means that each nucleotide hydrogen bonds without any nucleotide mismatches.
  • the target sequence is a target DNA sequence comprising a coding strand and a non-coding strand of a gene described herein, wherein the non-coding strand is complementary to the coding strand.
  • a guide sequence targeting the target DNA sequence is capable of hybridizing to the coding strand.
  • a guide sequence targeting the target DNA sequence is capable of hybridizing to the non-coding strand.
  • the guide sequence is capable of hybridizing to the target DNA sequence in its entirety.
  • a target DNA sequence may comprise 30 nucleotides, and the guide sequence hybridizes to all 30 nucleotides.
  • the guide sequence is capable of hybridizing to a region within the target DNA sequence.
  • a target DNA sequence may comprise 30 nucleotides
  • the guide sequence hybridizes to a region within the target DNA sequence, e.g., hybridizes to greater than 10 nucleotides, greater than 15 nucleotides, greater than 20 nucleotides, greater than 25 nucleotides, etc.
  • the guide sequence hybridizes to a region within the target DNA sequence, e.g., hybridizes to 10 to 30 nucleotides, 15 nucleotides to 30 nucleotides, or 20 nucleotides to 30 nucleotides, etc.
  • the term “targets” includes a guide sequence that is not 100% complementary to the target sequence (either the coding or non-coding strand of the target sequence), e.g., it is not complementary in 1, 2, 3 or 4 bases.
  • the noncomplementary bases are at the 5’ or the 3’ end of the target sequence.
  • the guide sequence is longer than the target sequence.
  • the guide sequence has bases that are not complementary to the target sequence but add stability to the guide sequence and/or prohibit formation of secondary structures.
  • operably linked means that a polynucleotide of interest, e.g., the polynucleotide encoding a nuclease, is linked to the regulatory element in a manner that allows for expression of the polynucleotide.
  • Regulatory elements can be cis-regulatory elements or trans-regulatory elements. Regulatory elements include, for example, promoters, enhancers, terminators, 5’ and 3’ UTRs, insulators, silencers, operators, and the like.
  • the regulatory element is a promoter.
  • a polynucleotide expressing a protein of interest is operably linked to a promoter on an expression vector.
  • promoter refers to a DNA regulatory region or polynucleotide capable of binding RNA polymerase and involved in initiating transcription of a downstream coding or non-coding sequence.
  • the promoter sequence includes the transcription initiation site and extends upstream to include the minimum number of bases or elements used to initiate transcription at levels detectable above background.
  • the promoter sequence includes a transcription initiation site, as well as protein binding domains responsible for the binding of RNA polymerase. Eukaryotic promoters typically contain “TATA” boxes and “CAT” boxes.
  • a “vector” is any means for the cloning of and/or transfer of a nucleic acid into a host cell.
  • a vector may be a replicon to which another DNA segment may be attached so as to bring about the replication of the attached segment.
  • a “replicon” is any genetic element (e.g., plasmid, phage, cosmid, chromosome, virus) that functions as an autonomous unit of DNA replication in vivo, i.e., capable of replication under its own control.
  • the vector is an episomal vector, which is removed/lost from a population of cells after a number of cellular generations, e.g., by asymmetric partitioning.
  • the term “vector” includes both viral and non-viral means for introducing the nucleic acid into a cell in vitro, ex vivo, or in vivo.
  • a large number of vectors known in the art may be used to manipulate nucleic acids, incorporate response elements and promoters into genes, etc.
  • a vector may include one or more regulatory regions, and/or selectable markers useful in selecting, measuring, and monitoring nucleic acid transfer results.
  • Possible vectors include, for example, plasmids or modified viruses including, for example, bacteriophages such as lambda derivatives, or plasmids such as PBR322 or pUC plasmid derivatives, or the Bluescript vector.
  • plasmids or modified viruses including, for example, bacteriophages such as lambda derivatives, or plasmids such as PBR322 or pUC plasmid derivatives, or the Bluescript vector.
  • the insertion of the DNA fragments corresponding to response elements and promoters into a suitable vector can be accomplished by ligating the appropriate DNA fragments into a chosen vector that has complementary cohesive termini.
  • the ends of the DNA molecules may be enzymatically modified, or any site may be produced by ligating polynucleotides (linkers) into the DNA termini.
  • Viral vectors may be engineered to contain selectable marker genes that provide for the selection of cells that have incorporated the marker into the cellular genome. Such markers allow identification and/or selection of host cells that incorporate and express the proteins encoded by the marker.
  • Viral vectors and particularly retroviral vectors, have been used in a wide variety of gene delivery applications in cells, as well as living animal subjects. Viral vectors that can be used include, but are not limited to, retrovirus, adenovirus, adeno-associated virus, pox, baculovirus, vaccinia, herpes simplex, Epstein-Barr, adenovirus, geminivirus, and caulimovirus vectors.
  • a viral vector is utilized to provide the polynucleotides described herein. In some embodiments, a viral vector is utilized to provide a polynucleotide coding for a protein described herein.
  • Vectors may be introduced into the desired host cells by known methods, including, but not limited to, transfection, transduction, cell fusion, and lipofection. Vectors can include various regulatory elements including promoters. In some embodiments, vector designs can be based on constructs designed by Mali et al., Nat Methods 10: 957-63, 2013. [086] Once a suitable host system and growth conditions are established, the polynucleotides and/or expression vectors described herein can be propagated and prepared in quantity.
  • the expression vectors which can be used include, but are not limited to, the following vectors or their derivatives: human or animal viruses such as vaccinia virus or adenovirus; insect viruses such as baculovirus; yeast vectors; bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNA vectors.
  • human or animal viruses such as vaccinia virus or adenovirus
  • insect viruses such as baculovirus
  • yeast vectors bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNA vectors.
  • expression system refers to components for expressing a protein and/or nucleic acid (e.g., RNA) of interest.
  • Exemplary expression systems may include, without limitation, a vector (e.g., an expression vector described herein), a cell, a transfection reagent for introducing the vector into the cell, a reagent for inducing expression of the protein and/or nucleic acid of interest (e.g., an inducer for an inducible promoter), or combinations thereof.
  • the expression vector comprises a polynucleotide, wherein the polynucleotide is capable of being transcribed into an mRNA for the protein of interest, or wherein the polynucleotide is capable of being transcribed in the nucleic acid of interest.
  • an expression system of the present disclosure comprises one or more vectors.
  • the present disclosure provides an expression system for expressing polynucleotides (e.g., guide RNAs) described herein.
  • the present disclosure provides an expression system for expressing proteins (e.g., Cas proteins) described herein.
  • proteins e.g., Cas proteins
  • plasmid refers to an extra chromosomal element often carrying a gene that is not part of the central metabolism of the cell, and usually in the form of circular double- stranded DNA molecules.
  • Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear, circular, or supercoiled, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of polynucleotides have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3’ untranslated sequence into a cell.
  • a plasmid is utilized to provide the polynucleotides described herein.
  • a plasmid is utilized to provide a polynucleotide coding for a protein described herein.
  • transfection means the introduction of an exogenous nucleic acid molecule, including a vector, into a cell.
  • Transfection methods e.g., for components of the CRISPR/Cas compositions described herein, are known to one of ordinary skill in the art.
  • a “transfected” cell includes an exogenous nucleic acid molecule inside the cell and a “transformed” cell is one in which the exogenous nucleic acid molecule within the cell induces a phenotypic change in the cell.
  • the transfected nucleic acid molecule can be integrated into the host cell’s genomic DNA and/or can be maintained by the cell, temporarily or for a prolonged period of time, extra-chromosomally.
  • Host cells or organisms that express exogenous nucleic acid molecules or fragments are referred to herein as “recombinant,” “transformed,” or “transgenic” organisms.
  • the present disclosure provides a host cell comprising any of the expression vectors described herein, e.g., an expression vector comprising a polynucleotide that encodes a protein described herein.
  • the term “host cell” refers to a cell into which a recombinant expression vector has been introduced, or “host cell” may also refer to the progeny of such a cell.
  • peptide refers to a polymeric form of amino acids of any length, which can include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and polypeptides having modified peptide backbones.
  • the start of the protein or polypeptide is known as the “N-terminus” (and also referred to as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus), referring to the free amine (-NH2) group of the first amino acid residue of the protein or polypeptide.
  • the end of the protein or polypeptide is known as the “C-terminus” (and also referred to as the carboxy-terminus, carboxyl-terminus, C-terminal end, or COOH-terminus), referring to the free carboxyl group (-COOH) of the last amino acid residue of the protein or polypeptide.
  • amino acid refers to a compound including both a carboxyl (- COOH) and amino (-NH2) group. “Amino acid” refers to both natural and unnatural, i.e., synthetic, amino acids.
  • Natural amino acids include: alanine (Ala; A); arginine (Arg, R); asparagine (Asn; N); aspartic acid (Asp; D); cysteine (Cys; C); glutamine (Gln; Q); glutamic acid (Glu; E ); glycine (Gly; G); histidine (His; H); isoleucine (Ile; I); leucine (Leu; L); lysine (Lys; K); methionine (Met; M); phenylalanine (Phe; F); proline (Pro; P); serine (Ser; S); threonine (Thr; T); tryptophan (Trp; W); tyrosine (Tyr; Y); and valine (Val; V).
  • Unnatural or synthetic amino acids include a side chain that is distinct from the natural amino acids provided above and may include, e.g., fluorophores, post-translational modifications, metal ion chelators, photocaged and photocross-linking moieties, uniquely reactive functional groups, and NMR, IR, and x-ray crystallographic probes.
  • Exemplary unnatural or synthetic amino acids are provided in, e.g., Mitra et al., Mater Methods 3:204, 2013 and Wals et al., Front Chem 2:15, 2014.
  • Unnatural amino acids may also include naturally-occurring compounds that are not typically incorporated into a protein or polypeptide, such as, e.g., citrulline (Cit), selenocysteine (Sec), and pyrrolysine (Pyl).
  • An “amino acid substitution” refers to a polypeptide or protein including one or more substitutions of wild-type or naturally occurring amino acid with a different amino acid relative to the wild-type or naturally occurring amino acid at that amino acid residue.
  • the substituted amino acid may be a synthetic or naturally occurring amino acid.
  • the substituted amino acid is a naturally occurring amino acid selected from the group consisting of: A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, and V.
  • the substituted amino acid is an unnatural or synthetic amino acid. Substitution mutants may be described using an abbreviated system.
  • a substitution mutation in which the fifth (5 th ) amino acid residue is substituted may be abbreviated as “X5Y,” wherein “X” is the wild-type or naturally occurring amino acid to be replaced, “5” is the amino acid residue position within the amino acid sequence of the protein or polypeptide, and “Y” is the substituted, or non-wild-type or non-naturally occurring, amino acid.
  • X5Y a substitution mutation in which the fifth (5 th ) amino acid residue is substituted
  • An “isolated” polypeptide, protein, peptide, or nucleic acid is a molecule that has been removed from its natural environment.
  • isolated polypeptides, proteins, peptides, or nucleic acids may be formulated with excipients such as diluents or adjuvants and still be considered isolated.
  • isolated does not necessarily imply any particular level purity of the polypeptide, protein, peptide, or nucleic acid.
  • recombinant when used in reference to a nucleic acid molecule, peptide, polypeptide, or protein means of, or resulting from, a new combination of genetic material that is not known to exist in nature.
  • a recombinant molecule can be produced by any of the techniques available in the field of recombinant technology, including, but not limited to, polymerase chain reaction (PCR), gene splicing (e.g., using restriction endonucleases), and solid-phase synthesis of nucleic acid molecules, peptides, or proteins.
  • PCR polymerase chain reaction
  • gene splicing e.g., using restriction endonucleases
  • solid-phase synthesis of nucleic acid molecules, peptides, or proteins solid-phase synthesis of nucleic acid molecules, peptides, or proteins.
  • exogenous means that the referenced molecule or activity introduced into the host cell.
  • the molecule can be introduced, for example, by introduction of an encoding nucleic acid into the host genetic material, such as by integration into a host chromosome or as non-chromosomal genetic material, e.g., a plasmid.
  • exogenous protein can be introduced into a host cell via an “exogenous” nucleic acid encoding the protein.
  • endogenous refers to a referenced molecule or activity that is naturally present in the host cell.
  • An “endogenous” protein is expressed by a nucleic acid contained within the host cell.
  • heterologous refers to a molecule or activity derived from a source other than the referenced organism/species, whereas “homologous” refers to a molecule or activity derived from the host organism/species. Accordingly, exogenous expression of an encoding nucleic acid can utilize either or both of a heterologous or homologous encoding nucleic acid.
  • sequence similarity refers to the degree of identity or correspondence between nucleic acid sequences or amino acid sequences.
  • sequence similarity may refer to nucleic acid sequences wherein changes in one or more nucleotide bases results in substitution of one or more amino acids, but do not affect the functional properties of the protein encoded by the polynucleotide.
  • sequence similarity may also refer to modifications of the polynucleotide, such as deletion or insertion of one or more nucleotide bases, that do not substantially affect the functional properties of the resulting transcript.
  • Similar polynucleotides of the present disclosure are about 70%, at least about 70%, about 75%, at least about 75%, about 80%, at least about 80%, about 85%, at least about 85%, about 90%, at least about 90%, about 95%, at least about 95%, about 99%, at least about 99%, or about 100% identical to the polynucleotides disclosed herein.
  • sequence similarity refers to two or more polypeptides wherein greater than about 40% of the amino acids are identical, or greater than about 60% of the amino acids are functionally identical. “Functionally identical” or “functionally similar” amino acids have chemically similar side chains.
  • amino acids can be grouped in the following manner according to functional similarity: (i) positively-charged side chains: Arg, His, Lys; (ii) negatively-charged side chains: Asp, Glu; (iii) polar, uncharged side chains: Ser, Thr, Asn, Gln; (iv) hydrophobic side chains: Ala, Val, Ile, Leu, Met, Phe, Tyr, Trp; and (v) others: Cys, Gly, Pro.
  • similar polypeptides of the present disclosure have about 40%, at least about 40%, about 45%, at least about 45%, about 50%, at least about 50%, about 55%, at least about 55%, about 60%, at least about 60%, about 65%, at least about 65%, about 70%, at least about 70%, about 75%, at least about 75%, about 80%, at least about 80%, about 85%, at least about 85%, about 90%, at least about 90%, about 95%, at least about 95%, about 97%, at least about 97%, about 98%, at least about 98%, about 99%, at least about 99%, or about 100% identical amino acids.
  • similar polypeptides of the present disclosure have about 60%, at least about 60%, about 65%, at least about 65%, about 70%, at least about 70%, about 75%, at least about 75%, about 80%, at least about 80%, about 85%, at least about 85%, about 90%, at least about 90%, about 95%, at least about 95%, about 97%, at least about 97%, about 98%, at least about 98%, about 99%, at least about 99%, or about 100% functionally identical amino acids.
  • Sequence similarity can be determined by sequence alignment using methods known in the field, such as, for example, BLAST, MUSCLE, Clustal (including ClustalW and ClustalX), and T-Coffee (including variants such as, for example, M-Coffee, R-Coffee, and Expresso).
  • Percent identity of polynucleotides or polypeptides can be determined when the polynucleotide or polypeptide sequences are aligned over a specified comparison window. In some embodiments, only specific portions of two or more sequences are aligned to determine sequence identity. In some embodiments, only specific domains of two or more sequences are aligned to determine sequence similarity.
  • a comparison window can be a segment of at least 10 to over 1000 residues, at least 20 to about 1000 residues, or at least 50 to 500 residues in which the sequences can be aligned and compared.
  • Methods of alignment for determination of sequence identity are well-known and can be performed using publicly available databases such as BLAST.
  • “percent identity” of two amino acid sequences is determined using the algorithm of Karlin and Altschul, Proc Nat Acad Sci USA 87:2264-2268 (1990), modified as in Karlin and Altschul, Proc Nat Acad Sci USA 90:5873- 5877 (1993).
  • Gapped BLAST can be utilized as described in Altschul et al., Nucleic Acids Res 25(17): 3389-3402 (1997).
  • a polypeptide or polynucleotide has 70%, at least 70%, 75%, at least 75%, 80%, at least 80%, 85%, at least 85%, 90%, at least 90%, 95%, at least 95%, 97%, at least 97%, 98%, at least 98%, 99%, or at least 99% or 100% sequence identity with a reference polypeptide or polynucleotide (or a fragment of the reference polypeptide or polynucleotide) provided herein.
  • a polypeptide or polynucleotide have about 70%, at least about 70%, about 75%, at least about 75%, about 80%, at least about 80%, about 85%, at least about 85%, about 90%, at least about 90%, about 95%, at least about 95%, about 97%, at least about 97%, about 98%, at least about 98%, about 99%, at least about 99% or about 100% sequence identity with a reference polypeptide or polynucleotide (or a fragment of the reference polypeptide or nucleic acid molecule) provided herein.
  • a “complex” refers to a group of two or more associated polynucleotides and/or polypeptides.
  • association refers to molecules bound to one another through electrostatic, hydrophobic/hydrophilic, and/or hydrogen bonding interaction, without being covalently attached.
  • a molecule that comprises different moieties covalently attached to one another is known.
  • a complex is formed when all the components of the complex are present together, i.e., a self-assembling complex.
  • a complex is formed through chemical interactions between different components of the complex such as, for example, hydrogen-bonding.
  • the polynucleotides provided herein form a complex with the proteins provided herein through secondary structure recognition of the polynucleotide by the protein.
  • the term “guide sequence” is used interchangeably with “spacer,” “spacer sequence,” “crRNA,” and “crRNA sequence”; and the term “scaffold sequence” is used interchangeably with “tracrRNA” and “tracrRNA sequence.”
  • the nucleic acid sequences of the polynucleotides described herein include the DNA nucleotide thymine (T), it will be understood by one of ordinary skill in the art that the polynucleotides also encompass the RNA nucleotide uracil (U) in place of T.
  • the locus targeted by the polynucleotide is 4p16.3 of the human genome.
  • the HTT gene is located at human genome coordinates chr4:3074510-3243960.
  • the disclosure provides a polynucleotide comprising a guide sequence and a scaffold sequence.
  • the guide sequence targets a sequence that is within about 1 to about 50 nucleotides, or within about 2 to about 40 nucleotides, or within about 3 to about 30 nucleotides, or within about 4 to about 25 nucleotides, or within about 5 to about 20 nucleotides of a SNP in the HTT gene.
  • the guide sequence targets a sequence that is within 30 nucleotides of a SNP in the HTT gene.
  • the guide sequence targets a sequence that is within 25 nucleotides of a SNP in the HTT gene. In some embodiments, the guide sequence targets a sequence that is within 20 nucleotides of a SNP in the HTT gene. In some embodiments, the guide sequence targets a sequence that is within 15 nucleotides of a SNP in the HTT gene. In some embodiments, the guide sequence targets a sequence that is within 10 nucleotides of a SNP in the HTT gene. In some embodiments, the guide sequence targets a sequence that is within 5 nucleotides of a SNP in the HTT gene. In some embodiments, the guide sequence targets a sequence that is within 3 nucleotides of a SNP in the HTT gene.
  • guide sequence targets a sequence that comprises a SNP in the HTT gene.
  • a guide sequence that “targets” a target sequence e.g., a sequence in the HTT gene, is capable of hybridizing to a strand of the target sequence.
  • the guide sequence is capable of hybridizing to a coding strand of the target sequence.
  • the guide sequence is capable of hybridizing to a non-coding strand of the target sequence.
  • the polynucleotide of the present disclosure is a first polynucleotide
  • the guide sequence is a first guide sequence
  • the scaffold sequence is a first scaffold sequence.
  • the first polynucleotide is also referred to herein as the “first gRNA.”
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443.
  • chr4:3078423-3078443 is within locus 4p16.3.
  • chr4:3078423-3078443 is within the huntingtin (HTT) gene.
  • chr4:3078423-3078443 is downstream (i.e., 3’) of exon 1 in the HTT gene.
  • a mutant allele of the HTT gene comprises expanded CAG repeats, thereby causing Huntington’s Disease.
  • chr4:3078423-3078443 is in an intron of the HTT gene.
  • the first guide sequence is capable of hybridizing to a coding strand of chr4:3078423-3078443.
  • the first guide sequence is capable of hybridizing to a non-coding strand of chr4:3078423-3078443.
  • the first guide sequence is capable of hybridizing to the entire length of chr4:3078423-3078443.
  • chr4:3078423-3078443 is adjacent to a protospacer adjacent motif (PAM) recognizable by an RNA-guided nuclease.
  • PAM protospacer adjacent motif
  • the RNA- guided nuclease is a Cas9 protein. In some embodiments, the RNA-guided nuclease is a Cas12a protein. In some embodiments, the Cas9 protein is Staphylococcus aureus Cas9 (SaCas9). In some embodiments, the PAM comprises NNGRRT, where N is any nucleotide and R is any purine (e.g., A or G). In some embodiments, the Cas9 protein is Streptococcus pyogenes Cas9 (SpCas9). In some embodiments, the PAM comprises NGG, wherein N is any nucleotide.
  • the PAM comprises NGAN or NGNG, e.g., NGAG or NGCG, where N is any nucleotide.
  • a SNP of the HTT gene is located adjacent to chr4:3078423- 3078443, wherein the SNP causes the loss of the PAM in a wild-type HTT allele.
  • the SNP has the NCBI SNP Database (dbSNP) accession number rs3856973.
  • the PAM is present in a mutant allele of HTT and absent in a wild- type allele of HTT.
  • the mutant allele comprises the motif NGNG.
  • the mutant allele comprises the motif NNGRRT. In some embodiments, the mutant allele comprises the sequence TCGAGT, and the wild-type allele comprises the sequence TCAAGT. In some embodiments, the sequence TCGAGT is recognizable by SpCas9 or SaCas9, while the sequence TCAAGT is not recognizable by SpCas9 or SaCas9. In some embodiments, the sequence TCGAGT is recognizable efficiently by SaCas9, while the sequence TCAAGT is not recognizable efficiently by SaCas9.
  • the SpCas9 or SaCas9 when guided to the target sequence of the first guide sequence, is capable of selectively cleaving the mutant allele of HTT and not the wild-type allele of HTT.
  • the first guide sequence provides improved targeting efficiency over conventional guide sequences, which may form secondary structures that reduce the guide sequence’s ability to hybridize to the target sequence.
  • the first guide sequence provides increased CRISPR editing efficiency.
  • the first guide sequence does not form a secondary structure.
  • nucleic acid secondary structures include stem loops (also known as hairpin loops), internal loops, bulge loops, pseudoknots, and the like.
  • the first guide sequence does not form a stem loop.
  • a guanine is typically added to the 5’ end of a guide sequence (also referred to herein as a “5’ guanine”) to improve expression. It was discovered that a single 5’ guanine in certain guide sequences, e.g., the first guide sequence described herein, contributes to secondary structure formation and therefore decreased targeting efficiency.
  • the first guide sequence does not comprise a 5’ guanine.
  • the first guide sequence comprises any one of SEQ ID NOs:1- 12, provided that the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine.
  • the first guide sequence comprises any one of SEQ ID NOs:2-12, 13-19, and 1561-1576, provided that the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine.
  • the first guide sequence comprises SEQ ID NO:1.
  • the first guide sequence comprises SEQ ID NO:2 and does not comprise a 5’ guanine.
  • SEQ ID NO:2 is identical to SEQ ID NO:1 except that SEQ ID NO:2 does not comprise a 5’ guanine.
  • a CRISPR system comprising a gRNA with SEQ ID NO:2 as the guide sequence has at least 1.5-fold, at least 2- fold, at least 3-fold, at least 4-fold, at least 5-fold, at least 6-fold, at least 7-fold, at least 8- fold, at least 9-fold, or at least 10-fold higher editing efficiency as compared to an otherwise identical CRISPR system except with SEQ ID NO:1 as the guide sequence.
  • the first guide sequence comprises at its 5’ end: at least one guanine and at least one additional nucleotide that do not hybridize to the target sequence and that prevent formation of a secondary structure, e.g., a stem loop.
  • the first guide sequence comprises two or more guanines at its 5’ end.
  • the two or more guanines at the 5’ end of the first guide sequence prevent formation of secondary structures.
  • the first guide sequence comprises 2, 3, 4, 5, or more than 5 guanines at its 5’ end.
  • the first guide sequence comprises SEQ ID NO:3.
  • the first guide sequence comprises SEQ ID NO:4.
  • the first guide sequence comprises SEQ ID NO:5. In some embodiments, the first guide sequence comprises SEQ ID NO:6. In some embodiments, the first guide sequence comprises SEQ ID NO:7. In some embodiments, the first guide sequence comprises SEQ ID NO:8. In some embodiments, the first guide sequence comprises SEQ ID NO:9. In some embodiments, the first guide sequence comprises SEQ ID NO:10. In some embodiments, the first guide sequence comprises SEQ ID NO:11. [115] In some embodiments, the first guide sequence comprises a sequence at its 5’ end that forms a 5’ secondary structure. In some embodiments, the 5’ secondary structure prevents formation of further secondary structure in the guide sequence.
  • the sequence that forms the 5’ secondary structure comprises the sequence GGACTTCGGTCC (SEQ ID NO:1540). In some embodiments, the 5’ secondary structure is a stem loop. In some embodiments, the first guide sequence comprises SEQ ID NO:12. [116] In some embodiments, each of SEQ ID NOs:1-12 is capable of targeting chr4:3078423-3078443. In some embodiments, each of SEQ ID NOs:1-12 is capable of hybridizing to at least a portion of the non-coding strand of chr4:3078423-3078443.
  • each of SEQ ID NOs:1-12 targets a target sequence, wherein the 3’ end of the target sequence is within 3 nucleotides of a SNP with the dbSNP accession number rs3856973.
  • each of SEQ ID NOs:1-12 targets a sequence directly upstream of a PAM for SpCas9 or SaCas9.
  • Second guide sequence [117]
  • the disclosure further provides a second polynucleotide comprising a second guide sequence and a second scaffold sequence.
  • the second polynucleotide is also referred to herein as the “second gRNA.”
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346- 3074862. In some embodiments, the second guide sequence targets a sequence within human genome coordinates chr4:3072397-3074774.
  • the second guide sequence targets a sequence of about 15 to 30 nucleotides, or about 17 to about 28 nucleotides, or about 19 to about 26 nucleotides, or about 20 to about 25 nucleotides, or about 21 to about 24 nucleotides in length within the region of chr4:3068346-3074862, e.g., within chr4:3072397-3074774.
  • chr4:3068346-3074862 is upstream (i.e., 5’) of exon 1 in the HTT gene.
  • chr4:3068346-3074862 is upstream of the region comprising the CAG repeats in exon 1 of the HTT gene.
  • the second guide sequence is capable of hybridizing to a coding strand of chr4:3068346- 3074862, e.g., chr4:3072397-3074774. In some embodiments, the second guide sequence is capable of hybridizing to a non-coding strand of chr4:3068346-3074862, e.g., chr4:3072397- 3074774. [118] In some embodiments, the second guide sequence hybridizes to a sequence adjacent to a PAM recognizable by an RNA-guided nuclease. In some embodiments, the RNA-guided nuclease is a Cas9 protein.
  • the RNA-guided nuclease is a Cas12a protein.
  • the Cas9 protein SaCas9.
  • the PAM comprises NNGRRT, where N is any nucleotide and R is any purine.
  • the Cas9 protein is SpCas9.
  • the PAM comprises NGG, wherein N is any nucleotide.
  • the PAM comprises NGAN or NGNG, e.g., NGAG or NGCG, where N is any nucleotide.
  • the first guide sequence and the second guide sequence each targets a sequence adjacent to a PAM, wherein the PAM is recognizable by the same RNA- guided nuclease, e.g., Cas protein such as SaCas9 or SpCas9.
  • the PAM adjacent to the target sequence of the first guide sequence and the PAM adjacent to the target sequence of the second guide sequence are identical.
  • the PAM adjacent to the target sequence of the first guide sequence and the PAM adjacent to the target sequence of the second guide sequence are different.
  • the PAMs are different but are recognizable by the same RNA-guided nuclease, e.g., Cas protein such as SaCas9 or SpCas9.
  • the PAM adjacent to the target sequence of the first guide sequence and the PAM adjacent to the target sequence of the second guide sequence comprise the motif NNGRRT.
  • the PAM adjacent to the target sequence of the first guide sequence and the PAM adjacent to the target sequence of the second guide sequence comprise the motif NGAN or NGNG, e.g., NGAG or NGCG.
  • the PAM adjacent to the target sequence of the first guide sequence and the PAM adjacent to the target sequence of the second guide sequence comprise the motif NGG.
  • the PAM adjacent to the target sequence of the first guide sequence is present in a mutant allele of the HTT gene and absent in a wild-type allele of the HTT gene.
  • the PAM adjacent to the target sequence of the second guide sequence is present in both wild-type and mutant alleles of the HTT gene.
  • the SpCas9 or SaCas9 when guided to the target sequence of the second guide sequence is capable of cleaving both the wild-type allele and the mutant allele of HTT.
  • the mutant allele comprises two cleavage sites is cleaved at both the sequence targeted by the first guide sequence (i.e., chr4:3078423-3078443) and the sequence targeted by the second guide sequence (i.e., the sequence within chr4:3068346-3074862), while the wild-type allele is only cleaved at the sequence targeted by the second guide sequence.
  • the second guide sequence provides improved targeting efficiency over conventional guide sequences, which may form secondary structures that reduce the guide sequence’s ability to hybridize to the target sequence.
  • the second guide sequence provides increased CRISPR editing efficiency.
  • the second guide sequence does not form a secondary structure.
  • nucleic acid secondary structures include stem loops (also known as hairpin loops), internal loops, bulge loops, pseudoknots, and the like.
  • the second guide sequence does not form a stem loop.
  • the second guide sequence comprises a sequence at its 5’ end that forms a 5’ secondary structure.
  • the 5’ secondary structure prevents formation of further secondary structure in the guide sequence.
  • the sequence that forms the 5’ secondary structure comprises the sequence GGACTTCGGTCC (SEQ ID NO:1540).
  • the 5’ secondary structure is a stem loop.
  • the second guide sequence targets a target sequence within human genome coordinates chr4:3068346-3074862, as shown in Table 1.
  • the second guide sequence comprises a sequence as shown in Table 1.
  • Table 1. Second Guide Sequence Chromosomal Locations and Sequence IDs Target Sequence Corresponding Guide Sequence chr4:3074163-3074183 SEQ ID NO:1561 chr4:3074430-3074450 SEQ ID NO:13 chr4:3074460-3074480 SEQ ID NO:1562 chr4:3074550-3074570 SEQ ID NO:1563 chr4:3074669-3074689 SEQ ID NO:1564 chr4:3074753-3074773 SEQ ID NO:1565 chr4:3074754-3074774 SEQ ID NO:1566 chr4:3072396-3072416 SEQ ID NO:1567 chr4:3072403-3072423 SEQ ID NO:15;1566 chr4:
  • the second guide sequence comprises a sequence of Table 1. In some embodiments, the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576. [125] In some embodiments, the second guide sequence targets chr4:3074430-3074450. In some embodiments, the second guide sequence comprises any one of SEQ ID NOs:13-19. In some embodiments, the second guide sequence comprises SEQ ID NO:13. In some embodiments, the second guide sequence comprises at its 5’ end: at least one guanine and at least one additional nucleotide that do not hybridize to the target sequence and that prevent formation of a secondary structure, e.g., a stem loop.
  • a secondary structure e.g., a stem loop.
  • the second guide sequence comprises two or more guanines at its 5’ end. In some embodiments, the two or more guanines at the 5’ end of the second guide sequence prevent formation of secondary structures. In some embodiments, the second guide sequence comprises 2, 3, 4, 5, or more than 5 guanines at its 5’ end. In some embodiments, the second guide sequence comprises any one of SEQ ID NOs:14-19. Scaffold sequence [126] In some embodiments, the polynucleotide of the present disclosure comprises a guide sequence, e.g., a first guide sequence or second guide sequence as described herein, and a scaffold sequence. In some embodiments, the scaffold sequence is capable of binding to a Cas protein.
  • the scaffold sequence is a tracrRNA sequence for a Cas protein.
  • the Cas protein is SpCas9.
  • the Cas protein is SaCas9.
  • the scaffold sequence of the present disclosure provides improved stability and/or expression of the gRNA.
  • expression of the gRNA comprises transcription by a polymerase, e.g., an RNA Polymerase such as RNA Pol I, RNA Pol II, or RNA Pol III.
  • the scaffold sequence of the present disclosure provides improved CRISPR editing efficiency.
  • the scaffold sequence does not comprise an early termination signal sequence; and/or (ii) comprises a stabilized secondary structure, thereby improving stability and/or expression of the gRNA.
  • the scaffold sequence does not comprise an early termination signal sequence.
  • a termination signal sequence is a nucleotide sequence that recognized by a polymerase, e.g., an RNA polymerase such as RNA Pol I, RNA Pol II, or RNA Pol III, to terminate transcription.
  • a scaffold sequence comprising an “early” termination signal sequence comprises the termination signal sequence within 10 nucleotides, within 15 nucleotides, within 20 nucleotides, within 25 nucleotides, or within 30 nucleotides of the 3’ end of the scaffold sequence. In some embodiments, a scaffold sequence comprising an “early” termination signal sequence comprises the termination signal sequence within 1 nucleotide, within 3 nucleotides, within 5 nucleotides, within 10 nucleotides, within 15 nucleotides, or within 20 nucleotides of the 5’ end of the scaffold sequence.
  • the early termination signal sequence comprises a stretch of at least 2, at least 3, at least 4, at least 5, or at least 6 consecutive thymine bases. In some embodiments, the early termination signal sequence comprises about 2 to about 8 consecutive thymine bases. In some embodiments, the early termination signal sequence comprises about 3 to about 7 consecutive thymine bases. In some embodiments, the early termination signal sequence comprises about 4 to about 6 consecutive thymine bases.
  • a scaffold sequence that comprises an early termination signal sequence has lower expression as compared to a scaffold sequence that does not comprise an early termination signal sequence.
  • a conventional scaffold sequence for a Cas protein described herein, e.g., SaCas9 comprises an early termination signal sequence.
  • the conventional scaffold sequence is based on the wild- type tracrRNA sequence of the Cas protein.
  • Conventional scaffold sequences for Cas proteins such as SaCas9 are described, e.g., in Ran et al., Nature 520:186-191, 2015.
  • the scaffold sequence of the present disclosure which does not comprise the early termination signal sequence, is capable of binding to a Cas protein with substantially similar affinity as the conventional scaffold sequence for the Cas protein.
  • the 5’ end of the conventional scaffold sequence comprises SEQ ID NO:1553.
  • SEQ ID NO:1553 comprises four consecutive thymine bases at nucleotide positions 2-5.
  • the scaffold sequence comprises SEQ ID NO:1553, except the T at nucleotide position 2 is A, G, or C. In some embodiments, the scaffold sequence comprises SEQ ID NO:1553, except the T at nucleotide position 3 is A, G, or C. In some embodiments, the scaffold sequence comprises SEQ ID NO:1553, except the T at nucleotide position 4 is A, G, or C. In some embodiments, the scaffold sequence comprises SEQ ID NO:1553, except the T at nucleotide position 5 is A, G, or C. [131] In some embodiments, the scaffold sequence comprises SEQ ID NO:1553, except that at least one of nucleotide positions 2-5 is A, G, or C.
  • the scaffold sequence comprises SEQ ID NO:1553 with the following modifications at nucleotide positions 2-5 of SEQ ID NO:1553: position 2 is A, G, or C, and positions 3, 4, and 5 are each T; position 3 is A, G, or C, and positions 2, 4, and 5 are each T; position 4 is A, G, or C, and positions 2, 3, and 5 are each T; position 5 is A, G, or C, and positions 2, 3, and 4 are each T; positions 2 and 3 are each independently A, G, or C, and positions 4 and 5 are each T; positions 2 and 4 are each independently A, G, or C, and positions 3 and 5 are each T; positions 2 and 5 are each independently A, G, or C, and positions 3 and 4 are each T; positions 3 and 4 are each independently A, G, or C, and positions 2 and 5 are each T; positions 3 and 4 are each independently A, G, or C, and positions 2 and 5 are each T; positions 3 and 5 are each independently A, G, or C, and positions 2 and 5 are each T; positions 3 and
  • the scaffold sequence comprises any one of SEQ ID NOs:1554-1560. In some embodiments, at least one of nucleotide positions 2-5 of SEQ ID NOs:1554-1560 is A, G, or C. [133] In some embodiments, the scaffold sequence comprises SEQ ID NO:1553 comprising a modification as described herein, i.e., a modified SEQ ID NO:1553, and a downstream sequence capable of base pairing with the modified SEQ ID NO:1553. In some embodiments, the downstream sequence capable of base pairing with the modified SEQ ID NO:1553 is a reverse complement of SEQ ID NO:1553. In some embodiments, the base pairing forms a loop structure in the scaffold sequence.
  • the scaffold sequence comprises (A) SEQ ID NO:1553 comprising a modification as described herein; and (B) a downstream sequence that is a reverse complement of (A). In some embodiments, the scaffold sequence comprises (A) any one of SEQ ID NOs:1554-1560; and (B) a downstream sequence that is a reverse complement of (A). In some embodiments, the scaffold sequence comprises (A) any one of SEQ ID NOs:1554-1560, wherein at least one of nucleotide positions 2-5 of SEQ ID NO:1554-1560 is A, G, or C; and (B) a downstream sequence that is a reverse complement of (A). In some embodiments, (A) and (B) are paired together to form a loop.
  • (A) and (B) are separated by about 1 to about 10 nucleotides. In some embodiments, (A) and (B) are separated by about 2 to about 8 nucleotides. In some embodiments, (A) and (B) are separated by about 3 to about 6 nucleotides. In some embodiments, (A) and (B) are separated by about 4 to about 5 nucleotides. In some embodiments, (A) and (B) are separated by about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, or about 10 nucleotides. In some embodiments, (A) and (B) are directly adjacent to one another.
  • (A) and (B) may be complementary without necessarily having complementarity at each nucleotide.
  • (A) and (B) include one or more nucleotide mismatches, i.e., points at which hydrogen bonding does not occur.
  • complementary oligonucleotides can have at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of nucleotides hydrogen bonded.
  • (A) and (B) are fully complementary.
  • the complementarity between (A) and (B) comprises Watson-Crick base pairing, Hoogsteen base pairing, or a combination thereof.
  • the scaffold sequence comprises a stabilized secondary structure.
  • the stabilized secondary structure comprises a sequence that promotes formation of the secondary structure.
  • the stabilized secondary structure comprises a sequence that improves stability of the secondary structure.
  • the stabilized secondary structure has improved binding affinity to its corresponding Cas protein, e.g., SaCas9.
  • the stabilized secondary structure comprises a stem loop.
  • the stabilized secondary structure comprises a modification that promotes folding and/or improves stability of a stem loop sequence present in a conventional scaffold sequence for a Cas protein described herein, e.g., SaCas9.
  • the conventional scaffold sequence is based on the wild-type tracrRNA sequence of the Cas protein.
  • the stem loop sequence comprises “GAAA.” In some embodiments, the stem loop sequence comprises “AAAAT.” In some embodiments, the stem loop sequence comprises “ACTT.”
  • the scaffold sequence comprises at least 1, at least 2, or at least 3 stem loops, wherein each stem loop comprises a modification described herein. In some embodiments, the modification comprises one or more additional nucleotides that are incorporated into the stem loop sequence. In some embodiments, the modification increases the number of hydrogen bonding interactions in the stem loop, thereby promoting folding and/or improving stability of the stem loop.
  • the modification comprises adding the flanking nucleotides “CTG” and “CAG” to the stem loop sequence. In some embodiments, the modification comprises adding the flanking nucleotides “TAATT” and “AATTA” to the stem loop sequence.
  • the stabilized secondary structure comprises a locked loop. In some embodiments, one or more stem loops of the conventional scaffold sequence is replaced with a locked loop.
  • the locked loop comprises a sequence that induces folding into a highly stabilized stem loop, e.g., with a melting temperature of at least 65°C, at least 66°C, at least 67°C, at least 68°C, at least 69°C, at least 70°C, or at least 71°C.
  • An exemplary locked loop sequence is GGACTTCGGTCC (SEQ ID NO:1540). Locked loops are further described, e.g., in Riesenberg et al., Nat Commun 13:489, 2022.
  • the stabilized secondary structure comprises the sequence GGACTTCGGTCC (SEQ ID NO:1540).
  • the stabilized secondary structure comprises the sequence CTGGAAACAG (SEQ ID NO:1541). In some embodiments, the stabilized secondary structure comprises the sequence TAATTGAAAAATTA (SEQ ID NO:1542). In some embodiments, the stabilized secondary structure comprises the sequence CTGAAAATCAG (SEQ ID NO:1543). In some embodiments, the stabilized secondary structure comprises the sequence TAATTAAAATAATTA (SEQ ID NO:1544). In some embodiments, the stabilized secondary structure comprises the sequence CTGACTTCAG (SEQ ID NO:1545). In some embodiments, the stabilized secondary structure comprises the sequence TAATTACTTAATTA (SEQ ID NO:1546).
  • the stabilized secondary structure comprises the sequence CAGGAAACTG (SEQ ID NO:1547). In some embodiments, the stabilized secondary structure comprises the sequence AATTAGAAATAATT (SEQ ID NO:1548). In some embodiments, the stabilized secondary structure comprises the sequence CAGAAAATCTG (SEQ ID NO:1549). In some embodiments, the stabilized secondary structure comprises the sequence AATTAAAAATTAATT (SEQ ID NO:1550). In some embodiments, the stabilized secondary structure comprises the sequence CAGACTTCTG (SEQ ID NO:1551). In some embodiments, the stabilized secondary structure comprises the sequence AATTAACTTTAATT (SEQ ID NO:1552).
  • the scaffold sequence comprises at least 1, at least 2, or at least 3 stem loops. In some embodiments, the scaffold sequence comprises at least three stem loops, wherein sequences of stem loops 1, 2, and 3 are according to Table 2. Table 2. Scaffold Sequence Stem Loops Combination # Stem loop 1 Stem loop 2 Stem loop 3 1a SEQ ID NO:1540 AAAAT ACTT 1b SEQ ID NO:1541 AAAAT ACTT 1c SEQ ID NO:1542 AAAAT ACTT 1d SEQ ID NO:1547 AAAAT ACTT 1e SEQ ID NO:1548 AAAAT ACTT 2a GAAA SEQ ID NO:1540 ACTT 2b GAAA SEQ ID NO:1543 ACTT 2c GAAA SEQ ID NO:1544 ACTT 2d GAAA SEQ ID NO:1549 ACTT 2e GAAA SEQ ID NO:1550 ACTT 3a GAAA AAAAT SEQ ID NO:1540 3b GAAA AAAAT SEQ ID NO:1545 3c GAAA
  • the scaffold sequence comprises (a) SEQ ID NO:1553 at a 5’ end; and (b) at least three stem loops, wherein stem loops 1, 2, and 3 comprise any of combinations 1a-7e of Table 2.
  • the scaffold sequence comprises any one of SEQ ID NOs:28, 29, and 44-46.
  • each of SEQ ID NOs:28, 29, and 44-46 comprises SEQ ID NO:1553 at a 5’ end.
  • each of SEQ ID NOs:28, 29, and 44-46 comprises a stabilized secondary structure, e.g., a stem loop, as described herein.
  • the scaffold sequence comprises (a) any one of SEQ ID NOs:1554-1560 at a 5’ end; and (b) at least three stem loops, wherein stem loops 1, 2, and 3 comprise the sequences GAAA, AAAAT, and ACTT, respectively.
  • the scaffold sequence comprises any one of SEQ ID NOs:21-27.
  • each of SEQ ID NOs:21-27 does not comprise an early termination signal sequence.
  • each of SEQ ID NOs:21-27 comprises three stem loops, wherein stem loops 1, 2, and 3 comprise the sequences GAAA, AAAAT, and ACTT, respectively.
  • the scaffold sequence (i) does not comprise an early termination signal sequence; and (ii) comprises a stabilized secondary structure. In some embodiments, the scaffold sequence provides improved stability and/or expression of the gRNA as compared to a scaffold sequence characterized by only one of (i) and (ii) or a scaffold sequence characterized by neither of (i) and (ii). [143] In some embodiments, the scaffold sequence comprises (i) any one of SEQ ID NOs:1554-1560 at a 5’ end; and (ii) at least three stem loops, wherein stem loops 1, 2, and 3 comprise any of combinations 1a-7e of Table 2, e.g., according to Table 3: Table 3.
  • each of SEQ ID NOs:28-43 and 47-95 does not comprise an early termination signal sequence and (ii) comprises a stabilized secondary structure, e.g., a stem loop, as described herein.
  • at least one of nucleotide positions 2-5 of each of SEQ ID NOs:28-43 and 47-95 is A, G, or C.
  • each of SEQ ID NOs:28-43 and 47-95 comprises at least three stem loops, wherein each stem loop independently comprises any one of SEQ ID NOs:1540-1552.
  • each of SEQ ID NOs:28-43 and 47-95 comprises at least three stem loops, wherein stem loop 1 comprises any one of SEQ ID NO:1540-1542, 1547, or 1548; stem loop 2 comprises any one of SEQ ID NOs:1540, 1543, 1544, 1549, or 1550; and/or stem loop 3 comprises any one of SEQ ID NOs:1540, 1545, 1546, 1551, or 1552.
  • the scaffold sequence comprises any one of SEQ ID NOs:20- 95.
  • the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • the scaffold sequence comprises SEQ ID NO:20.
  • the scaffold sequence comprises SEQ ID NO:21. In some embodiments, the scaffold sequence comprises SEQ ID NO:22. In some embodiments, the scaffold sequence comprises SEQ ID NO:25. In some embodiments, the scaffold sequence comprises SEQ ID NO:28. In some embodiments, the scaffold sequence comprises SEQ ID NO:29. In some embodiments, the scaffold sequence comprises SEQ ID NO:30. In some embodiments, the scaffold sequence comprises SEQ ID NO:31. In some embodiments, the scaffold sequence comprises SEQ ID NO:44. In some embodiments, the scaffold sequence comprises SEQ ID NO:45. In some embodiments, the scaffold sequence comprises SEQ ID NO:46. In some embodiments, the scaffold sequence comprises SEQ ID NO:47.
  • polynucleotide comprising guide and scaffold sequences
  • the polynucleotide of the present disclosure comprises a guide sequence described herein and a scaffold sequence described herein.
  • the polynucleotide is a first polynucleotide
  • the guide sequence is a first guide sequence.
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence (i) does not comprise an early termination signal sequence; and/or (ii) comprises a stabilized secondary structure.
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443, and nucleotide positions 2-5 of the scaffold sequence comprises at least one A, G, or C. In some embodiments, the first guide sequence targets human genome coordinates chr4:3078423-3078443, and a 5’ end of the scaffold sequence comprises any one of SEQ ID NOs:1554-1560. In some embodiments, the first guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence comprises a stabilized secondary structure.
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence comprises at least three stem loops, wherein stem loops 1, 2, and 3 comprise any of combinations 1a-7e of Table 2.
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence comprises a sequence according to Table 3.
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443, and the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • the first guide sequence comprises any one of SEQ ID NOs:2- 12, 13-19, and 1561-1576, wherein the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine, and the scaffold sequence is capable of binding a Cas protein, e.g., SpCas9 or SaCas9. In some embodiments, the scaffold sequence is capable of binding SaCas9. In some embodiments, the first guide sequence comprises any one of SEQ ID NOs:2-12, 13-19, and 1561-1576, wherein the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine, and the scaffold sequence does not comprise an early termination signal sequence.
  • the first guide sequence comprises any one of SEQ ID NOs:2-12, 13-19, and 1561-1576, wherein the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine, and the scaffold sequence comprises a stabilized secondary structure.
  • the first guide sequence comprises any one of SEQ ID NOs:2-12, 13-19, and 1561-1576, wherein the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine, and the scaffold sequence (i) does not comprise an early termination signal sequence; and (ii) comprises a stabilized secondary structure.
  • the first guide sequence comprises any one of SEQ ID NOs:2-12, 13-19, and 1561-1576, wherein the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine, and the scaffold sequence comprises any one of SEQ ID NOs:20-95.
  • the polynucleotide is a second polynucleotide, and the guide sequence is a second guide sequence.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence comprises SEQ ID NO:20.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence (i) does not comprise an early termination signal sequence; and/or (ii) comprises a stabilized secondary structure.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and nucleotide positions 2-5 of the scaffold sequence comprises at least one A, G, or C.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and a 5’ end of the scaffold sequence comprises any one of SEQ ID NOs:1554- 1560.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence comprises a stabilized secondary structure.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346- 3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence comprises at least three stem loops, wherein stem loops 1, 2, and 3 comprise any of combinations 1a-7e of Table 2.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence comprises a sequence according to Table 3.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346- 3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, e.g., any of the target sequences of Table 1, and the scaffold sequence comprises any one of SEQ ID NOs:21, 22, 25, 28-31, and 44-47.
  • the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576, and the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576
  • the scaffold sequence comprises any one of SEQ ID NOs:21, 22, 25, 28-31, and 44-47.
  • the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576
  • the scaffold sequence is capable of binding a Cas protein, e.g., SpCas9 or SaCas9.
  • the scaffold sequence is capable of binding SaCas9.
  • the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576, and the scaffold sequence does not comprise an early termination signal sequence. In some embodiments, the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576, and the scaffold sequence comprises a stabilized secondary structure. In some embodiments, the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576, and the scaffold sequence (i) does not comprise an early termination signal sequence; and (ii) comprises a stabilized secondary structure.
  • the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576
  • the scaffold sequence comprises any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises: a guide sequence comprising any one of SEQ ID NOs:2-12, 13-19, and 1561-1576, any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine; and a scaffold sequence comprising any one of SEQ ID NOs:21- 95.
  • the polynucleotide comprises a sequence according to Table 4 or Table 5.
  • the polynucleotide comprises any one of SEQ ID NOs:97-1007. In some embodiments, the polynucleotide comprises any one of SEQ ID NOs:97-171. In some embodiments, the polynucleotide comprises any one of SEQ ID NOs:172-247. In some embodiments, the polynucleotide comprises any one of SEQ ID NOs:97, 98, 101, 104-107, and 120-123. In some embodiments, the polynucleotide comprises any one of SEQ ID NOs:172-174, 177, 180-183, and 196-199. In some embodiments, the polynucleotide comprises SEQ ID NO:97.
  • the polynucleotide comprises SEQ ID NO:98. In some embodiments, the polynucleotide comprises SEQ ID NO:101. In some embodiments, the polynucleotide comprises SEQ ID NO:104. In some embodiments, the polynucleotide comprises SEQ ID NO:105. In some embodiments, the polynucleotide comprises SEQ ID NO:106. In some embodiments, the polynucleotide comprises SEQ ID NO:107. In some embodiments, the polynucleotide comprises SEQ ID NO:120. In some embodiments, the polynucleotide comprises SEQ ID NO:121. In some embodiments, the polynucleotide comprises SEQ ID NO:122.
  • the polynucleotide comprises SEQ ID NO:123. In some embodiments, the polynucleotide comprises SEQ ID NO:172. In some embodiments, the polynucleotide comprises SEQ ID NO:173. In some embodiments, the polynucleotide comprises SEQ ID NO:174. In some embodiments, the polynucleotide comprises SEQ ID NO:183. In some embodiments, the polynucleotide comprises SEQ ID NO:196. In some embodiments, the polynucleotide comprises SEQ ID NO:199. In some embodiments, the polynucleotide comprises SEQ ID NO:248. In some embodiments, the polynucleotide comprises SEQ ID NO:324.
  • the polynucleotide comprises SEQ ID NO:400. In some embodiments, the polynucleotide comprises SEQ ID NO:476. In some embodiments, the polynucleotide comprises SEQ ID NO:552. In some embodiments, the polynucleotide comprises SEQ ID NO:628. In some embodiments, the polynucleotide comprises SEQ ID NO:704. In some embodiments, the polynucleotide comprises SEQ ID NO:780. In some embodiments, the polynucleotide comprises SEQ ID NO:856. In some embodiments, the polynucleotide comprises SEQ ID NO:932.
  • the polynucleotide comprising any one of SEQ ID NOs:96- 1007 is a first polynucleotide, e.g., first gRNA.
  • the polynucleotide comprises any one of SEQ ID NOs:1008- 1539.
  • the polynucleotide comprises any one of SEQ ID NOs:1008- 1083.
  • the polynucleotide comprises any one of SEQ ID NOs:1008- 1010, 1013, 1016-1019, and 1032-1035.
  • the polynucleotide comprises SEQ ID NO:1008.
  • the polynucleotide comprises SEQ ID NO:1009. In some embodiments, the polynucleotide comprises SEQ ID NO:1010. In some embodiments, the polynucleotide comprises SEQ ID NO:1013. In some embodiments, the polynucleotide comprises SEQ ID NO:1016. In some embodiments, the polynucleotide comprises SEQ ID NO:1017. In some embodiments, the polynucleotide comprises SEQ ID NO:1018. In some embodiments, the polynucleotide comprises SEQ ID NO:1019. In some embodiments, the polynucleotide comprises SEQ ID NO:1032.
  • the polynucleotide comprises SEQ ID NO:1035. In some embodiments, the polynucleotide comprises SEQ ID NO:1084. In some embodiments, the polynucleotide comprises SEQ ID NO:1160. In some embodiments, the polynucleotide comprises SEQ ID NO:1236. In some embodiments, the polynucleotide comprises SEQ ID NO:1312. In some embodiments, the polynucleotide comprises SEQ ID NO:1388. In some embodiments, the polynucleotide comprises SEQ ID NO:1464.
  • the polynucleotide comprising any one of SEQ ID NOs:1008-1539 is a second polynucleotide, e.g., second gRNA.
  • the polynucleotide comprises a combination of SEQ ID NO:1561 and any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises a combination of SEQ ID NO:1562 and any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises a combination of SEQ ID NO:1563 and any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises a combination of SEQ ID NO:1564 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1565 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1566 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1567 and any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises a combination of SEQ ID NO:1568 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1569 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1570 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1571 and any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises a combination of SEQ ID NO:1572 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1573 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1574 and any one of SEQ ID NOs:20-95. In some embodiments, the polynucleotide comprises a combination of SEQ ID NO:1575 and any one of SEQ ID NOs:20-95.
  • the polynucleotide comprises a combination of SEQ ID NO:1576 and any one of SEQ ID NOs:20-95.
  • a combination of two sequences comprises the two sequences joined end-to-end (i.e., the 3’ end of the first indicated sequence joined directly to the 5’ end of the second indicated sequence).
  • the polynucleotide comprising the combination of any one of SEQ ID NOs:1561-1576 with any one of SEQ ID NOs:20-95 is a second polynucleotide, e.g., second gRNA.
  • the disclosure provides an expression system comprising a nucleic acid sequence encoding one or more of the polynucleotides described herein.
  • the expression system comprises a first nucleic acid sequence encoding the first polynucleotide (e.g., first gRNA) described herein.
  • the expression system comprises a second nucleic acid sequence encoding the second polynucleotide (e.g., second gRNA) described herein.
  • the first polynucleotide comprises a first guide sequence and a first scaffold sequence as described herein.
  • the second polynucleotide comprises a second guide sequence and a second scaffold sequence as described herein.
  • the expression system further comprises a third nucleic acid sequence encoding an RNA-guided nuclease described herein.
  • the third nucleic acid sequence encodes a Cas protein.
  • the Cas protein is Cas9.
  • the Cas protein is capable of forming a complex with the first polynucleotide.
  • the Cas protein is capable of forming a complex with the second polynucleotide.
  • the Cas protein is capable of binding to the first scaffold sequence.
  • the Cas protein is capable of binding to the second scaffold sequence.
  • the Cas protein is SaCas9 or SpCas9.
  • the Cas protein is SaCas9.
  • the disclosure provides a composition comprising an RNA- guided nuclease and one or more polynucleotides described herein.
  • the disclosure provides a composition comprising a Cas protein and one or more polynucleotides described herein.
  • the composition comprises a Cas protein and the first polynucleotide (e.g., first gRNA) described herein.
  • the composition comprises a Cas protein and the second polynucleotide (e.g., second gRNA) described herein.
  • the first polynucleotide comprises a first guide sequence and a first scaffold sequence.
  • the second polynucleotide comprises a second guide sequence and a second scaffold sequence.
  • the Cas protein is capable of binding to the first scaffold sequence, thereby forming a complex with the first polynucleotide.
  • the Cas protein is capable of binding to the second scaffold sequence, thereby forming a complex with the second polynucleotide.
  • the Cas protein is Cas9.
  • the Cas protein is SaCas9 or SpCas9. In some embodiments, the Cas protein is SaCas9. In some embodiments, the composition comprises a Cas protein and an expression system described herein.
  • the first guide sequence targets human genome coordinates chr4:3078423-3078443 as described herein. In some embodiments, the first guide sequence comprises any one of SEQ ID NOs:1-12, wherein the first guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine.
  • the second guide sequence targets a sequence within human genome coordinates chr4:3072397-3074774 as described herein.
  • the second guide sequence targets a target sequence of Table 1. In some embodiments, the second guide sequence targets a sequence within human genome coordinates chr4:3074430-3074450 as described herein. In some embodiments, the second guide sequence comprises any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576. In some embodiments, the second guide sequence comprises SEQ ID NO:13. In some embodiments, the first guide sequence and/or the second guide sequence does not form a secondary structure. In some embodiments, the first guide sequence and/or the second guide sequence does not form a stem loop. [158] In some embodiments, the first and second scaffold sequences are each capable of binding to a Cas protein.
  • the Cas protein is Cas9, e.g., SaCas9 or SpCas9.
  • the first and second scaffold sequences are each capable of binding to the Cas protein of the composition.
  • the first scaffold sequence and the second scaffold sequence each independently (i) does not comprise an early termination signal sequence and (ii) comprises a stabilized secondary structure.
  • the early termination signal sequence comprises about 2 to about 8, or about 3 to about 7, or about 4 to about 6 consecutive thymine bases.
  • a 5’ end of the first scaffold sequence and/or the second scaffold sequence comprises any one of SEQ ID NOs:1554-1560.
  • the stabilized secondary structure comprises a locked loop.
  • the stabilized secondary structure comprises any one of SEQ ID NOs:1540-1552.
  • the first scaffold sequence and the second scaffold sequence each independently comprises at least three stem loops, wherein stem loops 1, 2, and 3 comprise any of combinations 1a-7e of Table 2.
  • the first scaffold sequence and the second scaffold sequence each independently comprises a 5’ end sequence and stem loop sequences as shown in Table 3.
  • the first scaffold sequence and the second scaffold sequence each independently comprises any one of SEQ ID NOs:20-95.
  • the first scaffold sequence and the second scaffold sequence are identical.
  • the first scaffold sequence and the second scaffold sequence are different.
  • the expression system comprises a first nucleic acid sequence encoding a first polynucleotide as described herein.
  • the composition comprises a first polynucleotide as described herein.
  • the first polynucleotide encoded by the first nucleic acid sequence of the expression system, or the first polynucleotide of the composition comprises: (i) a first guide sequence targeting human genome coordinates chr4:3078423-3078443; and (ii) a first scaffold sequence comprising any one of SEQ ID NOs:21-95.
  • the first polynucleotide comprises: (i) a first guide sequence comprising any one of SEQ ID NOs:2-12, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine; and optionally (ii) a first scaffold sequence.
  • the first polynucleotide encoded by the first nucleic acid sequence of the expression system, or the first polynucleotide of the composition comprises: (i) a first guide sequence comprising any one of SEQ ID NOs:2-12, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine; and (ii) a first scaffold sequence comprising any one of SEQ ID NOs:20-95.
  • the first polynucleotide comprises: (i) a first guide sequence comprising any one of SEQ ID NOs:1- 12; and (ii) a first scaffold sequence comprising any one of SEQ ID NOs:21-95.
  • the expression system comprises a second nucleic acid sequence encoding a second polynucleotide as described herein.
  • the composition comprises a second polynucleotide as described herein.
  • the second polynucleotide encoded by the second nucleic acid sequence of the expression system, or the second polynucleotide of the composition comprises: (i) a second guide sequence targeting a sequence within human genome coordinates chr4:3068346-3074862; and (ii) a second scaffold sequence comprising any one of SEQ ID NOs:21-95.
  • the second polynucleotide comprises: (i) a second guide sequence comprising any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576; and optionally (ii) a second scaffold sequence.
  • the second polynucleotide encoded by the second nucleic acid sequence of the expression system, or the second polynucleotide of the composition comprises: (i) a second guide sequence comprising any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576; and (ii) a second scaffold sequence comprising any one of SEQ ID NOs:20-95.
  • the expression system comprises (a) a first nucleic acid sequence encoding a first polynucleotide and (b) a second nucleic acid sequence encoding a second polynucleotide as described herein.
  • the composition comprises (a) a first polynucleotide and (b) a second polynucleotide as described herein.
  • the expression system further comprises a third nucleic acid sequence encoding a Cas protein.
  • the composition further comprises a Cas protein.
  • the Cas protein is SaCas9.
  • the first polynucleotide comprises (i) a first guide sequence targeting human genome coordinates chr4:3078423-3078443; and (ii) a first scaffold sequence comprising any one of SEQ ID NOs:21-95; and (b) the second polynucleotide comprises (i) a second guide sequence targeting a sequence within human genome coordinates chr4:3068346-3074862, and (ii) a second scaffold sequence comprising any one of SEQ ID NOs:21-95.
  • the first polynucleotide comprises (i) a first guide sequence targeting human genome coordinates chr4:3078423-3078443; and (ii) a first scaffold sequence comprising any one of SEQ ID NOs:21-95; and (b) the second polynucleotide comprises (i) a second guide sequence comprising any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576; and optionally (ii) a second scaffold sequence.
  • the first polynucleotide comprises (i) a first guide sequence comprising any one of SEQ ID NOs:2-12, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine; and optionally (ii) a first scaffold sequence; and (b) the second polynucleotide comprises (i) a second guide sequence targeting a sequence within human genome coordinates chr4:3068346-3074862; and (ii) a second scaffold sequence comprising any one of SEQ ID NOs:21-95.
  • the first polynucleotide comprises (i) a first guide sequence comprising any one of SEQ ID NOs:2-12, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5’ guanine; and optionally (ii) a first scaffold sequence; and
  • the second polynucleotide comprises (i) a second guide sequence comprising any one of SEQ ID NOs:13-19 or any one of SEQ ID NOs:1561-1576; and optionally (ii) a second scaffold sequence.
  • the first polynucleotide encoded by the first nucleic acid sequence of the expression vector, or the first polynucleotide of the composition comprises any one of SEQ ID NOs:96-1007. In some embodiments, first polynucleotide comprises any one of SEQ ID NOs:97-1007. In some embodiments, the first polynucleotide comprises any one of SEQ ID NOs:96-98, 101, 104-107, 120-123, 172-174, 177, 180-183, and 196-199.
  • the first polynucleotide comprises any one of SEQ ID NOs:97, 98, 101, 104-107, 120-123, 172-174, 177, 180-183, and 196-199. In some embodiments, the first polynucleotide comprises any one of SEQ ID NOs:96-98, 101, 104-107, 120-123, 172-174, 183, 196, 199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • the first polynucleotide comprises any one of SEQ ID NOs:97, 98, 101, 104-107, 120-123, 172-174, 183, 196, 199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • the second polynucleotide encoded by the second nucleic acid sequence of the expression vector, or the second polynucleotide of the composition comprises any one of SEQ ID NOs:1008-1539.
  • the second polynucleotide comprises any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, and 1032-1035.
  • the second polynucleotide comprises any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, 1032, 1035, 1084, 1160, 1236, 1312, 1388, and 1464. In some embodiments, the second polynucleotide comprises a combination of any one of SEQ ID NOs:1561-1576 with any one of SEQ ID NOs:20-95. [172] In some embodiments, the first and second polynucleotides encoded respectively by the first and second nucleic acid sequences of the expression system, or the first and second polynucleotides of the composition, comprise the sequences according to Table 6. Table 6.
  • the first polynucleotide comprises SEQ ID NO:174
  • the second polynucleotide comprises SEQ ID NO:1010.
  • the first polynucleotide comprises SEQ ID NO:174
  • the second polynucleotide comprises SEQ ID NO:1013.
  • the first polynucleotide comprises SEQ ID NO:199
  • the second polynucleotide comprises SEQ ID NO:1010.
  • the first polynucleotide comprises SEQ ID NO:199
  • the second polynucleotide comprises SEQ ID NO:1013.
  • the expression system comprises a vector.
  • the composition comprises a vector.
  • the first nucleic acid sequence, the second nucleic acid sequence, the third nucleic acid sequence, or any combination thereof, are on the vector.
  • the vector is an expression vector.
  • the vector is a viral vector.
  • the vector is a non-viral vector.
  • the vector is a bacterial expression vector.
  • the vector is a mammalian expression vector.
  • the vector is a human expression vector. Exemplary vectors are described herein. [175] In some embodiments, the vector is a viral vector.
  • the viral vector is a vector from retrovirus, adeno-associated virus, pox, baculovirus, vaccinia, herpes simplex, Epstein-Barr virus, adenovirus, geminivirus, or caulimovirus.
  • the viral vector is a lentiviral vector, an adenoviral vector, or an adeno-associated viral (AAV) vector. Methods of introducing vectors, e.g., viral vectors, into cells are described herein.
  • viral transduction with adenoviral, AAV, and lentiviral vectors is a delivery method for in vivo gene therapy.
  • the RNA stabilizing sequence is located 3’ of the first, second, and/or third polynucleotide on the vector.
  • the RNA stabilizing sequence is a posttranscriptional regulatory element (PRE).
  • the RNA stabilizing sequence is a woodchuck hepatitis virus posttranscriptional regulatory element (WPRE).
  • the expression system comprises the first nucleic acid sequence and the second nucleic acid sequence.
  • the first nucleic acid sequence and the second nucleic acid sequence are on a single vector.
  • the first nucleic acid sequence and the second nucleic acid sequence are on separate vectors.
  • the expression system comprises the first nucleic acid sequence and the third nucleic acid sequence. In some embodiments, the first nucleic acid sequence and the third nucleic acid sequence are on a single vector. In some embodiments, the first nucleic acid sequence and the third nucleic acid sequence are on separate vectors. In some embodiments, the expression system comprises the second nucleic acid sequence and the third nucleic acid sequence. In some embodiments, the second nucleic acid sequence and the third nucleic acid sequence are on a single vector. In some embodiments, the second nucleic acid sequence and the third nucleic acid sequence are on separate vectors. In some embodiments, each nucleic acid sequence of the expression system is operably linked to a distinct regulatory element.
  • one regulatory element is operably linked to more than one nucleic acid sequence of the expression system.
  • the expression system comprises the first nucleic acid sequence, the second nucleic acid sequence, and the third nucleic acid sequence.
  • the first nucleic acid sequence, the second nucleic acid sequence, and the third nucleic acid sequence are on a single vector.
  • the first and second nucleic acid sequences are on a first vector
  • the third nucleic acid sequence is on a second vector.
  • the first and third nucleic acid sequences are on a first vector
  • the second nucleic acid sequence is on a second vector.
  • the second and third nucleic acid sequences are on a first vector, and the first nucleic acid sequence is on a second vector. In some embodiments, each of the first, second, and third nucleic acid sequences is on a separate vector. In some embodiments, each nucleic acid sequence of the expression system is operably linked to a distinct regulatory element. In some embodiments, one regulatory element is operably linked to more than one nucleic acid sequence of the expression system.
  • the composition comprises the Cas protein, the first polynucleotide, and the second polynucleotide described herein. In some embodiments, the composition comprises the Cas protein and the first polynucleotide described herein.
  • the composition comprises the Cas protein and the second polynucleotide described herein.
  • the first polynucleotide of the composition is encoded by a first nucleic acid sequence.
  • the second polynucleotide of the composition is encoded by a second nucleic acid sequence.
  • the first nucleic acid sequence and/or the second nucleic acid sequence is on a vector. Exemplary vectors are provided herein.
  • the vector is a viral vector.
  • the viral vector is a lentiviral vector, an adenoviral vector, or an adeno- associated viral vector.
  • the disclosure provides a delivery particle comprising a polynucleotide described herein, an expression system described herein, a composition described herein, or combination thereof.
  • the polynucleotide is a first polynucleotide (e.g., first gRNA) described herein or a second polynucleotide (e.g., second gRNA) described herein.
  • the expression system comprises a first nucleic acid sequence encoding a first polynucleotide described herein; and/or a second nucleic acid sequence encoding a second polynucleotide described herein.
  • the composition comprises a first polynucleotide described herein; and/or a second polynucleotide described herein.
  • the first polynucleotide comprises a first guide sequence and a first scaffold sequence.
  • the second polynucleotide comprises a second guide sequence and a second scaffold sequence.
  • the expression system further comprises a third nucleic acid sequence encoding an RNA-guided nuclease described herein, e.g., a Cas protein such as SaCas9.
  • the composition further comprises an RNA-guided nuclease described herein, e.g., a Cas protein such as SaCas9.
  • Delivery particles for delivering biological components are known to one of ordinary skill in the art. Delivery particles may be in any form, including but not limited to: solid, semi-sold, emulsion, or colloidal particles.
  • the delivery particle is a lipid-based particle, a virus-like particle, a liposome, a micelle, a vesicle or microvesicle, an exosome, or a lipid nanoparticle.
  • Delivery particles are further described, e.g., in US 2008/0234183, US 2011/0293703, US 2012/0251560, US 2013/0302401, US 2019/0167810, US 2020/0207833, US 5,543,158, US 5,855,913, US 5,895,309, US 6,007,845, and US 8,709,843.
  • the polynucleotide, expression system, and/or composition described herein are comprised in a single delivery particle.
  • the polynucleotide, expression system, and/or composition described herein are comprised in multiple delivery particles.
  • the delivery particle is configured to deliver the polynucleotide, expression system, composition, or combination thereof into a cell.
  • the disclosure provides a cell comprising a polynucleotide described herein, an expression system described herein, a composition described herein, a delivery particle described herein, or a combination thereof.
  • the polynucleotide is a first polynucleotide (e.g., first gRNA) described herein or a second polynucleotide (e.g., second gRNA) described herein.
  • the expression system comprises a first nucleic acid sequence encoding a first polynucleotide described herein; and/or a second nucleic acid sequence encoding a second polynucleotide described herein.
  • the composition comprises a first polynucleotide described herein; and/or a second polynucleotide described herein.
  • the first polynucleotide comprises a first guide sequence and a first scaffold sequence.
  • the second polynucleotide comprises a second guide sequence and a second scaffold sequence.
  • the expression system further comprises a third nucleic acid sequence encoding an RNA-guided nuclease described herein, e.g., a Cas protein such as SaCas9.
  • the composition further comprises an RNA-guided nuclease described herein, e.g., a Cas protein such as SaCas9.
  • the cell is a bacterial cell.
  • the cell is a eukaryotic cell.
  • the cell is an animal cell.
  • the cell is a mammalian cell.
  • the cell is a human cell.
  • the cell is from an animal or human cell line.
  • animal or human cells and cell lines include, but are not limited to, NSO, CHO, HT1080, H9, HepG2, MCF7, MDBK Jurkat, NIH3T3, PC12, BHK, EBX, EB14, EB24, EB26, EB66, Ebvl3, VERO, SP2/0, YB2/0, Y0, C127, L cell, COS (e.g., COS1 and COS7), QC1-3, VERO, PER.C6, HEK293, HeLa, and NT2.
  • the cell is a neuronal cell.
  • the cell is a neuronal precursor cell line.
  • the cell is from an animal or human cellular model for Huntington’s Disease (HD).
  • the cell is derived from a human subject having HD.
  • the cell is a fibroblast cell, e.g., derived from a human subject having HD.
  • the cell is a human stem cell, e.g., an induced pluripotent stem cell (iPSC), an embryonic stem cell (ESC), a tissue specific stem cell (e.g., neural stem cell), or a mesenchymal stem cell (MSC).
  • the cell is a neuronal cell differentiated from a stem cell described herein.
  • Exemplary cells and cell lines for modeling HD are further described, e.g., in Szlachcic et al., Mol Neurosci 10:253, 2017; Hung et al., Mol Biol Cell 29(23):2809-2820, 2018; and Le Cann et al., Sci Rep 11:6934, 2021.
  • Further exemplary cell lines for modeling HD include, but are not limited to, Huntingtin 150Q Stable PC12 Cell Line (Cat. No. T6018) from Applied Biological Materials Inc.; and cell lines GM04281, GM04282, AND GM23225 in the NIGMS Human Genetic Cell Repository, available from the Coriell Institute for Medical Research.
  • the disclosure provides a method of reducing CAG repeats in a mutant allele of a huntingtin (HTT) gene of a cell, comprising introducing to the cell: the polynucleotide, expression system, composition, or delivery particle described herein, or a combination thereof.
  • the disclosure provides a method of treating HD in a subject in need thereof, comprising administering to the subject: the polynucleotide, expression system, composition, or delivery particle described herein, or a combination thereof.
  • the subject comprises a heterozygous HD allele, i.e., an HTT gene comprising a mutant allele and a wild-type allele.
  • the first polynucleotide e.g., first gRNA
  • second polynucleotide e.g., second gRNA
  • Cas protein described herein are present in the cell following the introducing.
  • the first polynucleotide e.g., first gRNA
  • second polynucleotide e.g., second gRNA
  • Cas protein is SaCas9.
  • the first polynucleotide comprises a first guide sequence and first scaffold sequence as described herein.
  • the second polynucleotide comprises a second guide sequence and second scaffold sequence as described herein.
  • the first guide sequence targets a first target sequence, i.e., chr4:3078423-3078443 as described herein.
  • the second guide sequence targets a second target sequence, i.e., a sequence within human genome coordinates chr4:3068346-3074862 as described herein.
  • the HTT gene of the cell and/or the subject comprises a mutant allele and a wild-type allele.
  • the mutant allele comprises a PAM (e.g., for SaCas9) adjacent to the first target sequence, and the wild-type allele does not comprise the PAM adjacent to the first target sequence.
  • a first complex comprising the first polynucleotide and the Cas protein, e.g., SaCas9, is guided to the first target sequence by the first guide sequence of the first polynucleotide.
  • the Cas protein e.g., SaCas9, recognizes the PAM adjacent to the first target sequence in the mutant allele and cleaves the mutant allele at the first target sequence.
  • the Cas protein e.g., SaCas9
  • both the wild-type allele and the mutant allele comprise a PAM (e.g., for SaCas9) adjacent to the second target sequence.
  • a second complex comprising the second polynucleotide and the Cas protein, e.g., SaCas9, is guided to the second target sequence by the second guide sequence of the second polynucleotide.
  • the Cas protein e.g., SaCas9
  • the Cas protein recognizes the PAM adjacent to the second target sequence in both the wild-type and mutant alleles, and cleaves both the wild-type and mutant alleles at the second target sequence.
  • the mutant allele is cleaved at two sites (i.e., the first target sequence and the second target sequence), thereby excising the region between the two sites.
  • the region between the two sites comprises the expanded CAG repeats associated with HD.
  • the CAG repeats region of the wild-type allele is not excised due to only one site of cleavage (i.e., the second target sequence).
  • the cleaved second target sequence in the wild-type allele is ligated and/or repaired, e.g., via native cellular repair pathways such as homology-directed repair or non- homologous end joining. See, e.g., FIG. 1.
  • the mutant allele of the cell and/or the subject comprises greater than 27 CAG repeats prior to the introducing into the cell and/or the administering into the cell. In some embodiments, the mutant allele of the cell and/or the subject comprises greater than 35 CAG repeats prior to the introducing and/or the administering. In some embodiments, the mutant allele of the cell and/or the subject comprises fewer than 35 CAG repeats following the introducing and/or the administering.
  • the mutant allele of the cell and/or the subject comprises fewer than 27 CAG repeats following the introducing and/or the administering. In some embodiments, following the introducing and/or the administering, all CAG repeats of the mutant allele are removed. In some embodiments, following the introducing and/or the administering, exon 1 of the mutant allele is removed. [192] In some embodiments, the method provided herein reduces levels of RNA encoded by the mutant allele by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95%. In some embodiments, the method reduces levels of RNA encoded by the mutant allele by at least 30%.
  • the method reduces levels of RNA encoded by the mutant allele by at least 50%. In some embodiments, the method reduces levels of protein expressed from the mutant allele by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95%. In some embodiments, the method reduces levels of protein expressed from the mutant allele by at least 30%. In some embodiments, the method reduces levels of protein expressed from the mutant allele by at least 50%.
  • the method provided herein does not reduce levels of RNA encoded by the wild-type allele and/or levels of protein expressed from the wild-type allele by more than 10%, more than 20%, more than 30%, more than 40%, more than 50%, more than 60%, or more than 70%. In some embodiments, the method does not reduce levels of RNA encoded by the wild-type allele and/or levels of protein expressed from the wild-type allele by more than 50%. In some embodiments, the method (1) reduces levels of RNA encoded by the mutant allele and/or levels of protein expressed from the mutant allele by at least 30%; and (2) does not reduce levels of RNA encoded by the wild-type allele and/or levels of protein expressed from the wild-type allele by more than 50%.
  • the method (1) reduces levels of RNA encoded by the mutant allele and/or levels of protein expressed from the mutant allele by at least 50%; and (2) does not reduce levels of RNA encoded by the wild-type allele and/or levels of protein expressed from the wild-type allele by more than 50%.
  • Embodiment 1 A polynucleotide comprising a guide sequence and a scaffold sequence, wherein the guide sequence targets human genome coordinates chr4:3078423- 3078443, and the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • Embodiment 2. The polynucleotide of embodiment 1, wherein the guide sequence does not form a secondary structure.
  • Embodiment 3 The polynucleotide of embodiment 2, wherein the secondary structure is a stem loop.
  • Embodiment 5 A polynucleotide comprising a guide sequence and a scaffold sequence, wherein the guide sequence comprises any one of SEQ ID NOs:2-12, and wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5' guanine.
  • Embodiment 6 The polynucleotide of embodiment 5, wherein the guide sequence comprises SEQ ID NO:2 and does not comprise a 5' guanine.
  • Embodiment 7. The polynucleotide of embodiment 5 or 6, wherein the scaffold sequence is capable of binding to a Cas protein.
  • Embodiment 8 The polynucleotide of embodiment 7, wherein the Cas protein is Cas9 from Staphylococcus aureus (SaCas9).
  • Embodiment 9. The polynucleotide of any one of embodiments 1 to 8, wherein the scaffold sequence does not comprise an early termination signal sequence.
  • Embodiment 10. The polynucleotide of embodiment 9, wherein the early termination signal sequence comprises 4 to 6 consecutive thymine bases.
  • Embodiment 11 The polynucleotide of any one of embodiments 1 to 10, wherein the scaffold sequence comprises a stabilized secondary structure.
  • Embodiment 13 The polynucleotide of any one of embodiments 1 to 12, wherein the scaffold sequence comprises any one of SEQ ID NOs:20-95.
  • Embodiment 14 The polynucleotide of embodiment 13, wherein the scaffold sequence comprises any one of SEQ ID NOs:20-22, 25, 28-31, and 44-47.
  • Embodiment 15 A polynucleotide comprising any one of SEQ ID NOs:97-1007.
  • Embodiment 16 A polynucleotide comprising any one of SEQ ID NOs:97-1007.
  • the polynucleotide of embodiment 15, comprising any one of SEQ ID NOs:97, 98, 101, 104-107, 120-123, 172-174, 177, 180-183, and 196-199.
  • Embodiment 17 The polynucleotide of embodiment 15, comprising any one of SEQ ID NOs: 97, 98, 101, 104-107, 120-123, 172-174, 183, 196, 199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • a polynucleotide comprising a guide sequence and a scaffold sequence wherein: (i) the guide sequence comprises any one of SEQ ID NOs:2-12, and the scaffold sequence comprises any one of SEQ ID NOs:20-95, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5' guanine; or (ii) the guide sequence comprises any one of SEQ ID NOs:1-12, and the scaffold sequence comprises any one of SEQ ID NOs:21-95.
  • Embodiment 19 An expression system comprising a nucleic acid sequence encoding the polynucleotide of embodiments 1-18.
  • nucleic acid sequence is a first nucleic acid sequence
  • the polynucleotide is a first polynucleotide
  • the guide sequence is a first guide sequence
  • the scaffold sequence is a first scaffold sequence
  • the expression system further comprises a second nucleic acid sequence encoding a second polynucleotide, wherein the second polynucleotide comprises a second guide sequence and a second scaffold sequence, wherein the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862.
  • Embodiment 21 The expression system of embodiment 20, wherein the second guide sequence does not form a secondary structure.
  • Embodiment 23 The expression system of any one of embodiments 20 to 22, wherein the second guide sequence comprises any one of SEQ ID NOs:13-19. [219] Embodiment 24. The expression system of embodiment 23, wherein the second guide sequence comprises SEQ ID NO:13. [220] Embodiment 25. The expression system of any one of embodiments 20 to 24, wherein the second scaffold sequence is capable of binding to a Cas protein. [221] Embodiment 26. The expression system of embodiment 25, wherein the Cas protein is SaCas9. [222] Embodiment 27. The expression system of any one of embodiments 20 to 26, wherein the second scaffold sequence does not comprise an early termination signal sequence.
  • Embodiment 28 The expression system of embodiment 27, wherein the early termination signal sequence comprises 4 to 6 consecutive thymine bases.
  • Embodiment 29 The expression system of any one of embodiments 20 to 28, wherein the second scaffold sequence comprises a stabilized secondary structure.
  • Embodiment 30 The expression system of embodiment 29, wherein the stabilized secondary structure comprises a locked loop.
  • Embodiment 31 The expression system of any one of embodiments 20 to 30, wherein the second scaffold sequence comprises any one of SEQ ID NOs:20-95.
  • Embodiment 32 The expression system of any one of embodiments 20 to 30, wherein the second scaffold sequence comprises any one of SEQ ID NOs:20-22, 25, 28-31, and 44- 47.
  • Embodiment 33 The expression system of any one of embodiments 20 to 32, wherein the second polynucleotide comprises any one of SEQ ID NOs:1008-1539.
  • Embodiment 34 The expression system of embodiment 33, wherein the second polynucleotide comprises any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, and 1032- 1035.
  • Embodiment 35 The expression system of embodiment 33, wherein the second polynucleotide comprises any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, 1032, 1035, 1084, 1160, 1236, 1312, 1388, and 1464.
  • Embodiment 36 Embodiment 36.
  • An expression system comprising: a first nucleic acid sequence encoding a first polynucleotide comprising any one of SEQ ID NOs:96-1007; and a second nucleic acid sequence encoding a second polynucleotide comprising any one of SEQ ID NOs:1008-1539.
  • Embodiment 37 The expression system of embodiment 36, wherein the first polynucleotide comprises any one of SEQ ID NOs:96-98, 101, 104-107, 120-123, 172-174, 177, 180-183, and 196-199.
  • Embodiment 38 Embodiment 38.
  • Embodiment 39 The expression system of any one of embodiments 36 to 38, wherein the second polynucleotide comprises any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, and 1032-1035. [235] Embodiment 40.
  • Embodiment 41 The expression system of any one of embodiments 19 to 40, wherein the expression system comprises a vector.
  • Embodiment 42 The expression system of embodiment 41, wherein the vector is a viral vector.
  • Embodiment 43 The expression system of embodiment 42, wherein the viral vector is a lentiviral vector, an adenoviral vector, or an adeno-associated viral vector.
  • Embodiment 45 The expression system of any one of embodiments 20 to 43, wherein the first nucleic acid sequence and the second nucleic acid sequence are on a single vector.
  • Embodiment 45 The expression system of any one of embodiments 20 to 43, wherein each of the first nucleic acid sequence and the second nucleic acid sequence is on a separate vector.
  • Embodiment 46 The expression system of embodiment 19, wherein the nucleic acid sequence is a first nucleic acid sequence, the polynucleotide is a first polynucleotide, and wherein the expression system further comprises a third nucleic acid sequence encoding a Cas protein capable of forming a complex with the first polynucleotide.
  • Embodiment 47 Embodiment 47.
  • Embodiment 46 The expression system of embodiment 46, wherein the third nucleic acid sequence is on a separate vector from the first nucleic acid sequence .
  • Embodiment 48 The expression system of embodiment 46, wherein the first nucleic acid sequence and the third nucleic acid sequence are on a single vector.
  • Embodiment 49 The expression system of any one of embodiments 20 to 45, further comprising a third nucleic acid sequence encoding a Cas protein capable of forming a complex with the first polynucleotide and/or the second polynucleotide.
  • Embodiment 50 The expression system of embodiment 49, wherein the first, second, and third nucleic acid sequences are on a single vector.
  • Embodiment 51 The expression system of embodiment 49, wherein the first and second nucleic acid sequences are on a first vector, and the third nucleic acid sequence is on a second vector; or wherein the first and third nucleic acid sequences are on a first vector, and the second nucleic acid sequence is on a second vector; or wherein the second and third nucleic acid sequences are on a first vector, and the first nucleic acid sequence is on a second vector.
  • Embodiment 52 The expression system of embodiment 49, wherein each of the first, second, and third nucleic acid sequences is on a separate vector.
  • Embodiment 53 Embodiment 53.
  • Embodiment 54 A composition comprising a Cas protein, and one or both of: a) a first polynucleotide comprising a first guide sequence, wherein: (i) the first guide sequence targets human genome coordinates chr4:3078423-3078443, and the first polynucleotide further comprises a first scaffold sequence comprising any one of SEQ ID NOs:21-95; or (ii) the first guide sequence comprises any one of SEQ ID NOs:2-12, wherein the guide sequence comprising SEQ ID NO:2 does not comprise a 5' guanine; and b) a second polynucleotide comprising a second guide sequence, wherein: (i) the second guide sequence targets a sequence within human genome coordinates chr4:3068346-3074862, and the second polynucleotide further comprises a second scaffold sequence comprising any one of S
  • Embodiment 55 The composition of embodiment 54, wherein the Cas protein is Cas9 from Staphylococcus aureus (SaCas9).
  • Embodiment 56 The composition of embodiment 54 or 55, wherein the first guide sequence and/or the second guide sequence does not form a secondary structure.
  • Embodiment 57 The composition of embodiment 56, wherein the secondary structure is a hairpin loop.
  • Embodiment 58 The composition of any one of embodiments 54 to 57, comprising both the first and second polynucleotides.
  • Embodiment 59 The composition of any one of embodiments 54 to 57, comprising both the first and second polynucleotides.
  • Embodiment 60 The composition of any one of embodiments 54 to 59, wherein the second polynucleotide comprises the second guide sequence comprising any one of SEQ ID NOs:13-19, and wherein the second polynucleotide further comprises a second scaffold sequence.
  • Embodiment 61 Embodiment 61.
  • Embodiment 62 The composition of any one of embodiments 54 to 61, wherein the first scaffold sequence and the second scaffold sequence are identical.
  • Embodiment 63 The composition of any one of embodiments 54 to 61, wherein the first scaffold sequence and the second scaffold sequence are different.
  • Embodiment 64 The composition of any one of embodiments 54 to 63, wherein the first scaffold sequence and/or the second scaffold sequence does not comprise an early termination signal sequence.
  • Embodiment 65 The composition of embodiment 64, wherein the early termination signal sequence comprises 4 to 6 consecutive thymine bases.
  • Embodiment 66 The composition of any one of embodiments 54 to 65, wherein the first scaffold sequence and/or the second scaffold sequence comprises a stabilized secondary structure.
  • Embodiment 67 The composition of embodiment 66, wherein the stabilized secondary structure comprises a locked hairpin loop.
  • Embodiment 68 The composition of any one of embodiments 54 to 67, wherein the first scaffold sequence and/or the second scaffold sequence comprises any one of SEQ ID NOs:20-96.
  • Embodiment 69 The composition of any one of embodiments 54 to 68, wherein the first scaffold sequence and/or the second scaffold sequence comprises any one of SEQ ID NOs:20-22, 25, 28-31, and 44-47.
  • Embodiment 70 The composition of any one of embodiments 54 to 69, wherein the first polynucleotide comprises any one of SEQ ID NOs:96-1007.
  • Embodiment 71 The composition of any one of embodiments 54 to 70, wherein the first polynucleotide comprises any one of SEQ ID NOs:96-98, 101, 104-107, 120-123, 172- 174, 177, 180-183, 196-199, 248, 324, 400, 476, 552, 628, 704, 780, 856, and 932.
  • Embodiment 72 Embodiment 72.
  • Embodiment 73 The composition of any one of embodiments 54 to 72, wherein the second polynucleotide comprises any one of SEQ ID NOs:1008-1010, 1013, 1016-1019, 1032-1035, 1084, 1160, 1236, 1312, 1388, and 1464.
  • Embodiment 74 The composition of any one of embodiments 54 to 73, wherein the first polynucleotide and the second polynucleotide are on a vector.
  • Embodiment 75 Embodiment 75.
  • Embodiment 74 wherein the vector is a viral vector.
  • Embodiment 76 The composition of embodiment 75, wherein the viral vector is a lentiviral vector, an adenoviral vector, or an adeno-associated viral vector.
  • Embodiment 77 A delivery particle comprising the polynucleotide of any one of embodiments 1 to 18, the expression system of any one of claims 19 to 53, the composition of any one of claims 54 to 76, or combination thereof.
  • Embodiment 78 The delivery particle of embodiment 77, wherein the delivery particle comprises a lipid-based particle or a virus-like particle.
  • Embodiment 79 Embodiment 79.
  • Embodiment 80 A cell comprising the polynucleotide of any one of embodiments 1 to 18, the expression system of any one of claims 19 to 53, the composition of any one of claims 54-76, the delivery particle of any one of claims 77 to 79, or combination thereof. [276] Embodiment 81.
  • a method of reducing CAG repeats in a mutant allele of a huntingtin (HTT) gene of a cell comprising introducing to the cell: the polynucleotide of any one of embodiments 1 to 18, the expression system of any one of claims 19 to 53, the composition of any one of claims 54-76, the delivery particle of any one of claims 77 to 79, or combination thereof.
  • HHT huntingtin
  • Embodiment 82 The method of embodiment 81, wherein the mutant allele comprises greater than 35 CAG repeats prior to the introducing, and fewer than 35 CAG repeats following the introducing.
  • Embodiment 83 The method of embodiment 82, wherein, following the introducing, all CAG repeats of the mutant allele are removed.
  • Embodiment 84 The method of embodiment 83, wherein, following the introducing, exon 1 of the mutant allele is removed.
  • Embodiment 85 The method of any one of embodiments 81 to 84, wherein the method reduces levels of RNA encoded by and/or levels of protein expressed from the mutant allele by at least 30%, and wherein the method does not reduce levels of RNA encoded by and/or levels of protein expressed from a wild-type allele of the HTT gene by more than 50%.
  • Embodiment 86 Embodiment 86.
  • Embodiment 87 A method of treating Huntington's Disease (HD) in a subject in need thereof, comprising administering to the subject: the polynucleotide of any one of embodiments 1 to 18, the expression system of any one of claims 19 to 53, the composition of any one of claims 54-76, the delivery particle of any one of claims 77 to 79, or combination thereof.
  • Embodiment 88 A method of treating Huntington's Disease (HD) in a subject in need thereof, comprising administering to the subject: the polynucleotide of any one of embodiments 1 to 18, the expression system of any one of claims 19 to 53, the composition of any one of claims 54-76, the delivery particle of any one of claims 77 to 79, or combination thereof.
  • Embodiment 87 wherein the subject comprises greater than 35 CAG repeats in a mutant allele of a huntingtin (HTT) gene prior to the administering.
  • Embodiment 89 The method of embodiment 88, wherein the method reduces levels of RNA encoded by and/or levels of protein expressed from the mutant allele by at least 30%, and wherein the method does not reduce levels of RNA encoded by and/or levels of protein expressed from a wild-type allele of the HTT gene by more than 50%.
  • Embodiment 90 Embodiment 90.
  • Embodiment 91 The method of any one of embodiments 88 to 90, wherein the subject comprises fewer than 35 CAG repeats in the mutant allele following the administering.
  • Embodiment 92 The method of embodiment 91, wherein, following the administering, all CAG repeats of the mutant allele are removed.
  • Embodiment 93 The method of embodiment 91, wherein, following the administering, all CAG repeats of the mutant allele are removed.
  • a SNP identified as rs3856973 in dbSNP, is present in heterozygosity in 42% HD patients, is present in intron 1 of the HTT gene, and is approximately 3.5 kb downstream from the CAG repeats region.
  • the rs3856973 SNP is a G-A variant, with the guanine being present on the mutant allele and the adenine present on the wild-type allele of heterozygous HD patients.
  • the guanine forms part of a PAM for SaCas9, TCGAGT, while the adenine does not have a SaCas9 PAM: TCAAGT.
  • the PAM-containing mutant allele can be selectively edited with SaCas9.
  • FIG. 2A shows a schematic of the assay that tested 7 different upstream gRNA sequences paired with a downstream gRNA targeting the SNP.
  • a pair of gRNAs was introduced in each condition: a first gRNA that targets the mutant, PAM-containing allele described in Example 1 (shown in the figures as “gRNA 1”), and a second gRNA that targets both mutant and wild-type alleles in the promoter region of the HTT gene, i.e., upstream of exon 1 containing the CAG repeats region.
  • gRNA 1 a first gRNA that targets the mutant, PAM-containing allele described in Example 1
  • second gRNA targets i.e., upstream of exon 1 containing the CAG repeats region.
  • FIG. 2A shows Several second gRNA targets. Results are shown in FIG. 2B and indicate that the G allele was selectively edited with gRNA 1 when paired with the 7 different gRNAs targeting upstream of HTT exon 1, while the A allele was not edited, demonstrating allele-specific editing at SNP rs3856973 with SaCas9.
  • FIG. 3A shows a schematic of the assay that tested 10 different upstream gRNA sequences that are paired with gRNA1 targeting the SNP. Results are shown in FIG. 3B show the mutant allele excision bands on the gel.
  • FIGS. 4A and 4B are introduced via electroporation: a first gRNA that targets the mutant, PAM-containing allele described in Example 1 (shown in the figures as “gRNA 1”), and a second gRNA that targets both mutant and wild-type alleles in the promoter region of the HTT gene, i.e., upstream of exon 1 containing the CAG repeats region.
  • gRNA 1 a first gRNA that targets the mutant, PAM-containing allele described in Example 1
  • gRNA 1 a second gRNA that targets both mutant and wild-type alleles in the promoter region of the HTT gene, i.e., upstream of exon 1 containing the CAG repeats region.
  • Several second gRNA targets were tested, shown in the figures as “gRNA11,” “gRNA25,” “gRNA27,” and “gRNA30.” Only the mutant allele is cleaved at the target sites of both gRNAs. See FIG. 2A for a schematic of the as
  • FIG. 4A shows that up to 50% of the mutant allele excision is observed using a quantitative ddPCR assay measuring the specific expected sequence upon excision when both the first and second gRNAs are introduced, while no excision is observed when only gRNA 1 is introduced or in a non-transfected control.
  • FIG. 4B shows knockdown of greater than 70% of RNA from the mutant allele with minimal effect on the wild-type allele, measured by qPCR assays specifically designed to recognize a SNP that is located at HTT exon 50 and differs in mutant and wild-type alleles.
  • the combination of gRNA 1 and gRNA 11 provided the highest mutant allele excision and mutant allele mRNA knockdown, while not significantly impacting the wild-type allele expression.
  • FIG. 5A shows a schematic of the lentiviral vector design.
  • the results in FIG. 5B show low editing efficiency with gRNA 1 ( ⁇ 11%), therefore leading to overall low mutant allele correction as shown in FIG. 5C. This was in contrast to the results of Example 2, which shows high editing efficiency when synthetic gRNAs were delivered into the HD patient-derived fibroblast cells.
  • Example 4 shows high editing efficiency when synthetic gRNAs were delivered into the HD patient-derived fibroblast cells.
  • FIG. 6A showing the predicted secondary structure of gRNA 1 as a synthetic gRNA, which does not contain the 5’ G and has an “open” spacer sequence
  • FIG. 6B showing the predicted secondary structure of gRNA 1 containing the 5’ G for expression from an adeno-associated viral (AAV) vector or lentiviral vector, which forms a stem loop in the spacer sequence that may inhibit binding of the spacer sequence with the target sequence.
  • RNA folding was predicted using Geneious Prime RNA folding at 37 °C (Andronescu et al., Bioinformatics 23(13):i19–i28, 2007).
  • gRNA 1 and gRNA 11 were prepared with or without a 5’ G and tested for in vitro cleavage. The results are shown in FIG. 7E. “gRNA 1” corresponds to a synthetic gRNA 1 without a 5’ G; “gRNA 1 (+5’G)” corresponds to a synthetic gRNA 1 with a 5’ G; “gRNA 11” corresponds to a synthetic gRNA 11 without a 5’ G; and “gRNA 11 (+5’G)” corresponds to a synthetic gRNA 11 with a 5’ G. “gRNA DMD 16-58” and “gRNA EMX1-sg1” were positive control gRNAs.
  • FIG. 6E the DNA cleavage efficiency of gRNA 1 without the 5’ G is about 90%, and addition of the 5’ G reduced the DNA cleavage efficiency to less than 30%. A decrease in the efficiency was also observed when a 5’ G was added to gRNA 11, though to a lesser extent than gRNA 1.
  • FIGS. 6C and 6D show modifications to the spacer sequence of gRNA 1 that are predicted to disrupt the inhibitory stem loop and “open” the spacer sequence.
  • FIG. 6C shows a mutation of the adenine directly following the 5’ G (A1) to uracil.
  • FIG. 6D shows addition of a uracil directly following the 5’ G.
  • the scaffold sequence i.e., region that binds to SaCas9, was also examined for modifications to improve expression and consequently editing efficiency.
  • the tetraloop of the scaffold sequence contains a tract of consecutive thymines (T). Consecutive regions of 5-6 thymines (T 5-6 ) is typically a strong termination signal for RNA Pol III, while 4 thymines (T 4 ) can serve as premature termination signal and impact the gRNA transcription efficacy. See, e.g., Chen et al. 2016, Nucleic Acids Res 44(8):e75,.2016.
  • FIG. 7A shows the sequence of gRNA 1 expression cassette with part of the U6 promoter region, including three loop structures in the scaffold region.
  • FIG. 7B shows a non-limiting list of modifications that were made to the gRNA to improve expression and efficiency, including: ( 1) deletion of the 5’ G ( G); (2) mutation of T2 to G (T2G); (3) insertion of T between 5’ G and A1 (+5’T); (4) mutation of A1 to T (A1T); (5) insertion of a folding-inducing sequence that forms a highly stabilized loop (“locked loop”); (6) mutations of the third T in the T4 sequence of the tetraloop to A and its corresponding downstream paired nucleotide from A to T (shown in FIG.
  • FIG. 8A shows representative editing efficiency results with three “original,” unmodified, batches of gRNA 1, and gRNA 1 containing the G, T2G, A1T, +5’T, sc1, sc2, and sc3 modifications described above. An untreated (uninfected) sample was included as negative control. Mutant allele excision efficiency (%) was measured by ddPCR. As shown in FIG. 8A, removal of the 5’ G improved editing by at least 3-fold as compared to the original gRNA 1.
  • FIG. 8B shows representative % indel results with the same modified and unmodified gRNA 1 as FIG. 8A. Similar improvements in % indels for gRNA 1 were observed with the G, sc1, sc2, and sc3 modifications.
  • FIG. 8C shows the % indel results of gRNA 11 with the same modifications as in FIGS. 8A and 8B.
  • FIG. 9 shows representative % removal of mutant alleles in iNeurons with different combinations of optimized gRNAs.
  • FIG. 10A shows % indel results with the same optimized gRNA combinations for gRNA1.
  • FIG. 10B shows % indel results with the same optimized gRNA combinations for gRNA11.
  • Example 5 Optimized dual gRNA excision strategy achieves mutant HTT mRNA and protein reduction in HD patient-derived IPS-Neurons
  • FIG. 12A shows the fold change in HTT mRNA levels upon dual gRNA excision with the engineered guides and an optimized SaCas9 using lentiviral vector delivery in HD patient-derived IPS-Neurons.
  • FIG. 12B shows the fold change of total and mutant HTT protein reduction achieved in HD patient-derived IPS-Neurons upon dual gRNA excision with the engineered guides and the optimized SaCas9 with lentivirus delivery.
  • Gys1 refers to a guide RNA targeting the mouse glycogen synthase gene Gys1 as non-targeting control (Gumusgoz et al.
  • FIG. 13A shows the mutant HTT allele excision % in HD patient-derived iNeurons after treatment with escalating doses of AAVs (1E3, 1E4 and 1E5 AAV particles/cell) containing the optimized guides and the original or a codon-NLS optimized SaCas9. Results were obtained by ddPCR designed to detect the excision product. A plateau is reached at the 1E4 dose, most likely due to observed toxicity.
  • FIG. 13B Shows the corresponding efficiency of editing (indels) at single sites (g1 top, g11 bottom) associated with the dual excision shown in FIG. 14A. Measurements were obtained by NGS using primers flanking each single cut site.
  • Example 7 In vivo AAV delivery of the optimized SaCas9-dual gRNA excision strategy in BACHD mouse brain [311] The optimized SaCas9 variant and the optimized gRNA 1 and 11 were tested in the brain of a mouse model for HD (BACHD mouse model). The striatum of BACHD mice was directly injected with AAV9 expressing the CRISPR components. Striatum of the mice was dissected for DNA extraction and analyzed.
  • FIG 15A shows that using the optimized SaCas9 variant resulted in an approximately 50 % increased mutant HTT allele excision compared to the original SaCas9, reaching to 9% measured by ddPCR designed to detect the excision product.
  • FIG. 15B shows the corresponding indel % at single gRNA target sites, reaching to 15% measured by NGS using primers flanking each single cut site.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente divulgation concerne des polynucléotides, des vecteurs d'expression, des compositions et des procédés de traitement de la maladie de Huntington (MH). Dans certains modes de réalisation, les polynucléotides, les vecteurs d'expression et les compositions de la divulgation sont des composants d'un système CRISPR.
PCT/EP2025/053831 2024-02-14 2025-02-13 Compositions et procédés de traitement de la maladie de huntington Pending WO2025172421A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP24157658.6 2024-02-14
EP24157658 2024-02-14

Publications (1)

Publication Number Publication Date
WO2025172421A1 true WO2025172421A1 (fr) 2025-08-21

Family

ID=89941016

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2025/053831 Pending WO2025172421A1 (fr) 2024-02-14 2025-02-13 Compositions et procédés de traitement de la maladie de huntington

Country Status (1)

Country Link
WO (1) WO2025172421A1 (fr)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5543158A (en) 1993-07-23 1996-08-06 Massachusetts Institute Of Technology Biodegradable injectable nanoparticles
US5855913A (en) 1997-01-16 1999-01-05 Massachusetts Instite Of Technology Particles incorporating surfactants for pulmonary drug delivery
US5895309A (en) 1998-02-09 1999-04-20 Spector; Donald Collapsible hula-hoop
US6007845A (en) 1994-07-22 1999-12-28 Massachusetts Institute Of Technology Nanoparticles and microparticles of non-linear hydrophilic-hydrophobic multiblock copolymers
US20080234183A1 (en) 2002-06-18 2008-09-25 Mattias Hallbrink Cell Penetrating Peptides
US20110293703A1 (en) 2008-11-07 2011-12-01 Massachusetts Institute Of Technology Aminoalcohol lipidoids and uses thereof
US20120251560A1 (en) 2011-03-28 2012-10-04 Massachusetts Institute Of Technology Conjugated lipomers and uses thereof
US20130302401A1 (en) 2010-08-26 2013-11-14 Massachusetts Institute Of Technology Poly(beta-amino alcohols), their preparation, and uses thereof
US8709843B2 (en) 2006-08-24 2014-04-29 Rohm Co., Ltd. Method of manufacturing nitride semiconductor and nitride semiconductor element
WO2015089473A1 (fr) * 2013-12-12 2015-06-18 The Broad Institute Inc. Ingénierie de systèmes, procédés et compositions guides optimisées avec de nouvelles architectures pour la manipulation de séquences
WO2017062983A1 (fr) * 2015-10-09 2017-04-13 The Children's Hospital Of Philadelphia Compositions et méthodes pour traiter la maladie de huntington et les troubles apparentés
US20190167810A1 (en) 2016-05-25 2019-06-06 Evox Therapeutics Ltd Exosomes comprising therapeutic polypeptides
WO2020007325A1 (fr) * 2018-07-05 2020-01-09 Tsinghua University Variants de cas9 et leurs utilisations
US20200207833A1 (en) 2013-04-12 2020-07-02 Evox Therapeutics Ltd. Therapeutic delivery vesicles
WO2021113769A1 (fr) * 2019-12-07 2021-06-10 Scribe Therapeutics Inc. Compositions et méthodes pour le ciblage de htt
US11674128B2 (en) * 2016-12-12 2023-06-13 Tsinghua University Engineering of a minimal SaCas9 CRISPR/Cas system for gene editing and transcriptional regulation optimized by enhanced guide RNA
WO2023178280A2 (fr) * 2022-03-17 2023-09-21 The Board Of Trustees Of The Leland Stanford Junior University Compositions et méthodes de modulation de l'expression de l'alpha-synucléine

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5543158A (en) 1993-07-23 1996-08-06 Massachusetts Institute Of Technology Biodegradable injectable nanoparticles
US6007845A (en) 1994-07-22 1999-12-28 Massachusetts Institute Of Technology Nanoparticles and microparticles of non-linear hydrophilic-hydrophobic multiblock copolymers
US5855913A (en) 1997-01-16 1999-01-05 Massachusetts Instite Of Technology Particles incorporating surfactants for pulmonary drug delivery
US5895309A (en) 1998-02-09 1999-04-20 Spector; Donald Collapsible hula-hoop
US20080234183A1 (en) 2002-06-18 2008-09-25 Mattias Hallbrink Cell Penetrating Peptides
US8709843B2 (en) 2006-08-24 2014-04-29 Rohm Co., Ltd. Method of manufacturing nitride semiconductor and nitride semiconductor element
US20110293703A1 (en) 2008-11-07 2011-12-01 Massachusetts Institute Of Technology Aminoalcohol lipidoids and uses thereof
US20130302401A1 (en) 2010-08-26 2013-11-14 Massachusetts Institute Of Technology Poly(beta-amino alcohols), their preparation, and uses thereof
US20120251560A1 (en) 2011-03-28 2012-10-04 Massachusetts Institute Of Technology Conjugated lipomers and uses thereof
US20200207833A1 (en) 2013-04-12 2020-07-02 Evox Therapeutics Ltd. Therapeutic delivery vesicles
WO2015089473A1 (fr) * 2013-12-12 2015-06-18 The Broad Institute Inc. Ingénierie de systèmes, procédés et compositions guides optimisées avec de nouvelles architectures pour la manipulation de séquences
WO2017062983A1 (fr) * 2015-10-09 2017-04-13 The Children's Hospital Of Philadelphia Compositions et méthodes pour traiter la maladie de huntington et les troubles apparentés
US20190167810A1 (en) 2016-05-25 2019-06-06 Evox Therapeutics Ltd Exosomes comprising therapeutic polypeptides
US11674128B2 (en) * 2016-12-12 2023-06-13 Tsinghua University Engineering of a minimal SaCas9 CRISPR/Cas system for gene editing and transcriptional regulation optimized by enhanced guide RNA
WO2020007325A1 (fr) * 2018-07-05 2020-01-09 Tsinghua University Variants de cas9 et leurs utilisations
WO2021113769A1 (fr) * 2019-12-07 2021-06-10 Scribe Therapeutics Inc. Compositions et méthodes pour le ciblage de htt
WO2023178280A2 (fr) * 2022-03-17 2023-09-21 The Board Of Trustees Of The Leland Stanford Junior University Compositions et méthodes de modulation de l'expression de l'alpha-synucléine

Non-Patent Citations (28)

* Cited by examiner, † Cited by third party
Title
"NCBI", Database accession no. rs3856973
ALEX MAS MONTEYS ET AL: "CRISPR/Cas9 Editing of the Mutant Huntingtin Allele In Vitro and In Vivo", MOLECULAR THERAPY, vol. 25, no. 1, 4 January 2017 (2017-01-04), United States, pages 12 - 23, XP055574814, ISSN: 1525-0016, DOI: 10.1016/j.ymthe.2016.11.010 *
ALTSCHUL ET AL., J MOL BIOL, vol. 215, 1990, pages 403 - 410
ALTSCHUL ET AL., NUCLEIC ACIDS RES, vol. 25, no. 17, 1997, pages 3389 - 3402
ANDRONESCU ET AL., BIOINFORMATICS, vol. 23, no. 13, 2007, pages 19 - 28
BAOHUI CHEN ET AL: "Expanding the CRISPR imaging toolset with Staphylococcus aureus Cas9 for simultaneous imaging of multiple genomic loci", NUCLEIC ACIDS RESEARCH, vol. 44, no. 8, 5 January 2016 (2016-01-05), GB, pages e75 - e75, XP055624229, ISSN: 0305-1048, DOI: 10.1093/nar/gkv1533 *
CHEN ET AL., NUCLEIC ACIDS RES, vol. 44, no. 8, 2016, pages 75
DIAZ-HERNANDEZ ET AL., J NEUROSCI, vol. 25, 2005, pages 9773 - 9781
EKMAN FREJA K. ET AL: "CRISPR-Cas9-Mediated Genome Editing Increases Lifespan and Improves Motor Deficits in a Huntington's Disease Mouse Model", MOLECULAR THERAPY-NUCLEIC ACIDS, vol. 17, 1 September 2019 (2019-09-01), US, pages 829 - 839, XP055779039, ISSN: 2162-2531, DOI: 10.1016/j.omtn.2019.07.009 *
GAO ET AL., NAT BIOTECHNOL, vol. 35, no. 8, 2017, pages 789 - 792
HUNG ET AL., MOLBIOL CELL, vol. 29, no. 23, 2018, pages 2809 - 2820
JINEK ET AL., SCIENCE, vol. 337, no. 6096, 2012, pages 816 - 821
KARLINALTSCHUL, PROC NAT ACAD SCI USA, vol. 87, 1990, pages 2264 - 2268
KARLINALTSCHUL, PROC NAT ACAD SCI USA, vol. 90, 1993, pages 5873 - 5877
KLEINSTIVER ET AL., NATURE, vol. 520, 2015, pages 186 - 191
LE CANN ET AL., SCI REP, vol. 11, 2021, pages 6934
LI FANG: "Haplotyping SNPs for allele-specific gene editing of the expanded huntingtin allele using long-read sequencing", HUMAN GENETICS AND GENOMICS ADVANCES, vol. 4, no. 1, 12 January 2023 (2023-01-12), pages 100146, XP093168127, ISSN: 2666-2477, DOI: 10.1016/j.xhgg.2022.100146 *
MALI ET AL., NAT METHODS, vol. 10, 2013, pages 957 - 63
MALI ET AL., SCIENCE, vol. 339, no. 6121, 2013, pages 823 - 826
MITRA ET AL., MATER METHODS, vol. 3, 2013, pages 204
RIESENBERG ET AL., NAT COMMUN, vol. 13, 2022, pages 489
RIESENBERG STEPHAN ET AL: "Improved gRNA secondary structures allow editing of target sites resistant to CRISPR-Cas9 cleavage", NATURE COMMUNICATIONS, vol. 13, no. 1, 25 January 2022 (2022-01-25), UK, pages 489 - 489, XP055955690, ISSN: 2041-1723, DOI: 10.1038/s41467-022-28137-7 *
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
SANDER ET AL., NAT BIOTECHNOL, vol. 32, 2014, pages 347 - 355
SZLACHCIC ET AL., MOLNEUROSCI, vol. 10, 2017, pages 253
TOTH ET AL., NUCLEIC ACIDS RES, vol. 48, no. 7, 2020, pages 3722 - 3733
WALS ET AL., FRONT CHEM, vol. 2, 2014, pages 15
YAMAMOTO ET AL., CELL, vol. 101, 2000, pages 57 - 66

Similar Documents

Publication Publication Date Title
US11479775B2 (en) RNA targeting of mutations via suppressor tRNAs and deaminases
CN113631708B (zh) 编辑rna的方法和组合物
US20200056206A1 (en) Crispr-based treatment of friedreich ataxia
US11530421B2 (en) Self-inactivating endonuclease-encoding nucleic acids and methods of using the same
CN109021111A (zh) 一种基因碱基编辑器
KR20180069898A (ko) 핵염기 편집제 및 그의 용도
CA3009727A1 (fr) Compositions et methodes de traitement d'hemoglobinopathies
WO2025061173A1 (fr) Protéine de fusion et son application
EP4530351A2 (fr) Compositions et procédés pour l'édition de gènes améliorée
WO2024251229A9 (fr) Enzyme cas et système et utilisation associés
WO2025172421A1 (fr) Compositions et procédés de traitement de la maladie de huntington
US20250041449A1 (en) Base editor and use thereof
CN115247162B (zh) 一种腺嘌呤碱基编辑用融合蛋白及其应用
CN117210435A (zh) 一种用于调控rna甲基化修饰的编辑系统及其应用
US20230313235A1 (en) Compositions for use in treating autosomal dominant best1-related retinopathies
WO2024245152A1 (fr) Système d'édition génique ciblant ptbp1 et son utilisation
HK40081918B (en) Methods and compositions for editing rna
WO2025085787A1 (fr) Composants modifiés de systèmes de transposons associés à crispr et crispr
WO2025244671A1 (fr) Compositions destinées à être utilisées dans le traitement de maladies à haplo-insuffisance
WO2025259780A1 (fr) Polypeptides et procédés de modification d'acides nucléiques
WO2025087411A1 (fr) Désaminase, éditeur de base la comprenant et son utilisation
EP4577647A1 (fr) Éditeur de bases crispr
HK40056042B (en) Methods and compositions for editing rnas
HK40056042A (en) Methods and compositions for editing rnas

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25704591

Country of ref document: EP

Kind code of ref document: A1